Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syncobj: use eventfd instead of stalling fd checks #9437

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

gulafaran
Copy link
Contributor

@gulafaran gulafaran commented Feb 18, 2025

use eventfd and add it to the event loop and when it recieves a signal release the buffer, this means we dont stall entire compositor when waiting for materilisation of the fd. and change its related usage, this also means we need to store release points that can popup in a container and signal them all on buffer release.

im not sure why directscanout previously manually tried to signal releasepoints. remove that and let the buffers releaser handle it.

should solve a lot of "compositor 5 fps while ingame overlay shows 300fps" and other similiar weird stalls.

on nvidia this shuffles compositor stalls to sometimes a noticeable rendering stutter, technically the acquire point isnt signaled fast enough so the buffer is stuck and nothing gets rendered until it is. pretty much the same stall as seen before only now its per surface rendering. but that i contribute to nvidia driver being simply shit, and is visible on other compositors.

NVIDIA/open-gpu-kernel-modules#777
NVIDIA/open-gpu-kernel-modules#743

Possibly fixes:
#7857

Copy link
Member

@vaxerski vaxerski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huge if works

use eventfd and add it to the event loop and when it recieves a signal
release the buffer, this means we dont stall entire compositor when
waiting for materilisation of the fd. and change its related usage, this
also means we need to store release points that can popup in a container
and signal them all on buffer release.

im not sure why directscanout previously manually tried to signal
releasepoints. remove that and let the buffers releaser handle it.
lets use unique_pointers all over, add defaulted destructors and make
fromResource directly check existing timelines instead of C style
casting and return a WP from that. instead of moving the releasepoint
vector lets swap it.
@gulafaran gulafaran force-pushed the eventfd branch 3 times, most recently from 25e5e5c to 53df780 Compare February 24, 2025 09:52
remove early buffer release config that previously tried to workaround
flickers while it was more of an syncobj issue, it also cant be used
since the buffer really is used for release point syncing in the
renderer later. and use std::move instead of std::exchange to avoid a
local temporar copy being created inside std::exchange.
@ikalco
Copy link
Contributor

ikalco commented Feb 26, 2025

wait... for ds, shouldn't we release the wl_buffer whenever DRM signals the out fence?
so the client can reuse that buffer, and the output won't artifact cause drm said scanout is done

edit:
uuuh it seems HL just isn't using the out fence lol, am I missing something?
for reference (https://docs.kernel.org/gpu/drm-kms.html)

“OUT_FENCE_PTR”:
Use this property to pass a file descriptor pointer to DRM. Once the Atomic Commit request call returns OUT_FENCE_PTR will be filled with the file descriptor number of a Sync File. This Sync File contains the CRTC fence that will be signaled when all framebuffers present on the Atomic Commit * request for that given CRTC are scanned out on the screen.

@gulafaran
Copy link
Contributor Author

wait... for ds, shouldn't we release the wl_buffer whenever DRM signals the out fence? so the client can reuse that buffer, and the output won't artifact cause drm said scanout is done

edit: uuuh it seems HL just isn't using the out fence lol, am I missing something? for reference (https://docs.kernel.org/gpu/drm-kms.html)

“OUT_FENCE_PTR”:
Use this property to pass a file descriptor pointer to DRM. Once the Atomic Commit request call returns OUT_FENCE_PTR will be filled with the file descriptor number of a Sync File. This Sync File contains the CRTC fence that will be signaled when all framebuffers present on the Atomic Commit * request for that given CRTC are scanned out on the screen.

There is a lot wrong at the moment, explicit sync currently relies on new buffer attached, the client can commit multiple times on the same buffer if it wants to, and does sometimes, and this PR probably exposed that quirk even more, and yeah direct scanout i havent begun reading into but removed that buffer manual buffer shenanigans because it was conflicting with the changes i made, there is also a few other things im checking that has popped up while testing

@gulafaran
Copy link
Contributor Author

wait... for ds, shouldn't we release the wl_buffer whenever DRM signals the out fence? so the client can reuse that buffer, and the output won't artifact cause drm said scanout is done

edit: uuuh it seems HL just isn't using the out fence lol, am I missing something? for reference (https://docs.kernel.org/gpu/drm-kms.html)

“OUT_FENCE_PTR”:
Use this property to pass a file descriptor pointer to DRM. Once the Atomic Commit request call returns OUT_FENCE_PTR will be filled with the file descriptor number of a Sync File. This Sync File contains the CRTC fence that will be signaled when all framebuffers present on the Atomic Commit * request for that given CRTC are scanned out on the screen.

And "releasing" when it comes to explicit sync is sending the release point not the buffer .release

@UjinT34
Copy link
Contributor

UjinT34 commented Feb 26, 2025

output->state->enableExplicitOutFenceForNextCommit() should be called before the commit if out fence is needed. It fills output->state->state().explicitOutFence.
Who should handle that fd?

Note that OUT_FENCE_PTR shouldn't be used when tearing is allowed.

@JunaidQrysh
Copy link

I tested this pr & it fixed glmark2 lag but minecraft lag is still the same and also it introduced slight lag in other areas like workspace switching and cursor lag when wlogout is opened.

@ikalco
Copy link
Contributor

ikalco commented Feb 26, 2025

output->state->enableExplicitOutFenceForNextCommit() should be called before the commit if out fence is needed. It fills output->state->state().explicitOutFence. Who should handle that fd?

Note that OUT_FENCE_PTR shouldn't be used when tearing is allowed.

we use that fd, i think we import it as a sync_file, then wait for it to signal, meaning aq plane is done with the primary buffer

@gulafaran
Copy link
Contributor Author

I tested this pr & it fixed glmark2 lag but minecraft lag is still the same and also it introduced slight lag in other areas like workspace switching and cursor lag when wlogout is opened.

yeah got a few very broken things to figure out, im trying tho.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants