-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix crashes on exit caused by wlroots listener checks #8578
base: master
Are you sure you want to change the base?
Conversation
On second thought, it might be more useful to remove the bulk of the listeners in Destroy listeners are still needed for a bunch of other components that aren't directly created in |
If we could have a 1:1 mapping between what is set up in One thing is what is easiest to implement, another is what is easiest to maintain and catch errors within in the future. |
8ea9677
to
307ce0c
Compare
IMO components that have This is sometimes hard to distinguish though; I'm currently handling these by making the I think I'll try to get this PR far enough to exit sway without crashes with this method, keeping as much as possible in separate commits for reviewability, and we'll sort out what the best path forward is during review. |
adfe709
to
8ee1fc7
Compare
The current state makes sway exit correctly from a desktop containing a few konsole windows and a firefox window. I'm not completely sure about the correct handling of all of these cases, and I'm sure there are a lot more objects with listeners that need to be removed before destroy, but those don't trigger on a regular exit in my setup. I've already started writing a script to search for all objects in wlroots that have listener checks and their corresponding usages in sway, but that might take a while. In the meantime, I think it's better to mark this as reviewable now and see where we go afterwards. |
8ee1fc7
to
6df3682
Compare
Addendum to this: this works fine for The current implementation has this 1:1 mapping for objects created in |
6df3682
to
ca19d20
Compare
Diff to state before last force-push: diff --git a/sway/server.c b/sway/server.c
index 683ce169..79c8f542 100644
--- a/sway/server.c
+++ b/sway/server.c
@@ -458,6 +458,7 @@ bool server_init(struct sway_server *server) {
void server_fini(struct sway_server *server) {
// remove listeners
+ wl_list_remove(&server->renderer_lost.link);
wl_list_remove(&server->new_output.link);
wl_list_remove(&server->layer_shell_surface.link);
wl_list_remove(&server->xdg_shell_toplevel.link);
@@ -474,14 +475,12 @@ void server_fini(struct sway_server *server) {
wl_list_remove(&server->xdg_activation_v1_request_activate.link);
wl_list_remove(&server->xdg_activation_v1_new_token.link);
wl_list_remove(&server->request_set_cursor_shape.link);
-#if WLR_HAS_XWAYLAND
- wl_list_remove(&server->xwayland_surface.link);
- wl_list_remove(&server->xwayland_ready.link);
-#endif
input_manager_finish(server);
// TODO: free sway-specific resources
#if WLR_HAS_XWAYLAND
+ wl_list_remove(&server->xwayland_surface.link);
+ wl_list_remove(&server->xwayland_ready.link);
wlr_xwayland_destroy(server->xwayland.wlr_xwayland);
#endif
wl_display_destroy_clients(server->wl_display); |
ca19d20
to
8932123
Compare
I found an interesting crash related to the The error occurs while sway is trying to destroy the old renderer after a GPU reset and also exists before this PR. Call Trace: wlroots Detailed Stack Tracewlroots
|
We can fix this in sway if we can break the stack without causing issues elsewhere by either:
The first one is cleaner and is hopefully fine, but I'm not sure if there will be a bunch of things failing in between the renderer being lost and the idle callback running. |
These seem like sensible approaches. I'll look into it and try to see if the first approach works without complications. IMO this could be done in a separate PR, since it doesn't really touch the exit logic and is more than just removing listeners. EDIT: implemented this as a separate commit in this PR |
8932123
to
9122400
Compare
9122400
to
e53f77a
Compare
I just implemented this in the latest push. I haven't fully tested it yet since I don't know of a way to force a GPU reset or renderer loss. I'm still searching if there's a simple way to do this (for info: I'm using amdgpu on this laptop), otherwise I'll hack something in to make sway think a GPU reset happened to test this. I've already tried |
2126919
to
9f0e3c0
Compare
Rebased onto latest master. I've also tested the new delayed renderer recreation by faking the return value of @kennylevinsen / @llyyr: could you look at the new renderer lost handling and confirm that nothing looks out of the ordinary? Otherwise I think this PR is good to go. I've been running this for weeks now and haven't had any more crashes on exit. There might be a few listeners still somewhere in sway, but for daily use this should be fine, and the rest can be handled when we come across them (IMO anything is better than the current state of master where sway just crashes on exit). |
This fixes a crash in wlroots listener checks. See swaywm#8509.
This fixes a crash in wlroots listener checks. See swaywm#8509.
This fixes a crash in wlroots listener checks. See swaywm#8509.
sway_input_method_relay can be destroyed from two sources, either the seat is destroyed or the manager protocol objects are destroyed due compositor exit. Therefore, finish must check whether it has already been called. This fixes a crash in wlroots listener checks. See swaywm#8509.
Change begin_destroy to remove event listeners before the final destroy, since otherwise event listeners would be removed twice, which crashes. This fixes a crash in wlroots listener checks. See swaywm#8509.
Destroying the wlr_renderer in a callback to its own renderer_lost event is unsafe due to wl_signal_emit*() still accessing it after it was destroyed. Delegate recreation of renderer to an idle callback and ensure that only one such idle callback is scheduled at a time by storing the returned event source.
9f0e3c0
to
95117e8
Compare
Rebased again onto latest master. I also did a quick grep over the codebase and from a first look it seems like every remaining |
I've also tested this a bit and haven't seen any exit crashes |
This PR fixes sway crashing on exit due to event listeners not being removed from wlroots objects on exit.
Listeners fixed:
server_init()
, reworkedhandle_renderer_lost()
to use an idle callbackwlr_backend.new_input
,wlr_virtual_keyboard
,wlr_virtual_pointer
,wlr_keyboard_shortcuts_inhibit
,wlr_transient_seat_manager
wlr_idle_inhibit_manager
wlr_text_input_manager
andwlr_input_method_manager
wlr_scene_buffer.output_enter
andwlr_scene_buffer.output_leave
This doesn't fix all wlroots objects with listener checks, just those that lead to immediate crashes on exit. I'll wait for a review of my general method of fixing this before continuing.