Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cursor movement starts to lag after prolonged execution of a browser app #1037

Open
Schweber opened this issue Jan 24, 2025 · 33 comments
Open
Labels
bug Something isn't working

Comments

@Schweber
Copy link

When playing the browser sim/game www.lstsim.de for a long time (about an hour at least), the cursor movement becomes more and more laggy (like a game with 10 fps or something like that). Closing the browser doesn't help, i have to restart niri for it to work normally again (logout and login again, or reboot of course).

This started happening with niri 25.01 and didn't happen before. It happens both with librewolf and brave.

System Information

  • niri version: niri 25.01 (unknown commit)
  • Distro: NixOS 24.11
  • GPU: Intel i5 11600 iGPU
  • CPU: Intel i5 11600
@Schweber Schweber added the bug Something isn't working label Jan 24, 2025
@YaLTeR
Copy link
Owner

YaLTeR commented Jan 24, 2025

Could you please record a Tracy profile?

You'll need to build and run niri with cargo build --release --features=profile-with-tracy-ondemand. Then you'll need to build and run the Tracy profiler v0.11.1 from https://github.com/wolfpld/tracy/tree/v0.11.1

When the cursor is laggy enough, attach it to niri, record a few seconds of moving the cursor around, then save the recording and upload it somewhere here.

@Schweber
Copy link
Author

I added this feature to the package.nix of nixpkgs but i'm getting the following errors:

warning: The interpretation of store paths arguments ending in `.drv` recently changed. If this command is now failing try again with '/nix/store/fcyk2wm0lahlqvl5khq0j79s052lx72r-niri-25.01.drv^*'
Running phase: unpackPhase
@nix { "action": "setPhase", "phase": "unpackPhase" }
unpacking source archive /nix/store/6j3g9say3dmww0zfvsj00b5fbmvh485j-source
source root is source
Executing cargoSetupPostUnpackHook
Finished cargoSetupPostUnpackHook
Running phase: patchPhase
@nix { "action": "setPhase", "phase": "patchPhase" }
patching script interpreter paths in resources/niri-session
resources/niri-session: interpreter directive changed from "#!/bin/sh" to "/nix/store/gwgqdl0242ymlikq9s9s62gkp5cvyal3-bash-5.2p37/bin/sh"
Executing cargoSetupPostPatchHook
Validating consistency between /build/source/Cargo.lock and /build/niri-25.01-vendor/Cargo.lock
Finished cargoSetupPostPatchHook
Running phase: updateAutotoolsGnuConfigScriptsPhase
@nix { "action": "setPhase", "phase": "updateAutotoolsGnuConfigScriptsPhase" }
Running phase: configurePhase
@nix { "action": "setPhase", "phase": "configurePhase" }
Running phase: buildPhase
@nix { "action": "setPhase", "phase": "buildPhase" }
Executing cargoBuildHook
cargoBuildHook flags: -j 12 --target x86_64-unknown-linux-gnu --offline --profile release --no-default-features --features=profile-with-tracy-ondemand\,dbus\,>
   Compiling proc-macro2 v1.0.92
   Compiling unicode-ident v1.0.14
   Compiling pkg-config v0.3.31
   Compiling serde v1.0.217
   Compiling libc v0.2.169
   Compiling hashbrown v0.15.2
   Compiling winnow v0.6.22
   Compiling equivalent v1.0.1
   Compiling autocfg v1.4.0
   Compiling cfg-if v1.0.0
   Compiling heck v0.5.0
   Compiling target-lexicon v0.12.16
   Compiling smallvec v1.13.2
   Compiling once_cell v1.20.2
...skipping...
test utils::watcher::tests::create_dir_and_file ... ok
test utils::watcher::tests::change_file ... ok
test utils::watcher::tests::recreate_dir ... ok
test utils::watcher::tests::change_file_in_linked_dir ... ok
test utils::watcher::tests::change_linked_file ... ok
test utils::watcher::tests::recreate_file ... ok
test utils::watcher::tests::swap_dir ... ok
test utils::watcher::tests::swap_dir_link ... ok
test utils::watcher::tests::swap_just_link ... ok

failures:

---- input::tests::bindings_suppress_keys stdout ----
thread 'input::tests::bindings_suppress_keys' panicked at niri-config/src/lib.rs:1721:21:
span! without a running Client
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- layout::tests::config_change_updates_cached_sizes stdout ----
thread 'layout::tests::config_change_updates_cached_sizes' panicked at niri-config/src/lib.rs:1721:21:
span! without a running Client

---- layout::tests::empty_workspaces_dont_move_back_to_original_output stdout ----
thread 'layout::tests::empty_workspaces_dont_move_back_to_original_output' panicked at src/animation/mod.rs:175:21:
span! without a running Client

---- layout::tests::interactive_move_drop_on_other_output_during_animation stdout ----
thread 'layout::tests::interactive_move_drop_on_other_output_during_animation' panicked at src/animation/mod.rs:175:21:
span! without a running Client

---- layout::tests::fixed_height_takes_max_non_auto_into_account stdout ----
thread 'layout::tests::fixed_height_takes_max_non_auto_into_account' panicked at src/animation/mod.rs:175:21:
span! without a running Client

---- layout::tests::interactive_move_onto_empty_output stdout ----
thread 'layout::tests::interactive_move_onto_empty_output' panicked at src/animation/mod.rs:175:21:
   Compiling async-broadcast v0.7.2
   Compiling blocking v1.6.1
   Compiling rayon v1.10.0
   Compiling async-io v2.4.0
   Compiling async-fs v2.1.2
   Compiling async-signal v0.2.10
   Compiling async-process v2.3.0
   Compiling zbus v5.2.0
   Compiling niri v25.1.0 (/build/source)
    Finished `release` profile [optimized + debuginfo] target(s) in 1m 31s
     Running unittests src/lib.rs (target/x86_64-unknown-linux-gnu/release/deps/niri-c995a5e4dd3c823f)

running 101 tests
test animation::clock::tests::frozen_clock ... ok
test animation::clock::tests::rate_change ... ok
test layout::scrolling::tests::large_fractional_strut ... ok
test layout::scrolling::tests::working_area_starts_at_physical_pixel ... ok
test input::tests::comp_mod_handling ... ok
test input::tests::bindings_suppress_keys ... FAILED
test layout::tests::config_change_updates_cached_sizes ... FAILED
test layout::tests::close_window_empty_ws_above_first ... ok
test layout::tests::empty_workspaces_dont_move_back_to_original_output ... FAILED
test layout::tests::add_and_remove_output ... ok
test layout::tests::fullscreen ... ok
test layout::tests::interactive_move_drop_on_other_output_during_animation ... FAILED
test layout::tests::large_max_size ... ok
test layout::tests::fixed_height_takes_max_non_auto_into_account ... FAILED
test layout::tests::large_negative_height_change ... ok
test layout::tests::interactive_move_onto_empty_output ... FAILED
test layout::tests::named_workspace_to_output ... ok
test layout::tests::interactive_resize_to_negative ... FAILED
test layout::tests::interactive_move_onto_empty_output_ewaf ... FAILED
test layout::tests::interactive_move_onto_last_workspace ... FAILED
test layout::tests::interactive_move_onto_first_empty_workspace ... FAILED
test layout::tests::named_workspace_to_output_ewaf ... ok
...skipping...
    layout::tests::unfullscreen_view_offset_not_reset_on_gesture
    layout::tests::unfullscreen_view_offset_not_reset_on_removal
    layout::tests::unfullscreen_view_offset_set_on_fullscreening_inactive_tile_in_column
    layout::tests::unfullscreen_window_in_column
    layout::tests::window_closed_on_previous_workspace
    layout::tests::windows_on_other_workspaces_remain_activated
    layout::tests::workspace_cleanup_during_switch
    layout::tests::workspace_transfer_during_switch
    layout::tests::workspace_transfer_during_switch_from_last
    layout::tests::workspace_transfer_during_switch_gets_cleaned_up
    tests::floating::floating_doesnt_store_fullscreen_size
    tests::floating::floating_respects_non_fixed_min_max_rule
    tests::floating::interactive_move_restores_floating_size_when_set_to_floating
    tests::floating::interactive_move_unfullscreen_to_floating_restores_size
    tests::floating::moving_across_workspaces_doesnt_cancel_resize
    tests::floating::moving_to_floating_doesnt_cancel_resize
    tests::floating::resize_during_interactive_move_propagates_to_floating
    tests::floating::resize_in_steps
    tests::floating::resize_to_different_size
    tests::floating::resize_to_different_then_same
    tests::floating::resize_to_same_size
    tests::floating::restore_floating_size
    tests::floating::set_window_height_uses_current_width
    tests::floating::set_window_width_uses_current_height
    tests::floating::state_change_doesnt_break_use_window_size
    tests::floating::unfocus_preserves_current_size
    tests::window_opening::dont_ack_initial_configure
    tests::window_opening::simple
    tests::window_opening::simple_no_workspaces
    tests::window_opening::target_output_and_workspaces
    tests::window_opening::target_size

test result: FAILED. 43 passed; 58 failed; 0 ignored; 0 measured; 0 filtered out; finished in 1.02s

error: test failed, to rerun pass `--lib`

Can you help me with that? My knowledge about that is very limited.

@YaLTeR
Copy link
Owner

YaLTeR commented Jan 24, 2025

Do not run tests with that feature flag.

@Schweber
Copy link
Author

When i launch tracy it fails with wp_viewport#16: error 2: source rectangle x=0,y=0,w=1915.5,h=2155.5 extends outside of the content area x=0,y=0,w=1277,h=1437. Is there something that i need to configure?

@YaLTeR
Copy link
Owner

YaLTeR commented Jan 24, 2025

I think if you run it a few times it'll launch

@Schweber
Copy link
Author

It's not working unfortunately, no matter how often i try

@YaLTeR
Copy link
Owner

YaLTeR commented Jan 25, 2025

Strange. Could you try a blanket open-floating true rule, maybe it'll help it?

@Schweber
Copy link
Author

It makes no difference

@YaLTeR
Copy link
Owner

YaLTeR commented Jan 25, 2025

Are you using fractional scale? Maybe if you set scale = 1 then it will work. Strange because I can run Tracy fine, if it errors it's usually only once. It's some race condition in how it handles OpenGL probably.

@Schweber
Copy link
Author

scale = 1 is a good idea but doesn't work either. I'm sorry that i can't be of more help regarding this issue.

I'll leave this issue open and update it, if the problem goes away, if that's ok.

@YaLTeR
Copy link
Owner

YaLTeR commented Jan 25, 2025

Could you try something different then? Record a profile with perf like I described here: #602 (comment)

@Schweber
Copy link
Author

Schweber commented Jan 25, 2025

I added debuginfo as a buildFeature but it says that this feature does not exist:

> cargoBuildHook flags: -j 12 --target x86_64-unknown-linux-gnu --offline --profile release --no-default-features --features=debuginfo\,profile-with-tracy-ondemand\,dbus\,xdp-gnome-screencast\,systemd
> error: none of the selected packages contains these features: debuginfo

What is meant by build and run niri with debuginfo?

@YaLTeR
Copy link
Owner

YaLTeR commented Jan 25, 2025

It just means not stripping out debug information. It shouldn't be specific to niri.

@YaLTeR
Copy link
Owner

YaLTeR commented Jan 25, 2025

@sodiboo hey do you know how to do this on nix, or does it already do the right thing?

@sodiboo
Copy link
Contributor

sodiboo commented Jan 25, 2025

set dontStrip = true; and also pass some rust flags. something like this. nixpkgs doesn't do it I believe, but I think my flake does what is desired here already? so maybe as simple as just using that as a package source.

@Schweber
Copy link
Author

Schweber commented Jan 25, 2025

Image

In this time frame, the mouse was laggy. Brave was still running with an instance of www.lstsim.de running for about an hour and a youtube video playing in the background.

@mitsuami-megane
Copy link

mitsuami-megane commented Jan 26, 2025

Are you sure you're only getting this on 25.01? I've been having a similar issue that I couldn't put my finger on, for a good couple of months now (maybe since October).

For me the issue happens in games like ARK and Cyberpunk, the mouse lag starts maybe 30-60 minutes after starting the game, and restarting the game fixes it.

Maybe in your case there's a browser background process that stays alive and prevents the browser restart from fixing it?

I first thought the game is simply lagging so I did the usual kernel/mesa/proton dance with no luck. Then I thought it might be an xwayland-satellite issue, but it seems to also happen in rootful XWayland and also with gamescope (when running under Niri).

I will try and see if the same can be reproduced under River.

Edit: also happens on River

@YaLTeR
Copy link
Owner

YaLTeR commented Jan 26, 2025

Wtf, this is really weird. I wonder if something regressed in calloop, maybe some kind of leak? Might be worth trying to downgrade calloop in niri, for example to v0.14.0. Maybe this is some oversight in the change to timer reinsertion in v0.14.1 (released on the 5th of September).

@Schweber
Copy link
Author

Schweber commented Jan 26, 2025

Are you sure you're only getting this on 25.01?

Then i'll rephrase my statement as "i never noticed it before 25.01 and i'm pretty sure that i would have".

Maybe in your case there's a browser background process that stays alive and prevents the browser restart from fixing it?

I was thinking the same thing but i checked with btop that everything is shut down. Memory consumption isn't excessive either from what i can tell so i don't have an answer as to the cause.

but it seems to also happen in rootful XWayland and also with gamescope (when running under Niri).

I'm using niri in a pure wayland environment.

@mitsuami-megane
Copy link

Updated the above comment, seems like my issue is not directly related to Niri as it also happens on River. Not sure if OP's issue is the same one.

@Myned
Copy link

Myned commented Jan 28, 2025

I'll try to find time to look into this further, but reducing the polling rate of my mouse to 500Hz from 1000Hz helped dramatically reduce the CPU usage of niri while moving the cursor, which mostly prevented the lag from occurring.

@YaLTeR
Copy link
Owner

YaLTeR commented Jan 29, 2025

It makes sense that this would improve this issue but it would be really good to get to the bottom of this

@mitsuami-megane
Copy link

Are you on a high refresh rate monitor? Moving the mouse on my 144hz shows around 10% CPU usage, with around 4-6% on the 60hz. I can imagine how on 240hz or higher it could become problematic.

@YaLTeR
Copy link
Owner

YaLTeR commented Jan 29, 2025

Also to reiterate: that flame graph is very strange. That unregister function absolutely should not dominate it, regardless of mouse Hz or monitor refresh rate. So if this is some kind of leak in calloop then it should be fixed

@Schweber
Copy link
Author

Schweber commented Jan 29, 2025

I'm on a 60hz 4k scaled 1.5 and my mouse is set to 1000hz indeed but this wasn't a problem so far.

@Myned
Copy link

Myned commented Feb 2, 2025

Some additional observations on an Intel UHD 770 iGPU with 3440x1440@100Hz monitor:

  • niri --session runs on a single core that spikes to 70-100% usage when rapidly moving a 1000Hz mouse
  • powerprofilesctl power-saver makes the stutter more apparent, likely due to lower performance of a single core
  • Lowering refresh rate of the monitor from 100Hz to 60Hz did not change the CPU usage nor reduce the amount of stutter
  • libinput errors in niri service log during mouse movement:
libinput error: event22 - Razer Razer Viper Ultimate Dongle: client bug: event processing lagging behind by 70ms, your system is too slow
  • hide-after-inactive-ms 15000 heavily increases CPU usage during mouse movement even without power-saver mode

Flame graph after a couple seconds of minor movement with hide-after-inactive-ms 15000, polling rate 1000Hz, and power-saver mode:

Image

@YaLTeR
Copy link
Owner

YaLTeR commented Feb 2, 2025

Again this unregister() call taking an unreasonable amount of time, odd. I mean, yeah, it's called every mouse input event in this case, but I don't think it's supposed to take this long. Maybe you could try older calloop =0.14.0 before some changes to timer re-registering?

@Myned
Copy link

Myned commented Feb 2, 2025

Not positive I built niri-flake correctly, but it seems similar to calloop 0.14.2 with the same performance characteristics:

~/.dev/niri/Cargo.toml

calloop = { version = "0.14.0", features = ["executor", "futures-io"] }

cargo update [email protected] --precise 0.14.0

~/.dev/niri/Cargo.lock

name = "calloop"
version = "0.14.0"

~/.dev/niri-flake/flake.nix

niri-unstable.url = "git+file:///home/user/.dev/niri";

/etc/nixos/flake.nix

inputs.niri-flake.url = "git+file:///home/user/.dev/niri-flake";

Image

@YaLTeR
Copy link
Owner

YaLTeR commented Feb 3, 2025

Thanks. Let's assume the problem isn't in the calloop updates then.

Maybe someone manages to run Tracy and record a Tracy profile? It would be very helpful in diagnosing this.

@YaLTeR
Copy link
Owner

YaLTeR commented Feb 3, 2025

Actually, you don't even need to run the profile UI to capture. There's a separate capture folder in Tracy that builds a CLI recording tool. You can use it like this: ./tracy-capture -o ~/output.tracy -p 8087 (try port number 8086 and then go +1 until it works)

@hallettj
Copy link
Contributor

hallettj commented Feb 3, 2025

Hi! I've had the same issue. It my case it comes up playing Overwatch in Wine, and things start getting laggy after around 10 minutes. I first ran into this problem around June when I upgraded to NixOS 24.05. There was no problem in NixOS 23.11 even with the same niri version in both OS versions. I explained my findings at that point in the Matrix chat, and I kept a copy of what I wrote: https://gist.github.com/hallettj/f68ae8fc9dccbdc709f6cce75fdd4cfb

Eventually the problem went away for a long time. I wish I had paid more attention to exactly when it got better. But now I've been seeing the problem again starting sometime in the last month I think. It's hard to pin down because I had a separate issue with gamescope that discouraged me from playing games in Niri until I found a workaround a couple of days ago.

My hunch is there's an issue with some non-Rust system dependency that somehow affects Niri but not Gnome or Sway, and the problem has more to due with a high rate of mouse events over a prolonged period of time than graphics or system load.

I also tested building niri v25.01 with calloop downgraded to 0.14.0, and I'm certain I built correctly, and I agree that the issue is still present. I did notice that there are two copies of calloop present - 0.13.0 is also in there. I downgraded the one that niri references directly, the one that was at 0.14.2.

I can work on a Tracy profile when I next have a bit of time.

@YaLTeR
Copy link
Owner

YaLTeR commented Feb 3, 2025

Thanks. Interesting. Unfortunately, this makes it less clear if anything, heh. But yeah, a Tracy profile could maybe shed more light on this.

@Myned
Copy link

Myned commented Feb 12, 2025

It's odd, but the stutter seems to smooth out while a video is playing onscreen, and starts stuttering again when paused or there's minimal movement. That may make this related to #855 with VRR and direct scanout.

Tracy profiles with the same settings as before:
idle.zip
video.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants