Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: kernel 6.12.8 is not stable #1208

Open
petuhovskiy opened this issue Jan 17, 2025 · 3 comments
Open

Bug: kernel 6.12.8 is not stable #1208

petuhovskiy opened this issue Jan 17, 2025 · 3 comments
Labels
t/bug Issue Type: Bug

Comments

@petuhovskiy
Copy link
Member

Environment

prod, staging

Steps to reproduce

Run many compute VMs using linux kernel 6.12.8 (#1193) and try to trigger autoscaling (upscaling, CPU hotplugs, etc).

Expected result

Everything works without issues, no errors in kernel logs

Actual result

There are quite frequent warning in dmesg, usually right after CPU hotplug:

[  128.464639] WARNING: CPU: 1 PID: 0 at arch/x86/kernel/cpu/cpuid-deps.c:117 do_clear_cpu_cap+0xdf/0x140
...stacktrace...
[  128.576531] kworker/0:1: vmemmap alloc failure: order:9, mode:0x4cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL), nodemask=(null),cpuset=/,mems_allowed=0
[  128.577673] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.12.8 #1	
[  128.578099] Tainted: [W]=WARN

The kernel becomes tainted, and that means that potentially anything can happen.

Other logs, links

https://neondb.slack.com/archives/C03TN5G758R/p1736944153528689

@petuhovskiy petuhovskiy added the t/bug Issue Type: Bug label Jan 17, 2025
@petuhovskiy
Copy link
Member Author

do_clear_cpu_cap still reproduces (locally) on 6.12.11

@petuhovskiy
Copy link
Member Author

I needed to know the latest stable kernel version, so I spent some time bisecting this issue. It looks like v6.9.12 is stable and v6.10.1 is not. So the regression was likely introduced at some point during development of v6.10.

@petuhovskiy
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t/bug Issue Type: Bug
Projects
None yet
Development

No branches or pull requests

1 participant