One big difference that I’ve noticed between Windows and Linux is that Windows does a much better job ensuring that the system stays responsive even under heavy load.
For instance, I often need to compile Rust code. Anyone who writes Rust knows that the Rust compiler is very good at using all your cores and all the CPU time it can get its hands on (which is good, you want it to compile as fast as possible after all). But that means that for a time while my Rust code is compiling, I will be maxing out all my CPU cores at 100% usage.
When this happens on Windows, I’ve never really noticed. I can use my web browser or my code editor just fine while the code compiles, so I’ve never really thought about it.
However, on Linux when all my cores reach 100%, I start to notice it. It seems like every window I have open starts to lag and I get stuttering as the programs struggle to get a little bit of CPU that’s left. My web browser starts lagging with whole seconds of no response and my editor behaves the same. Even my KDE Plasma desktop environment starts lagging.
I suppose Windows must be doing something clever to somehow prioritize user-facing GUI applications even in the face of extreme CPU starvation, while Linux doesn’t seem to do a similar thing (or doesn’t do it as well).
Is this an inherent problem of Linux at the moment or can I do something to improve this? I’m on Kubuntu 24.04 if it matters. Also, I don’t believe it is a memory or I/O problem as my memory is sitting at around 60% usage when it happens with 0% swap usage, while my CPU sits at basically 100% on all cores. I’ve also tried disabling swap and it doesn’t seem to make a difference.
EDIT: Tried nice -n +19
, still lags my other programs.
EDIT 2: Tried installing the Liquorix kernel, which is supposedly better for this kinda thing. I dunno if it’s placebo but stuff feels a bit snappier now? My mouse feels more responsive. Again, dunno if it’s placebo. But anyways, I tried compiling again and it still lags my other stuff.
“The kernel runs out of time to solve the NP-complete scheduling problem in time.”
More responsiveness requires more context-switching, which then subtracts from the available total CPU bandwidth. There is a point where the task scheduler and CPUs get so overloaded that a non-RT kernel can no longer guarantee timed events.
So, web browsing is basically poison for the task scheduler under high load. Unless you reserve some CPU bandwidth (with cgroups, etc.) beforehand for the foreground task.
Since SMT threads also aren’t real cores (about ~0.4 - 0.7 of an actual core), putting 16 tasks on a 16/8 machine is only going to slow down the execution of all other tasks on the shared cores. I usually leave one CPU thread for “housekeeping” if I need to do something else. If I don’t, some random task is going to be very pleased by not having to share a core. That “spare” CPU thread will be running literally everything else, so it may get saturated by the kernel tasks alone.
nice +5
is more of a suggestion to “please run this task with a worse latency on a contended CPU.”.(I think I should benchmark make -j15 vs. make -j16 to see what the difference is)
That’s all fine, but as I said, Windows seems to handle this situation without a hitch. Why can Windows do it when Linux can’t?
Also, it sounds like you suggest there is a tradeoff between bandwidth and responsiveness. That sounds reasonable. But shouldn’t Linux then allow me to easily decide where I want that tradeoff to lie? Currently I only have workarounds. Why isn’t there some setting somewhere to say “Yes, please prioritise responsiveness even if it reduces bandwidth a little bit”. And that probably ought to be the default setting. I don’t think a responsive UI should be questioned - that should just be a given.
Windows lies to you. The only way they don’t get this problem is that they are reserving some CPU bandwidth for the UI beforehand. Which explains the 1-2% y-cruncher worse results on windows.
If that’s the solution to the problem, it’s a good solution. Linux ought to do the same thing, cause none of the suggestions in this thread have worked for me.
deleted by creator
nohz_full
confusingly also helps with power usage… if the cpu doesn’t have anything to run, no point waking it up with a scheduler-tick IPI… but also no point trying to run the scheduler if a core is peaking with a single task… Withnohz
the kernel overheard basically ceases to exist for a task while the it is running. (Thought the overhead just moves to non-nohz cpu cores)You’re right of course. I think the issue is that Linux doesn’t care about the UI. As far as it is concerned GUI is just another program. That’s the same reason you don’t have things like ctrl-alt-del on Linux.
To be fair, there should be some heuristics to boost priority of anything that has received input from the hardware. (a button click e.g.) The no-care-latency jobs can be delayed indefinitely.
I agree that UI should always take priority. I shouldn’t have to do anything to guarantee this.
I have HZ_1000, tickless kernel with
nohz_full
set up. This all has a throughput/bandwidth cost (about 2%) in exchange for better responsiveness by default.But this is not enough, because the short burst UI tasks need near-zero wake-up latency… By the time the task scheduler has done its re-balancing the UI task is already sleeping/halted again, and this cycle repeats. So the nice/priorities don’t work very well for UI tasks. Only way a UI task can run immediately is if it can preempt something or if the system has a somewhat idle CPU to put it on.
The kernel doesn’t know any better which tasks are like this. The on-going
EEVDF
,sched_ext
scheduler projects attempt to improve the situation. (EEVDF
should allow specifying the desired latency, whilesched_ext
will likely allow tuning the latency automatically)