That's what comes with actually using the instruction set of modern CPU's instead of the lowest common denominator instruction set.
However, note that some of the modern instructions like AVX-512 use a lot of electrical power on the CPU, for example pushing a 12gen Intel chip above 300W.
I wouldn't write it all off on the AVX-512 in this case. As you can see in https://github.com/clearlinux-pkgs/php and other clearlinux repositories, the difference from other distros is:
1) a number of backported optimization patches
2) few patches in Linux kernel and glibc that were not accepted to upstream
3) and maybe the most important part: PGO for python, php and other performance-critical components. It is a well-known fact that PGO helps interpreted languages. And it is benchmark-oriented: for example, even though PGO helps bash, they don't enable it (maybe until somebody invents bash-bench).
ClearLinux pulls phpbench-0.8.2 (and no other applications) for PGO[1] and then Phoronix benchmarks with phpbench-0.8.1. In my opinion, that's a valid reason to disqualify the result. It does not mean that PGO makes PHP faster or slower, it means that there are no measurements to make the conclusion. I only have theory, that PGO generally helps to reduce startup time, so that in case when app starts 1000s times per second, it helps even if PGO data is irrelevant. But modern PHP does not behave in this way, also some real-word PHP apps are usually CPU / I/O bound in database, so one may see no difference...
Some time ago I tried to apply PGO to Blender, which only hurt the performance, but that's the other story (I read it as: if application already has logical separation for kernels for CPUs, attempt to join and reseparate in PGO may hurt the performance).
I am not sure off hand how that CPU level compares to Intel Clear Linux's options. The phoronix comparison fo Ubuntu with and without v3 were mostly quite small single digit percent though, on an Intel CPU.
Notes: Canonical (Ubuntu) is my employer, I am not working on this project.
Not really because AVX-512 isn't supported by those platforms; supposedly something related to processes in general being very unhappy if they're being scheduled to E cores that doesn't handle those instructions randomly.
> Clear Linux makes use of compiler function multi versioning, performance-minded defaults, aggressive compiler CFLAGS/CXXFLAGS defaults, optional AVX-512 usage for more libraries, and many other patches and optimizations in the name of delivering the greatest x86_64 Linux performance.
Compiling a few dozen important libraries with a higher x86_64 instruction version unlocking AVX2 or even AVX-512 by default can make a large difference. Normally these optimisations aren't used by distro packages because the libraries aren't written in such a way that one binary runs as fast as possible on new CPUs and still runs at all on older CPUs. Instead processes just crash with an illegal instruction exception (aka SIGILL). The tooling exists to allow such libraries and executables to be created and have the runtime linker pick the right implementation at link time, but support for it is spotty.
> The tooling exists to allow such libraries and executables to be created and have the runtime linker pick the right implementation at link time, but support for it is spotty.
Attribute target_clones works quite well with gcc in my experience, but it is done all by hand for worthwhile functions and should interfere with inlining and short jumps.
In addition to what other commenters said, Clear Linux goes all the way but you can get somewhat there with CachyOS. It's Arch with compiler optimizations and targets CPUs with AVX2 and newer.
That's a staggering improvement.
https://www.phoronix.com/review/linux-os-amd-ryzen9-9950x/6