You want to optimize for specific chips because different chips have different capabilities that are not captured by just what extensions they support.
A simple example is that the CPU might support running two specific instructions better if they were adjacent than if they were separated by other instructions ( https://en.wikichip.org/wiki/macro-operation_fusion ). So the optimizer can try to put those instructions next to each other. LLVM has target features for this, like "lui-addi-fusion" for CPUs that will fuse a `lui; addi` sequence into a single immediate load.
A more complex example is keeping track of the CPU's internal state. The optimizer models the state of the CPU's functional units (integer, address generation, etc) so that it has an idea of which units will be in use at what time. If the optimizer has to allocate multiple instructions that will use some combination of those units, it can try to lay them out in an order that will minimize stalling on busy units while leaving other units unused.
That information also tells the optimizer about the latency of each instruction, so when it has a choice between multiple ways to compute the same operation it can choose the one that works better on this CPU.
Wonder if we could generalize this so you can just give the optimizer a file containing all this info, without needing to explicitly add support for each cpu
Do compilers optimize for specific RISC-V CPUs, not just profiles/extensions? Same for drivers and kernel support.
My understanding was that if it's RISC-V compliant, no extra work is needed for existing software to run on it.