Not my area of expertise but what exactly is the difference between RISC-V and Power PC? Didn't Power-PC get a good run in the 90s and 2000s? Just wondering why there's renewed interest in RISC-like architectures when industry already had a good exploration of that area.
The interest is BECAUSE it's well explored territory. The concept is proven and works fine.
On the low end where RISC-V currently lives, simplicity is a virtue.
On the high end, RISC isn't inherently bad; it just couldn't keep up on with the massive R&D investment on the x86 side. It can go fast if you sink some money into it like Apple, Qualcomm, etc have done with ARM.
> Do you think Apple spends more money than Intel on chip design?
Absolutely. Apple's R&D budget for 2025 was 34 Billion to Intel's ~18 Billion (and the majority of Intel's R&D budget goes to architecture, while for Apple, that is all TSMC R&D and Apple pays TSMC another ~$20 billion a year, of which, something like 8 billion is probably TSMC R&D that goes into apple's chips).
Sure not all of Apple's 34B is CPU R&D, but on a like-for-like basis, Apple probably has at least 50% more chip design budget (and they only make ~10-20 different chips a year compared to Intel who make ~100-200)
Correct, ARM does not dominate x86 in desktop and servers. Just everywhere else.
Apple is top 5 for laptop and desktop market share. So, pretty sure Apple RISC Silicon has a presence in those markets. Very recently, Qualcomm has entered as well. And of course Chromebooks are primarily ARM.
ARM has only recently entered the server market. Already it is having some success, especially with hyperscalers.
RISC-V is about to enter all those markets. I mean, RISC-V silicon is in use in the cloud. But it is still an experiment at this stage. And you can buy a RISC-V laptop. But they are only for devs.
x86_64 machines are RISC under the hood and have been for ages, I believe; microcode is translating your x64 instructions to risc instructions that run on the real CPU, or something akin to that. RISC never died, CISC did, but is still presented as the front-facing ISA because of compatibility.
That's a common factoid that's bandied about but it's not really accurate, or at least overstated.
To start, modern x86 chips are more hard-wired than you might think; certain very complex operations are microcoded, but the bulk of common instructions aren't (they decode to single micro-ops), including ones that are quite CISC-y.
Micro-ops also aren't really "RISC" instructions that look anything like most typical RISC ISAs. The exact structure of the microcode is secret, but for an example, the Pentium Pro uses 118-bit micro-ops when most contemporary RISCs were fixed at 32. Most microcoded CPUs, anyway, have microcodes that are in some sense simpler than the user-facing ISA but also far lower-level and more tied to the microarchitecture.
But I think most importantly, this idea itself - that a microcoded CISC chip isn't truly CISC, but just RISC in disguise - is kind of confused, or even backwards. We've had microcoded CPUs since the 50s; the idea predates RISC. All the classic CISC examples (8086, 68000, VAX-11) are microcoded. The key idea behind RISC, arguably, was just to get rid of the friendly user-facing ISA layer and just expose the microarchitecture, since you didn't need to be friendly if the compiler could deal with ugliness - this then turned out to be a bad idea (e.g. branch delay slots) that was backtracked on, and you could argue instead that RISC chips have thus actually become more CISC-y! A chip with a CISC ISA and a simpler microcode underneath isn't secretly a RISC chip...it's just a CISC chip. The definition of a CISC chip is to have a CISC layer on top, regardless of the implementation underneath; the definition of a RISC chip is to not have a CISC layer on top.
I think you are conflating microcode with micro-ops. The distinction into the fundamental workings of the CPU is very important. Microcode is an alternative to a completely hard coded instruction decoder. It allows tweaking the behavior in the rest of the CPU for a given instruction without re-making the chip. Micro-ops are a way to break complex instructions into multiple independently executing instructions and in the case of x86 I think comparing them to RISC is completely apt.
The way I understand it, back in the day when RISC vs CISC battle started, CPUs were being pipelined for performance, but the complexity of the CISC instructions most CPUs had at the time directly impacted how fast that pipeline could be made. The RISC innovation was changing the ISA by breaking complex instructions with sources and destinations in memory to be sequences of simpler loads and stores and adding a lot more registers to hold the temporary values for computation. RISC allowed shorter pipelines (lower cost of branches or other pipeline flushes) that could also run at higher frequencies because of the relative simplicity.
What Intel did went much further than just microcode. They broke up the loads and stores into micro-ops using hidden registers to store the intermediates. This allowed them to profit from the innovations that RISC represented without changing the user facing ISA. But internal load store architecture is what people typically mean by the RISC hiding inside x86 (although I will admit most of them don't understand the nuance). Of course Intel also added Out of Order execution to the mix so the CPU is no longer a fixed length pipeline but more like a series of queues waiting for their inputs to be ready.
These days high performance RISC architectures contain all the same architectural elements as x86 CPUs (including micro-ops and extra registers) and the primary difference is the instruction decoding. I believe AMD even designed (but never released) an ARM cpu [1] that put a RISC instruction decoder in front of what I believe was the zen 1 backend.
That's an excellent rebuttal to this common factoid.
Recently I encountered a view that has me thinking. They characterized the PIO "ISA" in the RPi MCU as CISC. I wonder what you think of that.
The instructions are indeed complex, having side effects, implied branches and other features that appear to defy the intent of RISC. And yet they're all single cycle, uniform in size and few in number, likely avoiding any microcode, and certainly any pipelining and other complex evaluation.
If it is CISC, then I believe it is a small triumph of CISC. It's also possible that even characterizing it as and ISA at all is folly, in which case the point is moot.
I think that this is something of a misunderstanding. There isn't a litteral RISC processor inside the x86 processor with a tiny little compiler sitting in the middle. Its more that the out-of-order execution model breaks up instructions into μops so that the μops can separately queue at the core's dozens of ALUs, multiple load/store units, virtual->physical address translation units, etc. The units all work together in parallel to chug through the incoming instructions. High-performance RISC-V processors do exactly the same thing, despite already being "RISC".
Ah, PowerPC. For a RISC processor it surely had a lot of instructions, most of them quite peculiar. But hey, it had fixed-length instruction encoding and couldn't address memory in instructions other than "explicit memory load/store", so it was RISC, right?
Silly opinion that has no relevance to building competitive CPUs, but I like that RISC-V is modular and you can pick and choose which extensions to adopt.
Makes writing a simulator so easy (just have to focus on RV32I to get started), and also makes RISC-V a great bytecode alternative for a homegrown register-based virtual machine: chances are RV32I covers all the operations you will need on any Turing-complete VM. No need to reinvent the wheel. In a weekend I implemented all of RV32IM, passing all the official tests, and now I have target my VM with any major compiler (GCC, Rust) with no effort.
If there is any architecture that scales linearly from the most minimal of low-energy cores to advanced desktop hardware is RISC-V.
Disclaimer: I don't know much about ARM, but 1) it isn't as open and 2) it's been around enough to have accumulated as much historical cruft as x86.
When ARM moved to 64-bit the ISA was much more substantially reworked than for AMD's x86-64 transition (which mainly added modes and repurposed INCrement and DECrement to provide the REX prefix which provides a 64-bit size specification and one additional bit for register name specifiers; obviously the page table format also changed). I am not particularly familiar with AArch64, but I got the impression that the main retained cruft from 32-bit ARM was condition codes and the tradeoffs of providing condition codes would lead some not to consider such cruft. The use of four bits for almost every instruction to support predication was eliminated — which was a major cruft point for 32-bit ARM — and the legacy of shift and perform ALU operation orientation of the original ARM (which had timing slack from the slowness of instruction fetch) was de-emphasized.
AArch64 is accumulating cruft, perhaps particularly with respect to SIMD, but it is less crufty than x86-64.
ISA modularity/diversity can be useful for embedded systems, where the software is really firmware. If one is going to have to provide a diversity of compilation targets via either a common distribution format that is compiled to the local machine code or an app store that receives a software format that can be compiled to diverse, the best distribution format (to users or the app store) is likely to be significantly different than an encoding best for direct execution.
Some optional features can be hidden by system libraries (particularly when the main use of the feature is suitable for a separate accelerator). E.g., an instruction that performs a round of AES encryption could be hidden behind an encryption library. However, some uses of an AES instruction involve a very short "message" for which library overhead would be excessive or for which good enough software alternatives would be faster than actual AES.
Indexed memory accesses and conditional select/move, for example, are not really suitable to system libraries (or trapping to software even with a very fast trap handler).
ISA scaling is not necessarily a good design feature. An ISA optimized for the market targeted by ARM M Profile is unlikely to be optimal for future 16-wide decode high performance processors. E.g., if a context only has 16 registers, using 5-bit register specifiers is suboptimal even though it allows software to be "upward compatible" with a 32-register design.
That's not really how it works. There are only a few companies on the planet that are licensed to create their own cores that can run ARM instructions. This is an artificial constraint, though and at present China is (as far as I know) cut off from those licenses. Everyone else that makes ARM chips is taking the core design directly from ARM integrating it with other pieces (called IP) like IO controllers, power management, GPU and accelerators like NPUs to make a system on a chip. But with RISC-V lots of Chinese companies have been making their own core designs, that leads to flexibility with design that is not generally available (and certainly not cost effective) on ARM.
Maybe. People are free to partake in whatever cognitive misadventures they wish. I merely cite the incontrovertible fact that Berkeley RISC predates essentially all of the modern economic history of China, and also the rise of ARM. It came from academe in the US, for better or worse, whether it's crap or the finest ISA ever, and for whatever purpose these US academics had or or have. That is all anyone can truthfully say about its pedigree. The rest is just bullshit from the internet.