On microcode

On microcode

There has been one too many case of "I don't trust microcode, so I don't want microcode blobs in coreboot", so I felt the need for an answer. And since I don't like stuff to end up in silos, here's a copy.

Microcode vs. microcode updates

Let's get this out of the door first: The blobs that ship with coreboot, Linux, Windows, macOS etc aren't microcode but microcode updates. The CPU comes with microcode, so if you don't want microcode, choose a different vendor (good luck).

The updates provide a way for the vendor to run a newer version of parts of the microcode on the CPU. Note that these updates aren't persistent: They need to be installed after power-on or you're back to the original state.

This article will next discuss what microcode is, why there are updates, and what you miss out on if you don't install them.

What is microcode

CPUs have internal components for all kinds of operations that they support. But some parts of the (x86, but also most others) architecture are too complex to represent them directly as a distinct component.

For these complex parts, microcode is used (starting in the 60s or 70s, a _real_ long time ago! Home computers were a bit late but they always are): Small program snippets that represent a single instruction of the architecture as a whole bunch of instructions for the components that exist.

A bit like "if the instruction says to exponentiate a to the power of b, multiply a by itself b times" (although CPUs generally don't have exponentiation, that would be a great example for a microcoded instruction: some complex operation can be solved by repeating simpler operations several times)

Now, Intel (and the others) build both the internal components and everything that brings them together, some of them as real components, some of them microcoded.

Product cycles and some history

Like with all products, there isn't perfection: the product is developed, there's a quality bar that needs to be met, a time limit by which the product should be ready, and a budget that informs how many people can work on it. As soon as the chip is "good enough", it can be produced.

Now, Intel had some really bad accidents with that strategy, the most famous and expensive being the fdiv bug: certain division operations miscalculated for certain values. Depending on who you ask, that one could have had the ability to take down Intel for good if they had to ship a replacement CPU to every customer (they didn't).

With that experience, these days they provide some room in the CPU by which they can put on band-aids on certain parts of the operation. The details aren't well-known, but some educated guesses would be that they can update microcode programs for existing instructions (for example, if the exponentiation example above forgot to account for a^0 == 1, they could provide an update that takes care of that); and that they might put multiple variants of a basic component in a chip and allow the update to disable the new (and experimental) version if it is less reliable than intended, using an older (and probably slower) version instead.

The impact

When you don't install updates you mostly forgo whatever development happened on the CPU after some cut-off point in time that was chosen more-or-less arbitrarily by its product manager to get the product out of the door.

Since that's when the majority of developers move on to the next project, these updates will likely fix significant issues: significant enough that somebody was assigned to look into an old product instead of working on the Next Big Thing. Something that might have an impact like "we run out of money if customers can sue us over this", something like failures in the CPU's security architecture (I don't expect there to be math problems anymore)

For all we people outside Intel know (and we don't, really), any unpatched CPU for which there are updates has grave security issues that are trivially exploited. For all we know, there might be bad actors who know the issues in detail. Intel provides updates and even ensures that "typical" platforms (e.g. Windows, macOS, popular Linux distros) install them, so they're not liable anymore (like with the fdiv thing).