CPU.fail …. is that really necessary?

3 min readMay 20, 2019

Another day, another CPU-specific ‘speculative execution’ bug and another scary headline. The latest in the list is here- https://cpu.fail.

This is usually accompanied by speculation (pun intended) and hysteria, attributable to a lack of understanding of the inner workings of ‘processor microarchitectures’. I’ve had the opportunity to speak to many different security teams who deal with such attack(s) (or attack variants) over the course of the last year.

Invariably, all of them seem to be grappling with new variants coming their way. I think the problem lies in our approach. Every variant ends-up being treated as an entirely new problem that requires a different set of mitigations. While I agree that the 3 new ones are different from Spectre, Meltdown or Foreshadow and there is no such thing as one-single mitigation to deal with this problem, these attacks exploit a particular vulnerability class called ‘speculative execution side-channel’ and each new attack is just another variant.

Then why can’t we have a systematic approach to ‘mitigating’ these attacks? I think we can and what’s more — we already have a pretty good system (developed in-house at Microsoft).

Before we delve into the details of the systematic approach, a super simplified description of the attack

**A typical out-of-order CPU microarchitecture. Instructions flow *in-order* from fetch to commit (green arrow), but (between rename and commit) can *execute* in dataflow order (shaded region).**

The attack exploits a processor’s microarchitectural design. Modern processors don’t execute program instructions sequentially. They can (and will) execute instructions out-of-order or speculatively based on predictions (ex: a processor can speculate and take a “if statement” branch before/while evaluating the “if” condition)
But when a processor speculates and gets it wrong, it has to throw away or discard the result of that speculation (i.e. prediction) and restart execution from where it went wrong.
But here’s the problem — not everything is discarded. The result of such speculation could’ve have been written/persisted to the processor cache or other temporary microarchitectural buffers
Examining these processor specific buffers via side-channel techniques could reveal the result. Mostly in this case, results may contain private data.

If you think about it — a speculative execution side-channel attack has 4 steps

Make the CPU speculate
Get the CPU to execute your code during the speculation period or window
Persist/write the result of the speculative execution to cache or some internal buffer.
Finally examine or read cache/buffers for your stored result.

This kind of breaks it down, makes it so much more simpler to define our mitigation strategy and reasoning (rather than deal with every variant as a completely new attack). Simply put, if we were to do either of following, we win.

Prevent the CPU from speculating.
Remove sensitive content from memory.
Remove observation channels.

References:

Microsoft’s talk on the subject — https://youtu.be/_J9MpK4MQWk

CPU.fail …. is that really necessary?

Written by Nihal Pasham

No responses yet