Can and should an x86 sandbox run unmodified x86 code?
Well, it’s complicated, and there are more tradeoffs and unsolved problems than one might expect.
The first argument is that yes, running unmodified x86 (and x86-64) is a very good idea because it ensures that the sandbox is modular with respect to the software running in it. In other words x86 is a very solid interface, and conforming to it makes your sandbox a drop-in replacement for anything else (including no sandbox at all).
The trouble, of course, comes from the fact that x86 is notoriously hard to virtualize. High performance and low overhead are necessary for making sandboxing practical and popular.
There are basically for approaches that I know of in use so far:
- Hardware virtualization (Xen, everything these days)
- Dynamic recompiler (old VMWare)
- Architecture subset (NaCl)
- Alternate architecture (WAsm)
The first two run unmodified x86 code; the second two basically need a special compiler target.
Hardware virtualization is obviously the fastest. It can theoretically be full speed, depending on how powerful the feature is. (Of course in the extreme case, you have two separate, air-gapped computers. But let’s put that aside. We’re focusing on solutions for a single computer.)
The problem with using hardware virtualization is that we want our sandbox to be robust against CPU bugs. What does that mean? It means two things:
- It means that if a CPU bug is found in some instruction (somewhat poor example: Rowhammer being exploitable via
CLFLUSH), we want to be able to rapidly and easily modify our sandbox to block/avoid that instruction.
- It means that if you are very paranoid, you should be able to choose a Turing-complete subset of instructions to trust, rather than trusting all of them, based on the security-vs-performance tradeoffs you are willing to make.
Thus, relying too much on hardware virtualization is risky, because you have no defense/recourse in case a hardware bug is found. (Note that I am not talking about malicious hardware backdoors; those are a separate problem that sandboxing cannot hopefully defend against.)
For this reason, something like VT-x isn’t very useful for a secure sandbox. The real, hard problem of doing it in software has to be addressed.
Now, I want to explain two different ways of doing secure sandboxing in software:
- Define an accepted input language (which might be plain x86), and use a trusted compiler/interpreter/JIT to securely execute it.
- Define a secure x86 subset and use a trusted verifier to ensure the machine code conforms to it. Compilers, interpreters and JITs do not need to be trusted and it doesn’t matter how the code is generated.
In my opinion, the choice here is clear. A formally proven compiler is very difficult and expensive to build, and then difficult and expensive to maintain, forever. The chance of error, no matter how low (thanks to formal proofs), still grows linearly with the amount of maintenance or new development over time. In other words, this option is completely undesirable.
The other option is a secure verifier. If you come up with simple rules for your secure x86 subset, then the verifier can be very simple. There are NaCl verifiers that are 500 lines long and formally proven. This simplicity also lets you easily change them to work around CPU bugs. Besides that they never really need to change at all.
What I am saying here is that an x86 sandbox will need a secure verifier for x86 code regardless of what program format it accepts. This verifier should be the absolute last stage of sandbox execution, before the sandboxed code is run on hardware.
(This is trickier if you want to run self-modifying code. In theory, if your sandbox uses W^X (write xor execute) memory pages, then you can run the verifier when pages are about to become executable. Of course you want to be sure that the verifier is run in all necessary cases, and perhaps you don’t want to trust the hardware memory protection.)
At this point, the security portion of the sandbox is done. The only question left is the interface (execution format).
The simplest option is just to leave it as an instruction subset. This is what NaCl does. However, the fact that NaCl isn’t wildly popular suggests why this might be a bad option.
The biggest problem with this approach is that you are effectively exposing the internal implementation of the sandbox. If there’s a different sandbox that uses a slightly different subset, or if you need to change which instructions are permitted, then all applications are affected. (On the other hand, this option has the best possible performance.)
A new instruction set is theoretically even worse for running existing software, but in practice it can be better. The main selling point of WebAssembly is cross-platform portability, which is unrelated to its security properties.
The final option is x86 dynamic recompliation. As I understand it, VMWare basically became a $40 billion business just by doing this one thing first and well. Then hardware support came along and ate their lunch (to some extent).
As far as I know, VMWare’s recompiler was never “highly secure” (i.e. it was written using conventional development techniques), nor did it have a formally proven output verifier. (To be clear, you don’t need both.) In other words, x86 support for a secure sandbox is (somewhat) harder than the one VMWare was founded to solve, and moved off of as quickly as an alternative was found. There is also an implicit performance hit on top of the instruction subset cost.
In conclusion, no, a secure sandbox cannot (directly) run unmodified x86 code. However, it probably should, for the sake of adoption. A WebAssembly front-end might be a viable compromise thanks to an easier implementation and reduced performance expectations relative to x86.
Keywords: sandboxing, security