Apple Silicon Docker amd64/x86 emulation via Rosetta is fast(er)
Summary: Phoronix 7zip compression test benchmarks around 2-4 times faster when running via Rosetta rather than QEMU.
Recently, Docker released a beta feature for Docker Desktop that allows for x86/AMD64 images to be run via Rosetta rather than emulated on QEMU. QEMU gets the job done, but the performance overhead of emulating an AMD64 container on ARM64 is costly. From personal experience, I've seen cross-architecture Docker builds take around 5-10 times longer when built for different architectures. This problem is much more apparent on Apple Silicon Macs, which use ARM64 rather than the more common x86/AMD64. Apple's solution is Rosetta, which is a dynamic binary translator that translates Intel/x86 instructions to Apple Silicon/ARM64 instructions on-the-fly with a small performance hit. This performance loss I've heard sits around 80% the speed of Apple Silicon-native instructions.
After learning about the new option, I decided that I would test performance differences between QEMU and Rosetta emulation in Docker. This was done via Phoronix Test Suite, a suite of test tools, tests, and test suites meant to profile computer performance. I made a simple Dockerfile and script to run Phoronix, which you can checkout here. I chose to run the Prime Sieve test since it ran relatively quickly and is CPU-based. Between the two tests, I switched Docker's beta emulation setting from QEMU (default) to Rosetta.
The QEMU container performed worse than the Rosetta container. The output is in the form of time to execute the test (seconds). The Docker virtualization engine was limited to all 8 cores, 4 GB memory, and 1 GB swap. The test when ran natively was not limited and also ran across all 8 cores. Summary results for the tests are below:
Here is a plot showing how the two emulation methods compare to native execution in terms of the native run time.
And here are the actual averages and distributions of the tests:
- Apple M1: avg 26.476, stdev 0.642
- Rosetta 2: avg 32.276, stdev 0.152
- QEMU: avg 253.651, stdev 0.364
Run the rests for yourself:
Below are additional tests comparing 7zip compression and decompression speeds between Rosetta and QEMU (no native baseline).