Ruby Raytracer on M2 Max
A few years — and iterations of the Apple M-series — has passed, and I recently found myself in possession of a Mac Studio M2, in its base configuration of 8P + 4E CPU cores, and even more ridiculous1 amounts of cache.
But the burning question is, does it deliver on the promise of up to 18% CPU speed improvement?
Hard shadows
The base-line benchmark, as before, is the single-threaded render without any fancy rendering techniques active
Mac Studio M2 (2023)
[1] pry(#<GlisteningRuby::DSL::Scene>)> render verbose:true
Setup time: 0.178938 s
Render time: 104.306864 s
Total time: 104.485802 s
That’s a 17% improvement in single-core performance over the M1, so right on the spot.
MacBook Pro M1 (2021)
[1] pry(#<GlisteningRuby::DSL::Scene>)> render verbose:true
Setup time: 0.240659 s
Render time: 122.153057 s
Total time: 122.393716 s
Then we get into the unfair territory, because the M1 did the multi-threaded benchmark using all its P-, and E-cores, where the M2 Max can let its P-cores do all the work.
Mac Studio M2 (2023)
[1] pry(#<GlisteningRuby::DSL::Scene>)> render verbose:true, threads:8
Setup time: 0.173121 s
Render time: 15.689671 s
Total time: 15.862792 s
BOOM even lightning faster — over six and a half times as fast multi-threaded, and almost twice (1.79x) as fast as the M1 was.
MacBook Pro M1 (2021)
[2] pry(#<GlisteningRuby::DSL::Scene>)> render verbose:true, threads:8
Setup time: 0.266847 s
Render time: 28.061283 s
Total time: 28.32813 s
Going All In — Soft Shadows and Anti-Aliasing
This time, we’re throwing everything we have, letting the tiny E-cores do what they can to help along.
Mac Studio M2 (2023)
[1] pry(#<GlisteningRuby::DSL::Scene>)> render verbose:true, threads:12, ssaa:3
Setup time: 0.17466 s
Render time: 304.752032 s
Total time: 304.926692 s
Just shy of twice (1.97x) the crunch of an M1, in the extremely unfair final round.
MacBook Pro M1 (2021)
[1] pry(#<GlisteningRuby::DSL::Scene>)> render verbose:true, threads:8, ssaa:3
Setup time: 0.23379 s
Render time: 601.335931 s
Total time: 601.569721 s
Still, this is an extremely inefficient raytracer, and nothing we can throw at it will make renders reasonable. All the benchmarks yield a 320x256 image, which of course is the standard Amiga PAL low-res, and would take days to run with an optimized raytracer on an actual Amiga.
Which raises the next question, is it more efficient to emulate an Amiga, running an optimized raytracer, than this nonsense one I made in Ruby?
-
This monster has 87 MiB L1+L2+L3 cache in total. It can fit my entire first hard-drive (52 MB) in CPU cache. ↩
Previous posts
RasPi 400 bonus benchmarks
I didn’t quite finish the bonus benchmarks, so for the sake of completeness, here we go. The missing ones for the RasPi 400.
Ruby raytracer on M1
While doing something else than I was planning to, I accedentally came back to the raytracer I once wrote in Ruby, and decided that the important thing just now is to benchmark my MacBook Pro M1 against those old results. I guess this will actually be Ruby 3 on M1 vs Ruby 2 on i7, because time happened.
Further down the rabbit hole
Hello darkness, my old friend