Writing that changed how I think about PL
May 13, 2025
Every so often I come across a paper, blog post, or (occasionally) video that
completely changes how I think about a topic in programming languages and
compilers. For some of these posts, I can’t even remember how I thought about
the idea before reading it—it was that impactful.
Here are some of those posts in no particular order:
- a simple semi-space collector by Andy Wingo brought the concept
of a Cheney/copying/compacting garbage collector from theory to practice for
me. The garbage collector core in the post is tiny, extensible, and
can be understood in an afternoon.
- Implementing a Toy Optimizer by CF Bolz-Tereick changed how
I think about instruction rewrites in an optimizer. No more find-and-replace.
Instead, use a forwarding pointer! I love union-find. The whole toy optimizer
series is fantastic: each post brings something new and interesting to the
table.
- A Knownbits Abstract Domain for the Toy Optimizer, Correctly by
CF Bolz-Tereick is a two-for-one. It both introduced me to a new abstract
domain for optimizers and changed how I think about Z3. Before, I vaguely
knew about Z3 as this thing that can check numeric operations, kind of. The
post, however, uses Z3 as a proof engine: if Z3 can’t find a counterexample,
the code is correct. Furthermore, it uses Z3 as a verifier for
Python code by using the same Python code on both Z3 objects and normal
Python objects.
- Cranelift, Part 3: Correctness in Register Allocation by
Chris Fallin also made proofs more accessible, but in a different way.
Instead of proving your register allocator correct on all inputs, prove it
correct on one input: the current code. This means that in production, you
either get a correct register allocation or a meaningful crash. Additionally,
it uses fuzzing as a state space exploration tool that can identify bugs by
making the verifier fail.
- Regular Expression Matching: the Virtual Machine Approach by Russ
Cox made regular expression engines make sense to me. In the blog post is a
small regular expression engine in under 50 lines of readable code. As a side
effect, it also made coroutines/fibers/schedulers more understandable because
its implementation of nondeterminism makes regex “threads” in user-space.
- micrograd by Andrej Karpathy is a tiny implementation of neural
networks with no libraries (not even NumPy). It’s how I came to understand
machine learning and even write some blog posts about it.
- How I implement SSA form by Fil Pizlo changed how I think about
union-find. Instead of either storing an additional pointer inside every
object or having a side-table, add an
Identity
tag that can store the
pointer. Unlike the other two approaches, this is a destructive rewrite, but
it saves space. It also introduces two new things that I am still mulling
over: Phi/Upsilon form and TBAA-style heap effects.
- Speculation in JavaScriptCore is another absolute banger
by Fil Pizlo. I learn new things every time I re-read this post, which
describes how JSC’s various optimizers work (and I guess how Fil thinks about
optimizers).
- Modernizing Compiler Design for Carbon Toolchain is a talk by
Chandler Carruth on the design of the Carbon compiler. About 29 minutes into
it Chandler sets incredibly aggressive compile-time budgets and explains
how the compiler is architected to fit in that time budget. At around 40
minutes, he starts explaining per-layer how this works, starting at the
lexer.
- A Python Interpreter Written in Python by Allison Kaptur
made bytecode interpreters (and, specifically, how CPython works internally)
click for me.
- Parsing expressions by precedence climbing by Eli
Bendersky presented an understandable and easier-to-develop alternative to
traditional hand-written recursive descent parsers. It both made parsers less
scary to me and also at the same time illustrated how precedence climbing is
the same algorithm as recursive descent but instead of having 10 different
functions with similar structure, it uses one function and a table.
- Ruby JIT Challenge by Takashi Kokubun is a great start
to code generation and it is more general than JIT, too. In the post he also
shows off a cool approach to register allocation that I had not seen before:
fold your stack operations at compile-time to operate on a virtual stack of
physical registers.
- An Incremental Approach to Compiler Construction
(PDF) by Abdulaziz Ghuloum brought compilers and code generation from a
mystical multi-pass beast down to an understandable single-pass form. I like
how it implements each feature end-to-end, one by one.
- Lessons from Writing a Compiler by Fernando Borretti puts
into words this stripey implementation strategy in the first section, which
is really neat.
- egg: Fast and extensible equality saturation by [multiple authors]
changed how I think about optimizers and pass ordering. Why not just
generate the compressed hypergraph of all possible versions of an expression
and then pick the “best” one? I mean, there are a couple of reasons, but it
was mind-expanding.
- Cranelift: Using E-Graphs for Verified, Cooperating Middle-End
Optimizations by Chris Fallin showed that e-graphs are
workable and remarkably effective in a production compiler, even if you don’t
do the “full shebang”.
- Acyclic Egraphs and Smart Constructors by Phil Zucker further
explores acyclic egraphs in the small. This one was a slow burn: I read it
when it came out, didn’t really get it, and then months later it clicked
more. I’m sure that in a few months it’ll click again.
- This Reddit comment by Bob Nystrom and Flattening
ASTs by Adrian Sampson are a double whammy that led to two
things: 1) AST storage can be really compact, almost like bytecode 2) a
really interesting discussion on Mastodon about massively-parallel lock-free
abstract interpretation if IRs were stored like this. This (and some off-hand
comments by Cliff Click about bump-allocation only taking “a few clocks”)
changed how I think about allocating and referring to IR nodes.
I hope you enjoy the readings!