<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/assets/rss.xsl"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Max Bernstein&apos;s Blog</title>
        <description></description>
        <link>https://bernsteinbear.com</link>
        <atom:link href="https://bernsteinbear.com/feed.xml" rel="self" type="application/rss+xml" />
        <item shouldShow="false">
            <title>Sorry for marking all the posts as unread</title>
            <description>
              I noticed that the URLs were all a little off (had two slashes
              instead of one) and went in and fixed it. I did not think
              everyone's RSS software was going to freak out the way it did.

              PS: this is a special RSS-only post that is not visible on the
              site. Enjoy.
            </description>
            <pubDate>Wed, 31 Jan 2024 00:00:00 +0000</pubDate>
            <guid isPermaLink="false">rss-only-post-1</guid>
        </item>
        
        <item>
            <title>Value numbering</title>
            <description>&lt;p&gt;Welcome back to compiler land. Today we’re going to talk about &lt;em&gt;value
numbering&lt;/em&gt;, which is like SSA, but more.&lt;/p&gt;

&lt;p&gt;Static single assignment (SSA) gives names to values: every expression has a
name, and each name corresponds to exactly one expression. It transforms
programs like this:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;where the variable &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt; is assigned more than once in the program text, into
programs like this:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;where each assignment to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt; has been replaced with an assignment to a new
fresh name.&lt;/p&gt;

&lt;p&gt;It’s great because it makes clear the differences between the two &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x + 1&lt;/code&gt;
expressions. Though they textually look similar, they compute different values.
The first computes 1 and the second computes 2. In this example, it is not
possible to substitute in a variable and re-use the value of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x + 1&lt;/code&gt;, because
the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt;s are different.&lt;/p&gt;

&lt;p&gt;But what if we see two “textually” identical instructions in SSA? That sounds
much more promising than non-SSA because the transformation into SSA form has
removed (much of) the statefulness of it all. When can we re-use the result?&lt;/p&gt;

&lt;p&gt;Identifying instructions that are known at compile-time to always produce the
same value at run-time is called &lt;em&gt;value numbering&lt;/em&gt;. &lt;!-- This is also called common
subexpression elimination (CSE), though for some reason the two mean slightly
different things to different groups of people. --&gt;&lt;/p&gt;

&lt;h2 id=&quot;eliminating-common-subexpressions&quot;&gt;Eliminating common subexpressions&lt;/h2&gt;

&lt;p&gt;To understand value numbering, let’s extend the above IR snippet with two more
instructions, v3 and v4.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# new
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v4&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;do_something&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# new
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In this new snippet, v3 looks the same as v1: adding v0 and 1. Assuming our
addition operation is some ideal mathematical addition, we can absolutely
re-use v1; no need to compute the addition again. We can rewrite the IR to
something like:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v4&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;do_something&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is kind of similar to the destructive union-find representation that
JavaScriptCore and a couple other compilers use, where the optimizer doesn’t
eagerly re-write all uses but instead leaves a little breadcrumb
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Identity&lt;/code&gt;/&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Assign&lt;/code&gt; instruction&lt;sup id=&quot;fnref:cinder&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:cinder&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;We could then run our copy propagation pass (“union-find cleanup”?) and get:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v4&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;do_something&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Great. But how does this happen? How does an optimizer identify reusable
instruction candidates that are “textually identical”? Generally, there is &lt;a href=&quot;https://pointersgonewild.com/2011/10/07/optimizing-global-value-numbering/&quot;&gt;no
actual text in the
IR&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;One popular solution is to compute a hash of each instruction. Then any
instructions with the same hash (that also compare equal, in case of
collisions) are considered equivalent. This is called &lt;em&gt;hash-consing&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;When trying to figure all this out, I read through a couple of different
implementations. I particularly like the &lt;a href=&quot;https://maxine-vm.readthedocs.io/en/stable/&quot;&gt;Maxine VM&lt;/a&gt; implementation.
For example, here is the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;valueNumber&lt;/code&gt; (hashing) and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;valueEqual&lt;/code&gt;
functions for most binary operations, slightly modified for clarity:&lt;/p&gt;

&lt;div class=&quot;language-java highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;abstract&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Instruction&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// The base class for binary operations&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;abstract&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Op2&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Instruction&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// Each binary operation has an opcode and two opearands&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opcode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;// (IMUL, IADD, ...)&lt;/span&gt;
    &lt;span class=&quot;nc&quot;&gt;Value&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;nc&quot;&gt;Value&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;valueNumber&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// There are other fields but only opcode, and operands get hashed.&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// Always set at least one bit in case the hash wraps to zero.&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;mh&quot;&gt;0x20000000&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opcode&lt;/span&gt;
           &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;identityHashCode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
           &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;identityHashCode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;valueEqual&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Instruction&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;instanceof&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Op2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;nc&quot;&gt;Op2&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;o&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Op2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opcode&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;o&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;opcode&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;o&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;o&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The rest of the value numbering implementation assumes that if a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;valueNumber&lt;/code&gt;
function returns 0, it does not wish to be considered for value
numbering. Why might an instruction opt-out of value numbering?&lt;/p&gt;

&lt;h2 id=&quot;pure-vs-impure&quot;&gt;Pure vs impure&lt;/h2&gt;

&lt;p&gt;An instruction might opt out of value numbering if it is not “pure”.&lt;/p&gt;

&lt;p&gt;Some instructions are not pure. Purity is in the eye of the beholder, but in
general it means that an instruction does not interact with the state of the
outside world, except for trivial computation on its operands. (What does it
mean to de-duplicate/cache/reuse &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;printf&lt;/code&gt;?)&lt;/p&gt;

&lt;p&gt;A load from an array object is also not a pure operation&lt;sup id=&quot;fnref:heap-ssa&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:heap-ssa&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;. The load operation
implicitly relies on the state of the memory. Also, even if the array was
known-constant, in some runtime
systems, the load might raise an exception. Changing the source location where
an exception is raised is generally frowned upon. Languages such as Java often
have requirements about where exceptions are raised codified in their
specifications.&lt;/p&gt;

&lt;p&gt;We’ll work only on pure operations for now, but we’ll come back to this later.
We do often want to optimize impure operations as well!&lt;/p&gt;

&lt;p&gt;We’ll start off with the simplest form of value numbering, which operates only
on linear sequences of instructions, like basic blocks or traces.&lt;/p&gt;

&lt;h2 id=&quot;local-value-numbering&quot;&gt;Local value numbering&lt;/h2&gt;

&lt;p&gt;Let’s build a small implementation of local value numbering (LVN). We’ll start with
straight-line code—no branches or anything tricky.&lt;/p&gt;

&lt;p&gt;Most compiler optimizations on control-flow graphs (CFGs) iterate over the
instructions “top to bottom”&lt;sup id=&quot;fnref:order&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:order&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; and it seems like we can do the same thing
here too.&lt;/p&gt;

&lt;p&gt;From what we’ve seen so far optimizing our made-up IR snippet, we can do
something like this:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;initialize a map from instruction numbers to instruction pointers&lt;/li&gt;
  &lt;li&gt;for each instruction &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i&lt;/code&gt;
    &lt;ul&gt;
      &lt;li&gt;if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i&lt;/code&gt; wants to participate in value numbering
        &lt;ul&gt;
          &lt;li&gt;if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i&lt;/code&gt;’s value number is already in the map, replace all pointers to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i&lt;/code&gt;
in the rest of the program with the corresponding value from the map&lt;/li&gt;
          &lt;li&gt;otherwise, add &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i&lt;/code&gt; to the map&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The find-and-replace, remember, is not a literal find-and-replace, but instead
something like:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opcode&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Assign&quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operands&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;replacement&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;or&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_equal_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;replacement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(if you have been following along with the &lt;a href=&quot;https://pypy.org/categories/toy-optimizer.html&quot;&gt;toy optimizer&lt;/a&gt; series)&lt;/p&gt;

&lt;p&gt;This several-line function (as long as you already have a hash map and a
union-find available to you) is enough to build local value numbering! And real
compilers are built this way, too.&lt;/p&gt;

&lt;p&gt;If you don’t believe me, take a look at this slightly edited snippet from
&lt;a href=&quot;https://maxine-vm.readthedocs.io/en/stable/&quot;&gt;Maxine’s&lt;/a&gt; value numbering implementation. It has all of the components
we just talked about: iterating over instructions, map lookup, and some
substitution.&lt;/p&gt;

&lt;div class=&quot;language-java highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;// Local value numbering&lt;/span&gt;
&lt;span class=&quot;nc&quot;&gt;BlockBegin&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...;&lt;/span&gt;
&lt;span class=&quot;nc&quot;&gt;ValueMap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentMap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ValueMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;nc&quot;&gt;InstructionSubstituter&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;subst&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;InstructionSubstituter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// visit all instructions of this block&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Instruction&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;next&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;next&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// attempt value numbering (uses valueNumber() and valueEqual())&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;//&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// return a previous instruction if it exists in the map, or insert the&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// current instruction into the map and return it&lt;/span&gt;
    &lt;span class=&quot;nc&quot;&gt;Instruction&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;findInsert&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// remember the replacement in the union-find&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;subst&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setSubst&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This alone will get you pretty far. Code generators of all shapes tend to leave
messy repeated computations all over their generated code and this will make
short work of them.&lt;/p&gt;

&lt;p&gt;Sometimes, though, your computations are spread across control flow—over
multiple basic blocks. What do you do then?&lt;/p&gt;

&lt;!--
## Equivalence classes
--&gt;

&lt;h2 id=&quot;global-value-numbering&quot;&gt;Global value numbering&lt;/h2&gt;

&lt;p&gt;Computing value numbers for an entire function is called &lt;em&gt;global value
numbering&lt;/em&gt; (GVN) and it requires dealing with control flow (if, loops, etc). I
don’t just mean that for an entire function, we run local value numbering
block-by-block. Global value numbering implies that expressions can be
de-duplicated and shared across blocks.&lt;/p&gt;

&lt;p&gt;Let’s tackle control flow case by case.&lt;/p&gt;

&lt;p&gt;First is the simple case from above: one block. In this case, we can go top to
bottom with our value numbering and do alright.&lt;/p&gt;

&lt;figure&gt;
  &lt;object class=&quot;svg&quot; type=&quot;image/svg+xml&quot; data=&quot;/assets/img/gvn-one-block.svg&quot;&gt;&lt;/object&gt;
&lt;/figure&gt;

&lt;p&gt;The second case is also reasonable to handle: one block flowing into another. In this
case, we can still go top to bottom. We just have to find a way to iterate over
the blocks.&lt;/p&gt;

&lt;p&gt;If we’re not going to share value maps between blocks, the order doesn’t
matter. But since the point of global value numbering is to share values, we
have to iterate them in topological order (reverse post order (RPO)). This
ensures that predecessors get visited before successors. If you have &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bb0 -&amp;gt;
bb1&lt;/code&gt;, we have to visit first &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bb0&lt;/code&gt; and then &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bb1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Because of how SSA works and how CFGs work, the second block can “look up” into
the first block and use the values from it. To get global value numbering
working, we have to copy &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bb0&lt;/code&gt;’s value map before we start processing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bb1&lt;/code&gt; so
we can re-use the instructions.&lt;/p&gt;

&lt;figure&gt;
  &lt;object class=&quot;svg&quot; type=&quot;image/svg+xml&quot; data=&quot;/assets/img/gvn-two-blocks.svg&quot;&gt;&lt;/object&gt;
&lt;/figure&gt;

&lt;p&gt;Maybe something like:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;value_map&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ValueMap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reverse_post_order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;local_value_numbering&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value_map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then the expressions can accrue across blocks. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bb1&lt;/code&gt; can re-use the
already-computed &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Add v0, 1&lt;/code&gt; from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bb0&lt;/code&gt; because it is still in the map.&lt;/p&gt;

&lt;p&gt;…but this breaks as soon as you have control-flow splits. Consider the
following shape graph:&lt;/p&gt;

&lt;!--
digraph G {
  node [shape=square];
  A -&gt; B;
  A -&gt; C;
}
--&gt;
&lt;figure&gt;
  &lt;object class=&quot;svg&quot; type=&quot;image/svg+xml&quot; data=&quot;/assets/img/gvn-split.svg&quot;&gt;&lt;/object&gt;
&lt;/figure&gt;

&lt;p&gt;We’re going to iterate over that graph in one of two orders: A B C or A C B. In
either case, we’re going to be adding all this stuff into the value map from
one block (say, B) that is not actually available to its sibling block (say,
C).&lt;/p&gt;

&lt;p&gt;When I say “not available”, I mean “would not have been computed before”. This
is because we execute either A then B or A then C. There’s no world in which we
execute B then C.&lt;/p&gt;

&lt;p&gt;But alright, look at a third case where there is such a world: a control-flow
join. In this diagram, we have two predecessor blocks B and C each flowing into
D. In this diagram, B &lt;em&gt;always&lt;/em&gt; flows into D and also C &lt;em&gt;always&lt;/em&gt; flows into D.
So the iterator order is fine, right?&lt;/p&gt;

&lt;!--
digraph G {
  node [shape=square];
  A -&gt; B;
  A -&gt; C;
  B -&gt; D;
  C -&gt; D;
}
--&gt;
&lt;figure&gt;
  &lt;object class=&quot;svg&quot; type=&quot;image/svg+xml&quot; data=&quot;/assets/img/gvn-join.svg&quot;&gt;&lt;/object&gt;
&lt;/figure&gt;

&lt;p&gt;Well, still no. We have the same sibling problem as before. B and C still can’t
share value maps.&lt;/p&gt;

&lt;p&gt;We also have a weird question when we enter D: where did we come from? If we
came from B, we can re-use expressions from B. If we came from C, we can re-use
expressions from C. But we cannot in general know which predecessor block we
came from.&lt;/p&gt;

&lt;p&gt;The only block we know &lt;em&gt;for sure&lt;/em&gt; that we executed before D is A. This means we
can re-use A’s value map in D because we can guarantee that all execution paths
that enter D have previously gone through A.&lt;/p&gt;

&lt;p&gt;This relationship is called a &lt;em&gt;dominator&lt;/em&gt; relationship and this is the key to
one style of global value numbering that we’re going to talk about in this
post. A block can always use the value map from any other block that dominates
it. For completeness’ sake, in the diamond diagram, A dominates each of B and
C, too.&lt;/p&gt;

&lt;p&gt;We can compute dominators a couple of ways&lt;sup id=&quot;fnref:compute-doms&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:compute-doms&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;, but that’s a little
bit out of scope for this blog post. If we assume that we have dominator
information available in our CFG, we can use that for global value numbering.
And that’s just what—you guessed it—Maxine VM does.&lt;/p&gt;

&lt;p&gt;It iterates over all blocks in reverse post-order, doing local value numbering,
threading through value maps from dominator blocks. In this case, their method
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dominator&lt;/code&gt; gets the &lt;em&gt;immediate dominator&lt;/em&gt;: the “closest” dominator block of
all the blocks that dominate the current one.&lt;/p&gt;

&lt;div class=&quot;language-java highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;GlobalValueNumberer&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;HashMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;BlockBegin&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ValueMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;valueMaps&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;InstructionSubstituter&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;subst&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;nc&quot;&gt;ValueMap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;GlobalValueNumberer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;IR&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ir&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;subst&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;InstructionSubstituter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ir&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// reverse post-order&lt;/span&gt;
        &lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;BlockBegin&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;blocks&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ir&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;linearScanOrder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;valueMaps&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;HashMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;BlockBegin&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ValueMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;blocks&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;optimize&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;blocks&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;subst&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;finish&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;BlockBegin&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;blocks&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;numBlocks&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;blocks&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
        &lt;span class=&quot;nc&quot;&gt;BlockBegin&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;startBlock&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;blocks&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;// initial value map, with nesting 0&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;valueMaps&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;startBlock&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ValueMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;numBlocks&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;c1&quot;&gt;// iterate through all the blocks&lt;/span&gt;
            &lt;span class=&quot;nc&quot;&gt;BlockBegin&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;blocks&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
            &lt;span class=&quot;nc&quot;&gt;BlockBegin&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dominator&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;dominator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;

            &lt;span class=&quot;c1&quot;&gt;// create new value map with increased nesting&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;currentMap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ValueMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;valueMaps&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dominator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;

            &lt;span class=&quot;c1&quot;&gt;// &amp;lt;&amp;lt; INSERT LOCAL VALUE NUMBERING HERE &amp;gt;&amp;gt;&lt;/span&gt;

            &lt;span class=&quot;c1&quot;&gt;// remember value map for successors&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;valueMaps&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentMap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And that’s it! That’s the core of Maxine’s &lt;a href=&quot;https://github.com/beehive-lab/Maxine-VM/blob/e213a842f78983e2ba112ae46de8c64317bc206e/com.sun.c1x/src/com/sun/c1x/opt/GlobalValueNumberer.java&quot;&gt;GVN implementation&lt;/a&gt;. I
love how short it is. For not very much code, you can remove a lot of duplicate
pure SSA instructions.&lt;/p&gt;

&lt;p&gt;This does still work with loops, but with some caveats. From p7 of &lt;a href=&quot;/assets/img/briggs-gvn.pdf&quot;&gt;Briggs GVN&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The φ-functions require special treatment. Before the compiler can analyze
the φ-functions in a block, it must previously have assigned value numbers to
all of the inputs. This is not possible in all cases; specifically, any
φ-function input whose value flows along a back edge (with respect to the
dominator tree) cannot have a value number. If any of the parameters of a
φ-function have not been assigned a value number, then the compiler cannot
analyze the φ-function, and it must assign a unique, new value number to the
result.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It also talks about eliminating useless phis, which is optional, but would
the strengthen global value numbering pass: it makes more information
transparent.&lt;/p&gt;

&lt;p&gt;But what if we want to handle impure instructions?&lt;/p&gt;

&lt;h2 id=&quot;state-management-and-invalidation&quot;&gt;State management and invalidation&lt;/h2&gt;

&lt;p&gt;Languages such as Java allow for reading fields from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;this&lt;/code&gt;/&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;self&lt;/code&gt; object within
methods as if the field were a variable name. This makes code like the
following common:&lt;/p&gt;

&lt;div class=&quot;language-java highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;CPU&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;exec_adc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result_int&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regA&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fetched_data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flagCARRY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result_int&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
        &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result_int&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;^&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regA&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result_int&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;^&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fetched_data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;regA&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Each of these reference to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;regA&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;fetched_data&lt;/code&gt; is an implicit reference
to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;this.regA&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;this.fetched_data&lt;/code&gt;, which is semantically a field load off
an object. You can see it in &lt;a href=&quot;https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename:&apos;1&apos;,fontScale:14,fontUsePx:&apos;0&apos;,j:1,lang:java,selection:(endColumn:19,endLineNumber:14,positionColumn:19,positionLineNumber:14,selectionStartColumn:19,selectionStartLineNumber:14,startColumn:19,startLineNumber:14),source:&apos;class+CPU+%7B%0A++++private+void+exec_adc()+%7B%0A++++++++int+result_int+%3D+regA+%2B+fetched_data+%2B+flagCARRY%3B%0A++++++++byte+result+%3D+(byte)+result_int%3B%0A++++++++//+...%0A++++++++int+a+%3D+result_int+%5E+regA%3B%0A++++++++int+b+%3D+result_int+%5E+fetched_data%3B%0A++++++++//+...%0A++++++++regA+%3D+result%3B%0A++++%7D%0A%0A++++int+regA%3B%0A++++int+fetched_data%3B%0A++++int+flagCARRY%3B%0A%7D%0A&apos;),l:&apos;5&apos;,n:&apos;0&apos;,o:&apos;Java+source+%231&apos;,t:&apos;0&apos;)),k:50,l:&apos;4&apos;,n:&apos;0&apos;,o:&apos;&apos;,s:0,t:&apos;0&apos;),(g:!((h:compiler,i:(compiler:java2501,filters:(b:&apos;0&apos;,binary:&apos;1&apos;,binaryObject:&apos;1&apos;,commentOnly:&apos;0&apos;,debugCalls:&apos;1&apos;,demangle:&apos;0&apos;,directives:&apos;0&apos;,execute:&apos;1&apos;,intel:&apos;0&apos;,libraryCode:&apos;0&apos;,trim:&apos;1&apos;,verboseDemangling:&apos;0&apos;),flagsViewOpen:&apos;1&apos;,fontScale:14,fontUsePx:&apos;0&apos;,j:1,lang:java,libs:!(),options:&apos;&apos;,overrides:!(),selection:(endColumn:19,endLineNumber:40,positionColumn:1,positionLineNumber:1,selectionStartColumn:19,selectionStartLineNumber:40,startColumn:1,startLineNumber:1),source:1),l:&apos;5&apos;,n:&apos;0&apos;,o:&apos;+jdk+25.0.1+(Editor+%231)&apos;,t:&apos;0&apos;)),k:50,l:&apos;4&apos;,n:&apos;0&apos;,o:&apos;&apos;,s:0,t:&apos;0&apos;)),l:&apos;2&apos;,n:&apos;0&apos;,o:&apos;&apos;,t:&apos;0&apos;)),version:4&quot;&gt;the bytecode&lt;/a&gt; (thanks, Matt Godbolt):&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;class CPU {
  int regA;

  int fetched_data;

  int flagCARRY;

  CPU();
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object.&quot;&amp;lt;init&amp;gt;&quot;:()V
         4: return


  private void exec_adc();
         0: aload_0
         1: getfield      #7                  // Field regA:I
         4: aload_0
         // ...
        20: getfield      #7                  // Field regA:I
        23: ixor
        24: istore_3
        25: iload_1
        26: aload_0
        27: getfield      #13                 // Field fetched_data:I
        30: ixor
        31: istore        4
        33: aload_0
        34: iload_2
        35: putfield      #7                  // Field regA:I
        38: return
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;When straightforwardly building an SSA IR from the JVM bytecode for this
method, you will end up with a bunch of IR that looks like this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;v0 = LoadField self, :regA
v1 = LoadField self, :fetched_data
v2 = LoadField self, :flagCARRY
v3 = IntAdd v0, v1
v4 = IntAdd v3, v2
// ...
v7 = LoadField self, :regA
v8 = IntXor v4, v7
v9 = LoadField self, :fetched_data
v10 = IntXor v4, v9
// ...
StoreField self, :regA, ...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Pretty much the same as the bytecode. Even though no code in the middle could
modify the field &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;regA&lt;/code&gt; (which would require a re-load), we still have a
duplicate load. Bummer.&lt;/p&gt;

&lt;p&gt;I don’t want to re-hash this too much but it’s possible to fold &lt;a href=&quot;/blog/toy-load-store/&quot;&gt;Load and store
forwarding&lt;/a&gt; into your GVN implementation by either:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;doing load-store forwarding as part of local value numbering and clearing
memory information from the value map at the end of each block, or&lt;/li&gt;
  &lt;li&gt;keeping track of effects across blocks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See, there’s nothing fundamentally stopping you from tracking the state of your
heap at compile-time across blocks. You just have to do a little more
bookkeeping. In our dominator-based GVN implementation, for example, you can:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;track heap write effects for each block&lt;/li&gt;
  &lt;li&gt;at the start of each block B, union all of the “kill” sets for every block
back to its immediate dominator&lt;/li&gt;
  &lt;li&gt;finally, remove the stuff that got killed from the dominator’s value map&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Not so bad.&lt;/p&gt;

&lt;p&gt;Maxine doesn’t do global memory tracking, but they do a limited form of
load-store forwarding while building their HIR from bytecode: see
&lt;a href=&quot;https://github.com/beehive-lab/Maxine-VM/blob/e213a842f78983e2ba112ae46de8c64317bc206e/com.sun.c1x/src/com/sun/c1x/graph/GraphBuilder.java#L871&quot;&gt;GraphBuilder&lt;/a&gt; which uses the &lt;a href=&quot;https://github.com/beehive-lab/Maxine-VM/blob/e213a842f78983e2ba112ae46de8c64317bc206e/com.sun.c1x/src/com/sun/c1x/graph/MemoryMap.java&quot;&gt;MemoryMap&lt;/a&gt; to help track this stuff. At least
they would not have the same duplicate &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LoadField&lt;/code&gt; instructions in the example
above!&lt;/p&gt;

&lt;!--
```ruby
module Psych
  module Visitors
    class YAMLTree &lt; Psych::Visitors::Visitor
      def initialize emitter, ss, options
        # ...
        @line_width = options[:line_width]
        if @line_width &amp;&amp; @line_width &lt; 0
          if @line_width == -1
            # Treat -1 as unlimited line-width, same as libyaml does.
            @line_width = nil
          else
            fail(...)
          end
        end
        # ...
    end
  end
end
```
--&gt;

&lt;p&gt;We’ve now looked at one kind of value numbering and one implementation of it.
What else is out there?&lt;/p&gt;

&lt;h2 id=&quot;out-in-the-world&quot;&gt;Out in the world&lt;/h2&gt;

&lt;p&gt;Apparently, you can get better results by having a unified hash table (p9 of
&lt;a href=&quot;/assets/img/briggs-gvn.pdf&quot;&gt;Briggs GVN&lt;/a&gt;) of expressions, not limiting the
value map to dominator-available expressions. Not 100% on how this works yet.
&lt;!-- TODO What do you do in the second pass for available expressions? --&gt;
They note:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Using a unified hash-table has one important algorithmic consequence.
Replacements cannot be performed on-line because the table no longer reflects
availability.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Which is the first time that it occurred to me that hash-based value numbering
with dominators was an approximation of available expression analysis.&lt;/p&gt;

&lt;p&gt;There’s also a totally different kind of value numbering called value
partitioning (p12 of &lt;a href=&quot;/assets/img/briggs-gvn.pdf&quot;&gt;Briggs GVN&lt;/a&gt;). See also a nice
blog post about this by Allen Wang from the &lt;a href=&quot;https://www.cs.cornell.edu/courses/cs6120/2025sp/blog/global-value-numbering/&quot;&gt;Cornell compiler
course&lt;/a&gt;.
I think this mostly replaces the hashing bit, and you still need some other
thing for the available expressions bit.&lt;/p&gt;

&lt;p&gt;Ben Titzer and Seth Goldstein have some good &lt;a href=&quot;https://www.cs.cmu.edu/~411/slides/s25-24-gvn-inlining.pdf&quot;&gt;slides from
CMU&lt;/a&gt;. Where they
talk about the worklist dataflow approach. Apparently this is slower but gets
you more available expressions than just looking to dominator blocks. I wonder
how much it differs from dominator+unified hash table.&lt;/p&gt;

&lt;p&gt;While Maxine uses hash table cloning to copy value maps from dominator blocks,
there are also compilers such as Cranelift that use
&lt;a href=&quot;https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/scoped_hash_map.rs&quot;&gt;scoped hash maps&lt;/a&gt;
to track this information more efficiently. (Though &lt;a href=&quot;https://github.com/bytecodealliance/wasmtime/issues/4371#issuecomment-1255956651&quot;&gt;Amanieu
notes&lt;/a&gt; that you may
not need a scoped hash map and instead can tag values in your value map with the
block they came from, ignoring non-dominating values with a quick check. The
dominance check makes sense but I haven’t internalized how this affects the set
of available expressions yet.)&lt;/p&gt;

&lt;p&gt;You may be wondering if this kind of algorithm even helps at all in a dynamic
language JIT context. Surely everything is too dynamic, right? Actually, no!
The JIT hopes to eliminate a lot of method calls and dynamic behaviors,
replacing them with guards, assumptions, and simpler operations. These strength
reductions often leave behind a lot of repeated instructions. Just the other
day, Kokubun filed a &lt;a href=&quot;https://github.com/ruby/ruby/pull/16654&quot;&gt;value-numbering-like
PR&lt;/a&gt; to clean up some of the waste.&lt;/p&gt;

&lt;p&gt;ART has a recent &lt;a href=&quot;https://android-developers.googleblog.com/2025/12/18-faster-compiles-0-compromises.html&quot;&gt;blog
post&lt;/a&gt;
about speeding up GVN.&lt;/p&gt;

&lt;h3 id=&quot;implementations&quot;&gt;Implementations&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://android.googlesource.com/platform/art/+/refs/heads/main/compiler/optimizing/gvn.cc&quot;&gt;ART&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/tekknolagi/v8/blob/f030838700a83cde6992cb8ebcb3facc6a8fc1f1/src/crankshaft/hydrogen-gvn.cc&quot;&gt;V8 Hydrogen&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/facebook/hhvm/blob/1a885fae7421c759d70a8ed85aab1defcf5cc68f/hphp/runtime/vm/jit/gvn.cpp&quot;&gt;HHVM&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/openjdk/jdk/blob/f21e47db805b56d5bf183d7a2cfba076f380612a/src/hotspot/share/c1/c1_ValueMap.cpp#L517&quot;&gt;HotSpot C1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;wrapping-up-bits-and-bobbles&quot;&gt;Wrapping up; bits and bobbles&lt;/h2&gt;

&lt;p&gt;Go forth and give your values more numbers.&lt;/p&gt;

&lt;p&gt;There’s been an ongoing discussion with Phil Zucker on SSI, GVN, acyclic
egraphs, and scoped union-find. TODO summarize&lt;/p&gt;

&lt;h3 id=&quot;acyclic-e-graphs&quot;&gt;Acyclic e-graphs&lt;/h3&gt;

&lt;p&gt;Commutativity; canonicalization&lt;/p&gt;

&lt;p&gt;Seeding alternative representations into the GVN&lt;/p&gt;

&lt;p&gt;Aegraphs and union-find during GVN&lt;/p&gt;

&lt;p&gt;https://github.com/bytecodealliance/rfcs/blob/main/accepted/cranelift-egraph.md
https://github.com/bytecodealliance/wasmtime/issues/9049
https://github.com/bytecodealliance/wasmtime/issues/4371&lt;/p&gt;

&lt;h3 id=&quot;partial-redundancy-elimination&quot;&gt;Partial redundancy elimination&lt;/h3&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:cinder&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Writing this post is roughly the time when I realized that the whole
time I was wondering why Cinder did not use union-find for rewriting, it
actually did! Optimizing instruction &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;X = A + 0&lt;/code&gt; by replacing with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;X =
Assign A&lt;/code&gt; followed by copy propagation is equivalent to union-find. &lt;a href=&quot;#fnref:cinder&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:heap-ssa&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;In some forms of SSA, like heap-array SSA or sea of nodes, it’s
possible to more easily de-duplicate loads because the memory
representation has been folded into (modeled in) the IR. &lt;a href=&quot;#fnref:heap-ssa&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:order&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The order is a little more complicated than that: &lt;a href=&quot;https://stackoverflow.com/questions/36131500/what-is-the-reverse-postorder&quot;&gt;reverse
post-order&lt;/a&gt;
(RPO). And there’s a paper called “A Simple Algorithm for Global Data Flow
Analysis Problems” that I don’t yet have a PDF for that claims that RPO is
optimal for solving dataflow problems. &lt;a href=&quot;#fnref:order&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:compute-doms&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;There’s the iterative dataflow way (described in the &lt;a href=&quot;/assets/img/dominators-engineered.pdf&quot;&gt;Cooper
paper&lt;/a&gt; (PDF)),
&lt;a href=&quot;/assets/img/dominators-lengauer-tarjan.pdf&quot;&gt;Lengauer-Tarjan&lt;/a&gt; (PDF), the
&lt;a href=&quot;/assets/img/dominators-engineered.pdf&quot;&gt;Engineered Algorithm&lt;/a&gt; (PDF),
&lt;a href=&quot;/assets/img/dominators-practice.pdf&quot;&gt;hybrid/Semi-NCA approach&lt;/a&gt; (PDF), … &lt;a href=&quot;#fnref:compute-doms&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Sat, 04 Apr 2026 00:00:00 +0000</pubDate>
            <niceDate>April 4, 2026</niceDate>
            <link>https://bernsteinbear.com/blog/value-numbering/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/value-numbering/</guid>
        </item>
        
        <item>
            <title>Using Perfetto in ZJIT</title>
            <description>&lt;p&gt;&lt;em&gt;Originally published on &lt;a href=&quot;https://railsatscale.com/2026-03-27-using-perfetto-in-zjit/&quot;&gt;Rails At Scale&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Look! A trace of slow events in a benchmark! Hover over the image to see it get bigger.&lt;/p&gt;

&lt;style&gt;
img {
    max-width: 100%;
}
img:hover {
  transform: scale(2);
  transition: transform 0.1s ease-in;
}
img:not(:hover) {
  transition: transform 0.1s ease-out;
}
&lt;/style&gt;

&lt;figure&gt;

  &lt;p&gt;&lt;img src=&quot;/assets/img/perfetto-demo.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

  &lt;figcaption&gt;
A sneak preview of what the trace looks like.
&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Now read on to see what the slow events are and how we got this pretty picture.&lt;/p&gt;

&lt;h2 id=&quot;the-rules&quot;&gt;The rules&lt;/h2&gt;

&lt;p&gt;The first rule of just-in-time compilers is: you stay in JIT code. The second
rule of JIT is: you STAY in JIT code!&lt;/p&gt;

&lt;p&gt;When control leaves the compiled code to run in the interpreter—what the ZJIT
team calls either a “side-exit” or a “deopt”, depending on who you talk
to—things slow down. In a well-tuned system, this should happen pretty
rarely. Right now, because we’re still bringing up the compiler and runtime
system, it happens more than we would like.&lt;/p&gt;

&lt;p&gt;We’re reducing the number of exits over time.&lt;/p&gt;

&lt;h2 id=&quot;lies-damned-lies-and-statistics&quot;&gt;Lies, damned lies, and statistics&lt;/h2&gt;

&lt;p&gt;We can track our side-exit reduction progress with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--zjit-stats&lt;/code&gt;, which,
on process exit, prints out a tidy summary of the counters for all of the bad
stuff we track. It’s got side-exits. It’s got calls to C code. It’s got calls
to slow-path runtime helpers. It’s got everything.&lt;/p&gt;

&lt;p&gt;Here is a chopped-up sample of stats output for the Lobsters benchmark,
which is a large Rails app:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ WARMUP_ITRS=0 MIN_BENCH_ITRS=20 MIN_BENCH_TIME=0 ruby --zjit-stats benchmarks/lobsters/benchmark.rb
...
***ZJIT: Printing ZJIT statistics on exit***
...
Top-20 side exit reasons (100.0% of total 12,549,876):
                   guard_type_failure: 6,020,734 (48.0%)
                  guard_shape_failure: 5,556,147 (44.3%)
  block_param_proxy_not_iseq_or_ifunc:   445,358 ( 3.5%)
                   unhandled_hir_insn:   215,168 ( 1.7%)
                        compile_error:   181,474 ( 1.4%)
...
compiled_iseq_count:                               5,581
failed_iseq_count:                                     2
compile_time:                                    1,443ms
...
guard_type_count:                            133,425,094
guard_type_exit_ratio:                              4.5%
guard_shape_count:                            49,386,694
guard_shape_exit_ratio:                            11.3%
...
code_region_bytes:                            31,571,968
side_exit_size_ratio:                              33.1%
zjit_alloc_bytes:                             19,329,659
total_mem_bytes:                              50,901,627
...
ratio_in_zjit:                                     82.8%
$
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(I’ve cut out significant chunks of the stats output and replaced them with
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;...&lt;/code&gt; because it’s overwhelming the first time you see it.)&lt;/p&gt;

&lt;p&gt;The first thing you might note is that the thing I just described as terrible
for performance is happening &lt;em&gt;over twelve million times&lt;/em&gt;. The second thing you
might notice is that despite this, we’re staying in JIT code seemingly a high
percentage of the time. Or are we? Is 80% high? Is a 4.5% class guard miss
ratio high? What about 11% for shapes? It’s hard to say.&lt;/p&gt;

&lt;p&gt;The counters are great because they’re &lt;em&gt;quick&lt;/em&gt; and they’re reasonably stable
proxies for performance. There’s no substitute for painstaking measurements on
a quiet machine but if the counter for Bad Slow Thing goes down (and others do
not go up), we’re probably doing a good job.&lt;/p&gt;

&lt;p&gt;But they’re not great for building intuition. For intuition, we want more
tangible feeling numbers. We want to see things.&lt;/p&gt;

&lt;h2 id=&quot;building-intuition&quot;&gt;Building intuition&lt;/h2&gt;

&lt;p&gt;The third thing is that you might ask yourself “self, where are these exits
coming from?” Unfortunately, counters cannot tell you that. For that, we
want stack traces. This lets us know where in the guest (Ruby) code triggers
an exit.&lt;/p&gt;

&lt;p&gt;Ideally also we would want some notion of time: we would want to know not just
where these events happen but also when. Are the exits happening early, at
application boot? At warmup? Even during what should be steady state
application time? Hard to say.&lt;/p&gt;

&lt;p&gt;So we need more tools. Thankfully, &lt;a href=&quot;https://perfetto.dev/&quot;&gt;Perfetto&lt;/a&gt; exists.
Perfetto is a system for visualizing and analyzing traces and profiles that your
application generates. It has both a web UI and a command-line UI.&lt;/p&gt;

&lt;p&gt;We can emit traces for Perfetto and visualize them there.&lt;/p&gt;

&lt;h2 id=&quot;a-look-at-perfetto&quot;&gt;A look at Perfetto&lt;/h2&gt;

&lt;p&gt;Take a look at this &lt;a href=&quot;https://ui.perfetto.dev/#!/?url=https://bernsteinbear.com/assets/misc/perfetto-36885.fxt&quot;&gt;sample ZJIT Perfetto
trace&lt;/a&gt;
generated by running Ruby with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--zjit-trace-exits&lt;/code&gt;&lt;sup id=&quot;fnref:sampled&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:sampled&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. What do you see?&lt;/p&gt;

&lt;p&gt;I see a couple arrows on the left. Arrows indicate “instant” point-in-time
events. Then I see a mess of purple to the right of that until the end of the
trace.&lt;/p&gt;

&lt;p&gt;Hover over an arrow. Find out that each arrow is a side-exit. Scream silently.&lt;/p&gt;

&lt;p&gt;But it’s a friendly arrow. It tells you what the side-exit reason is. If you
click it, it even tells you the stack trace in the pop-up panel on the bottom.
If we click a couple of them, maybe we can learn more.&lt;/p&gt;

&lt;p&gt;We can also zoom by mousing over the track, holding Ctrl, and scrolling. That
will get us look closer. But there are so many…&lt;/p&gt;

&lt;p&gt;Fortunately, Perfetto also provides a SQL interface to the traces. We can write
a query to aggregate all of the side exit events from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;slice&lt;/code&gt; table and
line them up with the topmost method from the backtrace arguments in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;args&lt;/code&gt;
table:&lt;/p&gt;

&lt;div class=&quot;language-sql highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reason&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;display_value&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;COUNT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;count&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;slice&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;JOIN&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;ON&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg_set_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg_set_id&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;AND&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;0&apos;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;GROUP&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;display_value&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;ORDER&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;DESC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This pulls up a query box at the bottom showing us that there are a couple big
hotspots:&lt;/p&gt;

&lt;figure&gt;

  &lt;p&gt;&lt;img src=&quot;/assets/img/perfetto-method-query.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

  &lt;figcaption&gt;
Query results showing in columns left to right: reason for side-exit, method
that exited, and count. The top three are above 1k but it quickly falls off
after that.
&lt;/figcaption&gt;

&lt;/figure&gt;

&lt;p&gt;It even has a helpful option to export the results Markdown table so I can
paste (an edited version) into this blog post:&lt;/p&gt;

&lt;div style=&quot;overflow-x: auto; font-size: 0.75em; margin-left: max(-10em, calc(-50vw + 50%)); margin-right: max(-10em, calc(-50vw + 50%));&quot;&gt;

  &lt;table&gt;
    &lt;thead&gt;
      &lt;tr&gt;
        &lt;th&gt;reason&lt;/th&gt;
        &lt;th&gt;method&lt;/th&gt;
        &lt;th&gt;count&lt;/th&gt;
      &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardShape(ShapeId(2475))&lt;/td&gt;
        &lt;td&gt;ActiveModel::AttributeRegistration::ClassMethods#attribute_types&lt;/td&gt;
        &lt;td&gt;5119&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardShape(ShapeId(2099268))&lt;/td&gt;
        &lt;td&gt;ActiveRecord::ConnectionAdapters::AbstractAdapter#extended_type_map_key&lt;/td&gt;
        &lt;td&gt;2295&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardType(FalseClass)&lt;/td&gt;
        &lt;td&gt;ActiveModel::Type::Value#cast&lt;/td&gt;
        &lt;td&gt;1025&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardShape(ShapeId(2099698))&lt;/td&gt;
        &lt;td&gt;ActiveRecord::Associations#association_instance_get&lt;/td&gt;
        &lt;td&gt;904&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;BlockParamProxyNotIseqOrIfunc&lt;/td&gt;
        &lt;td&gt;ActiveRecord::AttributeMethods::Read#_read_attribute&lt;/td&gt;
        &lt;td&gt;902&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardShape(ShapeId(526450))&lt;/td&gt;
        &lt;td&gt;Rack::Request::Env#get_header&lt;/td&gt;
        &lt;td&gt;636&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardType(Class[class_exact*:Class@VALUE(0x128c60100)])&lt;/td&gt;
        &lt;td&gt;ActiveRecord::Base._reflections&lt;/td&gt;
        &lt;td&gt;622&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardType(ObjectSubclass[class_exact:Story])&lt;/td&gt;
        &lt;td&gt;ActiveRecord::Associations#association&lt;/td&gt;
        &lt;td&gt;565&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardShape(ShapeId(2098982))&lt;/td&gt;
        &lt;td&gt;ActiveRecord::Reflection::AssociationReflection#polymorphic?&lt;/td&gt;
        &lt;td&gt;510&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardType(StringSubclass[class_exact:ActiveSupport::SafeBuffer])&lt;/td&gt;
        &lt;td&gt;ActionView::OutputBuffer#&amp;lt;&amp;lt;&lt;/td&gt;
        &lt;td&gt;500&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardShape(ShapeId(2475))&lt;/td&gt;
        &lt;td&gt;ActiveRecord::AttributeMethods::PrimaryKey::ClassMethods#primary_key&lt;/td&gt;
        &lt;td&gt;492&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardType(ObjectSubclass[class_exact:ActiveModel::Type::String])&lt;/td&gt;
        &lt;td&gt;ActiveModel::Type::Value#deserialize&lt;/td&gt;
        &lt;td&gt;442&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardShape(ShapeId(2098982))&lt;/td&gt;
        &lt;td&gt;ActiveRecord::Reflection::AssociationReflection#deprecated?&lt;/td&gt;
        &lt;td&gt;376&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;GuardType(ObjectSubclass[class_exact:Bundler::Dependency])&lt;/td&gt;
        &lt;td&gt;Gem::Dependency#matches_spec?&lt;/td&gt;
        &lt;td&gt;355&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;UnhandledHIRInvokeBuiltin&lt;/td&gt;
        &lt;td&gt;Time#initialize&lt;/td&gt;
        &lt;td&gt;346&lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;

&lt;/div&gt;

&lt;p&gt;Looks like we should figure out why we’re having shape misses so much and that will
clear up a lot of exits. (Hint: it’s because once we make our first guess about
what we think the object shape will be, we don’t re-assess… &lt;strong&gt;yet&lt;/strong&gt;.)&lt;/p&gt;

&lt;p&gt;This has been a taste of Perfetto. There’s probably a lot more to explore.
Please join the &lt;a href=&quot;https://zjit.zulipchat.com&quot;&gt;ZJIT Zulip&lt;/a&gt; and let us know if you have any cool
tracing or exploring tricks.&lt;/p&gt;

&lt;p&gt;Now I’ll explain how you too can use Perfetto from your system. Adding support
to ZJIT was pretty straightforward.&lt;/p&gt;

&lt;h2 id=&quot;implementation&quot;&gt;Implementation&lt;/h2&gt;

&lt;p&gt;The first thing is that you’ll need some way to get trace data out of your
system. We write to a file with a well-known location
(&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/tmp/perfetto-PID.fxt&lt;/code&gt;), but you could do any number of things. Perhaps you
can stream events over a socket to another process, or to a server that
aggregates them, or store them internally and expose a webserver that serves
them over the internet, or… anything, really.&lt;/p&gt;

&lt;p&gt;Once you have that, you need a couple lines of code to emit the data. Perfetto
accepts a number of formats. For example, in his &lt;a href=&quot;https://thume.ca/2023/12/02/tracing-methods/&quot;&gt;excellent blog post&lt;/a&gt;,
Tristan Hume opens with such a simple snippet of code for logging Chromium
Trace JSON-formatted events (lightly modified by me):&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;event_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;timestamp&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;duration&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;trace.json&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;a&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;[&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# ... emit some events here ...
&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# Log a single event
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;{&quot;name&quot;: &quot;%s&quot;, &quot;ts&quot;: %d, &quot;dur&quot;: %d, &quot;cat&quot;: &quot;hi&quot;, &quot;ph&quot;: &quot;X&quot;, &quot;pid&quot;: 1, &quot;tid&quot;: 1, &quot;args&quot;: {}},&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;event_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timestamp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;duration&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# ... emit some events here ...
&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# ... at process exit, close the file ...
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;]&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# this closing ] isn&apos;t actually required
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This snippet is great. It shows, end-to-end, writing a stream of one event. It
is a &lt;em&gt;complete&lt;/em&gt; (X) event, as opposed to either:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;two discrete timestamped &lt;em&gt;begin&lt;/em&gt; (B) and &lt;em&gt;end&lt;/em&gt; (E) events that book-end
something, or&lt;/li&gt;
  &lt;li&gt;an &lt;em&gt;instant&lt;/em&gt; (i) event that has no duration, or&lt;/li&gt;
  &lt;li&gt;a couple other event types in the &lt;a href=&quot;https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/preview&quot;&gt;Chromium Trace Event Format doc&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It was enough to get me started. Since it’s JSON, and we have a lot of side
exits, the trace quickly ballooned to 8GB large for a several second benchmark.
Not great. Now, part of this is our fault—we should side exit less—and part
of it is just the verbosity of JSON.&lt;/p&gt;

&lt;p&gt;Thankfully, Perfetto ingests more compact binary formats, such as the &lt;a href=&quot;https://fuchsia.dev/fuchsia-src/reference/tracing/trace-format&quot;&gt;Fuchsia
trace format&lt;/a&gt;.
In addition to being more compact, FXT even supports string interning. After
modifying the tracer to emit FXT, we ended with closer to 100MB for the same
benchmark.&lt;/p&gt;

&lt;p&gt;We can reduce further by &lt;em&gt;sampling&lt;/em&gt;—not writing every exit to the trace, but
instead every &lt;em&gt;K&lt;/em&gt; exits (for some (probably prime) K). This is why we provide
the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--zjit-trace-exits-sample-rate=K&lt;/code&gt; option.&lt;/p&gt;

&lt;p&gt;Check out the &lt;a href=&quot;https://github.com/ruby/ruby/blob/eb8051185122d4b7bc9c6a6df694a85f34ced681/zjit/src/stats.rs#L988&quot;&gt;trace writer&lt;/a&gt; implementation from the point this article
was written.&lt;/p&gt;

&lt;h2 id=&quot;tracing-more-things&quot;&gt;Tracing more things&lt;/h2&gt;

&lt;p&gt;We could trace:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;When methods get compiled&lt;/li&gt;
  &lt;li&gt;How big the generated code is&lt;/li&gt;
  &lt;li&gt;How long each compile phase takes&lt;/li&gt;
  &lt;li&gt;When (and where) invalidation events happen&lt;/li&gt;
  &lt;li&gt;When (and where) allocations happen from JITed code&lt;/li&gt;
  &lt;li&gt;Garbage collection events&lt;/li&gt;
  &lt;li&gt;and more!&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Visualizations are awesome. Get your data in the right format so you can ask
the right questions easily. Thanks for Perfetto!&lt;/p&gt;

&lt;p&gt;Also, looks like visualizations are now available in Perfetto canary. Time to
go make some fun histograms…&lt;/p&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:sampled&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This is also sampled/strobed, so not every exit is in there. This
is just 1/K of them for some K that I don’t remember. &lt;a href=&quot;#fnref:sampled&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Fri, 27 Mar 2026 00:00:00 +0000</pubDate>
            <niceDate>March 27, 2026</niceDate>
            <link>https://bernsteinbear.com/blog/zjit-perfetto/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/zjit-perfetto/</guid>
        </item>
        
        <item>
            <title>A fuzzer for the Toy Optimizer</title>
            <description>&lt;p&gt;&lt;em&gt;Another entry in the &lt;a href=&quot;https://pypy.org/categories/toy-optimizer.html&quot;&gt;Toy Optimizer series&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;It’s hard to get compiler optimizers right. Even if you build up a painstaking test
suite by hand, you will likely miss corner cases, especially corner cases at
the interactions of multiple components or multiple optimization passes.&lt;/p&gt;

&lt;p&gt;I wanted to see if I could write a fuzzer to catch some of these bugs
automatically. But a fuzzer alone isn’t much use without some correctness
oracle—in this case, we want a more interesting bug than accidentally
crashing the optimizer. We want to see if the optimizer introduces a
correctness bug in the program.&lt;/p&gt;

&lt;p&gt;So I set off in the most straightforward way possible, inspired by my
hazy memories of a former &lt;a href=&quot;https://pypy.org/posts/2024/03/fixing-bug-incremental-gc.html&quot;&gt;CF blog post&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;generating-programs&quot;&gt;Generating programs&lt;/h2&gt;

&lt;p&gt;Generating random programs isn’t so bad. We have program generation APIs and we
can dynamically pick which ones we want to call. I wrote a small loop that
generates &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;load&lt;/code&gt;s from and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;store&lt;/code&gt;s to the arguments at random offsets and with
random values, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;escape&lt;/code&gt;s to random instructions with outputs. The idea
with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;escape&lt;/code&gt; is to keep track of the values as if there was some other
function relying on them.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;generate_program&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;num_ops&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;randint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;30&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ops_with_values&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[:]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_ops&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;choice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;load&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;store&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;escape&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;choice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;a_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;choice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ops_with_values&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;randint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;load&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ops_with_values&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;store&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;randint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;escape&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;NotImplementedError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Unknown operation &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This generates random programs. Here is an example stringified random program:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;var0 = getarg(0)
var1 = getarg(1)
var2 = getarg(2)
var3 = load(var2, 0)
var4 = load(var0, 1)
var5 = load(var1, 1)
var6 = escape(var0)
var7 = store(var0, 2, 3)
var8 = store(var2, 0, 7)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;No idea what would generate something like this, but oh well.&lt;/p&gt;

&lt;h2 id=&quot;verifying-programs&quot;&gt;Verifying programs&lt;/h2&gt;

&lt;p&gt;Then we want to come up with our invariants. I picked the invariant that, under
the same preconditions, the heap will look the same after running an optimized
program as it would under an un-optimized program&lt;sup id=&quot;fnref:equivalence&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:equivalence&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. So we can delete
instructions, but if we don’t have a load-bearing store, store the wrong
information, or cache stale loads, we will probably catch that.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;verify_program&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;before_no_alias&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interpret_program&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;a&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;b&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;c&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;a&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;before_alias&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interpret_program&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;optimized&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;after_no_alias&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interpret_program&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;optimized&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;a&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;b&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;c&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;after_alias&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interpret_program&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;optimized&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;before_no_alias&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;after_no_alias&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;before_alias&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;after_alias&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I have a very silly verifier that tests two cases: one where the arguments do
not alias and one where they are all the same object. Generating partial
aliases would be a good extension here.&lt;/p&gt;

&lt;p&gt;Last, we have the interpreter.&lt;/p&gt;

&lt;h2 id=&quot;running-programs&quot;&gt;Running programs&lt;/h2&gt;

&lt;p&gt;The interpreter is responsible for keeping track of the heap (as indexed by
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(object, offset)&lt;/code&gt; pairs) as well as the results of the various instructions.&lt;/p&gt;

&lt;p&gt;We keep track of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;escape&lt;/code&gt;d values so we can see results of some
instructions even if they do not get written back to the heap. Maybe we should
be &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;escape&lt;/code&gt;ing all instructions with output instead of only random ones. Who
knows.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;interpret_program&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ssa&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;escaped&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;getarg&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ssa&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;store&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ssa&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;load&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ssa&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;unknown&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ssa&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;escape&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Constant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;escaped&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;escaped&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ssa&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;NotImplementedError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Unknown operation &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;escaped&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;escaped&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then we return the heap so that the verifier can check.&lt;/p&gt;

&lt;h2 id=&quot;the-harness&quot;&gt;The harness&lt;/h2&gt;

&lt;p&gt;Then we run a bunch of random tests through the verifier!&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_random_programs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Remove random.seed if using in CI... instead print the seed out so you
&lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# can reproduce crashes if you find them
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;num_programs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100000&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_programs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;program&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;generate_program&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;verify_program&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;program&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The number of programs is configurable. Or you could make this &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;while True&lt;/code&gt;.
But due to how simple the optimizer is, we will find all the possible bugs
pretty quickly.&lt;/p&gt;

&lt;p&gt;I initially started writing this post because I thought I had found a bug, but
it turns out that I had, with CF’s help, in 2022, walked through every possible
case in the “buggy” situation, and the optimizer handles those cases correctly.
That explains why the verifier didn’t find that bug!&lt;/p&gt;

&lt;h2 id=&quot;testing-the-verifier&quot;&gt;Testing the verifier&lt;/h2&gt;

&lt;p&gt;So does it work? If you run it, it’ll hang for a bit and then report no issues.
That’s helpful, in a sense… it’s revealing that it is unable to find a
certain class of bug in the optimizer.&lt;/p&gt;

&lt;p&gt;Let’s comment out the main load-bearing pillar of correctness in the
optimizer—removing aliasing writes—and see what happens.&lt;/p&gt;

&lt;p&gt;We get a crash nearly instantly:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ uv run --with pytest pytest loadstore.py -k random
...
=========================================== FAILURES ============================================
_____________________________________ test_random_programs ______________________________________

    def test_random_programs():
        random.seed(0)
        num_programs = 100000
        for i in range(num_programs):
            program = generate_program()
&amp;gt;           verify_program(program)

loadstore.py:617:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

bb = [Operation(getarg, [Constant(0)], None, None), Operation(getarg, [Constant(1)], None, None), Operation(getarg, [Consta...], None, None)], None, None), Operation(load, [Operation(getarg, [Constant(0)], None, None), Constant(0)], None, None)]

    def verify_program(bb):
        before_no_alias = interpret_program(bb, [&quot;a&quot;, &quot;b&quot;, &quot;c&quot;])
        a = &quot;a&quot;
        before_alias = interpret_program(bb, [a, a, a])
        optimized = optimize_load_store(bb)
        after_no_alias = interpret_program(optimized, [&quot;a&quot;, &quot;b&quot;, &quot;c&quot;])
        after_alias = interpret_program(optimized, [a, a, a])
        assert before_no_alias == after_no_alias
&amp;gt;       assert before_alias == after_alias
E       AssertionError: assert {(&apos;a&apos;, 0): 4,...&apos;, 3): 1, ...} == {(&apos;a&apos;, 0): 9,...&apos;, 3): 1, ...}
E
E         Omitting 4 identical items, use -vv to show
E         Differing items:
E         {(&apos;a&apos;, 0): 4} != {(&apos;a&apos;, 0): 9}
E         Use -v to get more diff

loadstore.py:610: AssertionError
==================================== short test summary info ====================================
FAILED loadstore.py::test_random_programs - AssertionError: assert {(&apos;a&apos;, 0): 4,...&apos;, 3): 1, ...} == {(&apos;a&apos;, 0): 9,...&apos;, 3): 1, ...}
=============================== 1 failed, 15 deselected in 0.04s ================================
$
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We should probably use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bb_to_str(bb)&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bb_to_str(optimized)&lt;/code&gt; to print out
the un-optimized and optimized traces in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;assert&lt;/code&gt; failure messages. But we
get a nice diff of the heap automatically, which is neat. And it points to an
aliasing problem!&lt;/p&gt;

&lt;h2 id=&quot;full-code&quot;&gt;Full code&lt;/h2&gt;

&lt;p&gt;See the &lt;a href=&quot;https://github.com/tekknolagi/tekknolagi.github.com/blob/fbccf9696e98721ca77c8d5ec5f828a11492b04c/loadstore.py&quot;&gt;full code&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;extensions&quot;&gt;Extensions&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Synthesize (different) types for non-aliasing objects and add them in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;info&lt;/code&gt;
  &lt;!--
    * CF notes that we could maybe do this by, instead of adding `.info`, have a
      `checktype` guard instruction that the optimizer can use to learn types and
      change aliasing from inside the trace
  --&gt;&lt;/li&gt;
  &lt;li&gt;Shrink/reduce failing examples down for easier debugging&lt;/li&gt;
  &lt;li&gt;Use Hypothesis for property-based testing, which CF notes also gives you
shrinking&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://pypy.org/posts/2022/12/jit-bug-finding-smt-fuzzing.html&quot;&gt;Use Z3 to encode&lt;/a&gt; the generated programs instead of randomly interpreting them&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;thanks&quot;&gt;Thanks&lt;/h2&gt;

&lt;p&gt;Thank you to &lt;a href=&quot;https://cfbolz.de/&quot;&gt;CF Bolz-Tereick&lt;/a&gt; for feedback on this post!&lt;/p&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:equivalence&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;CF notes that this notion of equivalence works for this
optimizer but not for one that does allocation removal (escape analysis).
If we removed allocations and writes to them, we would be changing the heap
results and our verifier would appear to fail. This means we have to, if we
are to delete allocations, pick a more subtle definition of equivalence.&lt;/p&gt;

      &lt;p&gt;Perhaps something that looks like escape analysis in the verifier’s
interpreter? &lt;a href=&quot;#fnref:equivalence&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Wed, 25 Feb 2026 00:00:00 +0000</pubDate>
            <niceDate>February 25, 2026</niceDate>
            <link>https://bernsteinbear.com/blog/toy-fuzzer/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/toy-fuzzer/</guid>
        </item>
        
        <item>
            <title>Type-based alias analysis in the Toy Optimizer</title>
            <description>&lt;p&gt;&lt;em&gt;Another entry in the &lt;a href=&quot;https://pypy.org/categories/toy-optimizer.html&quot;&gt;Toy Optimizer series&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Last time, we did &lt;a href=&quot;/blog/toy-load-store/&quot;&gt;load-store forwarding&lt;/a&gt; in the context
of our Toy Optimizer. We managed to cache the results of both reads from and
writes to the heap—at compile-time!&lt;/p&gt;

&lt;p&gt;We were careful to mind object aliasing: we separated our heap information into
alias classes based on what offset the reads/writes referenced. This way, if we
didn’t know if object &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt; aliased, we could at least know that
different offsets would never alias (assuming our objects don’t overlap and
memory accesses are on word-sized slots). This is a coarse-grained heuristic.&lt;/p&gt;

&lt;p&gt;Fortunately, we often have much more information available at compile-time than
just the offset, so we should use it. I mentioned in a footnote that we could
use type information, for example, to improve our alias analysis. We’ll add
a lightweight form of &lt;a href=&quot;/assets/img/tbaa.pdf&quot;&gt;type-based alias analysis (TBAA)&lt;/a&gt;
(PDF) in this post.&lt;/p&gt;

&lt;h2 id=&quot;representing-types&quot;&gt;Representing types&lt;/h2&gt;

&lt;p&gt;We return once again to Fil Pizlo land, specifically &lt;a href=&quot;https://gist.github.com/pizlonator/cf1e72b8600b1437dda8153ea3fdb963&quot;&gt;How I implement SSA
form&lt;/a&gt;.
We’re going to be using the hierarchical heap effect representation from the
post in our implementation, but you can use your own type representation if you
have one already.&lt;/p&gt;

&lt;p&gt;This representation divides the heap into disjoint regions by type. Consider,
for example, that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Array&lt;/code&gt; objects and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;String&lt;/code&gt; objects do not overlap. A
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LinkedList&lt;/code&gt; pointer is never going to alias an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Integer&lt;/code&gt; pointer. They can
therefore be reasoned about separately.&lt;/p&gt;

&lt;p&gt;But sometimes you don’t have perfect type information available. If you have in
your language an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Object&lt;/code&gt; base class of all objects, then the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Object&lt;/code&gt; heap
overlaps with, say, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Array&lt;/code&gt; heap. So you need some way to represent that
too—just having an enum doesn’t work cleanly.&lt;/p&gt;

&lt;p&gt;Here is an example simplified type hierarchy:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Any
  Object
    Array
    String
  Other
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Where &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Other&lt;/code&gt; might represent different parts of the runtime’s data structures,
and could be further segmented into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GC&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Thread&lt;/code&gt;, etc.&lt;/p&gt;

&lt;p&gt;Fil’s idea is that we can represent each node in that hierarchy with a tuple of
integers &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[start, end)&lt;/code&gt; (inclusive, exclusive) that represent the pre- and
post-order traversals of the tree. Or, if tree traversals are not engraved into
your bones, they represent the range of all the nested objects within them.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Any [0, 3)
  Object [0, 2)
    Array [0, 1)
    String [1, 2)
  Other [2, 3)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then the “does this write interfere with this read” check—the aliasing
check—is a range overlap query.&lt;/p&gt;

&lt;p&gt;Here’s a perhaps over-engineered Python implementation of the range and heap
hierarchy based on the Ruby generator and C++ runtime code from JavaScriptCore:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;HeapRange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__repr__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;[&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;, &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;)&quot;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;is_empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;overlaps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;HeapRange&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# Empty ranges interfere with nothing
&lt;/span&gt;        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;or&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AbstractHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parent&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;children&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add_child&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parent&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;children&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;compute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;current&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;children&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;child&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;children&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;child&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;compute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;current&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;child&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Any&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Object&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_child&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Object&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Object&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_child&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Array&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Object&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_child&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;String&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Other&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_child&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Other&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;compute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Where &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Any.compute(0)&lt;/code&gt; kicks off the tree-numbering scheme.&lt;/p&gt;

&lt;p&gt;Fil’s implementation also covers a bunch of abstract heaps such as SSAState and
Control because his is used for code motion and whatnot. That can be added on
later but we will not do so in this post.&lt;/p&gt;

&lt;p&gt;So there you have it: a type representation. Now we need to use it in our
load-store forwarding.&lt;/p&gt;

&lt;h2 id=&quot;load-store-forwarding&quot;&gt;Load-store forwarding&lt;/h2&gt;

&lt;p&gt;Recall that our load-store optimization pass looks like this:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Stores things we know about the heap at... compile-time.
&lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# Key: an object and an offset pair acting as a heap address
&lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# Value: a previous SSA value we know exists at that address
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;store&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;store_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;current_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;new_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eq_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;items&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_value&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;load&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_equal_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;At its core, it iterates over the instructions, keeping a representation of the
heap at compile-time. Reads get cached, writes get cached, and writes also
invalidate the state of compile-time information about fields that may alias.&lt;/p&gt;

&lt;p&gt;In this case, our &lt;em&gt;may alias&lt;/em&gt; asks only if the offsets overlap. This means that
the following unit test will fail:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_store_to_same_offset_different_heaps_does_not_invalidate_load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var4&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = getarg(1)
var2 = store(var0, 0, 3)
var3 = store(var1, 0, 4)
var4 = escape(3)&quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This test is expecting the write to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var0&lt;/code&gt; to still remain cached even though
we wrote to the same offset in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var1&lt;/code&gt;—because we have annotated &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var0&lt;/code&gt; as
being an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Array&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var1&lt;/code&gt; as being a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;String&lt;/code&gt;. If we account for type
information in our alias analysis, we can get this test to pass.&lt;/p&gt;

&lt;p&gt;After doing a bunch of fussing around with the load-store forwarding (many
rewrites), I eventually got it down to a very short diff:&lt;/p&gt;

&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gi&quot;&gt;+def may_alias(left: Value, right: Value) -&amp;gt; bool:
+    return (left.info or Any).range.overlaps((right.info or Any).range)
+
+
&lt;/span&gt; def optimize_load_store(bb: Block):
     opt_bb = Block()
     # Stores things we know about the heap at... compile-time.
&lt;span class=&quot;p&quot;&gt;@@ -138,6 +210,10 @@&lt;/span&gt; def optimize_load_store(bb: Block):
                 load_info: value
                 for load_info, value in compile_time_heap.items()
                 if load_info[1] != offset
&lt;span class=&quot;gi&quot;&gt;+                or not may_alias(load_info[0], obj)
&lt;/span&gt;             }
             compile_time_heap[store_info] = new_value
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If we don’t have any type/alias information, we default to “I know nothing”
(&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Any&lt;/code&gt;) for each object. Then we check range overlap.&lt;/p&gt;

&lt;p&gt;The boolean logic in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;optimize_load_store&lt;/code&gt; looks a little weird, maybe. But we
can also rewrite (via DeMorgan’s law) as:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;
            &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;may_alias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So, keeping all the cached field state about fields that are known by offset
and by type not to alias. Maybe that is clearer (but not as nice a diff).&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note that the type representation is not so important here! You could use a
bitset version of the type information if you want. The important things are
that you can cheaply construct types and check overlap between them.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;!--
Note that we do not currently have a notion of &quot;must-alias&quot; other than if two
SSA values are equal. Therefore we can&apos;t make use of writes to object A for
loads from object B even if A and B must alias.
--&gt;

&lt;p&gt;Nice, now our test passes! We can differentiate between memory accesses on
objects of different types.&lt;/p&gt;

&lt;p&gt;But what if we knew more?&lt;/p&gt;

&lt;h2 id=&quot;object-provenance--allocation-site&quot;&gt;Object provenance / allocation site&lt;/h2&gt;

&lt;p&gt;Sometimes we know where an object came from. For example, we may have seen it
get allocated in the trace. If we saw an object’s allocation, we know that it
does not alias (for example) any object that was passed in via a parameter. We
can use this kind of information to our advantage.&lt;/p&gt;

&lt;p&gt;For example, in the following made up IR snippet:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;trace(arg0):
  v0 = malloc(8)
  v1 = malloc(16)
  ...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We know that (among other facts) &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v0&lt;/code&gt; doesn’t alias &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;arg0&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v1&lt;/code&gt; because we
have seen its allocation site.&lt;/p&gt;

&lt;p&gt;I saw this in the old V8 IR Hydrogen’s lightweight alias analysis&lt;sup id=&quot;fnref:fork&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:fork&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;

&lt;div class=&quot;language-c++ highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;enum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HAliasing&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;kMustAlias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;kMayAlias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;kNoAlias&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;HAliasing&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Query&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HValue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;// The same SSA value always references the same object.&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kMustAlias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsAllocate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsInnerAllocatedObject&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// Two non-identical allocations can never be aliases.&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsAllocate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kNoAlias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsInnerAllocatedObject&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kNoAlias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// An allocation can never alias a parameter or a constant.&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsParameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kNoAlias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsConstant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kNoAlias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsAllocate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsInnerAllocatedObject&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// An allocation can never alias a parameter or a constant.&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsParameter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kNoAlias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsConstant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kNoAlias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;c1&quot;&gt;// Constant objects can be distinguished statically.&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsConstant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsConstant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Equals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kMustAlias&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kNoAlias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kMayAlias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There is plenty of other useful information such as:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If we know at compile-time that object A has 5 at offset 0 and object B has 7
at offset 0, then A and B don’t alias (thanks, CF)
    &lt;ul&gt;
      &lt;li&gt;In the RPython JIT in PyPy, this is used to determine if two user (Python)
objects don’t alias because we know the contents of the user (Python) class
field&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Object size (though perhaps that is a special case of the above bullet)&lt;/li&gt;
  &lt;li&gt;Field size/type&lt;/li&gt;
  &lt;li&gt;Deferring alias checks to run-time
    &lt;ul&gt;
      &lt;li&gt;Have a branch &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;if (a == b) { ... } else { ... }&lt;/code&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;…&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you have other fun ones, please write in.&lt;/p&gt;

&lt;h2 id=&quot;interacting-with-other-instructions&quot;&gt;Interacting with other instructions&lt;/h2&gt;

&lt;p&gt;We only handle loads and stores in our optimizer. Unfortunately, this means we
may accidentally cache stale information. Consider: what happens if a function
call (or any other opaque instruction) writes into an object we are tracking?&lt;/p&gt;

&lt;p&gt;The conservative approach is to invalidate all cached information on a function
call. This is definitely correct, but it’s a bummer for the optimizer. Can we
do anything?&lt;/p&gt;

&lt;p&gt;Well, perhaps we are calling a well-known function or a specific IR
instruction. In that case, we can annotate it with effects in the same abstract
heap model: if the instruction does not write, or only writes to some heaps, we
can at least only partially invalidate our heap.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;known_builtin_functions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;Array_length&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Effects&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;writes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()),&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;Object_setShape&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Effects&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;writes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Object&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;String_setEncoding&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Effects&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;writes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;However, if the function is unknown or otherwise opaque, we need at least more
advanced alias information and perhaps even (partial) escape analysis.&lt;/p&gt;

&lt;p&gt;Consider: even if an instruction takes no operands, we have no idea what state
it has access to. If it writes to any object A, we cannot safely cache
information about any other object B unless we know &lt;em&gt;for sure&lt;/em&gt; that A and B do
not alias. And we don’t know what the instruction writes to. So we may only
know we can cache information about B because it was allocated locally and has
not escaped.&lt;/p&gt;

&lt;h2 id=&quot;storing-vs-computing-on-the-fly&quot;&gt;Storing vs computing on the fly&lt;/h2&gt;

&lt;p&gt;Some runtimes such as ART &lt;a href=&quot;https://github.com/LineageOS/android_art/blob/8ce603e0c68899bdfbc9cd4c50dcc65bbf777982/compiler/optimizing/load_store_analysis.h#L395&quot;&gt;pre-compute all of their alias information&lt;/a&gt; in a bit
matrix. This makes more sense if you are using alias information in a full
control-flow graph, where you might need to iterate over the graph a few times.
In a trace context, you can do a lot in one single pass—no need to make a
matrix.&lt;/p&gt;

&lt;h2 id=&quot;when-is-this-useful-how-much&quot;&gt;When is this useful? How much?&lt;/h2&gt;

&lt;p&gt;As usual, this is a toy IR and a toy optimizer, so it’s hard to say how much
faster it makes its toy programs.&lt;/p&gt;

&lt;p&gt;In general, though, there is a dial for analysis and optimization that goes
between precision and speed. This is a happy point on that dial, only a tiny
incremental analysis cost bump above offset-only invalidation, but for higher
precision. I like that tradeoff.&lt;/p&gt;

&lt;p&gt;Also, it is very useful in JIT compilers where generally the managed language
is a little &lt;a href=&quot;https://blog.regehr.org/archives/959&quot;&gt;better-behaved than a C-like
language&lt;/a&gt;. Somewhere in your IR there
will be a lot of duplicate loads and stores from a strength reduction pass, and
this can clean up the mess.&lt;/p&gt;

&lt;!--
## In other languages

Taking address of objects throws a wrench in it

Can&apos;t really do it in C, even though UB
--&gt;

&lt;!--
https://github.com/WebKit/WebKit/blob/main/Source/JavaScriptCore/dfg/DFGObjectAllocationSinkingPhase.cpp
--&gt;

&lt;h2 id=&quot;wrapping-up&quot;&gt;Wrapping up&lt;/h2&gt;

&lt;p&gt;See the &lt;a href=&quot;https://github.com/tekknolagi/tekknolagi.github.com/blob/67a1c5cbcf81d96cc63f8b3904619c018d1f2be1/loadstore.py&quot;&gt;full code&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Thanks for joining as I work through a small use of type-based alias analysis
for myself. I hope you enjoyed.&lt;/p&gt;

&lt;p&gt;See also &lt;a href=&quot;https://wingolog.org/archives/2026/02/18/two-mechanisms-for-dynamic-type-checks&quot;&gt;two mechanisms for dynamic type
checks&lt;/a&gt;
by Andy Wingo. CRuby uses the latter technique described in the article.&lt;/p&gt;

&lt;h2 id=&quot;thanks&quot;&gt;Thanks&lt;/h2&gt;

&lt;p&gt;Thank you to &lt;a href=&quot;https://www.chrisgregory.me/&quot;&gt;Chris Gregory&lt;/a&gt; for helpful feedback.&lt;/p&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:fork&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I made &lt;a href=&quot;https://github.com/tekknolagi/v8&quot;&gt;a fork of V8&lt;/a&gt; to go spelunk
around the Hydrogen IR. I reset the V8 repo to the last commit before they
deleted it in favor of their new Sea of Nodes based IR called TurboFan. &lt;a href=&quot;#fnref:fork&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Mon, 16 Feb 2026 00:00:00 +0000</pubDate>
            <niceDate>February 16, 2026</niceDate>
            <link>https://bernsteinbear.com/blog/toy-tbaa/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/toy-tbaa/</guid>
        </item>
        
        <item>
            <title>A multi-entry CFG design conundrum</title>
            <description>&lt;h2 id=&quot;background-and-bytecode-design&quot;&gt;Background and bytecode design&lt;/h2&gt;

&lt;p&gt;The ZJIT compiler compiles Ruby bytecode (YARV) to machine code. It starts by
transforming the stack machine bytecode into a high-level graph-based
intermediate representation called HIR.&lt;/p&gt;

&lt;p&gt;We use a more or less typical&lt;sup id=&quot;fnref:ebb&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:ebb&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; control-flow graph (CFG) in HIR. We have a
compilation unit, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Function&lt;/code&gt;, which has multiple basic blocks, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Block&lt;/code&gt;. Each
block contains multiple instructions, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Insn&lt;/code&gt;. HIR is always in SSA form, and we
use the variant of SSA with block parameters instead of phi nodes.&lt;/p&gt;

&lt;p&gt;Where it gets weird, though, is our handling of multiple entrypoints. See, YARV
handles default positional parameters (but &lt;em&gt;not&lt;/em&gt; default keyword parameters) by
embedding the code to compute the defaults inside the callee bytecode. Then
callers are responsible for figuring out what offset in the bytecode they
should start running the callee, depending on the amount of arguments the
caller provides.&lt;sup id=&quot;fnref:keywords&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:keywords&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;In the following example, we have a function that takes two optional positional
parameters &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt;. If neither is provided, we start at offset &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0000&lt;/code&gt;. If
just &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; is provided, we start at offset &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0005&lt;/code&gt;. If both are provided, we can
start at offset &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0010&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ ruby --dump=insns -e &apos;def foo(a=compute_a, b=compute_b) = a + b&apos;
...
== disasm: #&amp;lt;ISeq:foo@-e:1 (1,0)-(1,41)&amp;gt;
local table (size: 2, argc: 0 [opts: 2, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 2] a@0&amp;lt;Opt=0&amp;gt; [ 1] b@1&amp;lt;Opt=5&amp;gt;
0000 putself                                                          (   1)
0001 opt_send_without_block   &amp;lt;calldata!mid:compute_a, argc:0, FCALL|VCALL|ARGS_SIMPLE&amp;gt;
0003 setlocal_WC_0            a@0
0005 putself
0006 opt_send_without_block   &amp;lt;calldata!mid:compute_b, argc:0, FCALL|VCALL|ARGS_SIMPLE&amp;gt;
0008 setlocal_WC_0            b@1
0010 getlocal_WC_0            a@0[Ca]
0012 getlocal_WC_0            b@1
0014 opt_plus                 &amp;lt;calldata!mid:+, argc:1, ARGS_SIMPLE&amp;gt;[CcCr]
0016 leave                    [Re]
$
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(See the jump table debug output: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[ 2] a@0&amp;lt;Opt=0&amp;gt; [ 1] b@1&amp;lt;Opt=5&amp;gt;&lt;/code&gt;)&lt;/p&gt;

&lt;p&gt;Unlike in Python, where default arguments are evaluated &lt;em&gt;at function creation
time&lt;/em&gt;, Ruby computes the default values &lt;em&gt;at function call time&lt;/em&gt;. This includes
arbitrary function calls, raising exceptions, doing long I/O, or whatever your
heart desires. For this reason, embedding the default code inside the callee
makes a lot of sense; we have a full call frame already set up, so any
optimizations (!), side-exits, exception handling machinery, profiling, etc
doesn’t need special treatment.&lt;/p&gt;

&lt;p&gt;Since the caller knows what arguments it is passing, and often to what
function, we can efficiently support this in the JIT. We just need to know what
offset in the compiled callee to call into. The interpreter can also call into
the compiled function, which just has a stub to do dispatch to the appropriate
entry block.&lt;/p&gt;

&lt;p&gt;This has led us to design the HIR to support &lt;em&gt;multiple function entrypoints&lt;/em&gt;.
Instead of having just a single entry block, as most control-flow graphs do,
each of our functions now has an array of function entries: one for the
interpreter, at least one for the JIT, and more for default parameter handling.
Each of these entry blocks is separately callable from the outside world.&lt;/p&gt;

&lt;p&gt;Here is what the (slightly cleaned up) HIR looks like for the above example:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Optimized HIR:
fn foo@tmp/branchnil.rb:4:
bb0():
  EntryPoint interpreter
  v1:BasicObject = LoadSelf
  v2:BasicObject = GetLocal :a, l0, SP@5
  v3:BasicObject = GetLocal :b, l0, SP@4
  v4:CPtr = LoadPC
  v5:CPtr[CPtr(0x16d27e908)] = Const CPtr(0x16d282120)
  v6:CBool = IsBitEqual v4, v5
  IfTrue v6, bb2(v1, v2, v3)
  v8:CPtr[CPtr(0x16d27e908)] = Const CPtr(0x16d282120)
  v9:CBool = IsBitEqual v4, v8
  IfTrue v9, bb4(v1, v2, v3)
  Jump bb6(v1, v2, v3)
bb1(v13:BasicObject):
  EntryPoint JIT(0)
  v14:NilClass = Const Value(nil)
  v15:NilClass = Const Value(nil)
  Jump bb2(v13, v14, v15)
bb2(v27:BasicObject, v28:BasicObject, v29:BasicObject):
  v65:HeapObject[...] = GuardType v27, HeapObject[class_exact*:Object@VALUE(0x1043aed00)]
  v66:BasicObject = SendWithoutBlockDirect v65, :compute_a (0x16d282148)
  Jump bb4(v27, v66, v29)
bb3(v18:BasicObject, v19:BasicObject):
  EntryPoint JIT(1)
  v20:NilClass = Const Value(nil)
  Jump bb4(v18, v19, v20)
bb4(v38:BasicObject, v39:BasicObject, v40:BasicObject):
  v69:HeapObject[...] = GuardType v38, HeapObject[class_exact*:Object@VALUE(0x1043aed00)]
  v70:BasicObject = SendWithoutBlockDirect v69, :compute_b (0x16d282148)
  Jump bb6(v38, v39, v70)
bb5(v23:BasicObject, v24:BasicObject, v25:BasicObject):
  EntryPoint JIT(2)
  Jump bb6(v23, v24, v25)
bb6(v49:BasicObject, v50:BasicObject, v51:BasicObject):
  v73:Fixnum = GuardType v50, Fixnum
  v74:Fixnum = GuardType v51, Fixnum
  v75:Fixnum = FixnumAdd v73, v74
  CheckInterrupts
  Return v75
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If you’re not a fan of text HIR, here is an embedded clickable visualization of
HIR thanks to our former intern &lt;a href=&quot;https://aidenfoxivey.com/&quot;&gt;Aiden&lt;/a&gt; porting
Firefox’s &lt;a href=&quot;https://github.com/mozilla-spidermonkey/iongraph&quot;&gt;Iongraph&lt;/a&gt;:&lt;/p&gt;

&lt;iframe width=&quot;100%&quot; height=&quot;400&quot; src=&quot;/assets/zjit-multi-entry-iongraph.html&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;(You might have to scroll sideways and down and zoom around. Or you can &lt;a href=&quot;/assets/zjit-multi-entry-iongraph.html&quot;&gt;open it
in its own window&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;Each entry block also comes with block parameters which mirror the function’s
parameters. These get passed in (roughly) the System V ABI registers.&lt;/p&gt;

&lt;p&gt;This is kind of gross. We have to handle these blocks specially in reverse
post-order (RPO) graph traversal. And, recently, I ran into an even worse case
when trying to implement the Cooper-style “engineered” dominator algorithm: if
we walk backwards in block dominators, the walk is not guaranteed to converge.
All non-entry blocks are dominated by all entry blocks, which are only
dominated by themselves. There is no one “start block”. So what is there to do?&lt;/p&gt;

&lt;h2 id=&quot;the-design-conundrum&quot;&gt;The design conundrum&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Approach 1&lt;/strong&gt; is to keep everything as-is, but handle entry blocks specially
in the dominator algorithm too. I’m not exactly sure what would be needed, but
it seems possible. Most of the existing block infra could be left alone, but
it’s not clear how much this would “spread” within the compiler. What else in
the future might need to be handled specially?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Approach 2&lt;/strong&gt; is to synthesize a super-entry block and make it a predecessor
of every interpreter and JIT entry block. Inside this approach there are two
ways to do it: one (&lt;strong&gt;2.a&lt;/strong&gt;) is to fake it and report some non-existent block.
Another (&lt;strong&gt;2.b&lt;/strong&gt;) is to actually make a block and a new instruction that is a
quasi-jump instruction. In this approach, we would either need to synthesize
fake block arguments for the JIT entry block parameters or add some kind of new
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LoadArg&amp;lt;i&amp;gt;&lt;/code&gt; instruction that reads the argument &lt;em&gt;i&lt;/em&gt; passed in.&lt;/p&gt;

&lt;p&gt;(suggested by Iain Ireland, as seen in the IBM COBOL compiler)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Approach 3&lt;/strong&gt; is to duplicate the entire CFG per entrypoint. This would return
us to having one entry block per CFG at the expense of code duplication. It
handles the problem pretty cleanly but then &lt;em&gt;forces&lt;/em&gt; code duplication. I think
I want the duplication to be opt-in instead of having it be the only way we
support multiple entrypoints. What if it increases memory too much? The
specialization probably would make the generated code faster, though.&lt;/p&gt;

&lt;p&gt;(suggested by Ben Titzer)&lt;/p&gt;

&lt;p&gt;None of these approaches feel great to me. The probable candidate is &lt;strong&gt;2.b&lt;/strong&gt;
where we have &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LoadArg&lt;/code&gt; instructions. That gives us flexibility to also later
add full specialization without forcing it.&lt;/p&gt;

&lt;p&gt;Cameron Zwarich also notes that this this is an analogue to the common problem
people have when implementing the reverse: postdominators. This is because
often functions have multiple return IR instructions. He notes the usual
solution is to transform them into branches to a single return instruction.&lt;/p&gt;

&lt;p&gt;Do you have this problem? What does your compiler do?&lt;/p&gt;

&lt;h2 id=&quot;update-a-conclusion&quot;&gt;Update: a conclusion&lt;/h2&gt;

&lt;p&gt;We have decided to go with &lt;a href=&quot;https://github.com/ruby/ruby/pull/16200&quot;&gt;the superblock
approach&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:ebb&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;We use extended basic blocks (EBBs), but this doesn’t matter for this
post. It makes dominators and predecessors slightly more complicated (now
you have dominating &lt;em&gt;instructions&lt;/em&gt;), but that’s about it as far as I can
tell. We’ll see how they fare in the face of more complicated analysis
later. &lt;a href=&quot;#fnref:ebb&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:keywords&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Keyword parameters have some mix of caller/callee presence checks
in the callee because they are passed in un-ordered. The caller handles
simple constant defaults whereas the callee handles anything that may
raise. Check out &lt;a href=&quot;https://kddnewton.com/2022/12/17/advent-of-yarv-part-17&quot;&gt;Kevin Newton’s awesome overview&lt;/a&gt;. &lt;a href=&quot;#fnref:keywords&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Thu, 22 Jan 2026 00:00:00 +0000</pubDate>
            <niceDate>January 22, 2026</niceDate>
            <link>https://bernsteinbear.com/blog/multiple-entry/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/multiple-entry/</guid>
        </item>
        
        <item>
            <title>The GDB JIT interface</title>
            <description>&lt;p&gt;GDB is great for stepping through machine code to figure out what is going on.
It uses debug information under the hood to present you with a tidy backtrace
and also determine how much machine code to print when you type &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;disassemble&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This debug information comes from your compiler. Clang, GCC, rustc, etc all
produce debug data in a format called &lt;a href=&quot;https://dwarfstd.org/&quot;&gt;DWARF&lt;/a&gt; and then embed that debug
information inside the binary (ELF, Mach-O, …) when you do &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-ggdb&lt;/code&gt; or
equivalent.&lt;/p&gt;

&lt;p&gt;Unfortunately, this means that by default, GDB has no idea what is going on if
you break in a JIT-compiled function. You can step instruction-by-instruction
and whatnot, but that’s about it. This is because the current instruction
pointer is nowhere to be found in any of the existing debug info tables from
the host runtime code, so your terminal is filled with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;???&lt;/code&gt;. See this example
from the V8 docs:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#8  0x08281674 in v8::internal::Runtime_SetProperty (args=...) at src/runtime.cc:3758
#9  0xf5cae28e in ?? ()
#10 0xf5cc3a0a in ?? ()
#11 0xf5cc38f4 in ?? ()
#12 0xf5cbef19 in ?? ()
#13 0xf5cb09a2 in ?? ()
#14 0x0809e0a5 in v8::internal::Invoke (...) at src/execution.cc:97
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Fortunately, there is a &lt;em&gt;JIT interface&lt;/em&gt; to GDB. If you implement a couple of
functions in your JIT and run them every time you finish compiling a function,
you can get the debugging niceties for your JIT code too. See again a V8
example:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#6  0x082857fc in v8::internal::Runtime_SetProperty (args=...) at src/runtime.cc:3758
#7  0xf5cae28e in ?? ()
#8  0xf5cc3a0a in loop () at test.js:6
#9  0xf5cc38f4 in test.js () at test.js:13
#10 0xf5cbef19 in ?? ()
#11 0xf5cb09a2 in ?? ()
#12 0x0809e1f9 in v8::internal::Invoke (...) at src/execution.cc:97
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Unfortunately, the GDB docs are &lt;a href=&quot;https://sourceware.org/gdb/current/onlinedocs/gdb.html/JIT-Interface.html&quot;&gt;somewhat sparse&lt;/a&gt;. So I went
spelunking through a bunch of different projects to try and understand what is
going on.&lt;/p&gt;

&lt;h2 id=&quot;the-big-picture-and-the-old-interface&quot;&gt;The big picture (and the old interface)&lt;/h2&gt;

&lt;p&gt;GDB expects your runtime to expose a function called
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_register_code&lt;/code&gt; and a global variable called
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_descriptor&lt;/code&gt;. GDB automatically adds its own internal breakpoints
at this function, if it exists. Then, when you compile code, you call this
function from your runtime.&lt;/p&gt;

&lt;p&gt;In slightly more detail:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Compile a function in your JIT compiler. This gives you a function name,
maybe other metadata, an executable code address, and a code size&lt;/li&gt;
  &lt;li&gt;Generate an &lt;em&gt;entire&lt;/em&gt; ELF/Mach-O/… object in-memory (!) for that one
function, describing its name, code region, maybe other DWARF metadata such
as line number maps&lt;/li&gt;
  &lt;li&gt;Write a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jit_code_entry&lt;/code&gt; linked list node that points at your object
(“symfile”)&lt;/li&gt;
  &lt;li&gt;Link it into the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_descriptor&lt;/code&gt; linked list&lt;/li&gt;
  &lt;li&gt;Call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_register_code&lt;/code&gt;, which gives GDB control of the process so it can
pick up the new function’s metadata&lt;/li&gt;
  &lt;li&gt;Optionally, break into (or crash inside) one of your JITed functions&lt;/li&gt;
  &lt;li&gt;At some point, later, when your function gets GCed, unregister your code by
editing the linked list and calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_register_code&lt;/code&gt; again&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is why you see compiler projects such as V8 including large swaths of code
just to make object files:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/v8/v8/blob/5668ed57de1c7c8dd5c3dc1598bf071e17d29c8c/src/diagnostics/gdb-jit.cc&quot;&gt;V8&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/facebookincubator/cinderx/blob/e6e925b20e6fa3fe1e100f147e1c8cd03076ebfb/cinderx/Jit/jit_gdb_support.cpp&quot;&gt;Cinder&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/zendtech/php-src/blob/f82e5b3abe1ff1d3ffc7954b0810bc584fd650a5/ext/opcache/jit/zend_jit_gdb.c#L473&quot;&gt;Zend PHP&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/dotnet/runtime/blob/3c040478f19e0f317790acab05dbe3ada9f52dc4/src/coreclr/vm/gdbjit.cpp&quot;&gt;CoreCLR/.NET&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/qemu/qemu/blob/942b0d378a1de9649085ad6db5306d5b8cef3591/tcg/tcg.c#L7064&quot;&gt;QEMU&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/WebKit/WebKit/blob/0afc2a867ab45651ac6c353c7b6ade5482b7bba7/Source/JavaScriptCore/jit/GdbJIT.cpp&quot;&gt;JavaScriptCore&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/LuaJIT/LuaJIT/blob/7152e15489d2077cd299ee23e3d51a4c599ab14f/src/lj_gdbjit.c&quot;&gt;LuaJIT&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/LineageOS/android_art/blob/8ce603e0c68899bdfbc9cd4c50dcc65bbf777982/runtime/jit/debugger_interface.cc#L187&quot;&gt;ART&lt;/a&gt;
    &lt;ul&gt;
      &lt;li&gt;which looks like it does something smart about grouping the JIT code
entries together (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RepackEntries&lt;/code&gt;), but I’m not sure exactly what it does&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/facebook/hhvm/blob/b1c47dcfbc574b508fd084f27ba4a06bcf4ba188/hphp/runtime/vm/debug/elfwriter.cpp#L622&quot;&gt;HHVM&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/TomatOrg/TomatoDotNet/blob/80266bb8dc0e7f0644f0638ecd98dfad4fb74427/src/dotnet/jit/gdb.c&quot;&gt;TomatoDotNet&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/jatovm/jato/blob/bb1c7d4fd987e016b2e0379182c4bfbb8c1c1a78/jit/elf.c#L164&quot;&gt;Jato JVM&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gist.github.com/yyny/4a012029b5889853c18b1efc19bb598e&quot;&gt;a minimal example&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/sisshiki1969/jit-debug/blob/213c72512761f815fc0b067ce68ee0ae12962e2a/src/main.rs&quot;&gt;monoruby&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/mono/mono/blob/0f53e9e151d92944cacab3e24ac359410c606df6/mono/mini/dwarfwriter.c&quot;&gt;Mono&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;It looks like Dart &lt;a href=&quot;https://github.com/dart-lang/sdk/commit/c4238c71da13d61ff32332058d371c5b2e92694b&quot;&gt;used to&lt;/a&gt;
have support for this but has since removed it&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/bytecodealliance/wasmtime/blob/b5272a5f103053f5ada2a38d5302a8d1e2de442d/crates/wasmtime/src/runtime/code_memory.rs#L509&quot;&gt;wasmtime&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because this is a huge hassle, GDB also has a newer interface that does not
require making an ELF/Mach-O/…+DWARF object.&lt;/p&gt;

&lt;h2 id=&quot;custom-debug-info-the-new-interface&quot;&gt;Custom debug info (the new interface)&lt;/h2&gt;

&lt;p&gt;This new interface requires writing a binary format of your choice. You make
the writer and you make the reader. Then, when you are in GDB, you load your
reader as a shared object.&lt;/p&gt;

&lt;p&gt;The reader must implement &lt;a href=&quot;https://sourceware.org/gdb/current/onlinedocs/gdb.html/Writing-JIT-Debug-Info-Readers.html#Writing-JIT-Debug-Info-Readers&quot;&gt;the interface specified by GDB&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;GDB_DECLARE_GPL_COMPATIBLE_READER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;extern&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gdb_reader_funcs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;gdb_init_reader&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gdb_reader_funcs&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;cm&quot;&gt;/* Must be set to GDB_READER_INTERFACE_VERSION.  */&lt;/span&gt;
  &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reader_version&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

  &lt;span class=&quot;cm&quot;&gt;/* For use by the reader.  */&lt;/span&gt;
  &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;priv_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

  &lt;span class=&quot;n&quot;&gt;gdb_read_debug_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;gdb_unwind_frame&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unwind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;gdb_get_frame_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_frame_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;gdb_destroy_reader&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;destroy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;read&lt;/code&gt; function pointer does the bulk of the work and is responsible for
matching code ranges to function names, line numbers, and more.&lt;/p&gt;

&lt;p&gt;Here are &lt;a href=&quot;https://pwparchive.wordpress.com/2011/11/20/new-jit-interface-for-gdb/&quot;&gt;some details from Sanjoy Das&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Only a few runtimes implement this interface. Most of them stub out the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unwind&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;get_frame_id&lt;/code&gt; function pointers:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/ykjit/yk/blob/755e533aa74ef5fa82a6586147727e23146b95fc/ykrt/src/compile/jitc_yk/gdb.rs#L216&quot;&gt;yk write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/ykjit/yk/blob/755e533aa74ef5fa82a6586147727e23146b95fc/ykrt/yk_gdb_plugin/yk_gdb_plugin.c#L22&quot;&gt;yk read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/tetzank/asmjit-utilities/blob/2fdbb99f7e002df4f8d7aa97c29910743adfc991/gdb/gdbjit.cpp&quot;&gt;asmjit-utilities write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/tetzank/asmjit-utilities/blob/2fdbb99f7e002df4f8d7aa97c29910743adfc991/gdb/jit-reader/gdbjit-reader.c&quot;&gt;asmjit-utilities read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/erlang/otp/blob/28a44634fb04b95ea666abb8aac7254e2c87ae05/erts/emulator/beam/jit/beam_jit_metadata.cpp#L123&quot;&gt;Erlang/OTP write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/erlang/otp-gdb-tools/blob/7b864f58c534699e4124e31ecfda86041b941037/jit-reader.c&quot;&gt;Erlang/OTP read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/FEX-Emu/FEX/blob/c8d72eabe589392b962bec94d002c5ffdb7381c2/FEXCore/Source/Interface/GDBJIT/GDBJIT.cpp#L110&quot;&gt;FEX write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/FEX-Emu/FEX/blob/c8d72eabe589392b962bec94d002c5ffdb7381c2/Source/Tools/FEXGDBReader/FEXGDBReader.cpp#L8&quot;&gt;FEX read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/bullno1/buxn-jit/blob/69effb96d5fe9725258fe367efcefd6911ef32fd/src/gdb/hook.c&quot;&gt;buxn-jit write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/bullno1/buxn-jit/blob/69effb96d5fe9725258fe367efcefd6911ef32fd/src/gdb/reader.c&quot;&gt;buxn-jit read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/KreitinnSoftware/box64/blob/f224a93cc83f9da34bc85ebb5414168d476a135d/src/tools/gdbjit.c#L45&quot;&gt;box64 write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/KreitinnSoftware/box64/blob/f224a93cc83f9da34bc85ebb5414168d476a135d/gdbjit/reader.c&quot;&gt;box64 read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/no-defun-allowed/ccl/blob/094a9ec5bf203db118e0ffc8ce2b5b80fc1c91dd/lisp-kernel/gdb.c&quot;&gt;ccl write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://gist.github.com/no-defun-allowed/32d38c5e664586c724cf2e0e97f0d2b1&quot;&gt;ccl read&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I think it also requires at least the reader to proclaim it is GPL via the
macro &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GDB_DECLARE_GPL_COMPATIBLE_READER&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Since I wrote about the &lt;a href=&quot;/blog/jit-perf-map/&quot;&gt;perf map interface&lt;/a&gt; recently, I
have it on my mind. Why can’t we reuse it in GDB?&lt;/p&gt;

&lt;h2 id=&quot;adapting-to-the-linux-perf-interface&quot;&gt;Adapting to the Linux perf interface&lt;/h2&gt;

&lt;p&gt;I suppose it would be possible to try and upstream a patch to GDB to support
the Linux perf map interface for JITs. After all, why shouldn’t it be able to
automatically pick up symbols from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/tmp/perf-...&lt;/code&gt;? That would be great
baseline debug info for “free”.&lt;/p&gt;

&lt;p&gt;In the meantime, maybe it is reasonable to create a re-usable custom debug
reader:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;When registering code, write the address and name to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/tmp/perf-...&lt;/code&gt; as you normally would&lt;/li&gt;
  &lt;li&gt;Write the filename as the symfile (does this make &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/tmp&lt;/code&gt; the magic number?)&lt;/li&gt;
  &lt;li&gt;Have the debug info reader just parse the perf map file&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It would be less flexible than both the DWARF and custom readers support: it
would only be able to handle filename and code region. No embedding source code
for GDB to display in your debugger. But maybe that is okay for a partial
solution?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; Here is &lt;a href=&quot;https://github.com/tekknolagi/gdb-jit-linux-perf-map&quot;&gt;my small attempt&lt;/a&gt;
at such a plugin.&lt;/p&gt;

&lt;h2 id=&quot;the-n-squared-problem&quot;&gt;The n-squared problem&lt;/h2&gt;

&lt;p&gt;V8 notes in their &lt;a href=&quot;https://v8.dev/docs/gdb-jit&quot;&gt;GDB JIT docs&lt;/a&gt; that because the JIT interface is
a linked list and we only keep a pointer to the head, we get O(n&lt;sup&gt;2&lt;/sup&gt;)
behavior. Bummer. This becomes especially noticeable since they register
additional code objects not just for functions, but also trampolines, cache
stubs, etc.&lt;/p&gt;

&lt;h2 id=&quot;garbage-collection&quot;&gt;Garbage collection&lt;/h2&gt;

&lt;p&gt;Since GDB expects the code pointer in your symbol object file not to move, you
have to make sure to have a stable symbol file pointer and stable executable
code pointer. To make this happen, V8 disables its moving GC.&lt;/p&gt;

&lt;p&gt;Additionally, if your compiled function gets collected, you have to make sure
to unregister the function. Instead of doing this eagerly, ART treats the GDB
JIT linked list as a weakref and periodically removes dead code entries from
it.&lt;/p&gt;
</description>
            <pubDate>Tue, 30 Dec 2025 00:00:00 +0000</pubDate>
            <niceDate>December 30, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/gdb-jit/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/gdb-jit/</guid>
        </item>
        
        <item>
            <title>Load and store forwarding in the Toy Optimizer</title>
            <description>&lt;p&gt;&lt;em&gt;Another entry in the &lt;a href=&quot;https://pypy.org/categories/toy-optimizer.html&quot;&gt;Toy Optimizer series&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;A long, long time ago (two years!) &lt;a href=&quot;https://cfbolz.de/&quot;&gt;CF Bolz-Tereick&lt;/a&gt; and I made a &lt;a href=&quot;https://www.youtube.com/watch?v=w-UHg0yOPSE&quot;&gt;video
about load/store forwarding&lt;/a&gt; and an accompanying &lt;a href=&quot;https://gist.github.com/tekknolagi/4e3fa26d350f6d3b39ede40d372b97fe&quot;&gt;GitHub Gist&lt;/a&gt;
about load/store forwarding (also called load elimination) in the Toy Optimizer. I
said I would write a blog post about it, but never found the time—it got lost
amid a sea of large life changes.&lt;/p&gt;

&lt;p&gt;It’s a neat idea: do an abstract interpretation over the trace, modeling the
heap at compile-time, eliminating redundant loads and stores. That means it’s
possible to optimize traces like this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;v0 = ...
v1 = load(v0, 5)
v2 = store(v0, 6, 123)
v3 = load(v0, 6)
v4 = load(v0, 5)
v5 = do_something(v1, v3, v4)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;into traces like this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;v0 = ...
v1 = load(v0, 5)
v2 = store(v0, 6, 123)
v5 = do_something(v1, 123, v1)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(where &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;load(v0, 5)&lt;/code&gt; is equivalent to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*(v0+5)&lt;/code&gt; in C syntax and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;store(v0, 6,
123)&lt;/code&gt; is equvialent to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*(v0+6)=123&lt;/code&gt; in C syntax)&lt;/p&gt;

&lt;p&gt;This indicates that we were able to eliminate two redundant loads by keeping
around information about previous loads and stores. Let’s get to work making
this possible.&lt;/p&gt;

&lt;h2 id=&quot;the-usual-infrastructure&quot;&gt;The usual infrastructure&lt;/h2&gt;

&lt;p&gt;We’ll start off with the usual infrastructure from the &lt;a href=&quot;https://pypy.org/categories/toy-optimizer.html&quot;&gt;Toy
Optimizer series&lt;/a&gt;: a very &lt;a href=&quot;https://wiki.c2.com/?StringlyTyped&quot;&gt;stringly-typed&lt;/a&gt; representation of a
&lt;a href=&quot;https://gist.github.com/tekknolagi/4e3fa26d350f6d3b39ede40d372b97fe#file-port-py-L4-L112&quot;&gt;trace-based SSA IR&lt;/a&gt; and a union-find rewrite mechanism.&lt;/p&gt;

&lt;p&gt;This means we can start writing some new optimization pass and our first test:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# TODO: copy an optimized version of bb into opt_bb
&lt;/span&gt;    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_two_loads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = load(var0, 0)
var2 = escape(var1)
var3 = escape(var1)&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This test is asserting that we can remove duplicate loads. Why load twice if we
can cache the result? Let’s make that happen.&lt;/p&gt;

&lt;h2 id=&quot;caching-loads&quot;&gt;Caching loads&lt;/h2&gt;

&lt;p&gt;To do this, we’ll model the the heap at compile-time. When I say “model”, I
mean that we will have an imprecise but correct abstract representation of the
heap: we don’t (and can’t) have knowledge of every value, but we can know for
sure that some addresses have certain values.&lt;/p&gt;

&lt;p&gt;For example, if we have observed a load from object &lt;em&gt;O&lt;/em&gt; at offset &lt;em&gt;8&lt;/em&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v0 =
load(O, 8)&lt;/code&gt;, we know that the SSA value &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v0&lt;/code&gt; is at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;heap[(O, 8)]&lt;/code&gt;. That sounds
tautological, but it’s not. Future loads can make use of this information.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Operation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Constant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Stores things we know about the heap at... compile-time.
&lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# Key: an object and an offset pair acting as a heap address
&lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# Value: a previous SSA value we know exists at that address
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;load&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;previous&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previous&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_equal_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;previous&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This pass records information about loads and uses the result of a previous
cached load operation if available. We treat the pair of (SSA value, offset) as
an address into our abstract heap.&lt;/p&gt;

&lt;p&gt;That’s great! If you run our simple test, it should now pass. But what happens
if we store into that address before the second load? Oops…&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_store_to_same_object_offset_invalidates_load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = load(var0, 0)
var2 = store(var0, 0, 5)
var3 = load(var0, 0)
var4 = escape(var1)
var5 = escape(var3)&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This test fails because we are incorrectly keeping around &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var1&lt;/code&gt; in our
abstract heap. We need to get rid of it and not replace &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var3&lt;/code&gt; with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var1&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;invalidating-cached-loads&quot;&gt;Invalidating cached loads&lt;/h2&gt;

&lt;p&gt;So it turns out we have to also model stores in order to cache loads correctly.
One valid, albeit aggressive, way to do that is to throw away all the
information we know at each store operation:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;store&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;load&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That makes our test pass—yay!—but at great cost. It means any store
operation mucks up redundant loads. In our world where we frequently read from
and write to objects, this is what we call a huge bummer.&lt;/p&gt;

&lt;p&gt;For example, a store to offset 4 on some object should never interfere with a
load from a different offset on the same object&lt;sup id=&quot;fnref:size&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:size&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. We should be able to
keep our load from offset 0 cached here:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_store_to_same_object_different_offset_does_not_invalidate_load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = load(var0, 0)
var2 = store(var0, 4, 5)
var3 = escape(var1)
var4 = escape(var1)&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We could try instead checking if our specific (object, offset) pair is in the
heap and only removing cached information about that offset and that object.
That would definitely help!&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;store&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;del&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;load&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It makes our test pass, too, which is great news.&lt;/p&gt;

&lt;p&gt;Unfortunately, this runs into problems due to aliasing: it’s entirely possible
that our compile-time heap could contain a pair &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(v0, 0)&lt;/code&gt; and a pair &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(v1, 0)&lt;/code&gt; where &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v0&lt;/code&gt;
and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v1&lt;/code&gt; are the same object (but not known to the optimizer). Then we might
run into a situation where we incorrectly cache loads because the optimizer
doesn’t know our abstract addresses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(v0, 0)&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(v1, 0)&lt;/code&gt; are actually the
same pointer at run-time.&lt;/p&gt;

&lt;p&gt;This means that we are breaking abstract interpretation rules: our abstract
interpreter has to correctly model &lt;em&gt;all&lt;/em&gt; possible outcomes at run-time. This
means to me that we should instead pick some tactic in-between clearing all
information (correct but over-eager) and clearing only exact matches of
object+offset (incorrect).&lt;/p&gt;

&lt;p&gt;The term that will help us here is called an &lt;em&gt;alias class&lt;/em&gt;. It is a name for a
way to efficiently partition objects in your abstract heap into completely
disjoint sets. Writes to any object in one class never affect objects in
another class.&lt;/p&gt;

&lt;p&gt;Our very scrappy alias classes will be just based on the offset: each offset is
a different alias class. If we write to any object at offset K, we have to
invalidate all of our compile-time offset K knowledge—even if it’s for
another object. This is a nice middle ground, and it’s possible because our
(made up) object system guarantees that distinct objects do not overlap, and
also that we are not writing out-of-bounds.&lt;sup id=&quot;fnref:tbaa&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:tbaa&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;So let’s remove all of the entries from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compile_time_heap&lt;/code&gt; where the offset
matches the offset in the current &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;store&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;store&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;items&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;load&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Great! Now our test passes.&lt;/p&gt;

&lt;p&gt;This concludes the load optimization section of the post. We have modeled
enough of loads and stores that we can eliminate redundant loads. Very cool.
But we can go further.&lt;/p&gt;

&lt;h2 id=&quot;caching-stores&quot;&gt;Caching stores&lt;/h2&gt;

&lt;p&gt;Stores don’t just invalidate information. They also give us new information!
Any time we see an operation of the form &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v1 = store(v0, 8, 5)&lt;/code&gt; we also learn
that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;load(v0, 8) == 5&lt;/code&gt;! Until it gets invalidated, anyway.&lt;/p&gt;

&lt;p&gt;For example, in this test, we can eliminate the load from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var0&lt;/code&gt; at offset 0:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_load_after_store_removed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = store(var0, 0, 5)
var2 = load(var0, 1)
var3 = escape(5)
var4 = escape(var2)&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Making that work is thankfully not very hard; we need only add that new
information to the compile-time heap after removing all the
potentially-aliased info:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;store&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# ... as before ...
&lt;/span&gt;            &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;new_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_value&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# NEW!
&lt;/span&gt;        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;load&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This makes the test pass. It makes another test fail, but only
because—oops—we now know more. You can delete the old test because the new
test supersedes it.&lt;/p&gt;

&lt;p&gt;Now, note that we are not removing the store. This is because we have nothing
in our optimizer that keeps track of what might have observed the side-effects
of the store. What if the object got &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;escape&lt;/code&gt;d? Or someone did a load later on?
We would only be able to remove the store (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;continue&lt;/code&gt;) if we could guarantee it
was not observable.&lt;/p&gt;

&lt;p&gt;In our current framework, this only happens in one case: someone is doing a
store of the exact same value that already exists in our compile-time heap.
That is, either the same constant, or the same SSA value. If we see this, then
we can completely skip the second store instruction.&lt;/p&gt;

&lt;p&gt;Here’s a test case for that, where we have gained information from the load
instruction that we can then use to get rid of the store instruction:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_load_then_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = load(var0, 0)
var2 = escape(var1)&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s make it pass. To do that, first we’ll make an equality function that
works for both constants and operations. Constants are equal if their values
are equal, and operations are equal if they are the identical (by
address/pointer) operation.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;eq_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Constant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Constant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is a partial equality: if two operations are not equal under &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;eq_value&lt;/code&gt;,
it doesn’t mean that they are different, only that we don’t know that they are
the same.&lt;/p&gt;

&lt;p&gt;Then, after that, we need only check if the current value in the compile-time
heap is the same as the value being stored in. If it is, wonderful. No need to
store. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;continue&lt;/code&gt; and don’t append the operation to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;opt_bb&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;store&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;store_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;current_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;new_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eq_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# NEW!
&lt;/span&gt;                &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# ... as before ...
&lt;/span&gt;            &lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;load&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_equal_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This makes our load-then-store pass and it also makes other tests pass too,
like eliminating a store after another store!&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_store_after_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = store(var0, 0, 5)&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Unfortunately, this only works if the values—constants or SSA values—are
known to be the same. If we store &lt;em&gt;different&lt;/em&gt; values, we can’t optimize. In the
live stream, we left this an exercise for the viewer:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pytest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mark&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xfail&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_exercise_for_the_reader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;arg0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = store(var0, 0, 7)
var2 = escape(7)&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We would only be able to optimize this away if we had some notion of a store
being &lt;em&gt;dead&lt;/em&gt;. In this case, that is a store in which the value is never read
before being overwritten.&lt;/p&gt;

&lt;h2 id=&quot;removing-dead-stores&quot;&gt;Removing dead stores&lt;/h2&gt;

&lt;p&gt;TODO, I suppose. I have not gotten this far yet. If I get around to it, I will
come back and update the post.&lt;/p&gt;

&lt;h2 id=&quot;in-the-real-world&quot;&gt;In the real world&lt;/h2&gt;

&lt;p&gt;This small optimization pass may seem silly or fiddly—when would we ever see
something like this in a real IR?—but it’s pretty useful. Here’s the Ruby
code that got me thinking about it again some years later for ZJIT:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;C&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;initialize&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;CRuby has a shape system and ZJIT makes use of it, so we end up optimizing this
code (if it’s monomorphic) into a series of shape checks and stores. The HIR
might end up looking something like the mess below, where I’ve annotated the
shape guards (can be thought of as loads) and stores with asterisks:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;fn initialize@tmp/init.rb:3:
# ...
bb2(v6:BasicObject):
  v10:Fixnum[1] = Const Value(1)
  v31:HeapBasicObject = GuardType v6, HeapBasicObject
* v32:HeapBasicObject = GuardShape v31, 0x400000
* StoreField v32, :@a@0x10, v10
  WriteBarrier v32, v10
  v35:CShape[0x40008e] = Const CShape(0x40008e)
* StoreField v32, :_shape_id@0x4, v35
  v16:Fixnum[2] = Const Value(2)
  v37:HeapBasicObject = GuardType v6, HeapBasicObject
* v38:HeapBasicObject = GuardShape v37, 0x40008e
* StoreField v38, :@b@0x18, v16
  WriteBarrier v38, v16
  v41:CShape[0x40008f] = Const CShape(0x40008f)
* StoreField v38, :_shape_id@0x4, v41
  v22:Fixnum[3] = Const Value(3)
  v43:HeapBasicObject = GuardType v6, HeapBasicObject
* v44:HeapBasicObject = GuardShape v43, 0x40008f
* StoreField v44, :@c@0x20, v22
  WriteBarrier v44, v22
  v47:CShape[0x400090] = Const CShape(0x400090)
* StoreField v44, :_shape_id@0x4, v47
  CheckInterrupts
  Return v22
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If we had store-load forwarding in ZJIT, we could get rid of the intermediate
shape guards; they would know the shape from the previous &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StoreField&lt;/code&gt;
instruction. If we had dead store elimination, we could get rid of the
intermediate shape writes; they are never read. (And the repeated type guards
to check if it’s a heap object still are just silly and need to get removed
eventually.)&lt;/p&gt;

&lt;p&gt;This is on the roadmap and will make object initialization even faster than it
is right now.&lt;/p&gt;

&lt;h2 id=&quot;wrapping-up&quot;&gt;Wrapping up&lt;/h2&gt;

&lt;p&gt;See the &lt;a href=&quot;https://github.com/tekknolagi/tekknolagi.github.com/blob/74aad4d26d166f9bc847bafcd503378d78d294ee/loadstore.py&quot;&gt;full code&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Thanks for reading the text version of the video that CF and I made a while
back. Now you know how to do load/store elimination on traces. For a full
compiler that has other operations, you will need to model their
effects/potential heap writes in your optimizer… perhaps even using the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AbstractHeap&lt;/code&gt; machinery we have here. For example, a function call probably
writes to all heaps and therefore clears &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compile_time_heap&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I think this does not need too much extra work to get it going on full CFGs; a
block is pretty much the same as a trace, so you can do a block-local version
without much fuss. If you want to go global, you need dominator information and
gen-kill sets.&lt;/p&gt;

&lt;p&gt;Maybe check out the implementation in other compilers:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/tekknolagi/v8/blob/f030838700a83cde6992cb8ebcb3facc6a8fc1f1/src/crankshaft/hydrogen-load-elimination.cc&quot;&gt;V8’s old Hydrogen load elimination&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/tekknolagi/v8/blob/f030838700a83cde6992cb8ebcb3facc6a8fc1f1/src/crankshaft/hydrogen-escape-analysis.cc&quot;&gt;V8’s old Hydrogen escape analysis&lt;/a&gt;
    &lt;ul&gt;
      &lt;li&gt;Which also does some load-store forwarding&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/tekknolagi/v8/blob/f030838700a83cde6992cb8ebcb3facc6a8fc1f1/src/crankshaft/hydrogen-alias-analysis.h&quot;&gt;V8’s old Hydrogen simple alias analysis&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/LineageOS/android_art/blob/8ce603e0c68899bdfbc9cd4c50dcc65bbf777982/compiler/optimizing/load_store_elimination.cc&quot;&gt;Android ART’s load-store elimination&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Maybe I will touch on this in a future post…&lt;/p&gt;

&lt;h2 id=&quot;thank-you&quot;&gt;Thank you&lt;/h2&gt;

&lt;p&gt;Thank you to CF, who walked me through this live on a stream two years ago!
This blog post wouldn’t be possible without you.&lt;/p&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:size&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;In this toy optimizer example, we are assuming that all reads and writes
are the same size and different offsets don’t overlap at all. This is often
the case for managed runtimes, where object fields are pointer-sized and
all reads/writes are pointer-aligned. &lt;a href=&quot;#fnref:size&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:tbaa&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;We could do better. If we had type information, we could also use that
to make alias classes. Writes to a List will never overlap with writes to a
Map, for example. This requires your compiler to have strict aliasing—if
you can freely cast between types, as in C, then this tactic goes out the
window.&lt;/p&gt;

      &lt;p&gt;This is called &lt;a href=&quot;/assets/img/tbaa.pdf&quot;&gt;Type-based alias analysis&lt;/a&gt; (PDF), or
TBAA. I cover it in &lt;a href=&quot;/blog/toy-tbaa/&quot;&gt;the next post&lt;/a&gt;. &lt;a href=&quot;#fnref:tbaa&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Wed, 24 Dec 2025 00:00:00 +0000</pubDate>
            <niceDate>December 24, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/toy-load-store/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/toy-load-store/</guid>
        </item>
        
        <item>
            <title>ZJIT is now available in Ruby 4.0</title>
            <description>&lt;p&gt;&lt;em&gt;Originally published on &lt;a href=&quot;https://railsatscale.com/2025-12-24-launch-zjit/&quot;&gt;Rails At Scale&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;ZJIT is a new just-in-time (JIT) Ruby compiler built into the reference Ruby
implementation, &lt;a href=&quot;https://en.wikipedia.org/wiki/YARV&quot;&gt;YARV&lt;/a&gt;, by the same compiler group that brought you YJIT.
We (Aaron Patterson, Aiden Fox Ivey, Alan Wu, Jacob Denbeaux, Kevin Menard, Max
Bernstein, Maxime Chevalier-Boisvert, Randy Stauner, Stan Lo, and Takashi
Kokubun) have been working on ZJIT since the beginning of this year.&lt;/p&gt;

&lt;p&gt;In case you missed the last post, we’re building a new compiler for Ruby
because we want to both raise the performance ceiling (bigger compilation unit
size and SSA IR) and encourage more outside contribution (by becoming a more
traditional method compiler).&lt;/p&gt;

&lt;p&gt;It’s been a long time since we gave an official update on ZJIT. Things are
going well. We’re excited to share our progress with you. We’ve done a lot
&lt;a href=&quot;/blog/merge-zjit/&quot;&gt;since May&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;in-brief&quot;&gt;In brief&lt;/h2&gt;

&lt;p&gt;ZJIT is compiled by default—but not enabled by default—in Ruby 4.0. Enable
it by passing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--zjit&lt;/code&gt; flag or the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RUBY_ZJIT_ENABLE&lt;/code&gt; environment variable
or calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RubyVM::ZJIT.enable&lt;/code&gt; after starting your application.&lt;/p&gt;

&lt;p&gt;It’s faster than the interpreter, but not yet as fast as YJIT. &lt;strong&gt;Yet.&lt;/strong&gt; But we
have a plan, and we have some more specific numbers below. The TL;DR is we have
a great new foundation and now need to pull out all the Ruby-specific stops to
match YJIT.&lt;/p&gt;

&lt;p&gt;We encourage you to experiment with ZJIT, but maybe hold off on deploying it in
production for now. This is a very new compiler. You should expect crashes and
wild performance degradations (or, perhaps, improvements). Please test locally,
try to run CI, etc, and let us know what you run into on &lt;a href=&quot;https://bugs.ruby-lang.org/projects/ruby-master/issues?set_filter=1&amp;amp;tracker_id=1&quot;&gt;the Ruby issue
tracker&lt;/a&gt; (or, if you don’t want to make a Ruby Bugs account, we would
also take reports &lt;a href=&quot;https://github.com/Shopify/ruby/issues&quot;&gt;on GitHub&lt;/a&gt;).&lt;/p&gt;

&lt;h2 id=&quot;state-of-the-compiler&quot;&gt;State of the compiler&lt;/h2&gt;

&lt;p&gt;To underscore how much has happened since the &lt;a href=&quot;/blog/merge-zjit/&quot;&gt;announcement of being merged
into CRuby&lt;/a&gt;, we present to you a series of comparisons:&lt;/p&gt;

&lt;h3 id=&quot;side-exits&quot;&gt;Side-exits&lt;/h3&gt;

&lt;p&gt;Back in May, we could not side-exit from JIT code into the interpreter. This
meant that the code we were running had to continue to have the same
preconditions (expected types, no method redefinitions, etc) or the JIT would
safely abort. &lt;strong&gt;Now,&lt;/strong&gt; we can side-exit and use this feature liberally.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;For example, we gracefully handle the phase transition from integer to string;
a guard instruction fails and transfers control to the interpreter.&lt;/p&gt;

  &lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;three&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;four&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;  &lt;/div&gt;
&lt;/blockquote&gt;

&lt;p&gt;This enables running a lot more code!&lt;/p&gt;

&lt;h3 id=&quot;more-code&quot;&gt;More code&lt;/h3&gt;

&lt;p&gt;Back in May, we could only run a handful of small benchmarks. &lt;strong&gt;Now,&lt;/strong&gt; we can
run all sorts of code, including passing the full Ruby test suite, the test
suite and shadow traffic of a large application at Shopify, and the test suite
of GitHub.com! Also a bank, apparently.&lt;/p&gt;

&lt;p&gt;Back in May, we did not optimize much; we only really optimized operations
on fixnums (small integers) and method sends to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;main&lt;/code&gt; object. &lt;strong&gt;Now,&lt;/strong&gt;
we optimize a lot more: all sorts of method sends, instance variable reads
and writes, attribute accessor/reader/writer use, struct reads and writes,
object allocations, certain string operations, optional parameters, and more.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;For example, we can &lt;a href=&quot;https://en.wikipedia.org/wiki/Constant_folding&quot;&gt;constant-fold&lt;/a&gt; numeric operations. Because we also have a
(small, limited) inliner borrowed from YJIT, we can constant-fold the entirety
of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add&lt;/code&gt; down to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;3&lt;/code&gt;—and still handle redefinitions of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;one&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;two&lt;/code&gt;,
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Integer#+&lt;/code&gt;, …&lt;/p&gt;

  &lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;one&lt;/span&gt;
  &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;two&lt;/span&gt;
  &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;one&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;two&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;  &lt;/div&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;register-spilling&quot;&gt;Register spilling&lt;/h3&gt;

&lt;p&gt;Back in May, we could not compile many large functions due to limitations of
our backend that we borrowed from YJIT. &lt;strong&gt;Now,&lt;/strong&gt; we can compile absolutely
enormous functions just fine. And quickly, too. Though we have not been
focusing specifically on compiler performance, we compile even large methods in
under a millisecond.&lt;/p&gt;

&lt;h3 id=&quot;c-methods&quot;&gt;C methods&lt;/h3&gt;

&lt;p&gt;Back in May, we could not even optimize calls to built-in C methods. &lt;strong&gt;Now,&lt;/strong&gt;
we have a feature similar to JavaScriptCore’s DOMJIT, which allows us to emit
inline HIR versions of certain well-known C methods. This allows the optimizer
to reason about these methods and their effects (more on this in a future post)
much more… er, effectively.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;For example, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Integer#succ&lt;/code&gt;, which is defined as adding &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1&lt;/code&gt; to an integer, is a
C method. It’s used in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Integer#times&lt;/code&gt; to drive the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;while&lt;/code&gt; loop. Instead of
emitting a call to it, our C method “inliner” can emit our existing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;FixnumAdd&lt;/code&gt;
instruction and take advantage of the rest of the type inference and
constant-folding.&lt;/p&gt;

  &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;inline_integer_succ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                       &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                       &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InsnId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                       &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InsnId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
                       &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InsnId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Option&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InsnId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.is_empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.likely_a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;types&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fixnum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.coerce_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;types&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fixnum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push_insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;Const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Const&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;VALUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;fixnum_from_usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push_insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FixnumAdd&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Some&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;None&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;  &lt;/div&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;fewer-c-calls&quot;&gt;Fewer C calls&lt;/h3&gt;

&lt;p&gt;Back in May, the machine code ZJIT generated called a lot of C functions from
the CRuby runtime to implement our HIR instructions in LIR. We have pared this
down significantly and now “open code” the implementations in LIR.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;For example, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GuardNotFrozen&lt;/code&gt; used to call out to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rb_obj_frozen_p&lt;/code&gt;. Now, it
requires that its input is a heap-allocated object and can instead do a load, a
test, and a conditional jump.&lt;/p&gt;

  &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;gen_guard_not_frozen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;JITState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                        &lt;span class=&quot;n&quot;&gt;asm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Assembler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                        &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Opnd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                        &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FrameState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Opnd&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;asm&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// It&apos;s a heap object, so check the frozen flag&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flags&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;asm&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Opnd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RUBY_OFFSET_RBASIC_FLAGS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;asm&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;flags&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RUBY_FL_FREEZE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.into&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// Side-exit if frozen&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;asm&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.jnz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;side_exit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GuardNotFrozen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;  &lt;/div&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;more-teammates&quot;&gt;More teammates&lt;/h3&gt;

&lt;p&gt;Back in May, we had four people working full-time on the compiler. &lt;strong&gt;Now,&lt;/strong&gt; we
have more internally at Shopify—and also more from the community! We have
had several interested people reach out, learn about ZJIT, and successfully
land complex changes. For this reason, we have opened up &lt;a href=&quot;https://zjit.zulipchat.com&quot;&gt;a chat
room&lt;/a&gt; to discuss and improve ZJIT.&lt;/p&gt;

&lt;h3 id=&quot;a-cool-graph-visualization-tool&quot;&gt;A cool graph visualization tool&lt;/h3&gt;

&lt;p&gt;You &lt;em&gt;have to&lt;/em&gt; check out our intern Aiden’s &lt;a href=&quot;https://railsatscale.com/2025-11-19-adding-iongraph-support/&quot;&gt;integration of Iongraph into
ZJIT&lt;/a&gt;. Now we
have clickable, zoomable, scrollable graphs of all our functions and all our
optimization passes. It’s great!&lt;/p&gt;

&lt;p&gt;Try zooming (Ctrl-scroll), clicking the different optimization passes on the
left, clicking the instruction IDs in each basic block (definitions and uses),
and seeing how the IR for the below Ruby code changes over time.&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Point&lt;/span&gt;
  &lt;span class=&quot;nb&quot;&gt;attr_accessor&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:y&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;initialize&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;no&quot;&gt;P&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;freeze&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;P&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;P&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;y&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;iframe title=&quot;Iongraph Viewer&quot; aria-label=&quot;Interactive compiler graph visualization&quot; src=&quot;/assets/html/zjit-viewer.html&quot; width=&quot;100%&quot; height=&quot;400&quot;&gt;&lt;/iframe&gt;

&lt;h3 id=&quot;more&quot;&gt;More&lt;/h3&gt;

&lt;p&gt;…and so, so many garbage collection fixes.&lt;/p&gt;

&lt;p&gt;There’s still a lot to do, though.&lt;/p&gt;

&lt;h2 id=&quot;to-do&quot;&gt;To do&lt;/h2&gt;

&lt;p&gt;We’re going to optimize &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;invokeblock&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;yield&lt;/code&gt;) and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;invokesuper&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;super&lt;/code&gt;)
instructions, each of which behaves similarly, but not identically, to a
normal &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;send&lt;/code&gt; instruction. These are pretty common.&lt;/p&gt;

&lt;p&gt;We’re going to optimize &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;setinstancevariable&lt;/code&gt; in the case where we have to
transition the object’s shape. This will help normal &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@a = b&lt;/code&gt; situations. It
will also help &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@a ||= b&lt;/code&gt;, but I think we can even do better with the latter
using some kind of value numbering.&lt;/p&gt;

&lt;p&gt;We only optimize monomorphic calls right now—cases where a method send only
sees one class of receiver while being profiled. We’re going to optimize
polymorphic sends, too. Right now we’re laying the groundwork (a new register
allocator; see below) to make this much easier. It’s not as much of an
immediate focus, though, because most (high 80s, low 90s percent) of sends are
monomorphic. &lt;!-- TODO throwback to Smalltalk-80 --&gt;&lt;/p&gt;

&lt;p&gt;We’re in the middle of re-writing the register allocator after reading the
entire history of linear scan papers and several implementations. That will
unlock performance improvements and also allow us to make the IRs easier to
use.&lt;/p&gt;

&lt;p&gt;We don’t handle phase changes particularly well yet; if your method call
patterns change significantly after your code has been compiled, we will
frequently side-exit into the interpreter. Instead, we would like to use these
side-exits as additional profile information and re-compile the function.&lt;/p&gt;

&lt;p&gt;Right now we have a lot of traffic to the VM frame. JIT frame pushes are
reasonably fast, but with every effectful operation, we have to flush our local
variable state and stack state to the VM frame. The instances in which code
might want to read this reified frame state are rare: frame unwinding due to
exceptions, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Binding#local_variable_get&lt;/code&gt;, etc. In the future, we will instead
defer writing this state until it needs to be read.&lt;/p&gt;

&lt;p&gt;We only have a limited inliner that inlines constants, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;self&lt;/code&gt;, and parameters.
In the fullness of time, we will add a general-purpose method inlining
facility. This will allow us to reduce the amount of polymorphic sends, do some
branch folding, and reduce the amount of method sends.&lt;/p&gt;

&lt;p&gt;We only support optimizing positional parameters, required keyword parameters,
and optional parameters right now but we will work on optimizing optional
keyword arguments as well. Most of this work is in marshaling the complex
Ruby calling convention into one coherent form that the JIT can understand.&lt;/p&gt;

&lt;h2 id=&quot;performance&quot;&gt;Performance&lt;/h2&gt;

&lt;p&gt;We have public performance numbers for a selection of macro- and
micro-benchmarks on &lt;a href=&quot;https://rubybench.github.io/&quot;&gt;rubybench&lt;/a&gt;. Here is a screenshot of what those
per-benchmark graphs look like. The Y axis is speedup multiplier vs the
interpreter and the X axis is time. Higher is better:&lt;/p&gt;

&lt;figure style=&quot;display: block; margin: 0 auto; max-width: 80%;&quot;&gt;
  &lt;img src=&quot;/assets/img/zjit-benchmark.png&quot; /&gt;
  &lt;figcaption&gt;A line chart of ZJIT performance on railsbench&amp;mdash;represented as a
  speedup multiplier when compared to the interpreter&amp;mdash;improving over
  time, passing interpreter performance, catching up to YJIT.&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;You can see that we are improving performance on nearly all benchmarks over
time. Some of this comes from from optimizing in a similar way as YJIT does
today (e.g. specializing ivar reads and writes), and some of it is optimizing
in a way that takes advantage of ZJIT’s high-level IR (e.g. constant folding,
branch folding, more precise type inference).&lt;/p&gt;

&lt;p&gt;We are using both raw time numbers and also our internal performance counters
(e.g. number of calls to C functions from generated code) to drive
optimization.&lt;/p&gt;

&lt;h2 id=&quot;try-it-out&quot;&gt;Try it out&lt;/h2&gt;

&lt;p&gt;While Ruby now ships with ZJIT compiled into the binary by default, it is not
&lt;em&gt;enabled&lt;/em&gt; by default at run-time. Due to performance and stability, YJIT is
still the default compiler choice in Ruby 4.0.&lt;/p&gt;

&lt;p&gt;If you want to run your test suite with ZJIT to see what happens, you
absolutely can. Enable it by passing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--zjit&lt;/code&gt; flag or the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RUBY_ZJIT_ENABLE&lt;/code&gt; environment variable or calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RubyVM::ZJIT.enable&lt;/code&gt; after
starting your application.&lt;/p&gt;

&lt;h2 id=&quot;on-yjit&quot;&gt;On YJIT&lt;/h2&gt;

&lt;p&gt;We devoted a lot of our resources this year to developing ZJIT. While we did
not spend much time on YJIT (outside of a great &lt;a href=&quot;https://railsatscale.com/2025-05-21-fast-allocations-in-ruby-3-5/&quot;&gt;allocation speed
up&lt;/a&gt;), YJIT isn’t going anywhere soon.&lt;/p&gt;

&lt;h2 id=&quot;thank-you&quot;&gt;Thank you&lt;/h2&gt;

&lt;p&gt;This compiler was made possible by contributions to your &lt;del&gt;PBS station&lt;/del&gt; open
source project from programmers like you. Thank you!&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Aaron Patterson&lt;/li&gt;
  &lt;li&gt;Abrar Habib&lt;/li&gt;
  &lt;li&gt;Aiden Fox Ivey&lt;/li&gt;
  &lt;li&gt;Alan Wu&lt;/li&gt;
  &lt;li&gt;Alex Rocha&lt;/li&gt;
  &lt;li&gt;André Luiz Tiago Soares&lt;/li&gt;
  &lt;li&gt;Benoit Daloze&lt;/li&gt;
  &lt;li&gt;Charlotte Wen&lt;/li&gt;
  &lt;li&gt;Daniel Colson&lt;/li&gt;
  &lt;li&gt;Donghee Na&lt;/li&gt;
  &lt;li&gt;Eileen Uchitelle&lt;/li&gt;
  &lt;li&gt;Étienne Barrié&lt;/li&gt;
  &lt;li&gt;Godfrey Chan&lt;/li&gt;
  &lt;li&gt;Goshanraj Govindaraj&lt;/li&gt;
  &lt;li&gt;Hiroshi SHIBATA&lt;/li&gt;
  &lt;li&gt;Hoa Nguyen&lt;/li&gt;
  &lt;li&gt;Jacob Denbeaux&lt;/li&gt;
  &lt;li&gt;Jean Boussier&lt;/li&gt;
  &lt;li&gt;Jeremy Evans&lt;/li&gt;
  &lt;li&gt;John Hawthorn&lt;/li&gt;
  &lt;li&gt;Ken Jin&lt;/li&gt;
  &lt;li&gt;Kevin Menard&lt;/li&gt;
  &lt;li&gt;Max Bernstein&lt;/li&gt;
  &lt;li&gt;Max Leopold&lt;/li&gt;
  &lt;li&gt;Maxime Chevalier-Boisvert&lt;/li&gt;
  &lt;li&gt;Nobuyoshi Nakada&lt;/li&gt;
  &lt;li&gt;Peter Zhu&lt;/li&gt;
  &lt;li&gt;Randy Stauner&lt;/li&gt;
  &lt;li&gt;Satoshi Tagomori&lt;/li&gt;
  &lt;li&gt;Shannon Skipper&lt;/li&gt;
  &lt;li&gt;Stan Lo&lt;/li&gt;
  &lt;li&gt;Takashi Kokubun&lt;/li&gt;
  &lt;li&gt;Tavian Barnes&lt;/li&gt;
  &lt;li&gt;Tobias Lütke&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(via a lightly touched up &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git log --pretty=&quot;%an&quot; zjit | sort -u&lt;/code&gt;)&lt;/p&gt;
</description>
            <pubDate>Wed, 24 Dec 2025 00:00:00 +0000</pubDate>
            <niceDate>December 24, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/launch-zjit/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/launch-zjit/</guid>
        </item>
        
        <item>
            <title>How to annotate JITed code for perf/samply</title>
            <description>&lt;p&gt;Brief one today. I got asked “does YJIT/ZJIT have support for [Linux] perf?”&lt;/p&gt;

&lt;p&gt;The answer is yes, and it also works with &lt;a href=&quot;https://github.com/mstange/samply&quot;&gt;samply&lt;/a&gt; (including on macOS!),
because both understand the &lt;a href=&quot;https://github.com/torvalds/linux/blob/516471569089749163be24b973ea928b56ac20d9/tools/perf/Documentation/jit-interface.txt&quot;&gt;perf map interface&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is the entirety of the implementation in ZJIT&lt;sup id=&quot;fnref:hex&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:hex&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;register_with_perf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iseq_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start_ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;code_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;perf_map&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;format!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;/tmp/perf-{}.map&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;fs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;OpenOptions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.create&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;perf_map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;debug!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Failed to open perf map file: {perf_map}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;BufWriter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;writeln!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;{start_ptr:x} {code_size:x} zjit::{iseq_name}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;debug!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Failed to write {iseq_name} to perf map file: {perf_map}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Whenever you generate a function, append a one-line entry consisting of&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;START SIZE symbolname
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/tmp/perf-{PID}.map&lt;/code&gt;. Per the Linux docs linked above,&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;START and SIZE are hex numbers without 0x.&lt;/p&gt;

  &lt;p&gt;symbolname is the rest of the line, so it could contain special characters.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You can now happily run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;perf record your_jit [...]&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;samply record your_jit
[...]&lt;/code&gt; and have JIT frames be named in the output. We hide this behind
the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--zjit-perf&lt;/code&gt; flag to avoid file I/O overhead when we don’t need it.&lt;/p&gt;

&lt;h2 id=&quot;there-is-also-the-jit-dump-interface&quot;&gt;There is also the JIT dump interface&lt;/h2&gt;

&lt;p&gt;Perf map is the older way to interact with perf: a newer, more complicated way
involves &lt;a href=&quot;https://theunixzoo.co.uk/blog/2025-09-14-linux-perf-jit.html&quot;&gt;generating a “dump” file&lt;/a&gt; and then &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;perf inject&lt;/code&gt;ing it.&lt;/p&gt;

&lt;!--

## There is also the JIT gdb interface

This is not strictly related but I want to figure it out

--&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:hex&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;We actually use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{:#x}&lt;/code&gt;, which I noticed today is wrong. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{:#x}&lt;/code&gt; leaves
in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0x&lt;/code&gt;, and it shouldn’t; instead &lt;strong&gt;use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{:x}&lt;/code&gt;&lt;/strong&gt;. &lt;a href=&quot;#fnref:hex&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Thu, 18 Dec 2025 00:00:00 +0000</pubDate>
            <niceDate>December 18, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/jit-perf-map/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/jit-perf-map/</guid>
        </item>
        
        <item>
            <title>A catalog of side effects</title>
            <description>&lt;p&gt;Optimizing compilers like to keep track of each IR instruction’s &lt;em&gt;effects&lt;/em&gt;. An
instruction’s effects vary wildly from having no effects at all, to writing a
specific variable, to completely unknown (writing all state).&lt;/p&gt;

&lt;p&gt;This post can be thought of as a continuation of &lt;a href=&quot;/blog/irs/&quot;&gt;What I talk about when I talk
about IRs&lt;/a&gt;, specifically the section talking about asking the right
questions. When we talk about effects, we should ask the right questions: not
&lt;em&gt;what opcode is this?&lt;/em&gt; but instead &lt;em&gt;what effects does this opcode have?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Different compilers represent and track these effects differently. I’ve been
thinking about how to represent these effects all year, so I have been doing
some reading. In this post I will give some summaries of the landscape of
approaches. Please feel free to suggest more.&lt;/p&gt;

&lt;h2 id=&quot;some-background&quot;&gt;Some background&lt;/h2&gt;

&lt;p&gt;Internal IR effect tracking is similar to the programming language notion of
algebraic effects in type systems, but internally, compilers keep track of
finer-grained effects. Effects such as “writes to a local variable”, “writes to
a list”, or “reads from the stack” indicate what instructions can be
re-ordered, duplicated, or removed entirely.&lt;/p&gt;

&lt;p&gt;For example, consider the following pseodocode for some made-up language that
stands in for a snippet of compiler IR:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;some_var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;another_var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The goal of effects is to communicate to the compiler if, for example, these two IR
instructions can be re-ordered. The second instruction &lt;em&gt;might&lt;/em&gt; write to a
location that the first one reads. But it also might not! This is about knowing
if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;some_var&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;another_var&lt;/code&gt; &lt;em&gt;alias&lt;/em&gt;—if they are different names that
refer to the same object.&lt;/p&gt;

&lt;p&gt;We can sometimes answer that question directly, but often it’s cheaper to
compute an approximate answer: &lt;em&gt;could&lt;/em&gt; they even alias? It’s possible that
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;some_var&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;another_var&lt;/code&gt; have different types, meaning that (as long as you
have strict aliasing) the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Load&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Store&lt;/code&gt; operations that implement these
reads and writes by definition touch different locations. And if they look
at disjoint locations, there need not be any explicit order enforced.&lt;/p&gt;

&lt;p&gt;Different compilers keep track of this information differently. The null effect
analysis gives up and says “every instruction is maximally effectful” and
therefore “we can’t re-order or delete any instructions”. That’s probably fine
for a first stab at a compiler, where you will get a big speed up purely based
on strength reductions. Over-approximations of effects should always be
valid.&lt;/p&gt;

&lt;p&gt;But at some point you start wanting to do dead code elimination (DCE), or
common subexpression elimination (CSE), or loads/store elimination, or move
instructions around, and you start wondering how to represent effects. That’s
where I am right now. So here’s a catalog of different compilers I have looked
at recently.&lt;/p&gt;

&lt;p&gt;There are two main ways I have seen to represent effects: bitsets and heap
range lists. We’ll look at one example compiler for each, talk a bit about
tradeoffs, then give a bunch of references to other major compilers.&lt;/p&gt;

&lt;p&gt;We’ll start with &lt;a href=&quot;https://github.com/facebookincubator/cinder&quot;&gt;Cinder&lt;/a&gt;, a Python JIT, because that’s what I used to
work on.&lt;/p&gt;

&lt;h2 id=&quot;cinder&quot;&gt;Cinder&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/facebookincubator/cinder&quot;&gt;Cinder&lt;/a&gt; tracks heap effects for its high-level IR (HIR) in
&lt;a href=&quot;https://github.com/facebookincubator/cinderx/blob/8bf5af94e2792d3fd386ab25b1aeedae27276d50/cinderx/Jit/hir/instr_effects.h&quot;&gt;instr_effects.h&lt;/a&gt;. Pretty much everything happens in
the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memoryEffects(const Instr&amp;amp; instr)&lt;/code&gt; function, which is expected to know
everything about what effects the given instruction might have.&lt;/p&gt;

&lt;p&gt;The data representation is a bitset representation of a lattice called an
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AliasClass&lt;/code&gt; and that is defined in &lt;a href=&quot;https://github.com/facebookincubator/cinderx/blob/8bf5af94e2792d3fd386ab25b1aeedae27276d50/cinderx/Jit/hir/alias_class.h&quot;&gt;alias_class.h&lt;/a&gt;. Each
bit in the bitset represents a distinct location in the heap: reads from and
writes to each of these locations are guaranteed not to affect any of the other
locations.&lt;/p&gt;

&lt;p&gt;Here is the X-macro that defines it:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;#define HIR_BASIC_ACLS(X) \
  X(ArrayItem)            \
  X(CellItem)             \
  X(DictItem)             \
  X(FuncArgs)             \
  X(FuncAttr)             \
  X(Global)               \
  X(InObjectAttr)         \
  X(ListItem)             \
  X(Other)                \
  X(TupleItem)            \
  X(TypeAttrCache)        \
  X(TypeMethodCache)
&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;enum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BitIndexes&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;#define ACLS(name) k##name##Bit,
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;HIR_BASIC_ACLS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ACLS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;#undef ACLS
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that each bit implicitly represents a set: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ListItem&lt;/code&gt; does not refer to a
&lt;em&gt;specific&lt;/em&gt; list index, but the infinite set of all possible list indices. It’s
&lt;em&gt;any&lt;/em&gt; list index. Still, every list index is completely disjoint from, say, every
entry in a global variable table.&lt;/p&gt;

&lt;p&gt;(And, to be clear, an object in a list might be the same as an object in a
global variable table. The objects themselves can alias. But the thing being
written to or read from, the thing &lt;em&gt;being side effected&lt;/em&gt;, is the container.)&lt;/p&gt;

&lt;p&gt;Like other bitset lattices, it’s possible to union the sets by or-ing the bits.
It’s possible to query for overlap by and-ing the bits.&lt;/p&gt;

&lt;div class=&quot;language-c++ highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AliasClass&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;// The union of two AliasClass&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;operator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;c1&quot;&gt;// The intersection (overlap) of two AliasClass&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;operator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If this sounds familiar, it’s because (as the repo notes) it’s a similar idea
to Cinder’s &lt;a href=&quot;/blog/lattice-bitset/&quot;&gt;type lattice representation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Like other lattices, there is both a bottom element (no effects) and a top
element (all possible effects):&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;#define HIR_OR_BITS(name) | k##name
&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;#define HIR_UNION_ACLS(X)                           \
  &lt;/span&gt;&lt;span class=&quot;cm&quot;&gt;/* Bottom union */&lt;/span&gt;&lt;span class=&quot;cp&quot;&gt;                                \
  X(Empty, 0)                                       \
  &lt;/span&gt;&lt;span class=&quot;cm&quot;&gt;/* Top union */&lt;/span&gt;&lt;span class=&quot;cp&quot;&gt;                                   \
  X(Any, 0 HIR_BASIC_ACLS(HIR_OR_BITS))             \
  &lt;/span&gt;&lt;span class=&quot;cm&quot;&gt;/* Memory locations accessible by managed code */&lt;/span&gt;&lt;span class=&quot;cp&quot;&gt; \
  X(ManagedHeapAny, kAny &amp;amp; ~kFuncArgs)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Union operations naturally hit a fixpoint at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Any&lt;/code&gt; and intersection operations
naturally hit a fixpoint at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Empty&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;All of this together lets the optimizer ask and answer questions such as:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;where might this instruction write?&lt;/li&gt;
  &lt;li&gt;(because CPython is reference counted and incref implies ownership) where
does this instruction borrow its input from?&lt;/li&gt;
  &lt;li&gt;do these two instructions’ write destinations overlap?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;and more.&lt;/p&gt;

&lt;p&gt;Let’s take a look at an (imaginary) IR version of the code snippet in the intro
and see what analyzing it might look like in the optimizer. Here is the fake
IR:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;v0: Tuple = ...
v1: List = ...
v2: Int[5] = ...
# v = some_var[0]
v3: Object = LoadTupleItem v0, 0
# another_var[0] = 5
StoreListItem v1, 0, v2
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can imagine that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LoadTupleItem&lt;/code&gt; declares that it reads from the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TupleItem&lt;/code&gt; heap and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StoreListItem&lt;/code&gt; declares that it writes to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ListItem&lt;/code&gt;
heap. Because tuple and list pointers cannot be casted into one another and
therefore cannot alias, these are
disjoint heaps in our bitset. Therefore &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ListItem &amp;amp; TupleItem == 0&lt;/code&gt;, therefore
these memory operations can never interfere! They can (for example) be
re-ordered arbitrarily.&lt;/p&gt;

&lt;p&gt;In Cinder, these memory effects could in the future be used for instruction
re-ordering, but they are today mostly used in two places: the refcount
insertion pass and DCE.&lt;/p&gt;

&lt;p&gt;DCE involves first finding the set of instructions that need to be kept around
because they are useful/important/have effects. So here is what the Cinder DCE
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;isUseful&lt;/code&gt; looks like:&lt;/p&gt;

&lt;div class=&quot;language-c++ highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;isUseful&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Instr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsTerminator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsSnapshot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;asDeoptBase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;nullptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsPrimitiveBox&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsPhi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;memoryEffects&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;may_store&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AEmpty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There are some other checks in there but &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memoryEffects&lt;/code&gt; is right there at the
core of it!&lt;/p&gt;

&lt;p&gt;Now that we have seen the bitset representation of effects and an
implementation in Cinder, let’s take a look at a different representation and
and an implementation in JavaScriptCore.&lt;/p&gt;

&lt;h2 id=&quot;javascriptcore&quot;&gt;JavaScriptCore&lt;/h2&gt;

&lt;p&gt;I keep coming back to &lt;a href=&quot;https://gist.github.com/pizlonator/cf1e72b8600b1437dda8153ea3fdb963&quot;&gt;How I implement SSA form&lt;/a&gt; by &lt;a href=&quot;http://www.filpizlo.com/&quot;&gt;Fil
Pizlo&lt;/a&gt;, one of the significant contributors to JavaScriptCore (JSC). In
particular, I keep coming back to the &lt;a href=&quot;https://gist.github.com/pizlonator/cf1e72b8600b1437dda8153ea3fdb963#uniform-effect-representation&quot;&gt;Uniform Effect
Representation&lt;/a&gt; section. This notion of “abstract heaps” felt
very… well, abstract. Somehow more abstract than the bitset representation.
The pre-order and post-order integer pair as a way to represent nested heap
effects just did not click.&lt;/p&gt;

&lt;p&gt;It didn’t make any sense until I actually went spelunking in JavaScriptCore and
found one of several implementations—because, you know, JSC is six compilers
in a trenchcoat&lt;sup&gt;[&lt;a href=&quot;https://en.wikipedia.org/wiki/Wikipedia:Citation_needed&quot;&gt;&lt;i&gt;citation needed&lt;/i&gt;&lt;/a&gt;]&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;DFG, B3, DOMJIT, and probably others all have their own abstract heap
implementations. We’ll look at DOMJIT mostly because it’s a smaller example and
also illustrates something else that’s interesting: builtins. We’ll come back
to builtins in a minute.&lt;/p&gt;

&lt;p&gt;Let’s take a lookat how DOMJIT structures its &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/989c9f9cd5b1f0c9606820e219ee51da32a34c6b/Source/WebCore/domjit/DOMJITAbstractHeapRepository.yaml&quot;&gt;abstract
heaps&lt;/a&gt;: a YAML file.&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;DOM&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;Tree&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_firstChild&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_lastChild&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_parentNode&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_nextSibling&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_previousSibling&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_ownerDocument&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;Document&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Document_documentElement&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Document_body&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It’s a hierarchy. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node_firstChild&lt;/code&gt; is a subheap of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node&lt;/code&gt; is a subheap of…
and so on. A write to any &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node_nextSibling&lt;/code&gt; is a write to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node&lt;/code&gt; is a write to
… Sibling heaps are unrelated: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node_firstChild&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node_lastChild&lt;/code&gt;, for
example, are disjoint.&lt;/p&gt;

&lt;p&gt;To get a feel for this, I wired up a &lt;a href=&quot;https://github.com/tekknolagi/tekknolagi.github.com/tree/main/assets/code/gen_bitset.rb&quot;&gt;simplified version&lt;/a&gt; of
ZJIT’s bitset generator (for &lt;em&gt;types!&lt;/em&gt;) to read a YAML document and generate a
bitset. It generated the following Rust code:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;mod&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0u64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_documentElement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_body&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_documentElement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_firstChild&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_lastChild&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_nextSibling&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_ownerDocument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_parentNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_previousSibling&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_firstChild&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_lastChild&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_nextSibling&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_ownerDocument&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_parentNode&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_previousSibling&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tree&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tree&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NumTypeBits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It’s not a fancy X-macro, but it’s a short and flexible Ruby script.&lt;/p&gt;

&lt;p&gt;Then I took the &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/989c9f9cd5b1f0c9606820e219ee51da32a34c6b/Source/WebCore/domjit/generate-abstract-heap.rb&quot;&gt;DOMJIT abstract heap
generator&lt;/a&gt;—also funnily enough a short Ruby
script—modified the output format slightly, and had it generate its int
pairs:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;mod&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;cm&quot;&gt;/* DOMJIT Abstract Heap Tree.
  DOM&amp;lt;0,8&amp;gt;:
      Tree&amp;lt;0,8&amp;gt;:
          Node&amp;lt;0,6&amp;gt;:
              Node_firstChild&amp;lt;0,1&amp;gt;
              Node_lastChild&amp;lt;1,2&amp;gt;
              Node_parentNode&amp;lt;2,3&amp;gt;
              Node_nextSibling&amp;lt;3,4&amp;gt;
              Node_previousSibling&amp;lt;4,5&amp;gt;
              Node_ownerDocument&amp;lt;5,6&amp;gt;
          Document&amp;lt;6,8&amp;gt;:
              Document_documentElement&amp;lt;6,7&amp;gt;
              Document_body&amp;lt;7,8&amp;gt;
  */&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tree&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_firstChild&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_lastChild&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_parentNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_nextSibling&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_previousSibling&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_ownerDocument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_documentElement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It already comes with a little diagram, which is super helpful for readability.&lt;/p&gt;

&lt;p&gt;Any empty range(s) represent empty heap effects: if the start and end are the
same number, there are no effects. There is no one &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Empty&lt;/code&gt; value, but any empty
range could be normalized to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HeapRange { start: 0, end: 0 }&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Maybe this was obvious to you, dear reader, but this pre-order/post-order thing
is about nested ranges&lt;sup id=&quot;fnref:dominance&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:dominance&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;! Seeing the output of the generator laid out clearly
like this made it make a lot more sense for me.&lt;/p&gt;

&lt;!--
So how do we compute subtyping relationships with `HeapRange`s? We check range
overlap! Here is [DOMJIT&apos;s C++ implementation][domjit-is-subtype-of]:

[domjit-is-subtype-of]: https://github.com/WebKit/WebKit/blob/989c9f9cd5b1f0c9606820e219ee51da32a34c6b/Source/JavaScriptCore/domjit/DOMJITHeapRange.h#L99

```c++
class HeapRange {
    constexpr explicit operator bool() const {
        return m_begin != m_end;
    }

    bool isStrictSubtypeOf(const HeapRange&amp; other) const {
        if (!*this || !other)
            return false;
        if (*this == other)
            return false;
        return other.m_begin &lt;= m_begin &amp;&amp; m_end &lt;= other.m_end;
    }

    bool isSubtypeOf(const HeapRange&amp; other) const {
        if (!*this || !other)
            return false;
        if (*this == other)
            return true;
        return isStrictSubtypeOf(other);
    }
```

This is represented by the `operator bool()`
and implicit boolean conversions. To reinforce the whole nested heap ranges
thing, `isSubtypeOf` is asking if one `HeapRange` contains another.
--&gt;

&lt;p&gt;What about checking overlap? Here is the &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/989c9f9cd5b1f0c9606820e219ee51da32a34c6b/Source/JavaScriptCore/domjit/DOMJITHeapRange.h#L108&quot;&gt;implementation in
JSC&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-c++ highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;namespace&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WTF&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Check if two ranges overlap assuming that neither range is empty.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;template&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;constexpr&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nonEmptyRangesOverlap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ASSERT_UNDER_CONSTEXPR_CONTEXT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ASSERT_UNDER_CONSTEXPR_CONTEXT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// Pass ranges with the min being inclusive and the max being exclusive.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;template&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;constexpr&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rangesOverlap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ASSERT_UNDER_CONSTEXPR_CONTEXT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ASSERT_UNDER_CONSTEXPR_CONTEXT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;// Empty ranges interfere with nothing.&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nonEmptyRangesOverlap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;overlaps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WTF&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rangesOverlap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m_begin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m_begin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m_end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(See also &lt;a href=&quot;https://zayenz.se/blog/post/how-to-check-for-overlapping-intervals/&quot;&gt;How to check for overlapping intervals&lt;/a&gt; and
&lt;a href=&quot;https://nedbatchelder.com/blog/201310/range_overlap_in_two_compares.html&quot;&gt;Range overlap in two compares&lt;/a&gt; for more fun.)&lt;/p&gt;

&lt;p&gt;While bitsets are a dense representation (you have to hold every bit), they are
very compact and they are very precise. You can hold any number of combinations
of 64 or 128 bits in a single register. The union and intersection operations
are very cheap.&lt;/p&gt;

&lt;p&gt;With int ranges, it’s a little more complicated. An imprecise union of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; and
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt; can take the maximal range that covers both &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt;. To get a more
precise union, you have to keep track of both. In the worst case, if you want
efficient arbitrary queries, you need to store your int ranges in an interval
tree. So what gives?&lt;/p&gt;

&lt;p&gt;I asked Fil if both bitsets and int ranges answer the same question, why use
int ranges? He said that it’s more flexible long-term: bitsets get expensive as
soon as you need over 128 bits (you might need to heap allocate them!) whereas
ranges have no such ceiling. But doesn’t holding sequences of ranges require
heap allocation? Well, despite Fil writing this in his SSA post:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The purpose of the effect representation baked into the IR is to provide a
precise always-available baseline for alias information that is super easy to
work with. […] you can have instructions report that they read/write
multiple heaps […] you can have a utility function that produces such lists
on demand.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It’s important to note that this doesn’t actually involve any allocation of
lists. JSC does this very clever thing where they have “functors” that they
pass in as arguments that compress/summarize what they want to out of an
instruction’s effects.&lt;/p&gt;

&lt;p&gt;Let’s take a look at how the DFG (for example) uses these heap ranges in
analysis. The DFG is structured in such a way that it can make use of the
DOMJIT heap ranges directly, which is neat.&lt;/p&gt;

&lt;p&gt;Note that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AbstractHeap&lt;/code&gt; in the example below is a thin wrapper over the DFG
compiler’s own &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DOMJIT::HeapRange&lt;/code&gt; equivalent:&lt;/p&gt;

&lt;div class=&quot;language-c++ highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AbstractHeapOverlaps&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nl&quot;&gt;public:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;AbstractHeapOverlaps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;operator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;otherHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m_result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;m_result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;overlaps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;otherHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;nl&quot;&gt;private:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;mutable&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

&lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;writesOverlap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Graph&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;NoOpClobberize&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;noOp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;AbstractHeapOverlaps&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addWrite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;clobberize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;noOp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addWrite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;noOp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addWrite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/WebKit/WebKit/blob/4865155d2fcb7cf39aad87597da8f29909d9b7f7/Source/JavaScriptCore/dfg/DFGClobberize.h#L49&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;clobberize&lt;/code&gt;&lt;/a&gt; is the function that calls these functors (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;noOp&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;addWrite&lt;/code&gt; in
this case) for each effect that the given IR instruction &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;node&lt;/code&gt; declares.&lt;/p&gt;

&lt;p&gt;I’ve pulled some relevant snippets of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;clobberize&lt;/code&gt;, which is quite long, that I
think are interesting.&lt;/p&gt;

&lt;p&gt;First, some instructions (constants, here) have no effects. There’s some
utility in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;def(PureValue(...))&lt;/code&gt; call but I didn’t understand fully.&lt;/p&gt;

&lt;p&gt;Then there are some instructions that conditionally have effects depending on
the use types of their operands.&lt;sup id=&quot;fnref:dfg-use-type&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:dfg-use-type&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; Taking the absolute value of an
Int32 or a Double is effect-free but otherwise looks like it can run arbitrary
code.&lt;/p&gt;

&lt;p&gt;Some run-time IR guards that might cause side exits are annotated as
such—they write to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SideState&lt;/code&gt; heap.&lt;/p&gt;

&lt;p&gt;Local variable instructions read &lt;em&gt;specific&lt;/em&gt; heaps indexed by what looks like
the local index but I’m not sure. This means accessing two different locals
won’t alias!&lt;/p&gt;

&lt;p&gt;Instructions that allocate can’t be re-ordered, it looks like; they both read
and write the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HeapObjectCount&lt;/code&gt;. This probably limits the amount of allocation
sinking that can be done.&lt;/p&gt;

&lt;p&gt;Then there’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CallDOM&lt;/code&gt;, which is the builtins stuff I was talking about. We’ll
come back to that after the code block.&lt;/p&gt;

&lt;div class=&quot;language-c++ highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;template&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ReadFunctor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;WriteFunctor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;DefFunctor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ClobberTopFunctor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;clobberize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Graph&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ReadFunctor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WriteFunctor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DefFunctor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;switch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;JSConstant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DoubleConstant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Int52Constant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PureValue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;constant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ArithAbs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;child1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;useKind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Int32Use&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;child1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;useKind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DoubleRepUse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PureValue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arithMode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;clobberTop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AssertInBounds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AssertNotEmpty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SideState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GetLocal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Stack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapLocation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;StackLoc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Stack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;LazyNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewArrayWithSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewArrayWithSizeAndStructure&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapObjectCount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapObjectCount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CallDOM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOMJIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Signature&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;signature&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;signature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;DOMJIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Effect&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;signature&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOMJIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;top&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;World&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DOMState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rawRepresentation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;writes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;writes&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOMJIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;top&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validateDFGClobberize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
                    &lt;span class=&quot;n&quot;&gt;clobberTopFunctor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DOMState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;writes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rawRepresentation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ASSERT_WITH_MESSAGE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOMJIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;top&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Currently, we do not accept any def for CallDOM.&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(Remember that these &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AbstractHeap&lt;/code&gt; operations are very similar to DOMJIT’s
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HeapRange&lt;/code&gt; with a couple more details—and in some cases even contain DOMJIT
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HeapRange&lt;/code&gt;s!)&lt;/p&gt;

&lt;p&gt;This &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CallDOM&lt;/code&gt; node is the way for the DOM APIs in the browser—a significant
chunk of the builtins, which are written in C++—to communicate what they do
to the optimizing compiler. Without any annotations, the JIT has to assume that
a call into C++ could do anything to the JIT state. Bummer!&lt;/p&gt;

&lt;p&gt;But because, for example, &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/Node/firstChild&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node.firstChild&lt;/code&gt;&lt;/a&gt; &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/32bda1b1d73527ba1d05ccba0aa8e463ddeac56d/Source/WebCore/domjit/JSNodeDOMJIT.cpp#L86&quot;&gt;annotates what
memory it reads from&lt;/a&gt; and what it &lt;em&gt;doesn’t&lt;/em&gt; write to,
the JIT can optimize around it better—or even remove the access completely.
It means the JIT can reason about calls to known builtins &lt;em&gt;the same way&lt;/em&gt; that
it reasons about normal JIT opcodes.&lt;/p&gt;

&lt;p&gt;(Incidentally it looks like it doesn’t even make a C call, but instead is
inlined as a little memory read snippet using a JIT builder API. Neat.)&lt;/p&gt;

&lt;!-- TODO tie it back to the original example --&gt;

&lt;!--
B3 from JSC
https://github.com/WebKit/WebKit/blob/main/Source/JavaScriptCore/b3/B3Effects.h
https://github.com/WebKit/WebKit/blob/5811a5ad27100acab51f1d5ba4518eed86bbf00b/Source/JavaScriptCore/b3/B3AbstractHeapRepository.h

DOMJIT from JSC
https://github.com/WebKit/WebKit/blob/main/Source/WebCore/domjit/generate-abstract-heap.rb
generates from https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/WebCore/domjit/DOMJITAbstractHeapRepository.yaml#L4

DFG from JSC
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGAbstractHeap.h
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGClobberize.h
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGClobberize.cpp
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGClobberize.h
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGStructureAbstractValue.cpp
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGStructureAbstractValue.h
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGClobberSet.h
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGStructureAbstractValue.h
--&gt;

&lt;p&gt;Last, we’ll look at Simple, which has a slightly different take on all of this.&lt;/p&gt;

&lt;h2 id=&quot;simple&quot;&gt;Simple&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/seaofnodes/simple&quot;&gt;Simple&lt;/a&gt; is Cliff Click’s pet Sea of
Nodes (SoN) project to try and showcase the idea to the world—outside of a
HotSpot C2 context.&lt;/p&gt;

&lt;p&gt;This one is a little harder for me to understand but it looks like each
translation unit has a &lt;a href=&quot;https://github.com/SeaOfNodes/Simple/blob/1426384fc7d0e9947e38ad6d523a5e53c324d710/chapter10/src/main/java/com/seaofnodes/simple/node/StartNode.java#L33&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StartNode&lt;/code&gt;&lt;/a&gt; that doles out
different classes of memory nodes for each alias class. Each IR node then takes
data dependencies on whatever effect nodes it might uses.&lt;/p&gt;

&lt;p&gt;Alias classes are split up based on the paper &lt;a href=&quot;/assets/img/tbaa.pdf&quot;&gt;Type-Based Alias Analysis&lt;/a&gt;
(PDF): “Our approach is a form of TBAA similar to the ‘FieldTypeDecl’ algorithm
described in the paper.”&lt;/p&gt;

&lt;!--

Cliff Click says:

All effects are represented as edges in the graph, the same edges as normal value flows, and all edges in Simple/C2 are simple pointers (and hence are unlabeled).

StartNode produces all effects and StopNode consumes them; same for Call and CallEnd.
Effects, being just another form of value, can be merged in PhiNodes.
Effects are generally split into smaller disjoint pieces, and recombined before Stop/Call.  Splitting into disjoint pieces allows more precision in the IR, and so more optimizations.
The common first split is the Memory effect from all other effects.  Other effects are generally some form of abstract i/o (all file system operations, reading/writing device controller memory, all external calls to disjoint address spaces, etc), or control.  Control is Just Another Edge denoting normal control flow, and e.g. data ops that depend on a prior control op use it to guard for safety.  Things like div-by-0, or null-ptr-check, or array-index-OOB are all done with a control edge to the guarding test.

Memory effects are further split into disjoint aliases; operations in one alias class can never overlap with another (this is a Y/N choice, not a may/must choice).  These aliases are equivalence classes; all mem ops belong in exactly one class, and the set of classes exactly partitions all of memory.  Common splits are fields in a struct (no &apos;f&apos; field ever overlaps with any &apos;g&apos; field), or kinds of arrays (no int[] overlaps with a flt[]).

In this example a = l[0]; l[0] = 5, we might have as IR:

a = Load(ctrl-for-AIOOB, mem-for-int[], offset);
mem-for-int[] = Store(ctrl-for-AIOOB, mem-for-int[], offset, 5)



Note that the Load and Store are not ordered here.  This Store IS ordered against all other int[] Stores.
The serializing algo Global Code Motion will add an anti-dep as needed, and then order the Load &amp; Store.

Splitting is basically by having a &quot;narrow&quot; user read from a &quot;fat memory&quot;.  Narrow, because its using a single alias and is one of the memops (e.g. Loads and Stores).  A &quot;fat memory&quot; always comes from Start &amp; CallEnd.  A MemMerge can merge a bunch of narrow aliases (and one fat) and make a fat memory.  Basically its all done lazily by &quot;doing nothing&quot;, and requiring the graph builder not produce a junk graph.

Splitting happens when the Parser decides you are manipulating a slice.
THere are some peephole&apos;s for widening the split region over a larger area, allowing more memory optimizations in the larger wider area.
Load &amp; Stores have a peep to move &quot;up past&quot; a MemMerge on the correct alias edge.
--&gt;

&lt;p&gt;The Simple project is structured into sequential implementation stages and
alias classes come into the picture in &lt;a href=&quot;https://github.com/SeaOfNodes/Simple/tree/main/chapter10&quot;&gt;Chapter 10&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Because I spent a while spelunking through other implementations to see how
other projects did this, here is a list of the projects I looked at. Mostly,
they use bitsets.&lt;/p&gt;

&lt;h2 id=&quot;other-implementations&quot;&gt;Other implementations&lt;/h2&gt;

&lt;h3 id=&quot;hhvm&quot;&gt;HHVM&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/facebook/hhvm&quot;&gt;HHVM&lt;/a&gt;, a JIT for the
&lt;a href=&quot;https://hacklang.org/&quot;&gt;Hack&lt;/a&gt; language, also uses a bitset for its memory
effects. See for example: &lt;a href=&quot;https://github.com/facebook/hhvm/blob/0395507623c2c08afc1d54c0c2e72bc8a3bd87f1/hphp/runtime/vm/jit/alias-class.h&quot;&gt;alias-class.h&lt;/a&gt; and
&lt;a href=&quot;https://github.com/facebook/hhvm/blob/0395507623c2c08afc1d54c0c2e72bc8a3bd87f1/hphp/runtime/vm/jit/memory-effects.h&quot;&gt;memory-effects.h&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;HHVM has a couple places that use this information, such as &lt;a href=&quot;https://github.com/facebook/hhvm/blob/4cdb85bf737450bf6cb837d3167718993f9170d7/hphp/runtime/vm/jit/def-sink.cpp&quot;&gt;a
definition-sinking pass&lt;/a&gt;, &lt;a href=&quot;https://github.com/facebook/hhvm/blob/0395507623c2c08afc1d54c0c2e72bc8a3bd87f1/hphp/runtime/vm/jit/alias-analysis.h&quot;&gt;alias
analysis&lt;/a&gt;, &lt;a href=&quot;https://github.com/facebook/hhvm/blob/4cdb85bf737450bf6cb837d3167718993f9170d7/hphp/runtime/vm/jit/dce.cpp&quot;&gt;DCE&lt;/a&gt;, &lt;a href=&quot;https://github.com/facebook/hhvm/blob/4cdb85bf737450bf6cb837d3167718993f9170d7/hphp/runtime/vm/jit/store-elim.cpp&quot;&gt;store
elimination&lt;/a&gt;, &lt;a href=&quot;https://github.com/facebook/hhvm/blob/1f9eda80656b79634b6956084481ed5a43d8bc2e/hphp/runtime/vm/jit/refcount-opts.cpp&quot;&gt;refcount opts&lt;/a&gt;, and
more.&lt;/p&gt;

&lt;p&gt;If you are wondering why the HHVM representation looks similar to the Cinder
representation, it’s because some former HHVM engineers such as Brett Simmers
also worked on Cinder!&lt;/p&gt;

&lt;h3 id=&quot;android-art&quot;&gt;Android ART&lt;/h3&gt;

&lt;p&gt;(note that I am linking an ART fork on GitHub as a reference, but the upstream
code is &lt;a href=&quot;https://android.googlesource.com/platform/art/+/refs/heads/main/compiler/optimizing/nodes.h&quot;&gt;hosted on googlesource&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Android’s &lt;a href=&quot;https://source.android.com/docs/core/runtime&quot;&gt;ART Java runtime&lt;/a&gt; also
uses a bitset for its effect representation. It’s a very compact class called
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SideEffects&lt;/code&gt; in &lt;a href=&quot;https://github.com/LineageOS/android_art/blob/c09a5c724799afdc5f89071b682b181c0bd23099/compiler/optimizing/nodes.h#L1602&quot;&gt;nodes.h&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The side effects are used in &lt;a href=&quot;https://github.com/LineageOS/android_art/blob/c09a5c724799afdc5f89071b682b181c0bd23099/compiler/optimizing/licm.cc#L104&quot;&gt;loop-invariant code motion&lt;/a&gt;, &lt;a href=&quot;https://github.com/LineageOS/android_art/blob/c09a5c724799afdc5f89071b682b181c0bd23099/compiler/optimizing/gvn.cc#L204&quot;&gt;global
value numbering&lt;/a&gt;, &lt;a href=&quot;https://github.com/LineageOS/android_art/blob/c09a5c724799afdc5f89071b682b181c0bd23099/compiler/optimizing/write_barrier_elimination.cc#L45&quot;&gt;write barrier
elimination&lt;/a&gt;, &lt;a href=&quot;https://github.com/LineageOS/android_art/blob/c09a5c724799afdc5f89071b682b181c0bd23099/compiler/optimizing/scheduler.cc#L55&quot;&gt;scheduling&lt;/a&gt;,
and more.&lt;/p&gt;

&lt;h3 id=&quot;netcoreclr&quot;&gt;.NET/CoreCLR&lt;/h3&gt;

&lt;p&gt;CoreCLR mostly &lt;a href=&quot;https://github.com/dotnet/runtime/blob/a0878687d02b42034f4ea433ddd7a72b741510b8/src/coreclr/jit/sideeffects.h#L169&quot;&gt;uses a bitset&lt;/a&gt; for its &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SideEffectSet&lt;/code&gt;
class. This one is interesting though because it also splits out effects
specifically to include sets of local variables (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LclVarSet&lt;/code&gt;).&lt;/p&gt;

&lt;h3 id=&quot;v8&quot;&gt;V8&lt;/h3&gt;

&lt;p&gt;V8 is also about six completely different compilers in a trenchcoat.&lt;/p&gt;

&lt;p&gt;Turboshaft uses a struct in &lt;a href=&quot;https://github.com/v8/v8/blob/e817fdf31a2947b2105bd665067d92282e4b4d59/src/compiler/turboshaft/operations.h#L577&quot;&gt;operations.h&lt;/a&gt; called
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OpEffects&lt;/code&gt; which is two bitsets for reads/writes of effects. This is used in
&lt;a href=&quot;https://github.com/v8/v8/blob/42f5ff65d12f0ef9294fa7d3875feba938a81904/src/compiler/turboshaft/value-numbering-reducer.h#L164&quot;&gt;value numbering&lt;/a&gt; as well a bunch of
other small optimization passes they call “reducers”.&lt;/p&gt;

&lt;p&gt;Maglev also has this thing called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NodeT::kProperties&lt;/code&gt; in &lt;a href=&quot;https://github.com/v8/v8/blob/42f5ff65d12f0ef9294fa7d3875feba938a81904/src/maglev/maglev-ir.h&quot;&gt;their IR
nodes&lt;/a&gt; that also looks like a bitset and is used in their various
reducers. It has effect query methods on it such as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;can_eager_deopt&lt;/code&gt; and
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;can_write&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Until recently, V8 also used Sea of Nodes as its IR representation, which also
tracks side effects more explicitly in the structure of the IR itself.&lt;/p&gt;

&lt;h2 id=&quot;guile&quot;&gt;Guile&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.gnu.org/software/guile/&quot;&gt;Guile Scheme&lt;/a&gt; looks like it has a &lt;a href=&quot;https://wingolog.org/archives/2014/05/18/effects-analysis-in-guile&quot;&gt;custom tagging
scheme&lt;/a&gt; type thing.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Both bitsets and int ranges are perfectly cromulent ways of representing heap
effects for your IR. The Sea of Nodes approach is also probably okay since it
powers HotSpot C2 and (for a time) V8.&lt;/p&gt;

&lt;p&gt;Remember to ask &lt;em&gt;the right questions&lt;/em&gt; of your IR when doing analysis.&lt;/p&gt;

&lt;h2 id=&quot;thank-you&quot;&gt;Thank you&lt;/h2&gt;

&lt;p&gt;Thank you to &lt;a href=&quot;http://www.filpizlo.com/&quot;&gt;Fil Pizlo&lt;/a&gt; for writing his initial
GitHub Gist and sending me on this journey and thank you to &lt;a href=&quot;https://www.chrisgregory.me/&quot;&gt;Chris
Gregory&lt;/a&gt;, Brett Simmers, and &lt;a href=&quot;https://ufuk.dev/&quot;&gt;Ufuk
Kayserilioglu&lt;/a&gt; for feedback on making some of the
explanations more helpful.&lt;/p&gt;

&lt;!--

TODO Dart
https://github.com/dart-lang/sdk/blob/59905c43f1a0394394ad5545ee439bcba63dea55/runtime/vm/constants_riscv.h#L968
https://github.com/dart-lang/sdk/blob/59905c43f1a0394394ad5545ee439bcba63dea55/runtime/vm/compiler/backend/redundancy_elimination.cc#L758
https://github.com/dart-lang/sdk/blob/59905c43f1a0394394ad5545ee439bcba63dea55/runtime/vm/compiler/backend/redundancy_elimination.cc#L1096

ChakraCore
https://github.com/chakra-core/ChakraCore/blob/2dba810c925eb366e44a1f7d7a5b2e289e2f8510/lib/Runtime/Types/RecyclableObject.h#L172

SpiderMonkey
https://github.com/servo/mozjs/blob/77645ed41f588297fd8d7edaee71500f4c83d070/mozjs-sys/mozjs/js/src/jit/MIR.h#L935
https://github.com/servo/mozjs/blob/77645ed41f588297fd8d7edaee71500f4c83d070/mozjs-sys/mozjs/js/src/jit/MIR.h#L9658

Cinder LIR
https://github.com/facebookincubator/cinderx/blob/main/cinderx/Jit/lir/instruction.h

HotSpot C1

HotSpot C2

PyPy
https://github.com/pypy/pypy/blob/main/rpython/jit/codewriter/effectinfo.py
https://github.com/pypy/pypy/blob/main/rpython/jit/metainterp/optimizeopt/heap.py#L59

LLVM
https://llvm.org/docs/LangRef.html#tbaa-metadata

LLVM MemorySSA
https://llvm.org/docs/MemorySSA.html

MLIR
https://mlir.llvm.org/docs/Rationale/SideEffectsAndSpeculation/

MEMOIR
https://conf.researchr.org/details/cgo-2024/cgo-2024-main-conference/31/Representing-Data-Collections-in-an-SSA-Form

Scala LMS graph IR
https://2023.splashcon.org/details/splash-2023-oopsla/46/Graph-IRs-for-Impure-Higher-Order-Languages-Making-Aggressive-Optimizations-Affordab

MIR and borrow checker
https://rustc-dev-guide.rust-lang.org/part-3-intro.html#source-code-representation

&gt; &quot;Fabrice Rastello, Florent Bouchez Tichadou (2022) SSA-based Compiler Design&quot;--most (all?) chapters in Part III, Extensions, are pretty much motivated by doing alias analysis in some way

Intermediate Representations in Imperative Compilers: A Survey
http://kameken.clique.jp/Lectures/Lectures2013/Compiler2013/a26-stanier.pdf

Partitioned Lattice per Variable (PLV) -- that&apos;s in Chapter 13 on SSI

TODO maybe lattice in ascent

--&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:dominance&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Update: After reading &lt;a href=&quot;https://github.com/bytecodealliance/wasmtime/issues/4371#issuecomment-1255956651&quot;&gt;Amanieu’s
comment&lt;/a&gt;
while writing my post on &lt;a href=&quot;/blog/value-numbering/&quot;&gt;value numbering&lt;/a&gt;, I
realized that heap-range-subtype is literally the same as checking dominance with
pre-order-post-order comparison. This in retrospect makes a lot of sense.
It’s just a tree. &lt;a href=&quot;#fnref:dominance&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:dfg-use-type&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This is because the DFG compiler does this interesting thing
where they track and guard the input types on &lt;em&gt;use&lt;/em&gt; vs having types
attached to the input’s own &lt;em&gt;def&lt;/em&gt;. It might be a clean way to handle shapes
inside the type system while also allowing the type+shape of an object to
change over time (which it can do in many dynamic language runtimes). &lt;a href=&quot;#fnref:dfg-use-type&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Tue, 11 Nov 2025 00:00:00 +0000</pubDate>
            <niceDate>November 11, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/compiler-effects/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/compiler-effects/</guid>
        </item>
        
    </channel>
</rss>
