<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[In Pursuit of Laziness]]></title>
  <link href="http://manishearth.github.io/atom.xml" rel="self"/>
  <link href="http://manishearth.github.io/"/>
  <updated>2024-08-21T01:01:09+00:00</updated>
  <id>http://manishearth.github.io/</id>
  <author>
    <name><![CDATA[Manish Goregaokar]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <entry>
    <title type="html"><![CDATA[So Zero It's ... Negative? (Zero-Copy #3)]]></title>
    <link href="http://manishearth.github.io/blog/2022/08/03/zero-copy-3-so-zero-its-dot-dot-dot-negative/"/>
    <updated>2022-08-03T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2022/08/03/zero-copy-3-so-zero-its-dot-dot-dot-negative</id>
    <content type="html"><![CDATA[<p><em>This is part 3 of a three-part series on interesting abstractions for zero-copy deserialization I’ve been working on over the last year. This part is about eliminating the deserialization step entirely. Part 1 is about making it more pleasant to work with and can be found <a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-1-not-a-yoking-matter/">here</a>; while Part 2 is about making it work for more types and can be found <a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-2-zero-copy-all-the-things/">here</a>.  The posts can be read in any order, though only the first post contains an explanation of what zero-copy deserialization</em> is.</p>

<blockquote>
  <p>And when Alexander saw the breadth of his work, he wept. For there were no more copies left to zero.</p>

  <p>—Hans Gruber, after designing three increasingly unhinged zero-copy crates</p>
</blockquote>

<p><a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-1-not-a-yoking-matter/">Part 1</a> of this series attempted to answer the question “how can we make zero-copy deserialization <em>pleasant</em>”, while <a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-2-zero-copy-all-the-things/">part 2</a> answered “how do we make zero-copy deserialization <em>more useful</em>?”.</p>

<p>This part goes one step further and asks “what if we could avoid deserialization altogether?”.</p>

<div class="discussion discussion-example">
            <img class="bobblehead" width="60px" height="60px" title="Confused pion" alt="Speech bubble for character Confused pion" src="http://manishearth.github.io/images/pion-nought.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Wait, what?
            </div>
        </div>

<p>Bear with me.</p>

<p>As mentioned in the previous posts, internationalization libraries like <a href="https://github.com/unicode-org/icu4x">ICU4X</a> need to be able to load and manage a lot of internationalization data. ICU4X in particular wants this part of the process to be as flexible and efficient as possible. The focus on efficiency is why we use zero-copy deserialization for basically everything, whereas the focus on flexibility has led to a robust and pluggable data loading infrastructure that allows you to mix and match data sources.</p>

<p>Deserialization is a <em>great</em> way to load data since it’s in and of itself quite flexible! You can put your data in a neat little package and load it off the filesystem! Or send it over the network! It’s even better when you have efficient techniques like zero-copy deserialization because the cost is low.</p>

<p>But the thing is, there is still a cost. Even with zero-copy deserialization, you have to <em>validate</em> the data you receive. It’s often a cost folks are happy to pay, but that’s not always the case.</p>

<p>For example, you might be, say, <a href="https://www.mozilla.org/en-US/firefox/">a web browser interested in using ICU4X</a>, and you <em>really</em> care about startup times. Browsers typically need to set up a lot of stuff when being started up (and when opening a new tab!), and every millisecond counts when it comes to giving the user a smooth experience. Browsers also typically ship with most of the internationalization data they need already. Spending precious time deserializing data that you shipped with is suboptimal.</p>

<p>What would be ideal would be something that works like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="n">DATA</span><span class="p">:</span> <span class="o">&amp;</span><span class="n">Data</span> <span class="o">=</span> <span class="o">&amp;</span><span class="nn">serde_json</span><span class="p">::</span><span class="nd">deserialize!</span><span class="p">(</span><span class="nd">include_bytes!</span><span class="p">(</span><span class="s">"./testdata.json"</span><span class="p">));</span>
</code></pre></div></div>

<p>where you can have stuff get deserialized at compile time and loaded into a static. Unfortunately, Rust <code class="language-plaintext highlighter-rouge">const</code> support is not at the stage where the above code is possible whilst working within serde’s generic framework, though it might be in a year or so.</p>

<p>You <em>could</em> write a very unsafe version of <code class="language-plaintext highlighter-rouge">serde::Deserialize</code> that operates on fully trusted data and uses some data format that is easy to zero-copy deserialize whilst avoiding any kind of validation. However, this would still have some cost: you still have to scan the data to reconstruct the full deserialized output. More importantly, it would require a parallel universe of unsafe serde-like traits that everyone has to derive or implement, where even small bugs in manual implementations would likely cause memory corruption.</p>

<div class="discussion discussion-note">
            <img class="bobblehead" width="60px" height="60px" title="Positive pion" alt="Speech bubble for character Positive pion" src="http://manishearth.github.io/images/pion-plus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Sounds like you need some format that needs no validation or scanning to zero-copy deserialize, and can be produced safely. But that doesn’t exist, does it?
            </div>
        </div>

<p>It does.</p>

<p>… but you’re not going to like where I’m going with this.</p>

<div class="discussion discussion-note">
            <img class="bobblehead" width="60px" height="60px" title="Positive pion" alt="Speech bubble for character Positive pion" src="http://manishearth.github.io/images/pion-plus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Oh no.
            </div>
        </div>

<p>There is such a format: <em>Rust code</em>. Specifically, Rust code in <code class="language-plaintext highlighter-rouge">static</code>s. When compiled, Rust <code class="language-plaintext highlighter-rouge">static</code>s are basically “free” to load, beyond the typical costs involved in paging in memory. The Rust compiler trusts itself to be good at codegen, so it doesn’t need validation when loading a compiled <code class="language-plaintext highlighter-rouge">static</code> from memory. There is the possibility of codegen bugs, however we have to trust the compiler about that for the rest of our program anyway!</p>

<p>This is even more “zero” than “zero-copy deserialization”! Regular “zero copy deserialization” still involves a scanning and potentially a validation step, it’s really more about “zero allocations” than actually avoiding <em>all</em> of the copies. On the other hand, there’s truly no copies or anything going on when you load Rust statics; it’s already ready to go as a <code class="language-plaintext highlighter-rouge">&amp;'static</code> reference!</p>

<p>We just have to figure out a way to “serialize to <code class="language-plaintext highlighter-rouge">const</code> Rust code” such that the resultant Rust code could just be compiled in to the binary, and people who need to load trusted data into ICU4X can load it for free!</p>

<div class="discussion discussion-example">
            <img class="bobblehead" width="60px" height="60px" title="Confused pion" alt="Speech bubble for character Confused pion" src="http://manishearth.github.io/images/pion-nought.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             What does “<code class="language-plaintext highlighter-rouge">const</code> code” mean in this context?
            </div>
        </div>

<p>In Rust, <code class="language-plaintext highlighter-rouge">const</code> code essentially is code that can be proven to be side-effect-free, and it’s the only kind of code allowed in <code class="language-plaintext highlighter-rouge">static</code>s, <code class="language-plaintext highlighter-rouge">const</code>s, and <code class="language-plaintext highlighter-rouge">const fn</code>s.</p>

<div class="discussion discussion-example">
            <img class="bobblehead" width="60px" height="60px" title="Confused pion" alt="Speech bubble for character Confused pion" src="http://manishearth.github.io/images/pion-nought.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             I see! Does this code actually have to be “constant”?
            </div>
        </div>

<p>Not quite! Rust supports mutation and even things like for loops in <code class="language-plaintext highlighter-rouge">const</code> code! Ultimately, it has to be the kind of code that <em>can</em> be computed at compile time with no difference of behavior: so no reading from files or the network, or using random numbers.</p>

<p>For a long time only very simple code was allowed in <code class="language-plaintext highlighter-rouge">const</code>, but over the last year the scope of what that environment can do has expanded greatly, and it’s actually possible to do complicated things here, which is precisely what enables us to actually do “serialize to Rust code” in a reasonable way.</p>

<h2 id="databake"><code class="language-plaintext highlighter-rouge">databake</code></h2>

<p><em>A lot of the design here can also be found in the <a href="https://docs.google.com/document/d/192l7yr6hVnG11Dr8a7mDLonIb6c8rr6zq-iswrZtlXE/edit">design doc</a>. While I did the bulk of the design for this crate, it was almost completely implemented by <a href="https://github.com/robertbastian">Robert</a>, who also worked on integrating it into ICU4X, and cleaned up the design in the process.</em></p>

<p>Enter <a href="https://docs.rs/databake"><code class="language-plaintext highlighter-rouge">databake</code></a> (née <code class="language-plaintext highlighter-rouge">crabbake</code>). <code class="language-plaintext highlighter-rouge">databake</code> is a crate that provides just this; the ability to serialize your types to <code class="language-plaintext highlighter-rouge">const</code> code that can then be used in <code class="language-plaintext highlighter-rouge">static</code>s allowing for truly zero-cost data loading, no deserialization necessary!</p>

<p>The core entry point to <code class="language-plaintext highlighter-rouge">databake</code> is the <code class="language-plaintext highlighter-rouge">Bake</code> trait:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">trait</span> <span class="n">Bake</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">bake</span><span class="p">(</span><span class="o">&amp;</span><span class="k">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="o">&amp;</span><span class="n">CrateEnv</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">TokenStream</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>A <code class="language-plaintext highlighter-rouge">TokenStream</code> is the type typically used in Rust <a href="https://doc.rust-lang.org/reference/procedural-macros.html">procedural macros</a> to represent a snippet of Rust code. The <code class="language-plaintext highlighter-rouge">Bake</code> trait allows you to take an instance of a type, and convert it to Rust code that represents the same value.</p>

<p>The <code class="language-plaintext highlighter-rouge">CrateEnv</code> object is used to track which crates are needed, so that it is possible for tools generating this code to let the user know which direct dependencies are needed.</p>

<p>This trait is augmented by a <a href="https://docs.rs/databake/0.1.1/databakee/derive.Bake.html"><code class="language-plaintext highlighter-rouge">#[derive(Bake)]</code></a> custom derive that can be used to apply it to most types automatically:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// inside crate `bar`, module `module.rs`</span>

<span class="k">use</span> <span class="nn">databake</span><span class="p">::</span><span class="n">Bake</span><span class="p">;</span>

<span class="nd">#[derive(Bake)]</span>
<span class="nd">#[databake(path</span> <span class="nd">=</span> <span class="nd">bar::module)]</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Person</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span> <span class="p">{</span>
   <span class="k">pub</span> <span class="n">name</span><span class="p">:</span> <span class="o">&amp;</span><span class="nv">'a</span> <span class="nb">str</span><span class="p">,</span>
   <span class="k">pub</span> <span class="n">age</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p>As with most custom derives, this only works on structs and enums that contain other types that already implement <code class="language-plaintext highlighter-rouge">Bake</code>. Most types not involving mandatory allocation should be able to.</p>

<h2 id="how-to-use-it">How to use it</h2>

<p><code class="language-plaintext highlighter-rouge">databake</code> itself doesn’t really prescribe any particular code generation strategy. It can be used in a proc macro or in a <code class="language-plaintext highlighter-rouge">build.rs</code>, or, even in a separate binary. ICU4X does the latter, since that’s just what ICU4X’s model for data generation is: clients can use the binary to customize the format and contents of the data they need.</p>

<p>So a typical way of using this crate might be to do something like this in <code class="language-plaintext highlighter-rouge">build.rs</code>:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">some_dep</span><span class="p">::</span><span class="n">Data</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">databake</span><span class="p">::</span><span class="n">Bake</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">quote</span><span class="p">::</span><span class="n">quote</span><span class="p">;</span>

<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
   <span class="c1">// load data from file</span>
   <span class="k">let</span> <span class="n">json_data</span> <span class="o">=</span> <span class="nd">include_str!</span><span class="p">(</span><span class="s">"data.json"</span><span class="p">);</span>

   <span class="c1">// deserialize from json</span>
   <span class="k">let</span> <span class="n">my_data</span><span class="p">:</span> <span class="n">Data</span> <span class="o">=</span> <span class="nn">serde_json</span><span class="p">::</span><span class="nf">from_str</span><span class="p">(</span><span class="n">json_data</span><span class="p">);</span>

   <span class="c1">// get a token tree out of it</span>
   <span class="k">let</span> <span class="n">baked</span> <span class="o">=</span> <span class="n">my_data</span><span class="nf">.bake</span><span class="p">();</span>


   <span class="c1">// Construct rust code with this in a static</span>
   <span class="c1">// The quote macro is used by procedural macros to do easy codegen,</span>
   <span class="c1">// but it's useful in build scripts as well.</span>
   <span class="k">let</span> <span class="n">my_data_rs</span> <span class="o">=</span> <span class="nd">quote!</span> <span class="p">{</span>
      <span class="k">use</span> <span class="nn">some_dep</span><span class="p">::</span><span class="n">Data</span><span class="p">;</span>
      <span class="k">static</span> <span class="n">MY_DATA</span><span class="p">:</span> <span class="n">Data</span> <span class="o">=</span> #<span class="n">baked</span><span class="p">;</span>
   <span class="p">}</span>

   <span class="c1">// Write to file</span>
   <span class="k">let</span> <span class="n">out_dir</span> <span class="o">=</span> <span class="nn">env</span><span class="p">::</span><span class="nf">var_os</span><span class="p">(</span><span class="s">"OUT_DIR"</span><span class="p">)</span><span class="nf">.unwrap</span><span class="p">();</span>
   <span class="k">let</span> <span class="n">dest_path</span> <span class="o">=</span> <span class="nn">Path</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="o">&amp;</span><span class="n">out_dir</span><span class="p">)</span><span class="nf">.join</span><span class="p">(</span><span class="s">"data.rs"</span><span class="p">);</span>
   <span class="nn">fs</span><span class="p">::</span><span class="nf">write</span><span class="p">(</span>
      <span class="o">&amp;</span><span class="n">dest_path</span><span class="p">,</span>
      <span class="o">&amp;</span><span class="n">my_data_rs</span><span class="nf">.to_string</span><span class="p">()</span>
   <span class="p">)</span><span class="nf">.unwrap</span><span class="p">();</span>

   <span class="c1">// (Optional step omitted: run rustfmt on the file)</span>

   <span class="c1">// tell Cargo that we depend on this file</span>
   <span class="nd">println!</span><span class="p">(</span><span class="s">"cargo:rerun-if-changed=src/data.json"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="what-it-looks-like">What it looks like</h2>

<p>ICU4X generates all of its test data into JSON, <a href="https://docs.rs/postcard"><code class="language-plaintext highlighter-rouge">postcard</code></a>, and “baked” formats. For example, for <a href="https://github.com/unicode-org/icu4x/blob/7b52dbfe57043da5459c12627671a779d467dc0f/provider/testdata/data/json/decimal/symbols%401/ar-EG.json">this JSON data representing how a particular locale does numbers</a>, the “baked” data looks like <a href="https://github.com/unicode-org/icu4x/blob/7b52dbfe57043da5459c12627671a779d467dc0f/provider/testdata/data/baked/decimal/symbols_v1.rs#L24-L41">this</a>. That’s a rather simple data type, but we do use this for more complex data like <a href="https://raw.githubusercontent.com/unicode-org/icu4x/7b52dbfe57043da5459c12627671a779d467dc0f/provider/testdata/data/baked/datetime/datesymbols_v1.rs">date time symbol data</a>, which is unfortunately too big for GitHub to render normally.</p>

<p>ICU4X’s code for generating this is in <a href="https://github.com/unicode-org/icu4x/blob/3f4d841ef0b168031d837433d075308bbebf34b7/provider/datagen/src/databake.rs">this file</a>. It’s complicated primarily because ICU4X’s data generation pipeline is super configurable and complicated, The core thing that it does is, for each piece of data, it <a href="https://github.com/unicode-org/icu4x/blob/3f4d841ef0b168031d837433d075308bbebf34b7/provider/datagen/src/databake.rs#L118">calls <code class="language-plaintext highlighter-rouge">tokenize()</code></a>, which is a thin wrapper around <a href="https://github.com/unicode-org/icu4x/blob/882e23403327620e4aafde28a9a407bcc6245a54/provider/core/src/datagen/payload.rs#L131-L136">calling <code class="language-plaintext highlighter-rouge">.bake()</code> on the data and some other stuff</a>. It then takes all of the data and organizes it into files like those linked above, populated with a static for each piece of data. In our case, we include all this generated rust code into our “testdata” crate as a module, but there are many possibilities here!</p>

<p>For our “test” data, which is currently 2.7 MB in the <a href="https://docs.rs/postcard"><code class="language-plaintext highlighter-rouge">postcard</code></a> format (which is optimized for being lightweight), the same data ends up being 11 MB of JSON, and 18 MB of generated Rust code! That’s … a lot of Rust code, and tools like rust-analyzer struggle to load it. It’s of course much smaller once compiled into the binary, though that’s much harder to measure, because Rust is quite aggressive at optimizing unused data out in the baked version (where it has ample opportunity to). From various unscientific tests, it seems like 2MB of deduplicated postcard data corresponds to roughly 500KB of deduplicated baked data. This makes sense, since one can expect baked data to be near the theoretical limit of how small the data is without applying some heavy compression. Furthermore, while we deduplicate baked data at a per-locale level, it can take advantage of LLVM’s ability to deduplicate statics further, so if, for example, two different locales have <em>mostly</em> the same data for a given data key<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> with some differences, LLVM may be able to use the same statics for sub-data.</p>

<h2 id="limitations">Limitations</h2>

<p><code class="language-plaintext highlighter-rouge">const</code> support in Rust still has a ways to go. For example, it doesn’t yet support creating objects like <code class="language-plaintext highlighter-rouge">String</code>s which are usually on the heap, though <a href="https://github.com/rust-lang/const-eval/issues/20">they are working on allowing this</a>. This isn’t a huge problem for us; all of our data already supports zero-copy deserialization, which means that for every instance of our data types, there is <em>some way</em> to represent it as a borrow from another <code class="language-plaintext highlighter-rouge">static</code>.</p>

<p>A more pesky limitation is that you can’t interact with traits in <code class="language-plaintext highlighter-rouge">const</code> environments. To some extent, were that possible, the purpose of this crate could also have been fulfilled by making the <code class="language-plaintext highlighter-rouge">serde</code> pipeline <code class="language-plaintext highlighter-rouge">const</code>-friendly<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>, and then the code snippet from the beginning of this post would work:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="n">DATA</span><span class="p">:</span> <span class="o">&amp;</span><span class="n">Data</span> <span class="o">=</span> <span class="o">&amp;</span><span class="nn">serde_json</span><span class="p">::</span><span class="nd">deserialize!</span><span class="p">(</span><span class="nd">include_bytes!</span><span class="p">(</span><span class="s">"./testdata.json"</span><span class="p">));</span>
</code></pre></div></div>

<p>This means that for things like <code class="language-plaintext highlighter-rouge">ZeroVec</code> (see <a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-2-zero-copy-all-the-things/">part 2</a>), we can’t actually just make their safe constructors <code class="language-plaintext highlighter-rouge">const</code> and pass in data to be validated — the validation code is all behind traits — so we have to unsafely construct them. This is somewhat unfortunate, however ultimately if the <code class="language-plaintext highlighter-rouge">zerovec</code> byte representation had trouble roundtripping we would have larger problems, so it’s not an introduction of a new surface of unsafety. We’re still able to validate things when <em>generating</em> the baked data, we just can’t get the compiler to also re-validate before agreeing to compile the <code class="language-plaintext highlighter-rouge">const</code> code.</p>

<h2 id="try-it-out">Try it out!</h2>

<p><a href="https://docs.rs/databake"><code class="language-plaintext highlighter-rouge">databake</code></a> is much less mature compared to <a href="https://docs.rs/yoke"><code class="language-plaintext highlighter-rouge">yoke</code></a> and <a href="https://docs.rs/zerovec"><code class="language-plaintext highlighter-rouge">zerovec</code></a>, but it does seem to work rather well so far. Try it out! Let me know what you think!</p>

<p><em>Thanks to <a href="https://twitter.com/plaidfinch">Finch</a>, <a href="https://twitter.com/yaahc_">Jane</a>, <a href="https://github.com/sffc">Shane</a>, and <a href="https://github.com/robertbastian">Robert</a> for reviewing drafts of this post</em></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>In ICU4X, a “data key” can be used to talk about a specific type of data, for example the decimal symbols data has a <code class="language-plaintext highlighter-rouge">decimal/symbols@1</code> data key. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Mind you, this would not be an easy task, but it would likely integrate with the ecosystem really well. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Zero-Copy All the Things! (Zero-Copy #2)]]></title>
    <link href="http://manishearth.github.io/blog/2022/08/03/zero-copy-2-zero-copy-all-the-things/"/>
    <updated>2022-08-03T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2022/08/03/zero-copy-2-zero-copy-all-the-things</id>
    <content type="html"><![CDATA[<p><em>This is part 2 of a three-part series on interesting abstractions for zero-copy deserialization I’ve been working on over the last year. This part is about making zero-copy deserialization work for more types. Part 1 is about making it more pleasant to work with and can be found <a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-1-not-a-yoking-matter/">here</a>; while Part 3 is about eliminating the deserialization step entirely and can be found <a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-3-so-zero-its-dot-dot-dot-negative/">here</a>. The posts can be read in any order, though only the first post contains an explanation of what zero-copy deserialization</em> is.</p>

<h2 id="background">Background</h2>

<p><em>This section is the same as in the last article and can be skipped if you’ve read it</em></p>

<p>For the past year and a half I’ve been working full time on <a href="https://github.com/unicode-org/icu4x">ICU4X</a>, a new internationalization library in Rust being built under the Unicode Consortium as a collaboration between various companies.</p>

<p>There’s a lot I can say about ICU4X, but to focus on one core value proposition: we want it to be <em>modular</em> both in data and code. We want ICU4X to be usable on embedded platforms, where memory is at a premium. We want applications constrained by download size to be able to support all languages rather than pick a couple popular ones because they cannot afford to bundle in all that data. As a part of this, we want loading data to be <em>fast</em> and pluggable. Users should be able to design their own data loading strategies for their individual use cases.</p>

<p>See, a key part of performing correct internationalization is the <em>data</em>. Different locales<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> do things differently, and all of the information on this needs to go somewhere, preferably not code. You need data on how a particular locale formats dates<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>, or how plurals work in a particular language, or how to accurately segment languages like Thai which are typically not written with spaces so that you can insert linebreaks in appropriate positions.</p>

<p>Given the focus on data, a <em>very</em> attractive option for us is zero-copy deserialization. In the process of trying to do zero-copy deserialization well, we’ve built some cool new libraries, this article is about one of them.</p>

<h2 id="what-can-you-zero-copy">What can you zero-copy?</h2>

<div class="discussion discussion-note">
            <img class="bobblehead" width="60px" height="60px" title="Positive pion" alt="Speech bubble for character Positive pion" src="http://manishearth.github.io/images/pion-plus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             If you’re unfamiliar with zero-copy deserialization, check out the explanation in the <a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-1-not-a-yoking-matter/">previous article</a>!
            </div>
        </div>

<p>In the <a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-1-not-a-yoking-matter/">previous article</a> we explored how zero-copy deserialization could be made more pleasant to work with by erasing the lifetimes. In essence, we were expanding our capabilities on <em>what you can do with</em> zero-copy data.</p>

<p>This article is about expanding our capabilities on <em>what we can make</em> zero-copy data.</p>

<p>We previously saw this struct:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[derive(Serialize,</span> <span class="nd">Deserialize)]</span>
<span class="k">struct</span> <span class="n">Person</span> <span class="p">{</span>
    <span class="c1">// this field is nearly free to construct</span>
    <span class="n">age</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
    <span class="c1">// constructing this will involve a small allocation and copy</span>
    <span class="n">name</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span>
    <span class="c1">// this may take a while</span>
    <span class="n">rust_files_written</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p>and made the <code class="language-plaintext highlighter-rouge">name</code> field zero-copy by replacing it with a <code class="language-plaintext highlighter-rouge">Cow&lt;'a, str&gt;</code>. However, we weren’t able to do the same with the <code class="language-plaintext highlighter-rouge">rust_files_written</code> field because <a href="https://docs.rs/serde"><code class="language-plaintext highlighter-rouge">serde</code></a> does not handle zero-copy deserialization for things other than <code class="language-plaintext highlighter-rouge">[u8]</code> and <code class="language-plaintext highlighter-rouge">str</code>. Forget nested collections like <code class="language-plaintext highlighter-rouge">Vec&lt;String&gt;</code> (as <code class="language-plaintext highlighter-rouge">&amp;[&amp;str]</code>), even <code class="language-plaintext highlighter-rouge">Vec&lt;u32&gt;</code> (as <code class="language-plaintext highlighter-rouge">&amp;[u32]</code>) can’t be made zero-copy easily!</p>

<p>This is not a fundamental restriction in zero-copy deserialization, indeed, the excellent <a href="https://docs.rs/rkyv"><code class="language-plaintext highlighter-rouge">rkyv</code></a> library is able to support data like this. However, it’s not as slam-dunk easy as <code class="language-plaintext highlighter-rouge">str</code> and <code class="language-plaintext highlighter-rouge">[u8]</code> and it’s understandable that <a href="https://docs.rs/serde"><code class="language-plaintext highlighter-rouge">serde</code></a> wishes to not pick sides on any tradeoffs here and leave it up to the users.</p>

<p>So what’s the actual problem here?</p>

<h2 id="blefuscudian-bewilderment">Blefuscudian Bewilderment</h2>

<p>The short answer is: endianness, alignment, and for <code class="language-plaintext highlighter-rouge">Vec&lt;String&gt;</code>, indirection.</p>

<p>See, the way zero-copy deserialization works is by directly taking a pointer to the memory and declaring it to be the desired value. For this to work, that data <em>must</em> be of a kind that looks the same on all machines, and must be legal to take a reference to.</p>

<p>This is pretty straightforward for <code class="language-plaintext highlighter-rouge">[u8]</code> and <code class="language-plaintext highlighter-rouge">str</code>, their data is identical on every system. <code class="language-plaintext highlighter-rouge">str</code> does need a validation step to ensure it’s valid UTF-8, but the general thrust of zero-copy serialization is to replace expensive deserialization with cheaper validation, so we’re fine with that.</p>

<p>On the other hand, the borrowed version of <code class="language-plaintext highlighter-rouge">Vec&lt;String&gt;</code>, <code class="language-plaintext highlighter-rouge">&amp;[&amp;str]</code> is unlikely to look the same even across different executions of the program on the <em>same system</em>, because it contains pointers (indirection) that’ll change each time depending on the data source!</p>

<p>Pointers are hard. What about <code class="language-plaintext highlighter-rouge">Vec&lt;u32&gt;</code>/<code class="language-plaintext highlighter-rouge">[u32]</code>? Surely there’s nothing wrong with a pile of integers?</p>

<figure class="caption-wrapper center" style="width: 400px"><img class="caption" src="http://manishearth.github.io/images/post/castlevania-data.png" width="400" /><figcaption class="caption-text"><p><small>Dracula, dispensing wisdom on the subject of zero-copy deserialization.</small></p>
</figcaption></figure>

<p>This is where the endianness and alignment come in. Firstly, a <code class="language-plaintext highlighter-rouge">u32</code> doesn’t look exactly the same on all systems, some systems are “big endian”, where the integer <code class="language-plaintext highlighter-rouge">0x00ABCDEF</code> would be represented in memory as <code class="language-plaintext highlighter-rouge">[0x00, 0xAB, 0xCD, 0xEF]</code>, whereas others are “little endian” and would represent it <code class="language-plaintext highlighter-rouge">[0xEF, 0xCD, 0xAB, 0x00]</code>. Most systems these days are little-endian, but not all, so you may need to care about this.</p>

<p>This would mean that a <code class="language-plaintext highlighter-rouge">[u32]</code> serialized on a little endian system would come out completely garbled on a big-endian system if we’re naïvely zero-copy deserializing.</p>

<p>Secondly, a lot of systems impose <em>alignment</em> restrictions on types like <code class="language-plaintext highlighter-rouge">u32</code>. A <code class="language-plaintext highlighter-rouge">u32</code> cannot be found at any old memory address, on most modern systems it must be found at a memory address that’s a multiple of 4. Similarly, a <code class="language-plaintext highlighter-rouge">u64</code> must be at a memory address that’s a multiple of 8, and so on. The subsection of data being serialized, however, may be found at any address. It’s possible to design a serialization framework where a particular field in the data is forced to have a particular alignment (<a href="https://docs.rs/rkyv/latest/rkyv/util/struct.AlignedVec.html">rkyv has this</a>), however it’s kinda tricky and requires you to have control over the alignment of the original loaded data, which isn’t a part of serde’s model.</p>

<p>So how can we address this?</p>

<h2 id="zerovec-and-varzerovec">ZeroVec and VarZeroVec</h2>

<p><em>A lot of the design here can be found explained in the <a href="https://github.com/unicode-org/icu4x/blob/main/utils/zerovec/design_doc.md">design doc</a></em></p>

<p>After <a href="https://github.com/unicode-org/icu4x/issues/78#issuecomment-817090204">a bunch of discussions</a> with <a href="https://github.com/sffc">Shane</a>, we designed and wrote <a href="https://docs.rs/zerovec"><code class="language-plaintext highlighter-rouge">zerovec</code></a>, a crate that attempts to solve this problem, in a way that works with <a href="https://docs.rs/serde"><code class="language-plaintext highlighter-rouge">serde</code></a>.</p>

<p>The core abstractions of the crate are the two types, <a href="https://docs.rs/zerovec/latest/zerovec/enum.ZeroVec.html"><code class="language-plaintext highlighter-rouge">ZeroVec</code></a> and <a href="https://docs.rs/zerovec/latest/zerovec/enum.VarZeroVec.html"><code class="language-plaintext highlighter-rouge">VarZeroVec</code></a>, which are essentially zero-copy enabled versions of <code class="language-plaintext highlighter-rouge">Cow&lt;'a, [T]&gt;</code>, for fixed-size and variable-size <code class="language-plaintext highlighter-rouge">T</code> types.</p>

<p><a href="https://docs.rs/zerovec/latest/zerovec/enum.ZeroVec.html"><code class="language-plaintext highlighter-rouge">ZeroVec</code></a> can be used with any type implementing <a href="https://docs.rs/zerovec/latest/zerovec/ule/trait.ULE.html"><code class="language-plaintext highlighter-rouge">ULE</code></a> (more on what this means later), which is by default all of the integer types and can be extended to <em>most</em> <code class="language-plaintext highlighter-rouge">Copy</code> types. It’s rather similar to <code class="language-plaintext highlighter-rouge">&amp;[T]</code>, however instead of returning <em>references</em> to its elements, it copies them out. While <a href="https://docs.rs/zerovec/latest/zerovec/enum.ZeroVec.html"><code class="language-plaintext highlighter-rouge">ZeroVec</code></a> is a <code class="language-plaintext highlighter-rouge">Cow</code>-like borrowed-or-owned type<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>, there is a fully borrowed variant <a href="https://docs.rs/zerovec/latest/zerovec/struct.ZeroSlice.html"><code class="language-plaintext highlighter-rouge">ZeroSlice</code></a> that it derefs to.</p>

<p>Similarly, <a href="https://docs.rs/zerovec/latest/zerovec/enum.VarZeroVec.html"><code class="language-plaintext highlighter-rouge">VarZeroVec</code></a> may be used with types implementing <a href="https://docs.rs/zerovec/latest/zerovec/ule/trait.VarULE.html"><code class="language-plaintext highlighter-rouge">VarULE</code></a> (e.g. <code class="language-plaintext highlighter-rouge">str</code>). It <em>is</em> able to hand out references <code class="language-plaintext highlighter-rouge">VarZeroVec&lt;str&gt;</code> behaves very similarly to how <code class="language-plaintext highlighter-rouge">&amp;[str]</code> would work if such a type were allowed to exist in Rust. You can even nest them, making types like <code class="language-plaintext highlighter-rouge">VarZeroVec&lt;VarZeroSlice&lt;ZeroSlice&lt;u32&gt;&gt;&gt;</code>, the zero-copy equivalent of <code class="language-plaintext highlighter-rouge">Vec&lt;Vec&lt;Vec&lt;u32&gt;&gt;&gt;</code>.</p>

<p>There’s also a <a href="https://docs.rs/zerovec/latest/zerovec/enum.ZeroMap.html"><code class="language-plaintext highlighter-rouge">ZeroMap</code></a> type that provides a binary-search based map that works with types compatible with either <a href="https://docs.rs/zerovec/latest/zerovec/enum.ZeroVec.html"><code class="language-plaintext highlighter-rouge">ZeroVec</code></a> or <a href="https://docs.rs/zerovec/latest/zerovec/enum.VarZeroVec.html"><code class="language-plaintext highlighter-rouge">VarZeroVec</code></a>.</p>

<p>So, for example, to make the following struct zero-copy:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[derive(serde::Serialize,</span> <span class="nd">serde::Deserialize)]</span>
<span class="k">struct</span> <span class="n">DataStruct</span> <span class="p">{</span>
    <span class="n">nums</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">u32</span><span class="o">&gt;</span><span class="p">,</span>
    <span class="n">chars</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">char</span><span class="o">&gt;</span><span class="p">,</span>
    <span class="n">strs</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p>you can do something like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[derive(serde::Serialize,</span> <span class="nd">serde::Deserialize)]</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">DataStruct</span><span class="o">&lt;</span><span class="nv">'data</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="nd">#[serde(borrow)]</span>
    <span class="n">nums</span><span class="p">:</span> <span class="n">ZeroVec</span><span class="o">&lt;</span><span class="nv">'data</span><span class="p">,</span> <span class="nb">u32</span><span class="o">&gt;</span><span class="p">,</span>
    <span class="nd">#[serde(borrow)]</span>
    <span class="n">chars</span><span class="p">:</span> <span class="n">ZeroVec</span><span class="o">&lt;</span><span class="nv">'data</span><span class="p">,</span> <span class="nb">char</span><span class="o">&gt;</span><span class="p">,</span>
    <span class="nd">#[serde(borrow)]</span>
    <span class="n">strs</span><span class="p">:</span> <span class="n">VarZeroVec</span><span class="o">&lt;</span><span class="nv">'data</span><span class="p">,</span> <span class="nb">str</span><span class="o">&gt;</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Once deserialized, the data can be accessed with <code class="language-plaintext highlighter-rouge">data.nums.get(index)</code> or <code class="language-plaintext highlighter-rouge">data.strs[index]</code>, etc.</p>

<p>Custom types can also be supported within these types with some effort, if you’d like the following complex data to be zero-copy:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[derive(Copy,</span> <span class="nd">Clone,</span> <span class="nd">PartialEq,</span> <span class="nd">Eq,</span> <span class="nd">Ord,</span> <span class="nd">PartialOrd,</span> <span class="nd">serde::Serialize,</span> <span class="nd">serde::Deserialize)]</span>
<span class="k">struct</span> <span class="n">Date</span> <span class="p">{</span>
    <span class="n">y</span><span class="p">:</span> <span class="nb">u64</span><span class="p">,</span>
    <span class="n">m</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
    <span class="n">d</span><span class="p">:</span> <span class="nb">u8</span>
<span class="p">}</span>

<span class="nd">#[derive(Clone,</span> <span class="nd">PartialEq,</span> <span class="nd">Eq,</span> <span class="nd">Ord,</span> <span class="nd">PartialOrd,</span> <span class="nd">serde::Serialize,</span> <span class="nd">serde::Deserialize)]</span>
<span class="k">struct</span> <span class="n">Person</span> <span class="p">{</span>
    <span class="n">birthday</span><span class="p">:</span> <span class="n">Date</span><span class="p">,</span>
    <span class="n">favorite_character</span><span class="p">:</span> <span class="nb">char</span><span class="p">,</span>
    <span class="n">name</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span>
<span class="p">}</span>

<span class="nd">#[derive(serde::Serialize,</span> <span class="nd">serde::Deserialize)]</span>
<span class="k">struct</span> <span class="n">Data</span> <span class="p">{</span>
    <span class="n">important_dates</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Date</span><span class="o">&gt;</span><span class="p">,</span>
    <span class="n">important_people</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Person</span><span class="o">&gt;</span><span class="p">,</span>
    <span class="n">birthdays_to_people</span><span class="p">:</span> <span class="n">HashMap</span><span class="o">&lt;</span><span class="n">Date</span><span class="p">,</span> <span class="n">Person</span><span class="o">&gt;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>you can do something like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// custom fixed-size ULE type for ZeroVec</span>
<span class="nd">#[zerovec::make_ule(DateULE)]</span>
<span class="nd">#[derive(Copy,</span> <span class="nd">Clone,</span> <span class="nd">PartialEq,</span> <span class="nd">Eq,</span> <span class="nd">Ord,</span> <span class="nd">PartialOrd,</span> <span class="nd">serde::Serialize,</span> <span class="nd">serde::Deserialize)]</span>
<span class="k">struct</span> <span class="n">Date</span> <span class="p">{</span>
    <span class="n">y</span><span class="p">:</span> <span class="nb">u64</span><span class="p">,</span>
    <span class="n">m</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
    <span class="n">d</span><span class="p">:</span> <span class="nb">u8</span>
<span class="p">}</span>

<span class="c1">// custom variable sized VarULE type for VarZeroVec</span>
<span class="nd">#[zerovec::make_varule(PersonULE)]</span>
<span class="nd">#[zerovec::derive(Serialize,</span> <span class="nd">Deserialize)]</span> <span class="c1">// add Serde impls to PersonULE</span>
<span class="nd">#[derive(Clone,</span> <span class="nd">PartialEq,</span> <span class="nd">Eq,</span> <span class="nd">Ord,</span> <span class="nd">PartialOrd,</span> <span class="nd">serde::Serialize,</span> <span class="nd">serde::Deserialize)]</span>
<span class="k">struct</span> <span class="n">Person</span><span class="o">&lt;</span><span class="nv">'data</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">birthday</span><span class="p">:</span> <span class="n">Date</span><span class="p">,</span>
    <span class="n">favorite_character</span><span class="p">:</span> <span class="nb">char</span><span class="p">,</span>
    <span class="nd">#[serde(borrow)]</span>
    <span class="n">name</span><span class="p">:</span> <span class="n">Cow</span><span class="o">&lt;</span><span class="nv">'data</span><span class="p">,</span> <span class="nb">str</span><span class="o">&gt;</span><span class="p">,</span>
<span class="p">}</span>

<span class="nd">#[derive(serde::Serialize,</span> <span class="nd">serde::Deserialize)]</span>
<span class="k">struct</span> <span class="n">Data</span><span class="o">&lt;</span><span class="nv">'data</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="nd">#[serde(borrow)]</span>
    <span class="n">important_dates</span><span class="p">:</span> <span class="n">ZeroVec</span><span class="o">&lt;</span><span class="nv">'data</span><span class="p">,</span> <span class="n">Date</span><span class="o">&gt;</span><span class="p">,</span>
    <span class="c1">// note: VarZeroVec always must reference the unsized ULE type directly</span>
    <span class="nd">#[serde(borrow)]</span>
    <span class="n">important_people</span><span class="p">:</span> <span class="n">VarZeroVec</span><span class="o">&lt;</span><span class="nv">'data</span><span class="p">,</span> <span class="n">PersonULE</span><span class="o">&gt;</span><span class="p">,</span>
    <span class="nd">#[serde(borrow)]</span>
    <span class="n">birthdays_to_people</span><span class="p">:</span> <span class="n">ZeroMap</span><span class="o">&lt;</span><span class="nv">'data</span><span class="p">,</span> <span class="n">Date</span><span class="p">,</span> <span class="n">PersonULE</span><span class="o">&gt;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Unfortunately the inner “ULE type” workings are not <em>completely</em> hidden from the user, especially for <code class="language-plaintext highlighter-rouge">VarZeroVec</code>-compatible types, but the crate does a fair number of things to attempt to make it pleasant to work with.</p>

<p>In general, <code class="language-plaintext highlighter-rouge">ZeroVec</code> should be used for types that are fixed-size and implement <code class="language-plaintext highlighter-rouge">Copy</code>, whereas <code class="language-plaintext highlighter-rouge">VarZeroVec</code> is to be used with types that logically contain a variable amount of data, like vectors, maps, strings, and aggregates of the same. <code class="language-plaintext highlighter-rouge">VarZeroVec</code> will always be used with a dynamically sized type, yielding references to that type.</p>

<p>I’ve noted before that these types are like <code class="language-plaintext highlighter-rouge">Cow&lt;'a, T&gt;</code>; they can be dealt with in a mutable-owned fashion, but it’s not the primary focus of the crate. In particular, <code class="language-plaintext highlighter-rouge">VarZeroVec&lt;T&gt;</code> will be significantly slower to mutate than something like <code class="language-plaintext highlighter-rouge">Vec&lt;String&gt;</code>, since all operations are done on the same buffer format. The general idea of this crate is that you probably will be <em>generating</em> your data in a situation without too many performance constraints, but you want the operation of <em>reading</em> the data to be fast. So, where necessary, the crate trades off mutation performance for deserialization/read performance. Still, it’s not terribly slow, just something to look out for and benchmark if necessary.</p>

<h2 id="how-it-works">How it works</h2>

<p>Most of the crate is built on the <a href="https://docs.rs/zerovec/latest/zerovec/ule/trait.ULE.html"><code class="language-plaintext highlighter-rouge">ULE</code></a> and <a href="https://docs.rs/zerovec/latest/zerovec/ule/trait.VarULE.html"><code class="language-plaintext highlighter-rouge">VarULE</code></a> traits. Both of these traits are <code class="language-plaintext highlighter-rouge">unsafe</code> traits (though as shown above most users need not manually implement them). “ULE” stands for “unaligned little-endian”, and marks types which have no alignment requirements and have the same representation across endiannesses, preferring to be identical to the little-endian representation where relevant<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup>.</p>

<p>There’s also a safe <a href="https://docs.rs/zerovec/latest/zerovec/ule/trait.AsULE.html"><code class="language-plaintext highlighter-rouge">AsULE</code></a> trait that allows one to convert a type between itself and some corresponding <code class="language-plaintext highlighter-rouge">ULE</code> type.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">unsafe</span> <span class="k">trait</span> <span class="n">ULE</span><span class="p">:</span> <span class="nb">Sized</span> <span class="o">+</span> <span class="nb">Copy</span> <span class="o">+</span> <span class="k">'static</span> <span class="p">{</span>
    <span class="c1">// Validate that a byte slice is appropriate to treat as a reference to this type</span>
    <span class="k">fn</span> <span class="nf">validate_byte_slice</span><span class="p">(</span><span class="n">bytes</span><span class="p">:</span> <span class="o">&amp;</span><span class="p">[</span><span class="nb">u8</span><span class="p">])</span> <span class="k">-&gt;</span> <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span> <span class="n">ZeroVecError</span><span class="o">&gt;</span><span class="p">;</span>

    <span class="c1">// less relevant utility methods omitted</span>
<span class="p">}</span>

<span class="k">pub</span> <span class="k">trait</span> <span class="n">AsULE</span><span class="p">:</span> <span class="nb">Copy</span> <span class="p">{</span>
    <span class="k">type</span> <span class="n">ULE</span><span class="p">:</span> <span class="n">ULE</span><span class="p">;</span>

    <span class="c1">// Convert to the ULE type</span>
    <span class="k">fn</span> <span class="nf">to_unaligned</span><span class="p">(</span><span class="k">self</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="k">Self</span><span class="p">::</span><span class="n">ULE</span><span class="p">;</span>
    <span class="c1">// Convert back from the ULE type</span>
    <span class="k">fn</span> <span class="nf">from_unaligned</span><span class="p">(</span><span class="n">unaligned</span><span class="p">:</span> <span class="k">Self</span><span class="p">::</span><span class="n">ULE</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="k">Self</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">pub</span> <span class="k">unsafe</span> <span class="k">trait</span> <span class="n">VarULE</span><span class="p">:</span> <span class="k">'static</span> <span class="p">{</span>
    <span class="c1">// Validate that a byte slice is appropriate to treat as a reference to this type</span>
    <span class="k">fn</span> <span class="nf">validate_byte_slice</span><span class="p">(</span><span class="n">_bytes</span><span class="p">:</span> <span class="o">&amp;</span><span class="p">[</span><span class="nb">u8</span><span class="p">])</span> <span class="k">-&gt;</span> <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span> <span class="n">ZeroVecError</span><span class="o">&gt;</span><span class="p">;</span>

    <span class="c1">// Construct a reference to Self from a known-valid byte slice</span>
    <span class="c1">// This is necessary since VarULE types are dynamically sized and the working of the metadata</span>
    <span class="c1">// of the fat pointer varies between such types</span>
    <span class="k">unsafe</span> <span class="k">fn</span> <span class="nf">from_byte_slice_unchecked</span><span class="p">(</span><span class="n">bytes</span><span class="p">:</span> <span class="o">&amp;</span><span class="p">[</span><span class="nb">u8</span><span class="p">])</span> <span class="k">-&gt;</span> <span class="o">&amp;</span><span class="k">Self</span><span class="p">;</span>

    <span class="c1">// less relevant utility methods omitted</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">ZeroVec&lt;T&gt;</code> takes in types that are <code class="language-plaintext highlighter-rouge">AsULE</code> and stores them internally as slices of their ULE types (<code class="language-plaintext highlighter-rouge">&amp;[T::ULE]</code>). Such slices can be freely zero-copy serialized. When you attempt to index a <code class="language-plaintext highlighter-rouge">ZeroVec</code>, it converts the value back to <code class="language-plaintext highlighter-rouge">T</code> on the fly, an operation that’s usually just an unaligned load.</p>

<p><code class="language-plaintext highlighter-rouge">VarZeroVec&lt;T&gt;</code> is a bit more complicated. The beginning of its memory stores the indices of every element in the vector, followed by the data for all of the elements just splatted one after the other. As long as the dynamically sized data can be represented in a <em>flat</em> fashion (without further internal indirection), it can implement <code class="language-plaintext highlighter-rouge">VarULE</code>, and thus be used in <code class="language-plaintext highlighter-rouge">VarZeroVec&lt;T&gt;</code>. <code class="language-plaintext highlighter-rouge">str</code> implements this, but so do <code class="language-plaintext highlighter-rouge">ZeroSlice&lt;T&gt;</code> and <code class="language-plaintext highlighter-rouge">VarZeroSlice&lt;T&gt;</code>, allowing for infinite nesting of <code class="language-plaintext highlighter-rouge">zerovec</code> types!</p>

<p><code class="language-plaintext highlighter-rouge">ZeroMap&lt;T&gt;</code> works similarly to the <a href="https://docs.rs/litemap"><code class="language-plaintext highlighter-rouge">litemap</code></a> crate, it’s a map built out of two vectors, using binary search to find keys. This isn’t always as efficient as a hash map but it can work well in a zero-copy way since it can just be backed by <code class="language-plaintext highlighter-rouge">ZeroVec</code> and <code class="language-plaintext highlighter-rouge">VarZeroVec</code>. There’s a bunch of trait infrastructure that allows it to automatically select <code class="language-plaintext highlighter-rouge">ZeroVec</code> or <code class="language-plaintext highlighter-rouge">VarZeroVec</code> for each of the key and value vectors based on the type of the key or value.</p>

<h2 id="what-about-rkyv">What about rkyv?</h2>

<p>An important question when we started down this path was: what about <a href="https://docs.rs/rkyv"><code class="language-plaintext highlighter-rouge">rkyv</code></a>? It had at the time just received a fair amount of attention in the Rust community, and seemed like a pretty cool library targeting the same space.</p>

<p>And in general if you’re looking for zero-copy deserialization, I wholeheartedly recommend looking at it! It’s an impressive library with a lot of thought put into it. When I was refining <a href="https://docs.rs/zerovec"><code class="language-plaintext highlighter-rouge">zerovec</code></a> I learned a lot from <a href="https://docs.rs/rkyv"><code class="language-plaintext highlighter-rouge">rkyv</code></a> having some insightful discussions with <a href="https://github.com/djkoloski">David</a> and comparing notes on approaches.</p>

<p>The main sticking point, for us, was that <a href="https://docs.rs/rkyv"><code class="language-plaintext highlighter-rouge">rkyv</code></a> works kinda separately from <a href="https://docs.rs/serde"><code class="language-plaintext highlighter-rouge">serde</code></a>: it uses its own traits and own serialization mechanism. We really liked <a href="https://docs.rs/serde"><code class="language-plaintext highlighter-rouge">serde</code></a>’s model and wanted to keep using it, especially since we wanted to support a variety of human-readable and non-human-readable data formats, including <a href="https://docs.rs/postcard"><code class="language-plaintext highlighter-rouge">postcard</code></a>, which is explicitly designed for low-resource environments. This becomes even more important for data interchange; we’d want programs written in other languages to be able to construct and send over data without necessarily being constrained to a particular wire format.</p>

<p>The goal of <a href="https://docs.rs/zerovec/latest/zerovec/enum.ZeroVec.html"><code class="language-plaintext highlighter-rouge">zerovec</code></a> is essentially to bring <a href="https://docs.rs/rkyv"><code class="language-plaintext highlighter-rouge">rkyv</code></a>-like improvements to a <a href="https://docs.rs/serde"><code class="language-plaintext highlighter-rouge">serde</code></a> universe without disrupting that universe too much. <code class="language-plaintext highlighter-rouge">zerovec</code> types, on human-readable formats like JSON, serialize to a normal human-readable representation of the structure, and on binary formats like <a href="https://docs.rs/postcard"><code class="language-plaintext highlighter-rouge">postcard</code></a>, serialize to a compact, zero-copy-friendly representation that Just Works.</p>

<h2 id="how-does-it-perform">How does it perform?</h2>

<p>So off the bat I’ll mention that <a href="https://docs.rs/rkyv"><code class="language-plaintext highlighter-rouge">rkyv</code></a> maintains <a href="https://github.com/djkoloski/rust_serialization_benchmark">a very good benchmark suite</a> that I really need to get around to integrating with zerovec, but haven’t yet.</p>

<div class="discussion discussion-issue">
            <img class="bobblehead" width="60px" height="60px" title="Negative pion" alt="Speech bubble for character Negative pion" src="http://manishearth.github.io/images/pion-minus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Why not go do that first? It would make your post better!
            </div>
        </div>

<p>Well, I was delaying working on this post until I had those benchmarks integrated, but that’s not how executive function works, and at this point I’d rather publish with the benchmarks I have rather than delaying further. I might update this post with the Good Benchmarks later!</p>

<div class="discussion discussion-issue">
            <img class="bobblehead" width="60px" height="60px" title="Negative pion" alt="Speech bubble for character Negative pion" src="http://manishearth.github.io/images/pion-minus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Hmph.
            </div>
        </div>

<p>The complete benchmark run details can be found <a href="https://gist.github.com/Manishearth/056a0ec12f9c943d71d214713d448ac0">here</a> (run via <code class="language-plaintext highlighter-rouge">cargo bench</code> at <a href="https://github.com/unicode-org/icu4x/tree/1e072b3248b93a974e21f3d01bc6a165eb272554/utils/zerovec"><code class="language-plaintext highlighter-rouge">1e072b32</code></a>. I’m pulling out some specific data points for illustration:</p>

<p><code class="language-plaintext highlighter-rouge">ZeroVec</code>:</p>

<table>
<thead><th>Benchmark</th><th>Slice</th><th>ZeroVec</th></thead>
<tbody>

   <tr><th>Deserialization (with <code>bincode</code>)</th></tr>
   <tr><th>Deserialize a vector of 100 u32s</th><td>141.55 ns</td><td>12.166 ns</td></tr>
   <tr><th>Deserialize a vector of 15 chars</th><td>225.55 ns</td><td>25.668 ns</td></tr>
   <tr><th>Deserialize and then sum a vector of 20 u32s</th><td>47.423 ns</td><td>14.131 ns</td></tr>

   <tr><th>Element fetching performance</th></tr>
   <tr><th>Sum a vector of 75 u32 elements</th><td>4.3091 ns</td><td>5.7108 ns</td></tr>
   <tr><th>Binary search a vector of 1000 u32 elements, 50 times</th><td>428.48 ns</td><td>565.23 ns</td></tr>
   <tr><th>Binary search a vector of 1000 u32 elements, 50 times</th><td>428.48 ns</td><td>565.23 ns</td></tr>
   <tr><th>Serialization</th></tr>

   <tr><th>Serialize a vector of 20 u32s</th><td>51.324 ns</td><td>21.582 ns</td></tr>
   <tr><th>Serialize a vector of 15 chars</th><td>195.75 ns</td><td>21.123 ns</td></tr>
</tbody>
</table>

<p><br />
In general we don’t care about serialization performance much, however serialization is fast here because <code class="language-plaintext highlighter-rouge">ZeroVec</code>s are always stored in memory as the same form they would be serialized at. This can make mutation slower. Fetching operations are a little bit slower on <code class="language-plaintext highlighter-rouge">ZeroVec</code>. The deserialization performance is where we see our real wins, sometimes being more than ten times as fast!</p>

<p><code class="language-plaintext highlighter-rouge">VarZeroVec</code>:</p>

<p>The strings are randomly generated, picked with sizes between 2 and 20 code points, and the same set of strings is used for any given row.</p>

<table>
<thead><th>Benchmark</th><th><code>Vec&lt;String&gt;</code></th><th><code>Vec&lt;&amp;str&gt;</code></th><th>VarZeroVec</th></thead>
<tbody>

   <tr><th>Deserialize (len 100)</th><td>11.274 us</td><td>2.2486 us</td><td>1.9446 us</td></tr>

   <tr><th>Count code points (len 100)</th><td colspan="2">728.99 ns</td><td>1265.0 ns</td></tr>
   <tr><th>Binary search for 1 element (len 500)</th><td colspan="2">57.788 ns</td><td>122.10 ns</td></tr>
   <tr><th>Binary search for 10 elements (len 500)</th><td colspan="2">451.40 ns</td><td>803.67 ns</td></tr>

</tbody>
</table>
<p><br /></p>

<p>Here, fetching operations are a bit slower since they need to read the indexing array, but there’s still a decent win for zero-copy deserialization. The deserialization wins stack up for more complex data; for <code class="language-plaintext highlighter-rouge">Vec&lt;String&gt;</code> you can get <em>most</em> of the wins by using <code class="language-plaintext highlighter-rouge">Vec&lt;&amp;str&gt;</code>, but that’s not necessarily possible for something more complex. We don’t currently have mutation benchmarks for <code class="language-plaintext highlighter-rouge">VarZeroVec</code>, but mutation can be slow and as mentioned before it’s not intended to be used much in client code.</p>

<p>Some of this is still in flux; for example we are in the process of <a href="https://github.com/unicode-org/icu4x/pull/2306">making <code class="language-plaintext highlighter-rouge">VarZeroVec</code>’s buffer format configurable</a> so that users can pick their precise tradeoffs.</p>

<h2 id="try-it-out">Try it out!</h2>

<p>Similar to <a href="https://docs.rs/yoke"><code class="language-plaintext highlighter-rouge">yoke</code></a>, I don’t consider the <a href="https://docs.rs/zerovec/latest/zerovec/enum.ZeroVec.html"><code class="language-plaintext highlighter-rouge">zerovec</code></a> crate “done” yet, but it’s been in use in ICU4X for a year now and I consider it mature enough to recommend to others. Try it out! Let me know what you think!</p>

<p><em>Thanks to <a href="https://twitter.com/plaidfinch">Finch</a>, <a href="https://twitter.com/yaahc_">Jane</a>, and <a href="https://github.com/sffc">Shane</a> for reviewing drafts of this post</em></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>A <em>locale</em> is typically a language and location, though it may contain additional information like the writing system or even things like the calendar system in use. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Bear in mind, this isn’t just a matter of picking a format like MM-DD-YYYY! Dates in just US English can look like <code class="language-plaintext highlighter-rouge">4/10/22</code> or <code class="language-plaintext highlighter-rouge">4/10/2022</code> or <code class="language-plaintext highlighter-rouge">April 10, 2022</code>, or <code class="language-plaintext highlighter-rouge">Sunday, April 10, 2022 C.E.</code>, or <code class="language-plaintext highlighter-rouge">Sun, Apr 10, 2022</code>, and that’s not without thinking about week numbers, quarters, or time! This quickly adds up to a decent amount of data for each locale. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>As mentioned in the previous post, while zero-copy deserializing, it is typical to use borrowed-or-owned types like <code class="language-plaintext highlighter-rouge">Cow</code> over pure borrowed types because it’s not necessary that data in a human-readable format will be able to zero-copy deserialize. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>Most modern systems are little endian, so this imposes one fewer potential cost on conversion. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Not a Yoking Matter (Zero-Copy #1)]]></title>
    <link href="http://manishearth.github.io/blog/2022/08/03/zero-copy-1-not-a-yoking-matter/"/>
    <updated>2022-08-03T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2022/08/03/zero-copy-1-not-a-yoking-matter</id>
    <content type="html"><![CDATA[<p><em>This is part 1 of a three-part series on interesting abstractions for zero-copy deserialization I’ve been working on over the last year. This part is about making zero-copy deserialization more pleasant to work with. Part 2 is about making it work for more types and can be found <a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-2-zero-copy-all-the-things/">here</a>; while Part 3 is about eliminating the deserialization step entirely and can be found <a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-3-so-zero-its-dot-dot-dot-negative/">here</a>. The posts can be read in any order, though this post contains an explanation of what zero-copy deserialization</em> is.</p>

<h2 id="background">Background</h2>

<p>For the past year and a half I’ve been working full time on <a href="https://github.com/unicode-org/icu4x">ICU4X</a>, a new internationalization library in Rust being built under the Unicode Consortium as a collaboration between various companies.</p>

<p>There’s a lot I can say about ICU4X, but to focus on one core value proposition: we want it to be <em>modular</em> both in data and code. We want ICU4X to be usable on embedded platforms, where memory is at a premium. We want applications constrained by download size to be able to support all languages rather than pick a couple popular ones because they cannot afford to bundle in all that data. As a part of this, we want loading data to be <em>fast</em> and pluggable. Users should be able to design their own data loading strategies for their individual use cases.</p>

<p>See, a key part of performing correct internationalization is the <em>data</em>. Different locales<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> do things differently, and all of the information on this needs to go somewhere, preferably not code. You need data on how a particular locale formats dates<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>, or how plurals work in a particular language, or how to accurately segment languages like Thai which are typically not written with spaces so that you can insert linebreaks in appropriate positions.</p>

<p>Given the focus on data, a <em>very</em> attractive option for us is zero-copy deserialization. In the process of trying to do zero-copy deserialization well, we’ve built some cool new libraries, this article is about one of them.</p>

<figure class="caption-wrapper center" style="width: 400px"><img class="caption" src="http://manishearth.github.io/images/post/cow-tools.png" width="400" /><figcaption class="caption-text"><p><small>Gary Larson, <a href="https://en.wikipedia.org/wiki/Cow_Tools">“Cow Tools”</a>, <em>The Far Side</em>. October 1982</small></p>
</figcaption></figure>

<h2 id="zero-copy-deserialization-the-basics">Zero-copy deserialization: the basics</h2>

<p><em>This section can be skipped if you’re already familiar with zero-copy deserialization in Rust</em></p>

<p>Deserialization typically involves two tasks, done in concert: validating the data, and constructing an in-memory representation that can be programmatically accessed; i.e., the final deserialized value.</p>

<p>Depending on the format, the former is typically rather fast, but the latter can be super slow, typically around any variable-sized data which needs a new allocation and often a large copy.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[derive(Serialize,</span> <span class="nd">Deserialize)]</span>
<span class="k">struct</span> <span class="n">Person</span> <span class="p">{</span>
    <span class="c1">// this field is nearly free to construct</span>
    <span class="n">age</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
    <span class="c1">// constructing this will involve a small allocation and copy</span>
    <span class="n">name</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span>
    <span class="c1">// this may take a while</span>
    <span class="n">rust_files_written</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p>A typical binary data format will probably store this as a byte for the age, followed by the length of <code class="language-plaintext highlighter-rouge">name</code>, followed by the bytes for <code class="language-plaintext highlighter-rouge">name</code>, followed by another length for the vector, followed by a length and string data for each <code class="language-plaintext highlighter-rouge">String</code> value. Deserializing the <code class="language-plaintext highlighter-rouge">u8</code> age just involves reading it, but the other two fields require allocating sufficient memory and copying each byte over, in addition to any validation the types may need.</p>

<p>A common technique in this scenario is to skip the allocation and copy by simply <em>validating</em> the bytes and storing a <em>reference</em> to the original data. This can only be done for serialization formats where the data is represented identically in the serialized file and in the deserialized value.</p>

<p>When using <a href="https://docs.rs/serde"><code class="language-plaintext highlighter-rouge">serde</code></a> in Rust, this is typically done by using a <a href="https://doc.rust-lang.org/stable/std/borrow/struct.Cow.html"><code class="language-plaintext highlighter-rouge">Cow&lt;'a, T&gt;</code></a> with <code class="language-plaintext highlighter-rouge">#[serde(borrow)]</code>:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[derive(Serialize,</span> <span class="nd">Deserialize)]</span>
<span class="k">struct</span> <span class="n">Person</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">age</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
    <span class="nd">#[serde(borrow)]</span>
    <span class="n">name</span><span class="p">:</span> <span class="n">Cow</span><span class="o">&lt;</span><span class="nv">'a</span><span class="p">,</span> <span class="nb">str</span><span class="o">&gt;</span><span class="p">,</span>
<span class="p">}</span>

</code></pre></div></div>

<p>Now, when <code class="language-plaintext highlighter-rouge">name</code> is being deserialized, the deserializer only needs to validate that it is in fact a valid UTF-8 <code class="language-plaintext highlighter-rouge">str</code>, and the final value for <code class="language-plaintext highlighter-rouge">name</code> will be a reference to the original data being deserialized from itself.</p>

<p>An <code class="language-plaintext highlighter-rouge">&amp;'a str</code> can also be used instead of the <code class="language-plaintext highlighter-rouge">Cow</code>, however this makes the <code class="language-plaintext highlighter-rouge">Deserialize</code> impl much less general, since formats that do <em>not</em> store strings identically to their in-memory representation (e.g. JSON with strings that include escapes) will not be able to fall back to an owned value. As a result of this, owned-or-borrowed <a href="https://doc.rust-lang.org/stable/std/borrow/struct.Cow.html"><code class="language-plaintext highlighter-rouge">Cow&lt;'a, T&gt;</code></a> is often a cornerstone of good design when writing Rust code partaking in zero-copy deserialization.</p>

<div class="post-aside post-aside-note">You may notice that <code class="language-plaintext highlighter-rouge">rust_files_written</code> can’t be found in this new struct. This is because <a href="https://docs.rs/serde"><code class="language-plaintext highlighter-rouge">serde</code></a>, out of the box, can’t handle zero-copy deserialization for anything other than <code class="language-plaintext highlighter-rouge">str</code> and <code class="language-plaintext highlighter-rouge">[u8]</code>, for very good reasons. Other frameworks like <a href="https://docs.rs/rkyv"><code class="language-plaintext highlighter-rouge">rkyv</code></a> can, however we’ve also managed to make this possible with <a href="https://docs.rs/serde"><code class="language-plaintext highlighter-rouge">serde</code></a>. I’ll go in more depth about said reasons and our solution in <a href="http://manishearth.github.io/blog/2022/08/03/zero-copy-2-zero-copy-all-the-things/">part 2</a>.</div>

<div class="discussion discussion-example">
            <img class="bobblehead" width="60px" height="60px" title="Confused pion" alt="Speech bubble for character Confused pion" src="http://manishearth.github.io/images/pion-nought.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Aren’t there still copies occurring here with the <code class="language-plaintext highlighter-rouge">age</code> field?
            </div>
        </div>

<p>Yes, “zero-copy” is somewhat of a misnomer, what it really means is “zero allocations”, or, alternatively, “zero large copies”. Look at it this way: data like <code class="language-plaintext highlighter-rouge">age</code> does get copied, but without, say, allocating a vector of <code class="language-plaintext highlighter-rouge">Person&lt;'a&gt;</code>, you’re only going to see that copy occur a couple times when individually deserializing <code class="language-plaintext highlighter-rouge">Person&lt;'a&gt;</code>s or when deserializing some struct that contains <code class="language-plaintext highlighter-rouge">Person&lt;'a&gt;</code> a couple times. To have a large copy occur <em>without</em> involving allocations, your type would have to be something that is that large on the stack in the first place, which people avoid in general because it means a large copy every time you move the value around even when you’re not deserializing.</p>

<h2 id="when-life-gives-you-lifetimes-">When life gives you lifetimes ….</h2>

<p>Zero-copy deserialization in Rust has one very pesky downside: the lifetimes. Suddenly, all of your deserialized types have lifetimes on them. Of course they would; they’re no longer self-contained, instead containing references to the data they were originally deserialized from!</p>

<p>This isn’t a problem unique to Rust, either, zero-copy deserialization always introduces more complex dependencies between your types, and different frameworks handle this differently; from leaving management of the lifetimes to the user to using reference counting or a GC to ensure the data sticks around. Rust serialization libraries can do stuff like this if they wish, too. In this case, <a href="https://docs.rs/serde"><code class="language-plaintext highlighter-rouge">serde</code></a>, in a very Rusty fashion, wants the library user to have control over the precise memory management here and surfaces this problem as a lifetime.</p>

<p>Unfortunately, lifetimes like these tend to make their way into everything. Every type holding onto your deserialized type needs a lifetime now and it’s likely going to become your users’ problem too.</p>

<p>Furthermore, Rust lifetimes are a purely compile-time construct. If your value is of a type with a lifetime, you need to know at compile time by when it will definitely no longer be in use, and you need to hold on to its source data until then. Rust’s design means that you don’t need to worry about getting this <em>wrong</em>, since the compiler will catch you, but you still need to <em>do it</em>.</p>

<p>All of this isn’t ideal for cases where you want to manage the lifetimes at runtime, e.g. if your data is being deserialized from a larger file and you wish to cache the loaded file as long as data deserialized from it is still around.</p>

<p>Typically in such cases you can use <a href="https://doc.rust-lang.org/stable/std/rc/struct.Rc.html"><code class="language-plaintext highlighter-rouge">Rc&lt;T&gt;</code></a>, which is effectively the “runtime instead of compile time” version of <code class="language-plaintext highlighter-rouge">&amp;'a T</code>s safe shared reference, but this only works for cases where you’re sharing homogenous types, whereas in this case we’re attempting to share different types deserialized from one blob of data, which itself is of a different type.</p>

<p>ICU4X would like users to be able to make use of caching and other data management strategies as needed, so this won’t do at all. For a while ICU4X had not one but <em>two</em> pervasive lifetimes threaded throughout most of its types: it was both confusing and not in line with our goals.</p>

<h2 id="-make-life-take-the-lifetimes-back">… make life take the lifetimes back</h2>

<p><em>A lot of the design here can be found explained in the <a href="https://github.com/unicode-org/icu4x/blob/main/utils/yoke/design_doc.md">design doc</a></em></p>

<p>After <a href="https://github.com/unicode-org/icu4x/issues/667#issuecomment-828123099">a bunch of discussion</a> on this, primarily with <a href="https://github.com/sffc">Shane</a>, I designed <a href="https://docs.rs/yoke"><code class="language-plaintext highlighter-rouge">yoke</code></a>, a crate that attempts to provide <em>lifetime erasure</em> in Rust via self-referential types.</p>

<div class="discussion discussion-example">
            <img class="bobblehead" width="60px" height="60px" title="Confused pion" alt="Speech bubble for character Confused pion" src="http://manishearth.github.io/images/pion-nought.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Wait, <em>lifetime</em> erasure?
            </div>
        </div>

<p>Like type erasure! “Type erasure” (in Rust, done using <code class="language-plaintext highlighter-rouge">dyn Trait</code>) lets you take a compile time concept (the type of a value) and move it into something that can be decided at runtime. Analogously, the core value proposition of <code class="language-plaintext highlighter-rouge">yoke</code> is to take types burdened with the compile time concept of lifetimes and allow you to decide they be decided at runtime anyway.</p>

<div class="discussion discussion-example">
            <img class="bobblehead" width="60px" height="60px" title="Confused pion" alt="Speech bubble for character Confused pion" src="http://manishearth.github.io/images/pion-nought.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Doesn’t <code class="language-plaintext highlighter-rouge">Rc&lt;T&gt;</code> already let you make lifetimes a runtime decision?
            </div>
        </div>

<p>Kind of, <code class="language-plaintext highlighter-rouge">Rc&lt;T&gt;</code> on its own lets you <em>avoid</em> compile-time lifetimes, whereas <code class="language-plaintext highlighter-rouge">Yoke</code> works with situations where there is already a lifetime (e.g. due to zero copy deserialization) that you want to paper over.</p>

<div class="discussion discussion-example">
            <img class="bobblehead" width="60px" height="60px" title="Confused pion" alt="Speech bubble for character Confused pion" src="http://manishearth.github.io/images/pion-nought.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Cool! What does that look like?
            </div>
        </div>

<p>The general idea is that you can take a zero-copy deserializeable type like a <code class="language-plaintext highlighter-rouge">Cow&lt;'a, str&gt;</code> (or something more complicated) and “yoke” it to the value it was deserialized from, which we call a “cart”.</p>

<div class="discussion discussion-issue">
            <img class="bobblehead" width="60px" height="60px" title="Negative pion" alt="Speech bubble for character Negative pion" src="http://manishearth.github.io/images/pion-minus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             <em>*groan*</em> not another crate named with a pun, Manish.
            </div>
        </div>

<p>I will never stop.</p>

<p>Anyway, here’s what that looks like.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Some types explicitly mentioned for clarity</span>

<span class="c1">// load a file</span>
<span class="k">let</span> <span class="n">file</span><span class="p">:</span> <span class="nb">Rc</span><span class="o">&lt;</span><span class="p">[</span><span class="nb">u8</span><span class="p">]</span><span class="o">&gt;</span> <span class="o">=</span> <span class="nn">fs</span><span class="p">::</span><span class="nf">read</span><span class="p">(</span><span class="s">"data.postcard"</span><span class="p">)</span><span class="o">?</span><span class="nf">.into</span><span class="p">();</span>

<span class="c1">// create a new Rc reference to the file data by cloning it,</span>
<span class="c1">// then use it as a cart for a Yoke</span>
<span class="k">let</span> <span class="n">y</span><span class="p">:</span> <span class="n">Yoke</span><span class="o">&lt;</span><span class="n">Cow</span><span class="o">&lt;</span><span class="k">'static</span><span class="p">,</span> <span class="nb">str</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">Rc</span><span class="o">&lt;</span><span class="p">[</span><span class="nb">u8</span><span class="p">]</span><span class="o">&gt;&gt;</span> <span class="o">=</span> <span class="nn">Yoke</span><span class="p">::</span><span class="nf">attach_to_cart</span><span class="p">(</span><span class="n">file</span><span class="nf">.clone</span><span class="p">(),</span> <span class="p">|</span><span class="n">contents</span><span class="p">|</span> <span class="p">{</span>
    <span class="c1">// deserialize from the file</span>
    <span class="k">let</span> <span class="n">cow</span><span class="p">:</span> <span class="n">Cow</span><span class="o">&lt;</span><span class="nb">str</span><span class="o">&gt;</span> <span class="o">=</span>  <span class="nn">postcard</span><span class="p">::</span><span class="nf">from_bytes</span><span class="p">(</span><span class="o">&amp;</span><span class="n">contents</span><span class="p">);</span>
    <span class="n">cow</span>
<span class="p">})</span>

<span class="c1">// the string is still accessible with `.get()`</span>
<span class="nd">println!</span><span class="p">(</span><span class="s">"{}"</span><span class="p">,</span> <span class="n">y</span><span class="nf">.get</span><span class="p">())</span>

<span class="nf">drop</span><span class="p">(</span><span class="n">y</span><span class="p">);</span>
<span class="c1">// only now will the reference count on the file be decreased</span>
</code></pre></div></div>

<div class="post-aside post-aside-issue">Some of the APIs here may not quite work due to current compiler bugs. In this blog post I’m using the ideal version of these APIs for illustrative purposes, but it’s worth checking with the Yoke docs to see if you may need to use an alternate workaround API. <em>Most</em> of the bugs have been fixed as of Rust 1.61.</div>

<div class="discussion discussion-note">
            <img class="bobblehead" width="60px" height="60px" title="Positive pion" alt="Speech bubble for character Positive pion" src="http://manishearth.github.io/images/pion-plus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             The example above uses <a href="https://docs.rs/postcard"><code class="language-plaintext highlighter-rouge">postcard</code></a>: <code class="language-plaintext highlighter-rouge">postcard</code> is a really neat <code class="language-plaintext highlighter-rouge">serde</code>-compatible binary serialization format, designed for use on resource constrained environments. It’s quite fast and has a low codesize, check it out!
            </div>
        </div>

<p>The type <code class="language-plaintext highlighter-rouge">Yoke&lt;Cow&lt;'static, str&gt;, Rc&lt;[u8]&gt;&gt;</code> is “a lifetime-erased <code class="language-plaintext highlighter-rouge">Cow&lt;str&gt;</code> ‘yoked’ to a backing data store ‘cart’ that is an <code class="language-plaintext highlighter-rouge">Rc&lt;[u8]&gt;</code>”. What this means is that the Cow contains references to data from the cart, however, the <code class="language-plaintext highlighter-rouge">Yoke</code> will hold on to the cart type until it is done, which ensures the references from the <code class="language-plaintext highlighter-rouge">Cow</code> no longer dangle.</p>

<p>Most operations on the data within a <code class="language-plaintext highlighter-rouge">Yoke</code> operate via <code class="language-plaintext highlighter-rouge">.get()</code>, which in this case will return a <code class="language-plaintext highlighter-rouge">Cow&lt;'a, str&gt;</code>, where <code class="language-plaintext highlighter-rouge">'a</code> is the lifetime of borrow of <code class="language-plaintext highlighter-rouge">.get()</code>. This keeps things safe: a <code class="language-plaintext highlighter-rouge">Cow&lt;'static, str&gt;</code> is not really safe to distribute in this case since <code class="language-plaintext highlighter-rouge">Cow</code> is not actually borrowing from static data; however it’s fine as long as we transform the lifetime to something shorter during accesses.</p>

<p>Turns out, the <code class="language-plaintext highlighter-rouge">'static</code> found in <code class="language-plaintext highlighter-rouge">Yoke</code> types is actually a lie! Rust doesn’t really let you work with types with borrowed content without mentioning <em>some</em> lifetime, and here we want to relieve the compiler from its duty of managing lifetimes and manage them ourselves, so we need to give it <em>something</em> so that we can name the type, and <code class="language-plaintext highlighter-rouge">'static</code> is the only preexisting named lifetime in Rust.</p>

<p>The actual signature of <code class="language-plaintext highlighter-rouge">.get()</code> is <a href="https://docs.rs/yoke/latest/yoke/struct.Yoke.html#method.get">a bit weird</a> since it needs to be generic, but if our borrowed type is <code class="language-plaintext highlighter-rouge">Foo&lt;'a&gt;</code>, then the signature of <code class="language-plaintext highlighter-rouge">.get()</code> is something like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">impl</span> <span class="n">Yoke</span><span class="o">&lt;</span><span class="n">Foo</span><span class="o">&lt;</span><span class="k">'static</span><span class="o">&gt;&gt;</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="n">get</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="nv">'a</span> <span class="k">self</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="o">&amp;</span><span class="nv">'a</span> <span class="n">Foo</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span> <span class="p">{</span>
        <span class="o">...</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>For a type to be allowed within a <code class="language-plaintext highlighter-rouge">Yoke&lt;Y, C&gt;</code>, it must implement <code class="language-plaintext highlighter-rouge">Yokeable&lt;'a&gt;</code>. This trait is unsafe to manually implement, in most cases you should autoderive it with <code class="language-plaintext highlighter-rouge">#[derive(Yokeable)]</code>:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[derive(Yokeable,</span> <span class="nd">Serialize,</span> <span class="nd">Deserialize)]</span>
<span class="k">struct</span> <span class="n">Person</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">age</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span>
    <span class="nd">#[serde(borrow)]</span>
    <span class="n">name</span><span class="p">:</span> <span class="n">Cow</span><span class="o">&lt;</span><span class="nv">'a</span><span class="p">,</span> <span class="nb">str</span><span class="o">&gt;</span><span class="p">,</span>
<span class="p">}</span>

<span class="k">let</span> <span class="n">person</span><span class="p">:</span> <span class="n">Yoke</span><span class="o">&lt;</span><span class="n">Person</span><span class="o">&lt;</span><span class="k">'static</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">Rc</span><span class="o">&lt;</span><span class="p">[</span><span class="nb">u8</span><span class="p">]</span><span class="o">&gt;</span> <span class="o">=</span> <span class="nn">Yoke</span><span class="p">::</span><span class="nf">attach_to_cart</span><span class="p">(</span><span class="n">file</span><span class="nf">.clone</span><span class="p">(),</span> <span class="p">|</span><span class="n">contents</span><span class="p">|</span> <span class="p">{</span>
    <span class="nn">postcard</span><span class="p">::</span><span class="nf">from_bytes</span><span class="p">(</span><span class="o">&amp;</span><span class="n">contents</span><span class="p">)</span>
<span class="p">});</span>
</code></pre></div></div>

<p>Unlike most <code class="language-plaintext highlighter-rouge">#[derive]</code>s, <code class="language-plaintext highlighter-rouge">Yokeable</code> can be derived even if the fields do not already implement <code class="language-plaintext highlighter-rouge">Yokeable</code>, except for cases when fields with lifetimes also have other generic parameters. In such cases it typically suffices to tag the type with <code class="language-plaintext highlighter-rouge">#[yoke(prove_covariance_manually)]</code> and ensure any fields with lifetimes also implement <code class="language-plaintext highlighter-rouge">Yokeable</code>.</p>

<p>There’s a bunch more you can do with <code class="language-plaintext highlighter-rouge">Yoke</code>, for example you can “project” a yoke to get a new yoke with a subset of the data found in the initial one:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">person</span><span class="p">:</span> <span class="n">Yoke</span><span class="o">&lt;</span><span class="n">Person</span><span class="o">&lt;</span><span class="k">'static</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">Rc</span><span class="o">&lt;</span><span class="p">[</span><span class="nb">u8</span><span class="p">]</span><span class="o">&gt;&gt;</span> <span class="o">=</span> <span class="o">...</span><span class="err">.</span><span class="p">;</span>

<span class="k">let</span> <span class="n">person_name</span><span class="p">:</span> <span class="n">Yoke</span><span class="o">&lt;</span><span class="n">Cow</span><span class="o">&lt;</span><span class="k">'static</span><span class="p">,</span> <span class="nb">str</span><span class="o">&gt;&gt;</span> <span class="o">=</span> <span class="n">person</span><span class="nf">.project</span><span class="p">(|</span><span class="n">p</span><span class="p">,</span> <span class="n">_</span><span class="p">|</span> <span class="n">p</span><span class="py">.name</span><span class="p">);</span>

</code></pre></div></div>

<p>This allows one to mix data coming from disparate Yokes.</p>

<p><code class="language-plaintext highlighter-rouge">Yoke</code>s are, perhaps surprisingly, <em>mutable</em> as well! They are, after all, primarily intended to be used with copy-on-write data, so there are ways to mutate them provided that no <em>additional</em> borrowed data sneaks in:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="k">mut</span> <span class="n">person</span><span class="p">:</span> <span class="n">Yoke</span><span class="o">&lt;</span><span class="n">Person</span><span class="o">&lt;</span><span class="k">'static</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">Rc</span><span class="o">&lt;</span><span class="p">[</span><span class="nb">u8</span><span class="p">]</span><span class="o">&gt;&gt;</span> <span class="o">=</span> <span class="o">...</span><span class="err">.</span><span class="p">;</span>

<span class="c1">// make the name sound fancier</span>
<span class="n">person</span><span class="nf">.with_mut</span><span class="p">(|</span><span class="n">person</span><span class="p">|</span> <span class="p">{</span>
    <span class="c1">// this will convert the `Cow` into owned one</span>
    <span class="n">person</span><span class="py">.name</span><span class="nf">.to_mut</span><span class="p">()</span><span class="nf">.push</span><span class="p">(</span><span class="s">", Esq."</span><span class="p">)</span>
<span class="p">})</span>
</code></pre></div></div>

<p>Overall <code class="language-plaintext highlighter-rouge">Yoke</code> is a pretty powerful abstraction, useful for a host of situations involving zero-copy deserialization as well as other cases involving heavy borrowing. In ICU4X the abstractions we use to load data always use <code class="language-plaintext highlighter-rouge">Yoke</code>s, allowing various data loading strategies — including caching — to be mixed</p>

<h3 id="how-it-works">How it works</h3>

<div class="discussion discussion-note">
            <img class="bobblehead" width="60px" height="60px" title="Positive pion" alt="Speech bubble for character Positive pion" src="http://manishearth.github.io/images/pion-plus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Manish is about to say the word “covariant” so I’m going to get ahead of him and say: If you have trouble understanding this and the next section, don’t worry! The internal workings of his crate rely on multiple niche concepts that most Rustaceans never need to care about, even those working on otherwise advanced code.
            </div>
        </div>

<p><code class="language-plaintext highlighter-rouge">Yoke</code> works by relying on the concept of a <em>covariant lifetime</em>. The <a href="https://docs.rs/yoke/latest/yoke/trait.Yokeable.html"><code class="language-plaintext highlighter-rouge">Yokeable</code></a> trait looks like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">unsafe</span> <span class="k">trait</span> <span class="n">Yokeable</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span><span class="p">:</span> <span class="k">'static</span> <span class="p">{</span>
    <span class="k">type</span> <span class="n">Output</span><span class="p">:</span> <span class="nv">'a</span><span class="p">;</span>
    <span class="c1">// methods omitted</span>
<span class="p">}</span>
</code></pre></div></div>

<p>and a typical implementation would look something like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">unsafe</span> <span class="k">impl</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span> <span class="n">Yokeable</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span> <span class="k">for</span> <span class="n">Cow</span><span class="o">&lt;</span><span class="k">'static</span><span class="p">,</span> <span class="nb">str</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">type</span> <span class="n">Output</span><span class="p">:</span> <span class="nv">'a</span> <span class="o">=</span> <span class="n">Cow</span><span class="o">&lt;</span><span class="nv">'a</span><span class="p">,</span> <span class="nb">str</span><span class="o">&gt;</span><span class="p">;</span>
    <span class="c1">// ...</span>
<span class="p">}</span>
</code></pre></div></div>

<p>An implementation of this trait will be implemented on the <code class="language-plaintext highlighter-rouge">'static</code> version of a type with a lifetime (which I will call <code class="language-plaintext highlighter-rouge">Self&lt;'static&gt;</code><sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup> in this post), and maps the type to a version of it with a lifetime (<code class="language-plaintext highlighter-rouge">Self&lt;'a&gt;</code>). It must only be implemented on types where the lifetime <code class="language-plaintext highlighter-rouge">'a</code> is <em>covariant</em>, i.e., where it’s safe to treat <code class="language-plaintext highlighter-rouge">Self&lt;'a&gt;</code> as <code class="language-plaintext highlighter-rouge">Self&lt;'b&gt;</code> when <code class="language-plaintext highlighter-rouge">'b</code> is a shorter lifetime. Most types with lifetimes fall in this category<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup>, especially in the space of zero-copy deserialization.</p>

<div class="discussion discussion-note">
            <img class="bobblehead" width="60px" height="60px" title="Positive pion" alt="Speech bubble for character Positive pion" src="http://manishearth.github.io/images/pion-plus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             You can read more about variance in the <a href="https://doc.rust-lang.org/nomicon/subtyping.html">nomicon</a>!
            </div>
        </div>

<p>For any <code class="language-plaintext highlighter-rouge">Yokeable</code> type <code class="language-plaintext highlighter-rouge">Foo&lt;'static&gt;</code>, you can obtain the version of that type with a lifetime <code class="language-plaintext highlighter-rouge">'a</code> with <code class="language-plaintext highlighter-rouge">&lt;Foo as Yokeable&lt;'a&gt;&gt;::Output</code>. The <code class="language-plaintext highlighter-rouge">Yokeable</code> trait exposes some methods that allow one to safely carry out the various transforms that are allowed on a type with a covariant lifetime.</p>

<p><code class="language-plaintext highlighter-rouge">#[derive(Yokeable)]</code>, in most cases, relies on the compiler’s ability to determine if a lifetime is covariant, and doesn’t actually generate much code! In most cases, the bodies of the various functions on <code class="language-plaintext highlighter-rouge">Yokeable</code> are pure safe code, looking like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">impl</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span> <span class="n">Yokeable</span> <span class="k">for</span> <span class="n">Foo</span><span class="o">&lt;</span><span class="k">'static</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">type</span> <span class="n">Output</span><span class="p">:</span> <span class="nv">'a</span> <span class="o">=</span> <span class="n">Foo</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span><span class="p">;</span>
    <span class="k">fn</span> <span class="nf">transform</span><span class="p">(</span><span class="o">&amp;</span><span class="k">self</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="o">&amp;</span><span class="k">Self</span><span class="p">::</span><span class="n">Output</span> <span class="p">{</span>
        <span class="k">self</span>
    <span class="p">}</span>
    <span class="k">fn</span> <span class="nf">transform_owned</span><span class="p">(</span><span class="k">self</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="k">Self</span><span class="p">::</span><span class="n">Output</span> <span class="p">{</span>
        <span class="k">self</span>
    <span class="p">}</span>
    <span class="k">fn</span> <span class="n">transform_mut</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="nv">'a</span> <span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">f</span><span class="p">:</span> <span class="n">F</span><span class="p">)</span>
    <span class="k">where</span>
        <span class="n">F</span><span class="p">:</span> <span class="k">'static</span> <span class="o">+</span> <span class="k">for</span><span class="o">&lt;</span><span class="nv">'b</span><span class="o">&gt;</span> <span class="nf">FnOnce</span><span class="p">(</span><span class="o">&amp;</span><span class="nv">'b</span> <span class="k">mut</span> <span class="k">Self</span><span class="p">::</span><span class="n">Output</span><span class="p">)</span> <span class="p">{</span>
        <span class="nf">f</span><span class="p">(</span><span class="k">self</span><span class="p">)</span>
    <span class="p">}</span>
    <span class="c1">// fn make() omitted since it's not as relevant</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The compiler knows these are safe because it knows that the type is covariant, and the <code class="language-plaintext highlighter-rouge">Yokeable</code> trait allows us to talk about types where these operations are safe, <em>generically</em>.</p>

<div class="discussion discussion-note">
            <img class="bobblehead" width="60px" height="60px" title="Positive pion" alt="Speech bubble for character Positive pion" src="http://manishearth.github.io/images/pion-plus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             In other words, there’s a certain useful property about lifetime “stretchiness” that the compiler knows about, and we can check that the property applies to a type by generating code that the compiler would refuse to compile if the property did not apply.
            </div>
        </div>

<p>Using this trait, <code class="language-plaintext highlighter-rouge">Yoke</code> then works by storing <code class="language-plaintext highlighter-rouge">Self&lt;'static&gt;</code> and transforming it to a shorter, more local lifetime before handing it out to any consumers, using the methods on <code class="language-plaintext highlighter-rouge">Yokeable</code> in various ways. Knowing that the lifetime is covariant is what makes it safe to do such lifetime “squeezing”. The <code class="language-plaintext highlighter-rouge">'static</code> is a lie, but it’s safe to do that kind of thing as long as the value isn’t actually accessed with the <code class="language-plaintext highlighter-rouge">'static</code> lifetime, and we take great care to ensure it doesn’t leak.</p>

<h2 id="better-conversions-zerofrom">Better conversions: ZeroFrom</h2>

<p>A crate that pairs well with this is <a href="https://docs.rs/zerofrom"><code class="language-plaintext highlighter-rouge">zerofrom</code></a>, primarily designed and written by <a href="https://github.com/sffc">Shane</a>. It comes with the <a href="https://docs.rs/zerofrom/latest/zerofrom/trait.ZeroFrom.html"><code class="language-plaintext highlighter-rouge">ZeroFrom</code></a> trait:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">trait</span> <span class="n">ZeroFrom</span><span class="o">&lt;</span><span class="nv">'zf</span><span class="p">,</span> <span class="n">C</span><span class="p">:</span> <span class="o">?</span><span class="nb">Sized</span><span class="o">&gt;</span><span class="p">:</span> <span class="nv">'zf</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">zero_from</span><span class="p">(</span><span class="n">other</span><span class="p">:</span> <span class="o">&amp;</span><span class="nv">'zf</span> <span class="n">C</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="k">Self</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The idea of this trait is to be able to work generically with types convertible to (often zero-copy) borrowed types.</p>

<p>For example, <code class="language-plaintext highlighter-rouge">Cow&lt;'zf, str&gt;</code> implements both <code class="language-plaintext highlighter-rouge">ZeroFrom&lt;'zf, str&gt;</code> and <code class="language-plaintext highlighter-rouge">ZeroFrom&lt;'zf, String&gt;</code>, as well as <code class="language-plaintext highlighter-rouge">ZeroFrom&lt;'zf, Cow&lt;'a, str&gt;&gt;</code>. It’s similar to the <a href="https://doc.rust-lang.org/stable/std/convert/trait.AsRef.html"><code class="language-plaintext highlighter-rouge">AsRef</code></a> trait but it allows for more flexibility on the kinds of borrowing occuring, and implementors are supposed to minimize the amount of copying during such a conversion. For example, when <code class="language-plaintext highlighter-rouge">ZeroFrom</code>-constructing a <code class="language-plaintext highlighter-rouge">Cow&lt;'zf, str&gt;</code> from some other <code class="language-plaintext highlighter-rouge">Cow&lt;'a, str&gt;</code>, it will <em>always</em> construct a <code class="language-plaintext highlighter-rouge">Cow::Borrowed</code>, even if the original <code class="language-plaintext highlighter-rouge">Cow&lt;'a, str&gt;</code> were owned.</p>

<p><code class="language-plaintext highlighter-rouge">Yoke</code> has a convenient constructor <a href="https://docs.rs/yoke/latest/yoke/struct.Yoke.html#method.attach_to_zero_copy_cart"><code class="language-plaintext highlighter-rouge">Yoke::attach_to_zero_copy_cart()</code></a> that can create a <code class="language-plaintext highlighter-rouge">Yoke&lt;Y, C&gt;</code> out of a cart type <code class="language-plaintext highlighter-rouge">C</code> if <code class="language-plaintext highlighter-rouge">Y&lt;'zf&gt;</code> implements <code class="language-plaintext highlighter-rouge">ZeroFrom&lt;'zf, C&gt;</code> for all lifetimes <code class="language-plaintext highlighter-rouge">'zf</code>. This is useful for cases where you want to do basic self-referential types but aren’t doing any fancy zero-copy deserialization.</p>

<h2 id="-make-life-rue-the-day-it-thought-it-could-give-you-lifetimes">… make life rue the day it thought it could give you lifetimes</h2>

<p>Life with this crate hasn’t been all peachy. We’ve, uh … <a href="https://github.com/rust-lang/rust/issues/90638">unfortunately</a> <a href="https://github.com/rust-lang/rust/issues/86703">discovered</a> <a href="https://github.com/rust-lang/rust/issues/88446">a</a> <a href="https://github.com/rust-lang/rust/issues/89436">toweringly</a> <a href="https://github.com/rust-lang/rust/issues/89196">large</a> <a href="https://github.com/rust-lang/rust/issues/84937">pile</a> <a href="https://github.com/rust-lang/rust/issues/89418">of</a> <a href="https://github.com/rust-lang/rust/issues/90950">gnarly</a> <a href="https://github.com/rust-lang/rust/issues/96223">compiler</a> <a href="https://github.com/rust-lang/rust/issues/91899">bugs</a>. A lot of this has its root in the fact that <code class="language-plaintext highlighter-rouge">Yokeable&lt;'a&gt;</code> in most cases is bound via <code class="language-plaintext highlighter-rouge">for&lt;'a&gt; Yokeable&lt;'a&gt;</code> (“<code class="language-plaintext highlighter-rouge">Yokeable&lt;'a&gt;</code> for all possible lifetimes <code class="language-plaintext highlighter-rouge">'a</code>”). The <code class="language-plaintext highlighter-rouge">for&lt;'a&gt;</code> is a niche feature known as a higher-ranked lifetime or trait bound (often referred to as “HRTB”), and while it’s always been necessary in some capacity for Rust’s typesystem to be able to reason about function pointers, it’s also always been rather buggy and is often discouraged for usages like this.</p>

<p>We’re using it so that we can talk about the lifetime of a type in a generic sense. Fortunately, there is a language feature under active development that will be better suited for this: <a href="https://rust-lang.github.io/generic-associated-types-initiative/index.html">Generic Associated Types</a>.</p>

<p>This feature isn’t stable yet, but, fortunately for <em>us</em>, most compiler bugs involving <code class="language-plaintext highlighter-rouge">for&lt;'a&gt;</code> <em>also</em> impact GATs, so we have been benefitting from the GAT work, and a lot of our bug reports have helped shore up the GAT code. Huge shout out to <a href="https://github.com/jackh726">Jack Huey</a> for fixing a lot of these bugs, and <a href="https://github.com/eddyb">eddyb</a> for helping out in the debugging process.</p>

<p>As of Rust 1.61, a lot of the major bugs have been fixed, however there are still some bugs around trait bounds for which the <code class="language-plaintext highlighter-rouge">yoke</code> crate maintains some <a href="https://docs.rs/yoke/latest/yoke/trait_hack/index.html">workaround helpers</a>. It has been our experience that most compiler bugs here are not <em>restrictive</em> when it comes to what you can do with the crate, but they may end up with code that looks less than ideal. Overall, we still find it worth it, we’re able to do some really neat zero-copy stuff in a way that’s externally convenient (even if some of the internal code is messy), and we don’t have lifetimes everywhere.</p>

<h2 id="try-it-out">Try it out!</h2>

<p>While I don’t consider the <a href="https://docs.rs/yoke"><code class="language-plaintext highlighter-rouge">yoke</code></a> crate “done” yet, it’s been in use in ICU4X for a year now and I consider it mature enough to recommend to others. Try it out! Let me know what you think!</p>

<p><em>Thanks to <a href="https://twitter.com/plaidfinch">Finch</a>, <a href="https://twitter.com/yaahc_">Jane</a>, and <a href="https://github.com/sffc">Shane</a> for reviewing drafts of this post</em></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>A <em>locale</em> is typically a language and location, though it may contain additional information like the writing system or even things like the calendar system in use. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Bear in mind, this isn’t just a matter of picking a format like MM-DD-YYYY! Dates in just US English can look like <code class="language-plaintext highlighter-rouge">4/10/22</code> or <code class="language-plaintext highlighter-rouge">4/10/2022</code> or <code class="language-plaintext highlighter-rouge">April 10, 2022</code>, or <code class="language-plaintext highlighter-rouge">Sunday, April 10, 2022 C.E.</code>, or <code class="language-plaintext highlighter-rouge">Sun, Apr 10, 2022</code>, and that’s not without thinking about week numbers, quarters, or time! This quickly adds up to a decent amount of data for each locale. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>This isn’t real Rust syntax; since <code class="language-plaintext highlighter-rouge">Self</code> is always just <code class="language-plaintext highlighter-rouge">Self</code>, but we need to be able to refer to <code class="language-plaintext highlighter-rouge">Self</code> as a higher-kinded type in this scenario. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>Types that aren’t are ones involving mutability (<code class="language-plaintext highlighter-rouge">&amp;mut</code> or interior mutability) around the lifetime, and ones involving function pointers and trait objects. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Colophon: Waiter, There Are Pions in My Blog Post!]]></title>
    <link href="http://manishearth.github.io/blog/2022/08/03/colophon-waiter-there-are-pions-in-my-blog-post/"/>
    <updated>2022-08-03T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2022/08/03/colophon-waiter-there-are-pions-in-my-blog-post</id>
    <content type="html"><![CDATA[<p>I’ve added a couple new styling elements to my blog and I make use of them extensively in upcoming posts, I thought I’d talk a bit about them because I expect people will have questions.</p>

<p>The main thing is that I have nice aside styling.</p>

<div class="post-aside post-aside-example">Asides are things that look like <em>this</em>.</div>

<p>The actual color scheme comes from the asides used in <a href="https://tabatkins.github.io/bikeshed/">bikeshed</a>, the tool used to generate Web specifications and C++ spec proposals. I edit a couple <a href="https://immersive-web.github.io/webxr/">WebXR</a> specs and used to read specs very often when I worked on browsers; I like the styling they use.</p>

<div class="post-aside post-aside-note">I also think it’s kinda funny to get readers do a double-take when they see familiar styling out of context.</div>

<p>Besides the “example” and “note” ones shown above, there’s also an “issue” one.</p>

<div class="post-aside post-aside-issue">Figure out a way to include an example of an “issue” aside in this post that works in context like the “note” and “example” ones.</div>

<p>These asides are useful for calling out supplemental information; and add to my existing repertoire of footnotes, em dashes, semicolons, and parentheses as a nonlinear writing tool.</p>

<div class="discussion discussion-issue">
            <img class="bobblehead" width="60px" height="60px" title="Negative pion" alt="Speech bubble for character Negative pion" src="http://manishearth.github.io/images/pion-minus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Manish, you’re burying the lede.
            </div>
        </div>

<p>Oh, right. Fine. The pions.</p>

<p>I’ve also introduced three similarly styled “character discussion” asides that show a discussion with a <a href="https://en.wikipedia.org/wiki/Pion">pion</a>. There’s a “positive” one that’s generally helpful, a “negative” one that’s grumpy, and a “confused” (neutrally charged) one that asks questions.</p>

<p>Having little characters participate in the blog post works really well; it gives a sense of <em>flow</em> to the articles. I’m also hoping it makes them easier to read, breaking up otherwise dense technical content with lighter conversational content<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">1</a></sup>. Asking questions through them give the reader an anchor point for themselves if they’re similarly confused, without me making any assumption that the reader did or didn’t understand a part.</p>

<p>They’re also <em>yet another</em> way for me to write nonlinearly, and I <em>love</em> writing nonlinearly.</p>

<p>Adding interlocutors to my blog was extremely inspired by other people: <a href="https://fasterthanli.me/articles">Amos</a> has Cool Bear, and <a href="https://xeiaso.net/blog/">Xe</a> has <a href="https://xeiaso.net/blog/how-mara-works-2020-09-30">Mara</a>, both of which serve similar purposes. While <a href="https://myrrlyn.net/blog">Alex</a> doesn’t quite have interlocutors, their use of <a href="https://en.wikipedia.org/wiki/ISO_7010">ISO 7010</a> icons for asides gave me the idea to use something relevant to my interests while picking characters.</p>

<div class="discussion discussion-example">
            <img class="bobblehead" width="60px" height="60px" title="Confused pion" alt="Speech bubble for character Confused pion" src="http://manishearth.github.io/images/pion-nought.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             Okay but why pions? Scratch that, <em>what</em> is a pion?
            </div>
        </div>

<p>You’re a pion!</p>

<div class="discussion discussion-example">
            <img class="bobblehead" width="60px" height="60px" title="Confused pion" alt="Speech bubble for character Confused pion" src="http://manishearth.github.io/images/pion-nought.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             You know what I meant!
            </div>
        </div>

<p>Okay, okay.</p>

<p>So a <a href="https://en.wikipedia.org/wiki/Pion">pion</a> is a type of subatomic particle, and is part of the mechanism holding the nucleus of an atom together. There are three of them, π<sup>0</sup> , π<sup>+</sup> , and π<sup>−</sup>, with the positive and negative ones being antiparticles of each other.</p>

<p>As for <em>why</em> I’ve chosen that particle in particular, explaining that requires some history first.</p>

<p>Back when we only knew about protons, neutrons, and electrons, physicists were attempting to figure out how atomic nuclei stay together. They’re made of protons and neutrons, which means they’re just a lot of positively and neutrally changed thingies crammed into a tight space. There’s not much reason for that to want to stay together, but there’s plenty of reason for it to come apart. This is rather concerning to beings made out of atoms, like physicists.</p>

<p>To resolve this, physicists theorized the existence of the pion<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">2</a></sup>, a type of particle that is exchanged between protons and neutrons in the nucleus and forms a mechanism for carrying force.</p>

<div class="discussion discussion-note">
            <img class="bobblehead" width="60px" height="60px" title="Positive pion" alt="Speech bubble for character Positive pion" src="http://manishearth.github.io/images/pion-plus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             <p><em>Why</em> “exchanging particles” works to carry force is a complicated topic (and “exchanging particles” is a very simplistic characterization) that would be very hard to explain here, but you can read up on “force carriers” in quantum field theory if you’re interested.</p>

<p>A useful example is that photons are force carriers for the electromagnetic force, which is why you can make radio waves (made up of photons) by messing with electromagnetic fields.</p>
            </div>
        </div>

<p>And when physicists think there’s a new particle, of course they go looking for it. And they did!</p>

<p>… but they found something else entirely.</p>

<p>Bear in mind, this was not the heyday of particle discovery when there was a new particle being discovered every Tuesday. Physicists knew about protons, neutrons, electrons, and probably pions, and were justifiably surprised when a completely different particle came knocking. Instead of having the properties they expected for the pion, it was basically like a heavier electron.</p>

<p>The physicist I. I. Rabi famously remarked “Who ordered that?” when they figured out what had happened. There was no <em>reason</em> for such a particle to exist, they had this nice consistent model of the atom that needed four kinds of particle and they found this fifth one just floating around, not really <em>doing</em> anything.</p>

<p>This particle was the <a href="https://en.wikipedia.org/wiki/Muon">muon</a>, and this story is why I use a Greek μ as my avatar everywhere<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">3</a></sup>. Given this history, the pion feels like a very natural choice as a foil in blog posts I write.</p>

<p>Furthermore, there are three of them, which lets me use them for different purposes! One for “positive” commentary, one for “negative” commentary, and a third “confused” one for questions.</p>

<div class="discussion discussion-note">
            <img class="bobblehead" width="60px" height="60px" title="Positive pion" alt="Speech bubble for character Positive pion" src="http://manishearth.github.io/images/pion-plus.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             The neutral pion being “confused” actually works really well because while π<sup>+</sup> (and π<sup>−</sup>) have straightforward <a href="https://en.wikipedia.org/wiki/Quark">quark</a> representations of an up and antidown quark (and a down and an antiup quark for π<sup>−</sup>), π<sup>0</sup> is a superposition between either an up and antiup quark or a down and antidown quark.
            </div>
        </div>

<p><br /></p>

<p>I can’t wait for people to get to see more of these in my upcoming posts; I really enjoyed writing with them!</p>

<div class="discussion discussion-example">
            <img class="bobblehead" width="60px" height="60px" title="Confused pion" alt="Speech bubble for character Confused pion" src="http://manishearth.github.io/images/pion-nought.png" />
            <div class="discussion-spacer"></div>
            <div class="discussion-text">
             One last question, are these supposed to be three characters or one character with different moods?
            </div>
        </div>

<p><a href="https://en.wikipedia.org/wiki/One-electron_universe">You tell me</a>.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:5" role="doc-endnote">
      <p>I’m reminded of how a lot of people don’t enjoy reading Tolkien because he spends pages describing, like, one tree, as opposed to most fiction which has plenty of conversational content. Nonfiction books (and blog posts) have the wall-of-description property <em>by default</em> so spending time on improving this makes a lot of sense to me. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:1" role="doc-endnote">
      <p>At the time, they called them “mesons”, which is currently the name of a general class of particles which pions belong to. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>My name starting with an M and my interest in writing systems is a <em>part</em> of it, but the main reason is that I really like this story and this kind of thing is what got me into physics in the first place. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A Tour of Safe Tracing GC Designs in Rust]]></title>
    <link href="http://manishearth.github.io/blog/2021/04/05/a-tour-of-safe-tracing-gc-designs-in-rust/"/>
    <updated>2021-04-05T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2021/04/05/a-tour-of-safe-tracing-gc-designs-in-rust</id>
    <content type="html"><![CDATA[<p>I’ve been thinking about garbage collection in Rust for a long time, ever since I started working on <a href="https://github.com/servo/servo">Servo</a>’s JS layer. I’ve <a href="https://manishearth.github.io/blog/2015/09/01/designing-a-gc-in-rust/">designed a GC library</a>, <a href="https://manishearth.github.io/blog/2016/08/18/gc-support-in-rust-api-design/">worked on GC integration ideas for Rust itself</a>, worked on Servo’s JS GC integration, and helped out with a <a href="https://github.com/asajeffrey/josephine">couple</a> <a href="https://github.com/kyren/gc-arena">other</a> GC projects in Rust.</p>

<p>As a result, I tend to get pulled into GC discussions fairly often. I enjoy talking about GCs – don’t get me wrong – but I often end up going over the same stuff. Being <a href="https://manishearth.github.io/blog/2018/08/26/why-i-enjoy-blogging/#blogging-lets-me-be-lazy">lazy</a> I’d much prefer to be able to refer people to a single place where they can get up to speed on the general space of GC design, after which it’s possible to have more in depth discussions about the specific tradeoffs necessary.</p>

<p>I’ll note that some of the GCs in this post are experiments or unmaintained. The goal of this post is to showcase these as examples of <em>design</em>, not necessarily general-purpose crates you may wish to use, though some of them are usable crates as well.</p>

<h3 id="a-note-on-terminology">A note on terminology</h3>

<p>A thing that often muddles discussions about GCs is that according to some definition of “GC”, simple reference counting <em>is</em> a GC. Typically the definition of GC used in academia broadly refers to any kind of automatic memory management. However, most programmers familiar with the term “GC” will usually liken it to “what Java, Go, Haskell, and C# do”, which can be unambiguously referred to as <em>tracing</em> garbage collection.</p>

<p>Tracing garbage collection is the kind which keeps track of which heap objects are directly reachable (“roots”), figures out the whole set of reachable heap objects (“tracing”, also, “marking”), and then cleans them up (“sweeping”).</p>

<p>Throughout this blog post I will use the term “GC” to refer to tracing garbage collection/collectors unless otherwise stated<sup id="fnref:0" role="doc-noteref"><a href="#fn:0" class="footnote" rel="footnote">1</a></sup>.</p>

<h2 id="why-write-gcs-for-rust">Why write GCs for Rust?</h2>

<p>(If you already want to write a GC in Rust and are reading this post to get ideas for <em>how</em>, you can skip this section. You already know why someone would want to write a GC for Rust)</p>

<p>Every time this topic is brought up someone will inevitably go “I thought the point of Rust was to avoid GCs” or “GCs will ruin Rust” or something. As a general rule it’s good to not give too much weight to the comments section, but I think it’s useful to explain why someone may wish for GC-like semantics in Rust.</p>

<p>There are really two distinct kinds of use cases. Firstly, sometimes you need to manage memory with cycles and <code class="language-plaintext highlighter-rouge">Rc&lt;T&gt;</code> is inadequate for the job since <code class="language-plaintext highlighter-rouge">Rc</code>-cycles get leaked. <a href="https://docs.rs/petgraph/"><code class="language-plaintext highlighter-rouge">petgraph</code></a> or an <a href="https://manishearth.github.io/blog/2021/03/15/arenas-in-rust/">arena</a> are often acceptable solutions for this kind of pattern, but not always, especially if your data is super heterogeneous. This kind of thing crops up often when dealing with concurrent datastructures; for example <a href="https://docs.rs/crossbeam/"><code class="language-plaintext highlighter-rouge">crossbeam</code></a> has <a href="https://docs.rs/crossbeam/0.8.0/crossbeam/epoch/index.html">an epoch-based memory management system</a> which, while not a full tracing GC, has a lot of characteristics in common with GCs.</p>

<p>For this use case it’s rarely necessary to design a custom GC, you can look for a reusable crate like <a href="https://docs.rs/gc/"><code class="language-plaintext highlighter-rouge">gc</code></a> <sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">2</a></sup>.</p>

<p>The second case is far more interesting in my experience, and since it cannot be solved by off-the-shelf solutions tends to crop up more often: integration with (or implementation of) programming languages that <em>do</em> use a garbage collector. <a href="https://github.com/servo/servo">Servo</a> needs to do this for integrating with the Spidermonkey JS engine and <a href="https://github.com/kyren/luster">luster</a> needed to do this for implementing the GC of its Lua VM. <a href="https://github.com/jasonwilliams/boa/">boa</a>, a pure Rust JS runtime, uses the <a href="https://docs.rs/gc/"><code class="language-plaintext highlighter-rouge">gc</code></a> crate to back its garbage collector.</p>

<p>Sometimes when integrating with a GCd language you can get away with not needing to implement a full garbage collector: JNI does this; while C++ does not have native garbage collection, JNI gets around this by simply “rooting” (we’ll cover what that means in a bit) anything that crosses over to the C++ side<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">3</a></sup>. This is often fine!</p>

<p>The downside of this is that every interaction with objects managed by the GC has to go through an API call; you can’t “embed” efficient Rust/C++ objects in the GC with ease. For example, in browsers most DOM types (e.g. <a href="https://doc.servo.org/script/dom/element/struct.Element.html"><code class="language-plaintext highlighter-rouge">Element</code></a>) are implemented in native code; and need to be able to contain references to other native GC’d types (it should be possible to inspect the <a href="https://doc.servo.org/script/dom/node/struct.Node.html#structfield.child_list">children of a <code class="language-plaintext highlighter-rouge">Node</code></a> without needing to call back into the JavaScript engine).</p>

<p>So sometimes you need to be able to integrate with a GC from a runtime; or even implement your own GC if you are writing a runtime that needs one. In both of these cases you typically want to be able to safely manipulate GC’d objects from Rust code, and even directly put Rust types on the GC heap.</p>

<h2 id="why-are-gcs-in-rust-hard">Why are GCs in Rust hard?</h2>

<p>In one word: Rooting. In a garbage collector, the objects “directly” in use on the stack are the “roots”, and you need to be able to identify them. Here, when I say “directly”, I mean “accessible without having to go through other GC’d objects”, so putting an object inside a <code class="language-plaintext highlighter-rouge">Vec&lt;T&gt;</code> does not make it stop being a root, but putting it inside some other GC’d object does.</p>

<p>Unfortunately, Rust doesn’t really have a concept of “directly on the stack”:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">Foo</span> <span class="p">{</span>
    <span class="n">bar</span><span class="p">:</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Gc</span><span class="o">&lt;</span><span class="n">Bar</span><span class="o">&gt;&gt;</span>
<span class="p">}</span>
<span class="c1">// this is a root</span>
<span class="k">let</span> <span class="n">bar</span> <span class="o">=</span> <span class="nn">Gc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Bar</span><span class="p">::</span><span class="nf">new</span><span class="p">());</span>
<span class="c1">// this is also a root</span>
<span class="k">let</span> <span class="n">foo</span> <span class="o">=</span> <span class="nn">Gc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Foo</span><span class="p">::</span><span class="nf">new</span><span class="p">());</span>
<span class="c1">// bar should no longer be a root (but we can't detect that!)</span>
<span class="n">foo</span><span class="py">.bar</span> <span class="o">=</span> <span class="nf">Some</span><span class="p">(</span><span class="n">bar</span><span class="p">);</span>
<span class="c1">// but foo should still be a root here since it's not inside</span>
<span class="c1">// another GC'd object</span>
<span class="k">let</span> <span class="n">v</span> <span class="o">=</span> <span class="nd">vec!</span><span class="p">[</span><span class="n">foo</span><span class="p">];</span>
</code></pre></div></div>

<p>Rust’s ownership system actually makes it easier to have fewer roots since it’s relatively easy to state that taking <code class="language-plaintext highlighter-rouge">&amp;T</code> of a GC’d object doesn’t need to create a new root, and let Rust’s ownership system sort it out, but being able to distinguish between “directly owned” and “indirectly owned” is super tricky.</p>

<p>Another aspect of this is that garbage collection is really a moment of global mutation – the garbage collector reads through the heap and then deletes some of the objects there. This is a moment of the rug being pulled out under your feet. Rust’s entire design is predicated on such rug-pulling being <em>very very bad and not to be allowed</em>, so this can be a bit problematic. This isn’t as bad as it may initially sound because after all the rug-pulling is mostly just cleaning up unreachable objects, but it does crop up a couple times when fitting things together, especially around destructors and finalizers<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">4</a></sup>. Rooting would be far easier if, for example, you were able to declare areas of code where “no GC can happen”<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">5</a></sup> so you can tightly scope the rug-pulling and have to worry less about roots.</p>

<h3 id="destructors-and-finalizers">Destructors and finalizers</h3>

<p>It’s worth calling out destructors in particular. A huge problem with custom destructors on GCd types is that the custom destructor totally can stash itself away into a long-lived reference during garbage collection, leading to a dangling reference:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">LongLived</span> <span class="p">{</span>
    <span class="n">dangle</span><span class="p">:</span> <span class="n">RefCell</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Gc</span><span class="o">&lt;</span><span class="n">CantKillMe</span><span class="o">&gt;&gt;&gt;</span>
<span class="p">}</span>

<span class="k">struct</span> <span class="n">CantKillMe</span> <span class="p">{</span>
    <span class="c1">// set up to point to itself during construction</span>
    <span class="n">self_ref</span><span class="p">:</span> <span class="n">RefCell</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Gc</span><span class="o">&lt;</span><span class="n">CantKillMe</span><span class="o">&gt;&gt;&gt;</span>
    <span class="n">long_lived</span><span class="p">:</span> <span class="nb">Gc</span><span class="o">&lt;</span><span class="n">LongLived</span><span class="o">&gt;</span>
<span class="p">}</span>

<span class="k">impl</span> <span class="nb">Drop</span> <span class="k">for</span> <span class="n">CantKillMe</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">drop</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// attach self to long_lived</span>
        <span class="o">*</span><span class="k">self</span><span class="py">.long_lived.dangle</span><span class="nf">.borrow_mut</span><span class="p">()</span> <span class="o">=</span> <span class="nf">Some</span><span class="p">(</span><span class="k">self</span><span class="py">.self_ref</span><span class="nf">.borrow</span><span class="p">()</span><span class="nf">.clone</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">());</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="k">let</span> <span class="n">long</span> <span class="o">=</span> <span class="nn">Gc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">LongLived</span><span class="p">::</span><span class="nf">new</span><span class="p">());</span>
<span class="p">{</span>
    <span class="k">let</span> <span class="n">cant</span> <span class="o">=</span> <span class="nn">Gc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">CantKillMe</span><span class="p">::</span><span class="nf">new</span><span class="p">());</span>
    <span class="o">*</span><span class="n">cant</span><span class="py">.self_ref</span><span class="nf">.borrow_mut</span><span class="p">()</span> <span class="o">=</span> <span class="nf">Some</span><span class="p">(</span><span class="n">cant</span><span class="nf">.clone</span><span class="p">());</span>
    <span class="c1">// cant goes out of scope, CantKillMe::drop is run</span>
    <span class="c1">// cant is attached to long_lived.dangle but still cleaned up</span>
<span class="p">}</span>

<span class="c1">// Dangling reference!</span>
<span class="k">let</span> <span class="n">dangling</span> <span class="o">=</span> <span class="n">long</span><span class="py">.dangle</span><span class="nf">.borrow</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">();</span>
</code></pre></div></div>

<p>The most common  solution here is to disallow destructors on types that use <code class="language-plaintext highlighter-rouge">#[derive(Trace)]</code>, which can be done by having the custom derive generate a <code class="language-plaintext highlighter-rouge">Drop</code> implementation, or have it generate something which causes a conflicting type error.</p>

<p>You can additionally provide a <code class="language-plaintext highlighter-rouge">Finalize</code> trait that has different semantics: the GC calls it while cleaning up GC objects, but it may be called multiple times or not at all. This kind of thing is typical in GCs outside of Rust as well.</p>

<h2 id="how-would-you-even-garbage-collect-without-a-runtime">How would you even garbage collect without a runtime?</h2>

<p>In most garbage collected languages, there’s a runtime that controls all execution, knows about every variable in the program, and is able to pause execution to run the GC whenever it likes.</p>

<p>Rust has a minimal runtime and can’t do anything like this, especially not in a pluggable way your library can hook in to. For thread local GCs you basically have to write it such that GC operations (things like mutating a GC field; basically some subset of the APIs exposed by your GC library) are the only things that may trigger the garbage collector.</p>

<p>Concurrent GCs can trigger the GC on a separate thread but will typically need to pause other threads whenever these threads attempt to perform a GC operation that could potentially be invalidated by the running garbage collector.</p>

<p>While this may restrict the flexibility of the garbage collector itself, this is actually pretty good for us from the side of API design: the garbage collection phase can only happen in certain well-known moments of the code, which means we only need to make things safe across <em>those</em> boundaries. Many of the designs we shall look at build off of this observation.</p>

<h2 id="commonalities">Commonalities</h2>

<p>Before getting into the actual examples of GC design, I want to point out some commonalities of design between all of them, especially around how they do tracing:</p>

<h3 id="tracing">Tracing</h3>

<p>“Tracing” is the operation of traversing the graph of GC objects, starting from your roots and perusing their children, and their children’s children, and so on.</p>

<p>In Rust, the easiest way to implement this is via a <a href="https://doc.rust-lang.org/book/ch19-06-macros.html#how-to-write-a-custom-derive-macro">custom derive</a>:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// unsafe to implement by hand since you can get it wrong</span>
<span class="k">unsafe</span> <span class="k">trait</span> <span class="n">Trace</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">trace</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">gc_context</span><span class="p">:</span> <span class="o">&amp;</span><span class="k">mut</span> <span class="n">GcContext</span><span class="p">);</span>
<span class="p">}</span>

<span class="nd">#[derive(Trace)]</span>
<span class="k">struct</span> <span class="n">Foo</span> <span class="p">{</span>
    <span class="n">vec</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">Gc</span><span class="o">&lt;</span><span class="n">Bar</span><span class="o">&gt;&gt;</span><span class="p">,</span>
    <span class="n">extra_thing</span><span class="p">:</span> <span class="nb">Gc</span><span class="o">&lt;</span><span class="n">Baz</span><span class="o">&gt;</span><span class="p">,</span>
    <span class="n">just_a_string</span><span class="p">:</span> <span class="nb">String</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The custom derive of <code class="language-plaintext highlighter-rouge">Trace</code> basically just calls <code class="language-plaintext highlighter-rouge">trace()</code> on all the fields. <code class="language-plaintext highlighter-rouge">Vec</code>’s <code class="language-plaintext highlighter-rouge">Trace</code> implementation will be written to call <code class="language-plaintext highlighter-rouge">trace()</code> on all of its fields, and <code class="language-plaintext highlighter-rouge">String</code>’s <code class="language-plaintext highlighter-rouge">Trace</code> implementation will do nothing. <code class="language-plaintext highlighter-rouge">Gc&lt;T&gt;</code> will likely have a <code class="language-plaintext highlighter-rouge">trace()</code> that marks its reachability in the <code class="language-plaintext highlighter-rouge">GcContext</code>, or something similar.</p>

<p>This is a pretty standard pattern, and while the specifics of the <code class="language-plaintext highlighter-rouge">Trace</code> trait will typically vary, the general idea is roughly the same.</p>

<p>I’m not going to get into the actual details of how mark-and-sweep algorithms work in this post; there are a lot of potential designs for them and they’re not that interesting from the point of view of designing a safe GC <em>API</em> in Rust. However, the general idea is to keep a queue of found objects initially populated by the root, trace them to find new objects and queue them up if they’ve not already been traced. Clean up any objects that were <em>not</em> found.</p>

<h3 id="immutable-by-default">Immutable-by-default</h3>

<p>Another commonality between these designs is that a <code class="language-plaintext highlighter-rouge">Gc&lt;T&gt;</code> is always potentially shared, and thus will need tight control over mutability to satisfy Rust’s ownership invariants. This is typically achieved by using interior mutability, much like how <code class="language-plaintext highlighter-rouge">Rc&lt;T&gt;</code> is almost always paired with <code class="language-plaintext highlighter-rouge">RefCell&lt;T&gt;</code> for mutation, however some approaches (like that in <a href="https://github.com/asajeffrey/josephine">josephine</a>) do allow for mutability without runtime checking.</p>

<h3 id="threading">Threading</h3>

<p>Some GCs are single-threaded, and some are multi-threaded. The single threaded ones typically have a <code class="language-plaintext highlighter-rouge">Gc&lt;T&gt;</code> type that is not <code class="language-plaintext highlighter-rouge">Send</code>, so while you can set up multiple graphs of GC types on different threads, they’re essentially independent. Garbage collection only affects the thread it is being performed for, all other threads can continue unhindered.</p>

<p>Multithreaded GCs will have a <code class="language-plaintext highlighter-rouge">Send</code> <code class="language-plaintext highlighter-rouge">Gc&lt;T&gt;</code> type. Garbage collection will typically, but not always, block any thread which attempts to access data managed by the GC during that time. In some languages there are “stop the world” garbage collectors which block all threads at “safepoints” inserted by the compiler; Rust does not have the capability to insert such safepoints and blocking threads on GCs is done at the library level.</p>

<p>Most of the examples below are single-threaded, but their API design is not hard to extend towards a hypothetical multithreaded GC.</p>

<h2 id="rust-gc">rust-gc</h2>

<p>The <a href="https://docs.rs/gc/"><code class="language-plaintext highlighter-rouge">gc</code></a> crate is one I wrote with <a href="https://twitter.com/kneecaw/">Nika Layzell</a> mostly as a fun exercise, to figure out if a safe GC API is <em>possible</em>. I’ve <a href="https://manishearth.github.io/blog/2015/09/01/designing-a-gc-in-rust/">written about the design in depth before</a>, but the essence of the design is that it does something similar to reference counting to keep track of roots, and forces all GC mutations go through special <code class="language-plaintext highlighter-rouge">GcCell</code> types so that they can update the root count. Basically, a “root count” is updated whenever something becomes a root or stops being a root:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">Foo</span> <span class="p">{</span>
    <span class="n">bar</span><span class="p">:</span> <span class="n">GcCell</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Gc</span><span class="o">&lt;</span><span class="n">Bar</span><span class="o">&gt;&gt;&gt;</span>
<span class="p">}</span>
<span class="c1">// this is a root (root count = 1)</span>
<span class="k">let</span> <span class="n">bar</span> <span class="o">=</span> <span class="nn">Gc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Bar</span><span class="p">::</span><span class="nf">new</span><span class="p">());</span>
<span class="c1">// this is also a root (root count = 1)</span>
<span class="k">let</span> <span class="n">foo</span> <span class="o">=</span> <span class="nn">Gc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Foo</span><span class="p">::</span><span class="nf">new</span><span class="p">());</span>
<span class="c1">// .borrow_mut()'s RAII guard unroots bar (sets its root count to 0)</span>
<span class="o">*</span><span class="n">foo</span><span class="py">.bar</span><span class="nf">.borrow_mut</span><span class="p">()</span> <span class="o">=</span> <span class="nf">Some</span><span class="p">(</span><span class="n">bar</span><span class="p">);</span>
<span class="c1">// foo is still a root here, no call to .set()</span>
<span class="k">let</span> <span class="n">v</span> <span class="o">=</span> <span class="nd">vec!</span><span class="p">[</span><span class="n">foo</span><span class="p">];</span>

<span class="c1">// at destrucion time, foo's root count is set to 0</span>
</code></pre></div></div>

<p>The actual garbage collection phase will occur when certain GC operations are performed at a time when the heap is considered to have gotten reasonably large according to some heuristics.</p>

<p>While this is essentially “free” on reads, this is a fair amount of reference count traffic on any kind of write, which might not be desired; often the goal of using GCs is to <em>avoid</em> the performance characteristics of reference-counting-like patterns. Ultimately this is a hybrid approach that’s a mix of tracing and reference counting<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">6</a></sup>.</p>

<p><a href="https://docs.rs/gc/"><code class="language-plaintext highlighter-rouge">gc</code></a> is useful as a general-purpose GC if you just want a couple of things to participate in cycles without having to think about it too much. The general design can apply to a specialized GC integrating with another language runtime since it provides a clear way to keep track of roots; but it may not necessarily have the desired performance characteristics.</p>

<h2 id="servos-dom-integration">Servo’s DOM integration</h2>

<p><a href="https://github.com/servo/servo">Servo</a> is a browser engine in Rust that I used to work on full time. As mentioned earlier, browser engines typically implement a lot of their DOM types in native (i.e. Rust or C++, not JS) code, so for example <a href="https://doc.servo.org/script/dom/element/struct.Element.html"><code class="language-plaintext highlighter-rouge">Node</code></a> is a pure Rust object, and it <a href="https://doc.servo.org/script/dom/node/struct.Node.html#structfield.child_list">contains direct references to its children</a> so Rust code can do things like traverse the tree without having to go back and forth between JS and Rust.</p>

<p>Servo’s model is a little weird: roots are a <em>different type</em>, and lints enforce that unrooted heap references are never placed on the stack:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[dom_struct]</span> <span class="c1">// this is #[derive(JSTraceable)] plus some markers for lints</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Node</span> <span class="p">{</span>
    <span class="c1">// the parent type, for inheritance</span>
    <span class="n">eventtarget</span><span class="p">:</span> <span class="n">EventTarget</span><span class="p">,</span>
    <span class="c1">// in the actual code this is a different helper type that combines</span>
    <span class="c1">// the RefCell, Option, and Dom, but i've simplified it to use</span>
    <span class="c1">// stdlib types for this example</span>
    <span class="n">prev_sibling</span><span class="p">:</span> <span class="n">RefCell</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">Dom</span><span class="o">&lt;</span><span class="n">Node</span><span class="o">&gt;&gt;&gt;</span><span class="p">,</span>
    <span class="n">next_sibling</span><span class="p">:</span> <span class="n">RefCell</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">Dom</span><span class="o">&lt;</span><span class="n">Node</span><span class="o">&gt;&gt;&gt;</span><span class="p">,</span>
    <span class="c1">// ...</span>
<span class="p">}</span>

<span class="k">impl</span> <span class="n">Node</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">frob_next_sibling</span><span class="p">(</span><span class="o">&amp;</span><span class="k">self</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// fields can be accessed as borrows without any rooting</span>
        <span class="k">if</span> <span class="k">let</span> <span class="nf">Some</span><span class="p">(</span><span class="n">next</span><span class="p">)</span> <span class="o">=</span> <span class="k">self</span><span class="py">.next_sibling</span><span class="nf">.borrow</span><span class="p">()</span><span class="nf">.as_ref</span><span class="p">()</span> <span class="p">{</span>
            <span class="n">next</span><span class="nf">.frob</span><span class="p">();</span>
        <span class="p">}</span>
    <span class="p">}</span>

    <span class="k">fn</span> <span class="nf">get_next_sibling</span><span class="p">(</span><span class="o">&amp;</span><span class="k">self</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="n">DomRoot</span><span class="o">&lt;</span><span class="n">Node</span><span class="o">&gt;&gt;</span> <span class="p">{</span>
        <span class="c1">// but you need to root things for them to escape the borrow</span>
        <span class="c1">// .root() turns Dom&lt;T&gt; into DomRoot&lt;T&gt;</span>
        <span class="k">self</span><span class="py">.next_sibling</span><span class="nf">.borrow</span><span class="p">()</span><span class="nf">.as_ref</span><span class="p">()</span><span class="nf">.map</span><span class="p">(|</span><span class="n">x</span><span class="p">|</span> <span class="n">x</span><span class="nf">.root</span><span class="p">())</span>
    <span class="p">}</span>

    <span class="k">fn</span> <span class="nf">illegal</span><span class="p">(</span><span class="o">&amp;</span><span class="k">self</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// this line of code would get linted by a custom lint called unrooted_must_root</span>
        <span class="c1">// (which works somewhat similarly to the must_use stuff that Rust does)</span>
        <span class="k">let</span> <span class="n">ohno</span><span class="p">:</span> <span class="n">Dom</span><span class="o">&lt;</span><span class="n">Node</span><span class="o">&gt;</span> <span class="o">=</span> <span class="k">self</span><span class="py">.next_sibling</span><span class="nf">.borrow_mut</span><span class="p">()</span><span class="nf">.take</span><span class="p">();</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">Dom&lt;T&gt;</code> is basically a smart pointer that behaves like <code class="language-plaintext highlighter-rouge">&amp;T</code> but without a lifetime, whereas <code class="language-plaintext highlighter-rouge">DomRoot&lt;T&gt;</code> has the additional behavior of rooting on creation (and unrooting on <code class="language-plaintext highlighter-rouge">Drop</code>). The custom lint plugin essentially enforces that <code class="language-plaintext highlighter-rouge">Dom&lt;T&gt;</code>, and any DOM structs (tagged with <code class="language-plaintext highlighter-rouge">#[dom_struct]</code>) are never accessible on the stack aside from through <code class="language-plaintext highlighter-rouge">DomRoot&lt;T&gt;</code> or <code class="language-plaintext highlighter-rouge">&amp;T</code>.</p>

<p>I wouldn’t recommend this approach; it works okay but we’ve wanted to move off of it for a while because it relies on custom plugin lints for soundness. But it’s worth mentioning for completeness.</p>

<h2 id="josephine-servos-experimental-gc-plans">Josephine (Servo’s experimental GC plans)</h2>

<p>Given that Servo’s existing GC solution depends on plugging in to the compiler to do additional static analysis, we wanted something better. So <a href="https://github.com/asajeffrey/">Alan</a> designed <a href="https://github.com/asajeffrey/josephine">Josephine</a> (“JS affine”), which uses Rust’s affine types and borrowing in a cleaner way to provide a safe GC system.</p>

<p>Josephine is explicitly designed for Servo’s use case and as such does a lot of neat things around “compartments” and such that are probably irrelevant unless you specifically wish for your GC to integrate with a JS engine.</p>

<p>I mentioned earlier that the fact that the garbage collection phase can only happen in certain well-known moments of the code actually can make things easier for GC design, and Josephine is an example of this.</p>

<p>Josephine has a “JS context”, which is to be passed around everywhere and essentially represents the GC itself. When doing operations which may trigger a GC, you have to borrow the context mutably, whereas when accessing heap objects you need to borrow the context immutably. You can root heap objects to remove this requirement:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// cx is a `JSContext`, `node` is a `JSManaged&lt;'a, C, Node&gt;`</span>
<span class="c1">// assuming next_sibling and prev_sibling are not Options for simplicity</span>

<span class="c1">// borrows cx for `'b`</span>
<span class="k">let</span> <span class="n">next_sibling</span><span class="p">:</span> <span class="o">&amp;</span><span class="nv">'b</span> <span class="n">Node</span> <span class="o">=</span> <span class="n">node</span><span class="py">.next_sibling</span><span class="nf">.borrow</span><span class="p">(</span><span class="n">cx</span><span class="p">);</span>
<span class="nd">println!</span><span class="p">(</span><span class="s">"Name: {:?}"</span><span class="p">,</span> <span class="n">next_sibling</span><span class="py">.name</span><span class="p">);</span>
<span class="c1">// illegal, because cx is immutably borrowed by next_sibling</span>
<span class="c1">// node.prev_sibling.borrow_mut(cx).frob();</span>

<span class="c1">// read from next_sibling to ensure it lives this long</span>
<span class="nd">println!</span><span class="p">(</span><span class="s">"{:?}"</span><span class="p">,</span> <span class="n">next_sibling</span><span class="py">.name</span><span class="p">);</span>

<span class="k">let</span> <span class="k">ref</span> <span class="k">mut</span> <span class="n">root</span> <span class="o">=</span> <span class="n">cx</span><span class="nf">.new_root</span><span class="p">();</span>
<span class="c1">// no longer needs to borrow cx, borrows root for 'root instead</span>
<span class="k">let</span> <span class="n">next_sibling</span><span class="p">:</span> <span class="n">JSManaged</span><span class="o">&lt;</span><span class="nv">'root</span><span class="p">,</span> <span class="n">C</span><span class="p">,</span> <span class="n">Node</span><span class="o">&gt;</span> <span class="o">=</span> <span class="n">node</span><span class="py">.next_sibling</span><span class="nf">.in_root</span><span class="p">(</span><span class="n">root</span><span class="p">);</span>
<span class="c1">// now it's fine, no outstanding borrows of `cx`</span>
<span class="n">node</span><span class="py">.prev_sibling</span><span class="nf">.borrow_mut</span><span class="p">(</span><span class="n">cx</span><span class="p">)</span><span class="nf">.frob</span><span class="p">();</span>

<span class="c1">// read from next_sibling to ensure it lives this long</span>
<span class="nd">println!</span><span class="p">(</span><span class="s">"{:?}"</span><span class="p">,</span> <span class="n">next_sibling</span><span class="py">.name</span><span class="p">);</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">new_root()</code> creates a new root, and <code class="language-plaintext highlighter-rouge">in_root</code> ties the lifetime of a JS managed type to the root instead of to the <code class="language-plaintext highlighter-rouge">JSContext</code> borrow, releasing the borrow of the <code class="language-plaintext highlighter-rouge">JSContext</code> and allowing it to be borrowed mutably in future <code class="language-plaintext highlighter-rouge">.borrow_mut()</code> calls.</p>

<p>Note that <code class="language-plaintext highlighter-rouge">.borrow()</code> and <code class="language-plaintext highlighter-rouge">.borrow_mut()</code> here do not have runtime borrow-checking cost despite their similarities to <code class="language-plaintext highlighter-rouge">RefCell::borrow()</code>, they instead are doing some lifetime juggling to make things safe. Creating roots typically does have runtime cost. Sometimes you <em>may</em> need to use <code class="language-plaintext highlighter-rouge">RefCell&lt;T&gt;</code> for the same reason it’s used in <code class="language-plaintext highlighter-rouge">Rc</code>, but mostly only for non-GCd fields.</p>

<p>Custom types are typically defined in two parts as so:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[derive(Copy,</span> <span class="nd">Clone,</span> <span class="nd">Debug,</span> <span class="nd">Eq,</span> <span class="nd">PartialEq,</span> <span class="nd">JSTraceable,</span> <span class="nd">JSLifetime,</span> <span class="nd">JSCompartmental)]</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Element</span><span class="o">&lt;</span><span class="nv">'a</span><span class="p">,</span> <span class="n">C</span><span class="o">&gt;</span> <span class="p">(</span><span class="k">pub</span> <span class="n">JSManaged</span><span class="o">&lt;</span><span class="nv">'a</span><span class="p">,</span> <span class="n">C</span><span class="p">,</span> <span class="n">NativeElement</span><span class="o">&lt;</span><span class="nv">'a</span><span class="p">,</span> <span class="n">C</span><span class="o">&gt;&gt;</span><span class="p">);</span>

<span class="nd">#[derive(JSTraceable,</span> <span class="nd">JSLifetime,</span> <span class="nd">JSCompartmental)]</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">NativeElement</span><span class="o">&lt;</span><span class="nv">'a</span><span class="p">,</span> <span class="n">C</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">name</span><span class="p">:</span> <span class="n">JSString</span><span class="o">&lt;</span><span class="nv">'a</span><span class="p">,</span> <span class="n">C</span><span class="o">&gt;</span><span class="p">,</span>
    <span class="n">parent</span><span class="p">:</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="n">Element</span><span class="o">&lt;</span><span class="nv">'a</span><span class="p">,</span> <span class="n">C</span><span class="o">&gt;&gt;</span><span class="p">,</span>
    <span class="n">children</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Element</span><span class="o">&lt;</span><span class="nv">'a</span><span class="p">,</span> <span class="n">C</span><span class="o">&gt;&gt;</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p>where <code class="language-plaintext highlighter-rouge">Element&lt;'a&gt;</code> is a convenient copyable reference that is to be used inside other GC types, and <code class="language-plaintext highlighter-rouge">NativeElement&lt;'a&gt;</code> is its backing storage. The <code class="language-plaintext highlighter-rouge">C</code> parameter has to do with compartments and can be ignored for now.</p>

<p>A neat thing worth pointing out is that there’s no runtime borrow checking necessary for manipulating other GC references, even though roots let you hold multiple references to the same object!</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">parent_root</span> <span class="o">=</span> <span class="n">cx</span><span class="nf">.new_root</span><span class="p">();</span>
<span class="k">let</span> <span class="n">parent</span> <span class="o">=</span> <span class="n">element</span><span class="nf">.borrow</span><span class="p">(</span><span class="n">cx</span><span class="p">)</span><span class="py">.parent</span><span class="nf">.in_root</span><span class="p">(</span><span class="n">parent_root</span><span class="p">);</span>
<span class="k">let</span> <span class="k">ref</span> <span class="k">mut</span> <span class="n">child_root</span> <span class="o">=</span> <span class="n">cx</span><span class="nf">.new_root</span><span class="p">();</span>

<span class="c1">// could potentially be a second reference to `element` if it was</span>
<span class="c1">// the first child</span>
<span class="k">let</span> <span class="n">first_child</span> <span class="o">=</span> <span class="n">parent</span><span class="py">.children</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="nf">.in_root</span><span class="p">(</span><span class="n">child_root</span><span class="p">);</span>

<span class="c1">// this is okay, even though we hold a reference to `parent`</span>
<span class="c1">// via element.parent, because we have rooted that reference so it's</span>
<span class="c1">// now independent of whether `element.parent` changes!</span>
<span class="n">first_child</span><span class="nf">.borrow_mut</span><span class="p">(</span><span class="n">cx</span><span class="p">)</span><span class="py">.parent</span> <span class="o">=</span> <span class="nb">None</span><span class="p">;</span>
</code></pre></div></div>

<p>Essentially, when mutating a field, you have to obtain mutable access to the context, so there will not be any references to the field itself still around (e.g. <code class="language-plaintext highlighter-rouge">element.borrow(cx).parent</code>), only to the GC’d data within it, so you can change what a field references without invalidating other references to the <em>contents</em> of what the field references. This is a pretty cool trick that enables GC <em>without runtime-checked interior mutability</em>, which is relatively rare in such designs.</p>

<h2 id="unfinished-design-for-a-builtin-rust-gc">Unfinished design for a builtin Rust GC</h2>

<p>For a while a couple of us worked on a way to make Rust <em>itself</em> extensible with a pluggable GC, using LLVM stack map support for finding roots. After all, if we know which types are GC-ish, we can include metadata on how to find roots for each function, similar to how Rust functions currently contain unwinding hooks to enable cleanly running destructors during a panic.</p>

<p>We never got around to figuring out a <em>complete</em> design, but you can find more information on what we figured out in <a href="https://manishearth.github.io/blog/2016/08/18/gc-support-in-rust-api-design/">my</a> and <a href="http://blog.pnkfx.org/blog/categories/gc/">Felix’s</a> posts on this subject. Essentially, it involved a <code class="language-plaintext highlighter-rouge">Trace</code> trait with more generic <code class="language-plaintext highlighter-rouge">trace</code> methods, an auto-implemented <code class="language-plaintext highlighter-rouge">Root</code> trait that works similar to <code class="language-plaintext highlighter-rouge">Send</code>, and compiler machinery to keep track of which <code class="language-plaintext highlighter-rouge">Root</code> types are on the stack.</p>

<p>This is probably not too useful for people attempting to implement a GC, but I’m mentioning it for completeness’ sake.</p>

<p>Note that pre-1.0 Rust did have a builtin GC (<code class="language-plaintext highlighter-rouge">@T</code>, known as “managed pointers”), but IIRC in practice the cycle-management parts were not ever implemented so it behaved exactly like <code class="language-plaintext highlighter-rouge">Rc&lt;T&gt;</code>. I believe it was intended to have a cycle collector (I’ll talk more about that in the next section).</p>

<h2 id="bacon-rajan-cc-and-cycle-collectors-in-general">bacon-rajan-cc (and cycle collectors in general)</h2>

<p><a href="https://fitzgeraldnick.com/">Nick Fitzgerald</a> wrote <a href="https://github.com/fitzgen/bacon-rajan-cc"><code class="language-plaintext highlighter-rouge">bacon-rajan-cc</code></a> to implement _<a href="https://researcher.watson.ibm.com/researcher/files/us-bacon/Bacon01Concurrent.pdf">“Concurrent Cycle Collection in Reference Counted Systems”</a>__ by David F. Bacon and V.T. Rajan.</p>

<p>This is what is colloquially called a <em>cycle collector</em>; a kind of garbage collector which is essentially “what if we took <code class="language-plaintext highlighter-rouge">Rc&lt;T&gt;</code> but made it detect cycles”. Some people do not consider these to be <em>tracing</em> garbage collectors, but they have a lot of similar characteristics (and they do still “trace” through types). They’re often categorized as “hybrid” approaches, much like <a href="https://docs.rs/gc/"><code class="language-plaintext highlighter-rouge">gc</code></a>.</p>

<p>The idea is that you don’t actually need to <em>know</em> what the roots are if you’re maintaining reference counts: if a heap object has a reference count that is more than the number of heap objects referencing it, it must be a root. In practice it’s pretty inefficient to traverse the entire heap, so optimizations are applied, often by applying different “colors” to nodes, and by only looking at the set of objects that have recently have their reference counts decremented.</p>

<p>A crucial observation here is that if you <em>only focus on potential garbage</em>, you can shift your definition of “root” a bit, when looking for cycles you don’t need to look for references from the stack, you can be satisfied with references from <em>any part of the heap you know for a fact is reachable from things which are not potential garbage</em>.</p>

<p>A neat property of cycle collectors is while mark and sweep tracing GCs have their performance scale by the size of the heap as a whole, cycle collectors scale by the size of <em>the actual garbage you have</em> <sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">7</a></sup>. There are of course other tradeoffs:  deallocation is often cheaper or “free” in tracing GCs (amortizing those costs by doing it during the sweep phase) whereas cycle collectors have the constant allocator traffic involved in cleaning up objects when refcounts reach zero.</p>

<p>The way <a href="https://github.com/fitzgen/bacon-rajan-cc">bacon-rajan-cc</a> works is that every time a reference count is decremented, the object is added to a list of “potential cycle roots”, unless the reference count is decremented to 0 (in which case the object is immediately cleaned up, just like <code class="language-plaintext highlighter-rouge">Rc</code>). It then traces through this list; decrementing refcounts for every reference it follows, and cleaning up any elements that reach refcount 0. It then traverses this list <em>again</em> and reincrements refcounts for each reference it follows, to restore the original refcount. This basically treats any element not reachable from this “potential cycle root” list as “not garbage”, and doesn’t bother to visit it.</p>

<p>Cycle collectors require tighter control over the garbage collection algorithm, and have differing performance characteristics, so they may not necessarily be suitable for all use cases for GC integration in Rust, but it’s definitely worth considering!</p>

<h2 id="cell-gc">cell-gc</h2>

<p><a href="https://twitter.com/jorendorff/">Jason Orendorff</a>’s <a href="https://github.com/jorendorff/cell-gc">cell-gc</a> crate is interesting, it has a concept of “heap sessions”. Here’s a modified example from the readme:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">cell_gc</span><span class="p">::</span><span class="n">Heap</span><span class="p">;</span>

<span class="c1">// implements IntoHeap, and also generates an IntListRef type and accessors</span>
<span class="nd">#[derive(cell_gc_derive::IntoHeap)]</span>
<span class="k">struct</span> <span class="n">IntList</span><span class="o">&lt;</span><span class="nv">'h</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">head</span><span class="p">:</span> <span class="nb">i64</span><span class="p">,</span>
    <span class="n">tail</span><span class="p">:</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="n">IntListRef</span><span class="o">&lt;</span><span class="nv">'h</span><span class="o">&gt;&gt;</span>
<span class="p">}</span>

<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// Create a heap (you'll only do this once in your whole program)</span>
    <span class="k">let</span> <span class="k">mut</span> <span class="n">heap</span> <span class="o">=</span> <span class="nn">Heap</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>

    <span class="n">heap</span><span class="nf">.enter</span><span class="p">(|</span><span class="n">hs</span><span class="p">|</span> <span class="p">{</span>
        <span class="c1">// Allocate an object (returns an IntListRef)</span>
        <span class="k">let</span> <span class="n">obj1</span> <span class="o">=</span> <span class="n">hs</span><span class="nf">.alloc</span><span class="p">(</span><span class="n">IntList</span> <span class="p">{</span> <span class="n">head</span><span class="p">:</span> <span class="mi">17</span><span class="p">,</span> <span class="n">tail</span><span class="p">:</span> <span class="nb">None</span> <span class="p">});</span>
        <span class="nd">assert_eq!</span><span class="p">(</span><span class="n">obj1</span><span class="nf">.head</span><span class="p">(),</span> <span class="mi">17</span><span class="p">);</span>
        <span class="nd">assert_eq!</span><span class="p">(</span><span class="n">obj1</span><span class="nf">.tail</span><span class="p">(),</span> <span class="nb">None</span><span class="p">);</span>

        <span class="c1">// Allocate another object</span>
        <span class="k">let</span> <span class="n">obj2</span> <span class="o">=</span> <span class="n">hs</span><span class="nf">.alloc</span><span class="p">(</span><span class="n">IntList</span> <span class="p">{</span> <span class="n">head</span><span class="p">:</span> <span class="mi">33</span><span class="p">,</span> <span class="n">tail</span><span class="p">:</span> <span class="nf">Some</span><span class="p">(</span><span class="n">obj1</span><span class="p">)</span> <span class="p">});</span>
        <span class="nd">assert_eq!</span><span class="p">(</span><span class="n">obj2</span><span class="nf">.head</span><span class="p">(),</span> <span class="mi">33</span><span class="p">);</span>
        <span class="nd">assert_eq!</span><span class="p">(</span><span class="n">obj2</span><span class="nf">.tail</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">()</span><span class="nf">.head</span><span class="p">(),</span> <span class="mi">17</span><span class="p">);</span>

        <span class="c1">// mutate `tail`</span>
        <span class="n">obj2</span><span class="nf">.set_tail</span><span class="p">(</span><span class="nb">None</span><span class="p">);</span>
    <span class="p">});</span>
<span class="p">}</span>
</code></pre></div></div>

<p>All mutation goes through autogenerated accessors, so the crate has a little more control over traffic through the GC. These accessors help track roots via a scheme similar to what <a href="https://docs.rs/gc/"><code class="language-plaintext highlighter-rouge">gc</code></a> does; where there’s an <code class="language-plaintext highlighter-rouge">IntoHeap</code> trait used for modifying root refcounts when a reference is put into and taken out of the heap via accessors.</p>

<p>Heap sessions allow for the heap to moved around, even sent to other threads, and their lifetime prevents heap objects from being mixed between sessions. This uses a concept called <em>generativity</em>; you can read more about generativity in <em><a href="https://raw.githubusercontent.com/Gankra/thesis/master/thesis.pdf">“You Can’t Spell Trust Without Rust”</a></em> ch 6.3, by <a href="https://github.com/Gankra">Aria Beingessner</a>, or by looking at the <a href="https://github.com/bluss/indexing"><code class="language-plaintext highlighter-rouge">indexing</code></a> crate.</p>

<h2 id="interlude-the-similarities-between-async-and-gcs">Interlude: The similarities between <code class="language-plaintext highlighter-rouge">async</code> and GCs</h2>

<p>The next two examples use machinery from Rust’s <code class="language-plaintext highlighter-rouge">async</code> functionality despite having nothing to do with async I/O, and I think it’s important to talk about why that should make sense. I’ve <a href="https://twitter.com/ManishEarth/status/1073651552768819200">tweeted about this before</a>: I and <a href="https://github.com/kyren">Catherine West</a> figured this out when we were talking about <a href="https://github.com/kyren/gc-arena">her GC idea</a> based on <code class="language-plaintext highlighter-rouge">async</code>.</p>

<p>You can see some of this correspondence in Go: Go is a language that has both garbage collection and async I/O, and both of these use the same “safepoints” for yielding to the garbage collector or the scheduler. In Go, the compiler needs to automatically insert code that checks the “pulse” of the heap every now and then, and potentially runs garbage collection. It also needs to automatically insert code that can tell the scheduler “hey now is a safe time to interrupt me if a different goroutine wishes to run”. These are very similar in principle – they’re both essentially places where the compiler is inserting “it is okay to interrupt me now” checks, sometimes called “interruption points” or “yield points”.</p>

<p>Now, Rust’s compiler does not automatically insert interruption points. However, the design of <code class="language-plaintext highlighter-rouge">async</code> in Rust is essentially a way of adding <em>explicit</em> interruption points to Rust. <code class="language-plaintext highlighter-rouge">foo().await</code> in Rust is a way of running <code class="language-plaintext highlighter-rouge">foo()</code> and expecting that the scheduler <em>may</em> interrupt the code in between. The design of <a href="https://doc.rust-lang.org/nightly/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> and <a href="https://doc.rust-lang.org/nightly/std/pin/struct.Pin.html"><code class="language-plaintext highlighter-rouge">Pin&lt;P&gt;</code></a> come out of making this safe and pleasant to work with.</p>

<p>As we shall see, this same machinery can be used for creating safe interruption points for GCs in Rust.</p>

<h2 id="shifgrethor">Shifgrethor</h2>

<p><a href="https://github.com/withoutboats/shifgrethor">shifgrethor</a> is an experiment by <a href="https://github.com/withoutboats/">Saoirse</a> to try and build a GC that uses <a href="https://doc.rust-lang.org/nightly/std/pin/struct.Pin.html"><code class="language-plaintext highlighter-rouge">Pin&lt;P&gt;</code></a> for managing roots. They’ve written extensively on the design of <a href="https://github.com/withoutboats/shifgrethor">shifgrethor</a> <a href="https://without.boats/tags/shifgrethor/">on their blog</a>. In particular, the <a href="https://without.boats/blog/shifgrethor-iii/">post on rooting</a> goes through how rooting works.</p>

<p>The basic design is that there’s a <code class="language-plaintext highlighter-rouge">Root&lt;'root&gt;</code> type that contains a <code class="language-plaintext highlighter-rouge">Pin&lt;P&gt;</code>, which can be <em>immovably</em> tied to a stack frame using the same idea behind <code class="language-plaintext highlighter-rouge">pin-utils</code>’ <a href="https://docs.rs/pin-utils/0.1.0/pin_utils/macro.pin_mut.html"><code class="language-plaintext highlighter-rouge">pin_mut!()</code> macro</a>:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">letroot!</span><span class="p">(</span><span class="n">root</span><span class="p">);</span>
<span class="k">let</span> <span class="n">gc</span><span class="p">:</span> <span class="nb">Gc</span><span class="o">&lt;</span><span class="nv">'root</span><span class="p">,</span> <span class="n">Foo</span><span class="o">&gt;</span> <span class="o">=</span> <span class="n">root</span><span class="nf">.gc</span><span class="p">(</span><span class="nn">Foo</span><span class="p">::</span><span class="nf">new</span><span class="p">());</span>
</code></pre></div></div>

<p>The fact that <code class="language-plaintext highlighter-rouge">root</code> is immovable allows for it to be treated as a true marker for the <em>stack frame</em> over anything else. The list of rooted types can be neatly stored in an ordered stack-like vector in the GC implementation, popping when individual roots go out of scope.</p>

<p>If you wish to return a rooted object from a function, the function needs to accept a <code class="language-plaintext highlighter-rouge">Root&lt;'root&gt;</code>:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="n">new</span><span class="o">&lt;</span><span class="nv">'root</span><span class="o">&gt;</span><span class="p">(</span><span class="n">root</span><span class="p">:</span> <span class="n">Root</span><span class="o">&lt;</span><span class="nv">'root</span><span class="o">&gt;</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">Gc</span><span class="o">&lt;</span><span class="nv">'root</span><span class="p">,</span> <span class="k">Self</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">root</span><span class="nf">.gc</span><span class="p">(</span><span class="k">Self</span> <span class="p">{</span>
        <span class="c1">// ...</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>All GC’d types have a <code class="language-plaintext highlighter-rouge">'root</code> lifetime of the root they trace back to, and are declared with a custom derive:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[derive(GC)]</span>
<span class="k">struct</span> <span class="n">Foo</span><span class="o">&lt;</span><span class="nv">'root</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="nd">#[gc]</span> <span class="n">bar</span><span class="p">:</span> <span class="n">GcStore</span><span class="o">&lt;</span><span class="nv">'root</span><span class="p">,</span> <span class="n">Bar</span><span class="o">&gt;</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">GcStore</code> is a way to have fields use the rooting of their parent. Normally, if you wanted to put <code class="language-plaintext highlighter-rouge">Gc&lt;'root2, Bar&lt;'root2&gt;&gt;</code> inside <code class="language-plaintext highlighter-rouge">Foo&lt;'root1&gt;</code> you would not be able to because the lifetimes derive from different roots. <code class="language-plaintext highlighter-rouge">GcStore</code>, along with autogenerated accessors from <code class="language-plaintext highlighter-rouge">#[derive(GC)]</code>, will set <code class="language-plaintext highlighter-rouge">Bar</code>’s lifetime to be the same as <code class="language-plaintext highlighter-rouge">Foo</code> when you attempt to stick it inside <code class="language-plaintext highlighter-rouge">Foo</code>.</p>

<p>This design is somewhat similar to that of Servo where there’s a pair of types, one that lets us refer to GC types on the stack, and one that lets GC types refer to each other on the heap, but it uses <code class="language-plaintext highlighter-rouge">Pin&lt;P&gt;</code> instead of a lint to enforce this safely, which is way nicer. <code class="language-plaintext highlighter-rouge">Root&lt;'root&gt;</code> and <code class="language-plaintext highlighter-rouge">GcStore</code> do a bunch of lifetime tweaking that’s reminiscent of Josephine’s rooting system, however there’s no need for an <code class="language-plaintext highlighter-rouge">&amp;mut JsContext</code> type that needs to be passed around everywhere.</p>

<h2 id="gc-arena">gc-arena</h2>

<p><a href="https://github.com/kyren/gc-arena"><code class="language-plaintext highlighter-rouge">gc-arena</code></a> is <a href="https://github.com/kyren">Catherine West</a>’s experimental GC design for her Lua VM, <a href="https://github.com/kyren/luster"><code class="language-plaintext highlighter-rouge">luster</code></a>.</p>

<p>The <code class="language-plaintext highlighter-rouge">gc-arena</code> crate forces all GC-manipulating code to go within <code class="language-plaintext highlighter-rouge">arena.mutate()</code> calls, between which garbage collection may occur.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[derive(Collect)]</span>
<span class="nd">#[collect(no_drop)]</span>
<span class="k">struct</span> <span class="n">TestRoot</span><span class="o">&lt;</span><span class="nv">'gc</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">number</span><span class="p">:</span> <span class="nb">Gc</span><span class="o">&lt;</span><span class="nv">'gc</span><span class="p">,</span> <span class="nb">i32</span><span class="o">&gt;</span><span class="p">,</span>
    <span class="n">many_numbers</span><span class="p">:</span> <span class="n">GcCell</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">Gc</span><span class="o">&lt;</span><span class="nv">'gc</span><span class="p">,</span> <span class="nb">i32</span><span class="o">&gt;&gt;&gt;</span><span class="p">,</span>
<span class="p">}</span>

<span class="nd">make_arena!</span><span class="p">(</span><span class="n">TestArena</span><span class="p">,</span> <span class="n">TestRoot</span><span class="p">);</span>

<span class="k">let</span> <span class="k">mut</span> <span class="n">arena</span> <span class="o">=</span> <span class="nn">TestArena</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">ArenaParameters</span><span class="p">::</span><span class="nf">default</span><span class="p">(),</span> <span class="p">|</span><span class="n">mc</span><span class="p">|</span> <span class="n">TestRoot</span> <span class="p">{</span>
    <span class="n">number</span><span class="p">:</span> <span class="nn">Gc</span><span class="p">::</span><span class="nf">allocate</span><span class="p">(</span><span class="n">mc</span><span class="p">,</span> <span class="mi">42</span><span class="p">),</span>
    <span class="n">many_numbers</span><span class="p">:</span> <span class="nn">GcCell</span><span class="p">::</span><span class="nf">allocate</span><span class="p">(</span><span class="n">mc</span><span class="p">,</span> <span class="nn">Vec</span><span class="p">::</span><span class="nf">new</span><span class="p">()),</span>
<span class="p">});</span>

<span class="n">arena</span><span class="nf">.mutate</span><span class="p">(|</span><span class="n">_mc</span><span class="p">,</span> <span class="n">root</span><span class="p">|</span> <span class="p">{</span>
    <span class="nd">assert_eq!</span><span class="p">(</span><span class="o">*</span><span class="p">((</span><span class="o">*</span><span class="n">root</span><span class="p">)</span><span class="py">.number</span><span class="p">),</span> <span class="mi">42</span><span class="p">);</span>
    <span class="n">root</span><span class="py">.numbers</span><span class="nf">.write</span><span class="p">(</span><span class="n">mc</span><span class="p">)</span><span class="nf">.push</span><span class="p">(</span><span class="nn">Gc</span><span class="p">::</span><span class="nf">allocate</span><span class="p">(</span><span class="n">mc</span><span class="p">,</span> <span class="mi">0</span><span class="p">));</span>
<span class="p">});</span>
</code></pre></div></div>

<p>Mutation is done with <code class="language-plaintext highlighter-rouge">GcCell</code>, basically a fancier version of <code class="language-plaintext highlighter-rouge">Gc&lt;RefCell&lt;T&gt;&gt;</code>. All GC operations require a <code class="language-plaintext highlighter-rouge">MutationContext</code> (<code class="language-plaintext highlighter-rouge">mc</code>), which is only available within <code class="language-plaintext highlighter-rouge">arena.mutate()</code>.</p>

<p>Only the arena root may survive between <code class="language-plaintext highlighter-rouge">mutate()</code> calls, and garbage collection does not happen during <code class="language-plaintext highlighter-rouge">.mutate()</code>, so rooting is easy – just follow the arena root. This crate allows for multiple GCs to coexist with separate heaps, and, similarly to <a href="https://github.com/jorendorff/cell-gc">cell-gc</a>, it uses generativity to enforce that the heaps do not get mixed.</p>

<p>So far this is mostly like other arena-based systems, but with a GC.</p>

<p>The <em>really cool</em> part of the design is the <code class="language-plaintext highlighter-rouge">gc-sequence</code> crate, which essentially builds a <code class="language-plaintext highlighter-rouge">Future</code>-like API (using a <code class="language-plaintext highlighter-rouge">Sequence</code> trait) on top of <code class="language-plaintext highlighter-rouge">gc-arena</code> that can potentially make this very pleasant to use. Here’s a modified example from a test:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[derive(Collect)]</span>
<span class="nd">#[collect(no_drop)]</span>
<span class="k">struct</span> <span class="n">TestRoot</span><span class="o">&lt;</span><span class="nv">'gc</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">test</span><span class="p">:</span> <span class="nb">Gc</span><span class="o">&lt;</span><span class="nv">'gc</span><span class="p">,</span> <span class="nb">i32</span><span class="o">&gt;</span><span class="p">,</span>
<span class="p">}</span>

<span class="nd">make_sequencable_arena!</span><span class="p">(</span><span class="n">test_sequencer</span><span class="p">,</span> <span class="n">TestRoot</span><span class="p">);</span>
<span class="k">use</span> <span class="nn">test_sequencer</span><span class="p">::</span><span class="n">Arena</span> <span class="k">as</span> <span class="n">TestArena</span><span class="p">;</span>

<span class="k">let</span> <span class="n">arena</span> <span class="o">=</span> <span class="nn">TestArena</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">ArenaParameters</span><span class="p">::</span><span class="nf">default</span><span class="p">(),</span> <span class="p">|</span><span class="n">mc</span><span class="p">|</span> <span class="n">TestRoot</span> <span class="p">{</span>
    <span class="n">test</span><span class="p">:</span> <span class="nn">Gc</span><span class="p">::</span><span class="nf">allocate</span><span class="p">(</span><span class="n">mc</span><span class="p">,</span> <span class="mi">42</span><span class="p">),</span>
<span class="p">});</span>

<span class="k">let</span> <span class="k">mut</span> <span class="n">sequence</span> <span class="o">=</span> <span class="n">arena</span><span class="nf">.sequence</span><span class="p">(|</span><span class="n">root</span><span class="p">|</span> <span class="p">{</span>
    <span class="nn">sequence</span><span class="p">::</span><span class="nf">from_fn_with</span><span class="p">(</span><span class="n">root</span><span class="py">.test</span><span class="p">,</span> <span class="p">|</span><span class="n">_</span><span class="p">,</span> <span class="n">test</span><span class="p">|</span> <span class="p">{</span>
        <span class="k">if</span> <span class="o">*</span><span class="n">test</span> <span class="o">==</span> <span class="mi">42</span> <span class="p">{</span>
            <span class="nf">Ok</span><span class="p">(</span><span class="o">*</span><span class="n">test</span> <span class="o">+</span> <span class="mi">10</span><span class="p">)</span>
        <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
            <span class="nf">Err</span><span class="p">(</span><span class="s">"will not be generated"</span><span class="p">)</span>
        <span class="p">}</span>
    <span class="p">})</span>
    <span class="nf">.and_then</span><span class="p">(|</span><span class="n">_</span><span class="p">,</span> <span class="n">r</span><span class="p">|</span> <span class="nf">Ok</span><span class="p">(</span><span class="n">r</span> <span class="o">+</span> <span class="mi">12</span><span class="p">))</span>
    <span class="nf">.and_chain</span><span class="p">(|</span><span class="n">_</span><span class="p">,</span> <span class="n">r</span><span class="p">|</span> <span class="nf">Ok</span><span class="p">(</span><span class="nn">sequence</span><span class="p">::</span><span class="nf">ok</span><span class="p">(</span><span class="n">r</span> <span class="o">-</span> <span class="mi">10</span><span class="p">)))</span>
    <span class="nf">.then</span><span class="p">(|</span><span class="n">_</span><span class="p">,</span> <span class="n">res</span><span class="p">|</span> <span class="n">res</span><span class="nf">.expect</span><span class="p">(</span><span class="s">"should not be error"</span><span class="p">))</span>
    <span class="nf">.chain</span><span class="p">(|</span><span class="n">_</span><span class="p">,</span> <span class="n">r</span><span class="p">|</span> <span class="nn">sequence</span><span class="p">::</span><span class="nf">done</span><span class="p">(</span><span class="n">r</span> <span class="o">+</span> <span class="mi">10</span><span class="p">))</span>
    <span class="nf">.map</span><span class="p">(|</span><span class="n">r</span><span class="p">|</span> <span class="nn">sequence</span><span class="p">::</span><span class="nf">done</span><span class="p">(</span><span class="n">r</span> <span class="o">-</span> <span class="mi">60</span><span class="p">))</span>
    <span class="nf">.flatten</span><span class="p">()</span>
    <span class="nf">.boxed</span><span class="p">()</span>
<span class="p">});</span>

<span class="k">loop</span> <span class="p">{</span>
    <span class="k">match</span> <span class="n">sequence</span><span class="nf">.step</span><span class="p">()</span> <span class="p">{</span>
        <span class="nf">Ok</span><span class="p">((</span><span class="n">_</span><span class="p">,</span> <span class="n">output</span><span class="p">))</span> <span class="k">=&gt;</span> <span class="p">{</span>
            <span class="nd">assert_eq!</span><span class="p">(</span><span class="n">output</span><span class="p">,</span> <span class="mi">4</span><span class="p">);</span>
            <span class="k">return</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="nf">Err</span><span class="p">(</span><span class="n">s</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">sequence</span> <span class="o">=</span> <span class="n">s</span><span class="p">,</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This is <em>very</em> similar to chained callback futures code; and if it could use the <code class="language-plaintext highlighter-rouge">Future</code> trait would be able to make use of <code class="language-plaintext highlighter-rouge">async</code> to convert this callback heavy code into sequential code with interrupt points using <code class="language-plaintext highlighter-rouge">await</code>. There were design constraints making <code class="language-plaintext highlighter-rouge">Future</code> not workable for this use case, though if Rust ever gets generators this would work well, and it’s quite possible that another GC with a similar design could be written, using <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> and <code class="language-plaintext highlighter-rouge">Future</code>.</p>

<p>Essentially, this paints a picture of an entire space of Rust GC design where GC mutations are performed using <code class="language-plaintext highlighter-rouge">await</code> (or <code class="language-plaintext highlighter-rouge">yield</code> if we ever get generators), and garbage collection can occur during those yield points, in a way that’s highly reminiscent of Go’s design.</p>

<h2 id="moving-forward">Moving forward</h2>

<p>As is hopefully obvious, the space of safe GC design in Rust is quite rich and has a lot of interesting ideas. I’m really excited to see what folks come up with here!</p>

<p>If you’re interested in reading more about GCs in general, <em><a href="https://courses.cs.washington.edu/courses/cse590p/05au/p50-bacon.pdf">“A Unified Theory of Garbage Collection”</a></em> by Bacon et al and the <a href="http://gchandbook.org/">GC Handbook</a> are great reads.</p>

<p><em>Thanks to <a href="https://mermaid.industries/">Andi McClure</a>, <a href="https://twitter.com/jorendorff/">Jason Orendorff</a>, <a href="https://fitzgeraldnick.com/">Nick Fitzgerald</a>, and <a href="https://twitter.com/kneecaw/">Nika Layzell</a> for providing feedback on drafts of this blog post</em></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:0" role="doc-endnote">
      <p>I’m also going to completely ignore the field of <em>conservative</em> stack-scanning tracing GCs where you figure out your roots by looking at all the stack memory and considering anything with a remotely heap-object-like bit pattern to be a root. These are interesting, but can’t really be made 100% safe in the way Rust wants them to be unless you scan the heap as well. <a href="#fnref:0" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:1" role="doc-endnote">
      <p>Which currently does not have support for concurrent garbage collection, but it could be added. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Some JNI-using APIs are also forced to have <a href="https://developer.android.com/ndk/reference/group/bitmap#androidbitmap_lockpixels">explicit rooting APIs</a> to give access to things like raw buffers. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>In general, finalizers in GCs are hard to implement soundly in any language, not just Rust, but Rust can sometimes be a bit more annoying about it. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>Spolier: This is actually possible in Rust, and we’ll get into it further in this post! <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:10" role="doc-endnote">
      <p>Such hybrid approaches are common in high performance GCs; <em><a href="https://courses.cs.washington.edu/courses/cse590p/05au/p50-bacon.pdf">“A Unified Theory of Garbage Collection”</a></em> by Bacon et al. covers a lot of the breadth of these approaches. <a href="#fnref:10" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>Firefox’s DOM actually uses a mark &amp; sweep tracing GC <em>mixed with</em> a cycle collector for this reason. The DOM types themselves are cycle collected, but JavaScript objects are managed by the Spidermonkey GC. Since some DOM types may contain references to arbitrary JS types (e.g. ones that store callbacks) there’s a fair amount of work required to break cycles manually in some cases, but it has performance benefits since the vast majority of DOM objects either never become garbage or become garbage by having a couple non-cycle-participating references get released. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Arenas in Rust]]></title>
    <link href="http://manishearth.github.io/blog/2021/03/15/arenas-in-rust/"/>
    <updated>2021-03-15T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2021/03/15/arenas-in-rust</id>
    <content type="html"><![CDATA[<p>There’s been some discussion about arenas in Rust recently, and I thought I’d write about them.</p>

<p>Arenas aren’t something you would typically reach for in Rust so fewer people know about them; you only really see them in applications for various niche use cases. Usually you can use an arena by pulling in a crate and not using additional <code class="language-plaintext highlighter-rouge">unsafe</code>, so there’s no need to be particularly skittish around them in Rust, and it seems like it would be useful knowledge, especially for people coming to Rust from fields where arenas are more common.</p>

<p>Furthermore, there’s a set of <em>really cool</em> lifetime effects involved when implementing self-referential arenas, that I don’t think have been written about before.</p>

<p>I’m mostly writing this to talk about the cool lifetime effects, but I figured it’s worth writing a general introduction that has something for all Rustaceans. If you know what arenas are and just want the cool lifetimes you can skip directly to <a href="#implementing-a-self-referential-arena">the section on implementing self-referential arenas</a>. Otherwise, read on.</p>

<h2 id="whats-an-arena">What’s an arena?</h2>

<p>An arena is essentially a way to group up allocations that are expected to have the same lifetime. Sometimes you need to allocate a bunch of objects for the lifetime of an event, after which they can all be thrown away wholesale. It’s inefficient to call into the system allocator each time, and far more preferable to <em>preallocate</em> a bunch of memory for your objects, cleaning it all up at once once you’re done with them.</p>

<p>Broadly speaking, there are two reasons you might wish to use an arena:</p>

<p>Firstly, your primary goal may be to reduce allocation pressure, as mentioned above. For example, in a game or application, there may be large mishmash of per-frame-tick objects that need to get allocated each frame, and then thrown away. This is <em>extremely</em> common in game development in particular, and allocator pressure is something gamedevs tend to care about. With arenas, it’s easy enough to allocate an arena, fill it up during each frame and clear it out once the frame is over. This has additional benefits of cache locality: you can ensure that most of the per-frame objects (which are likely used more often than other objects) are usually in cache during the frame, since they’ve been allocated adjacently.</p>

<p>Another goal might be that you want to write self referential data, like a complex graph with cycles, that can get cleaned up all at once. For example, when writing compilers, type information will likely need to reference other types and other such data, leading to a complex, potentially cyclic graph of types. Once you’ve computed a type you probably don’t need to throw it away individually, so you can use an arena to store all your computed type information, cleaning the whole thing up at once when you’re at a stage where the types don’t matter anymore. Using this pattern allows your code to not have to worry about whether the self-referential bits get deallocated “early”, it lets you make the assumption that if you have a <code class="language-plaintext highlighter-rouge">Ty</code> it lives as long as all the other <code class="language-plaintext highlighter-rouge">Ty</code>s and can reference them directly.</p>

<p>These two goals are not necessarily disjoint: You may wish to use an arena to achieve both goals simultaneously. But you can also just have an arena that disallows self referential types (but has other nice properties). Later in this post I’m going to implement an arena that allows self-referential types but is not great on allocation pressure, mostly for ease of implementation. <em>Typically</em> if you’re writing an arena for self-referential types you can make it simultaneously reduce allocator pressure, but there can be tradeoffs.</p>

<h2 id="how-can-i-use-an-arena-in-rust">How can I use an arena in Rust?</h2>

<p>Typically to <em>use</em> an arena you can just pull in a crate that implements the right kind of arena. There are two that I know of that I’ll talk about below, though <a href="https://crates.io/search?q=arena">a cursory search of “arena” on crates.io</a> turns up many other promising candidates.</p>

<p>I’ll note that if you just need cyclic graph structures, you don’t <em>have</em> to use an arena, the excellent <a href="https://docs.rs/petgraph/"><code class="language-plaintext highlighter-rouge">petgraph</code></a> crate is often sufficient. <a href="https://docs.rs/slotmap/"><code class="language-plaintext highlighter-rouge">slotmap</code></a> is also useful; it’s a map-like datastructure useful for self-referential data, based on generational indexing.</p>

<h3 id="bumpalo">Bumpalo</h3>

<p><a href="https://docs.rs/bumpalo"><code class="language-plaintext highlighter-rouge">Bumpalo</code></a> is a fast “bump allocator”, which allows heterogenous contents, and only allows cycles if you do not care about destructors getting run.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">bumpalo</span><span class="p">::</span><span class="n">Bump</span><span class="p">;</span>

<span class="c1">// (example slightly modified from `bumpalo` docs)</span>

<span class="c1">// Create a new arena to bump allocate into.</span>
<span class="k">let</span> <span class="n">bump</span> <span class="o">=</span> <span class="nn">Bump</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>

<span class="c1">// Allocate values into the arena.</span>
<span class="k">let</span> <span class="n">scooter</span> <span class="o">=</span> <span class="n">bump</span><span class="nf">.alloc</span><span class="p">(</span><span class="n">Doggo</span> <span class="p">{</span>
    <span class="n">cuteness</span><span class="p">:</span> <span class="nn">u64</span><span class="p">::</span><span class="nf">max_value</span><span class="p">(),</span>
    <span class="n">age</span><span class="p">:</span> <span class="mi">8</span><span class="p">,</span>
    <span class="n">scritches_required</span><span class="p">:</span> <span class="k">true</span><span class="p">,</span>
<span class="p">});</span>

<span class="c1">// Happy birthday, Scooter!</span>
<span class="n">scooter</span><span class="py">.age</span> <span class="o">+=</span> <span class="mi">1</span><span class="p">;</span>
</code></pre></div></div>

<p>Every call to <a href="https://docs.rs/bumpalo/3.6.1/bumpalo/struct.Bump.html#method.alloc"><code class="language-plaintext highlighter-rouge">Bump::alloc()</code></a> returns a mutable reference to the allocated object. You can allocate different objects, and they can even reference each other<sup id="fnref:0" role="doc-noteref"><a href="#fn:0" class="footnote" rel="footnote">1</a></sup>. By default it does not call destructors on its contents; however you can use <a href="https://docs.rs/bumpalo/3.6.1/bumpalo/boxed/index.html"><code class="language-plaintext highlighter-rouge">bumpalo::boxed</code></a> (or custom allocators on Nightly) to get this behavior. You can similarly use <a href="https://docs.rs/bumpalo/3.6.1/bumpalo/collections/index.html"><code class="language-plaintext highlighter-rouge">bumpalo::collections</code></a> to get <a href="https://docs.rs/bumpalo"><code class="language-plaintext highlighter-rouge">bumpalo</code></a>-backed vectors and strings. <a href="https://docs.rs/bumpalo/3.6.1/bumpalo/boxed/index.html"><code class="language-plaintext highlighter-rouge">bumpalo::boxed</code></a> will not be allowed to participate in cycles.</p>

<h3 id="typed-arena"><code class="language-plaintext highlighter-rouge">typed-arena</code></h3>

<p><a href="https://docs.rs/typed-arena/"><code class="language-plaintext highlighter-rouge">typed-arena</code></a> is an arena allocator that can only store objects of a single type, but it does allow for setting up cyclic references:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Example from typed-arena docs</span>

<span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">cell</span><span class="p">::</span><span class="n">Cell</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">typed_arena</span><span class="p">::</span><span class="n">Arena</span><span class="p">;</span>

<span class="k">struct</span> <span class="n">CycleParticipant</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">other</span><span class="p">:</span> <span class="n">Cell</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;&amp;</span><span class="nv">'a</span> <span class="n">CycleParticipant</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;&gt;&gt;</span><span class="p">,</span>
<span class="p">}</span>

<span class="k">let</span> <span class="n">arena</span> <span class="o">=</span> <span class="nn">Arena</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>

<span class="k">let</span> <span class="n">a</span> <span class="o">=</span> <span class="n">arena</span><span class="nf">.alloc</span><span class="p">(</span><span class="n">CycleParticipant</span> <span class="p">{</span> <span class="n">other</span><span class="p">:</span> <span class="nn">Cell</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nb">None</span><span class="p">)</span> <span class="p">});</span>
<span class="k">let</span> <span class="n">b</span> <span class="o">=</span> <span class="n">arena</span><span class="nf">.alloc</span><span class="p">(</span><span class="n">CycleParticipant</span> <span class="p">{</span> <span class="n">other</span><span class="p">:</span> <span class="nn">Cell</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nb">None</span><span class="p">)</span> <span class="p">});</span>

<span class="c1">// mutate them after the fact to set up a cycle</span>
<span class="n">a</span><span class="py">.other</span><span class="nf">.set</span><span class="p">(</span><span class="nf">Some</span><span class="p">(</span><span class="n">b</span><span class="p">));</span>
<span class="n">b</span><span class="py">.other</span><span class="nf">.set</span><span class="p">(</span><span class="nf">Some</span><span class="p">(</span><span class="n">a</span><span class="p">));</span>
</code></pre></div></div>

<p>Unlike <a href="https://docs.rs/bumpalo"><code class="language-plaintext highlighter-rouge">bumpalo</code></a>, <a href="https://docs.rs/typed-arena/"><code class="language-plaintext highlighter-rouge">typed-arena</code></a> will always run destructors on its contents when the arena itself goes out of scope<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">2</a></sup>.</p>

<h2 id="implementing-a-self-referential-arena">Implementing a self-referential arena</h2>

<p>Self referential arenas are interesting because, typically, Rust is very very wary of self-referential data. But arenas let you clearly separate the step of “I don’t care about this object” and “this object can be deleted” in a way that is sufficient to allow self-referential and cyclic types.</p>

<p>It’s pretty rare to need to implement your own arena – <a href="https://docs.rs/bumpalo"><code class="language-plaintext highlighter-rouge">bumpalo</code></a> and <a href="https://docs.rs/typed-arena/"><code class="language-plaintext highlighter-rouge">typed-arena</code></a> cover most of the use cases, and if they don’t cover yours you probably can find something that does on <a href="https://crates.io/search?q=arena">crates.io</a>. But if you really need to, or if you’re interested in the nitty-gritty lifetime details, this section is for you.</p>

<div class="post-aside post-aside-note">For people less familiar with lifetimes: the lifetimes in the syntaxes <code class="language-plaintext highlighter-rouge">&amp;'a Foo</code> and <code class="language-plaintext highlighter-rouge">Foo&lt;'b&gt;</code> mean different things. <code class="language-plaintext highlighter-rouge">'a</code> in <code class="language-plaintext highlighter-rouge">&amp;'a Foo</code> is the lifetime <em>of</em> <code class="language-plaintext highlighter-rouge">Foo</code>, or, at least the lifetime of <em>this</em> reference to <code class="language-plaintext highlighter-rouge">Foo</code>. <code class="language-plaintext highlighter-rouge">'b</code> in <code class="language-plaintext highlighter-rouge">Foo&lt;'b&gt;</code> is a lifetime <em>parameter</em> of <code class="language-plaintext highlighter-rouge">Foo</code>, and typically means something like “the lifetime of data <code class="language-plaintext highlighter-rouge">Foo</code> is allowed to reference”.</div>

<p>The key to implementing an arena <code class="language-plaintext highlighter-rouge">Arena</code> with entries typed as <code class="language-plaintext highlighter-rouge">Entry</code> is in the following rules:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Arena</code> and <code class="language-plaintext highlighter-rouge">Entry</code> should both have a lifetime parameter: <code class="language-plaintext highlighter-rouge">Arena&lt;'arena&gt;</code> and <code class="language-plaintext highlighter-rouge">Entry&lt;'arena&gt;</code></li>
  <li><code class="language-plaintext highlighter-rouge">Arena</code> methods should all receive <code class="language-plaintext highlighter-rouge">Arena&lt;'arena&gt;</code> as <code class="language-plaintext highlighter-rouge">&amp;'arena self</code>, i.e. their <code class="language-plaintext highlighter-rouge">self</code> type is <code class="language-plaintext highlighter-rouge">&amp;'arena Arena&lt;'arena&gt;</code></li>
  <li><code class="language-plaintext highlighter-rouge">Entry</code> should almost always be passed around as <code class="language-plaintext highlighter-rouge">&amp;'arena Entry&lt;'arena&gt;</code> (it’s useful to make an alias for this)</li>
  <li>Use interior mutability; <code class="language-plaintext highlighter-rouge">&amp;mut self</code> on <code class="language-plaintext highlighter-rouge">Arena</code> will make everything stop compiling. If using <code class="language-plaintext highlighter-rouge">unsafe</code> for mutability, make sure you have a <code class="language-plaintext highlighter-rouge">PhantomData</code> for <code class="language-plaintext highlighter-rouge">RefCell&lt;Entry&lt;'arena&gt;&gt;</code> somewhere.</li>
</ul>

<p>That’s basically it from the lifetime side, the rest is all in figuring what API you want and implementing the backing storage. Armed with the above rules you should be able to make your custom arena work with the guarantees you need without having to understand what’s going on with the underlying lifetimes.</p>

<p>Let’s go through an implementation example, and then dissect <em>why</em> it works.</p>

<h3 id="implementation">Implementation</h3>

<p>My crate <a href="https://docs.rs/elsa"><code class="language-plaintext highlighter-rouge">elsa</code></a> implements an arena in 100% safe code <a href="https://github.com/Manishearth/elsa/blob/915d26008d8bae069927c551da506dba05d2755b/examples/mutable_arena.rs">in one of its examples</a>. This arena does <em>not</em> save on allocations since <a href="https://docs.rs/elsa/1.4.0/elsa/vec/struct.FrozenVec.html"><code class="language-plaintext highlighter-rouge">elsa::FrozenVec</code></a> requires its contents be behind some indirection, and it’s not generic, but it’s a reasonable way to illustrate how the lifetimes work without getting into the weeds of implementing a <em>really good</em> arena with <code class="language-plaintext highlighter-rouge">unsafe</code>.</p>

<p>The example implements an arena of <code class="language-plaintext highlighter-rouge">Person&lt;'arena&gt;</code> types, <code class="language-plaintext highlighter-rouge">Arena&lt;'arena&gt;</code>. The goal is to implement some kind of directed social graph, which may have cycles.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">elsa</span><span class="p">::</span><span class="n">FrozenVec</span><span class="p">;</span>

<span class="k">struct</span> <span class="n">Arena</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">people</span><span class="p">:</span> <span class="n">FrozenVec</span><span class="o">&lt;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">Person</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;&gt;&gt;</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p><a href="https://docs.rs/elsa/1.4.0/elsa/vec/struct.FrozenVec.html"><code class="language-plaintext highlighter-rouge">elsa::FrozenVec</code></a> is an append-only <code class="language-plaintext highlighter-rouge">Vec</code>-like abstraction that allows you to call <code class="language-plaintext highlighter-rouge">.push()</code> without needing a mutable reference, and is how we’ll be able to implement this arena in safe code.</p>

<p>Each <code class="language-plaintext highlighter-rouge">Person&lt;'arena&gt;</code> has a list of people they follow but also keeps track of people who follow them:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">Person</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">pub</span> <span class="n">follows</span><span class="p">:</span> <span class="n">FrozenVec</span><span class="o">&lt;</span><span class="n">PersonRef</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;&gt;</span><span class="p">,</span>
    <span class="k">pub</span> <span class="n">reverse_follows</span><span class="p">:</span> <span class="n">FrozenVec</span><span class="o">&lt;</span><span class="n">PersonRef</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;&gt;</span><span class="p">,</span>
    <span class="k">pub</span> <span class="n">name</span><span class="p">:</span> <span class="o">&amp;</span><span class="k">'static</span> <span class="nb">str</span><span class="p">,</span>
<span class="p">}</span>

<span class="c1">// following the rule above about references to entry types</span>
<span class="k">type</span> <span class="n">PersonRef</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;</span> <span class="o">=</span> <span class="o">&amp;</span><span class="nv">'arena</span> <span class="n">Person</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;</span><span class="p">;</span>
</code></pre></div></div>

<p>The lifetime <code class="language-plaintext highlighter-rouge">'arena</code> is essentially “the lifetime of the arena itself”. This is where it starts getting weird: typically if your type has a lifetime <em>parameter</em>, the caller gets to pick what goes in there. You don’t get to just say “this is the lifetime of the object itself”, the caller would typically be able to instantiate an <code class="language-plaintext highlighter-rouge">Arena&lt;'static&gt;</code> if they wish, or an <code class="language-plaintext highlighter-rouge">Arena&lt;'a&gt;</code> for some <code class="language-plaintext highlighter-rouge">'a</code>. But here we’re declaring that <code class="language-plaintext highlighter-rouge">'arena</code> is the lifetime of the arena itself; clearly something fishy is happening here.</p>

<p>Here’s where we actually implement the arena:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">impl</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;</span> <span class="n">Arena</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-&gt;</span> <span class="n">Arena</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;</span> <span class="p">{</span>
        <span class="n">Arena</span> <span class="p">{</span>
            <span class="n">people</span><span class="p">:</span> <span class="nn">FrozenVec</span><span class="p">::</span><span class="nf">new</span><span class="p">(),</span>
        <span class="p">}</span>
    <span class="p">}</span>
    
    <span class="k">fn</span> <span class="nf">add_person</span><span class="p">(</span><span class="o">&amp;</span><span class="nv">'arena</span> <span class="k">self</span><span class="p">,</span> <span class="n">name</span><span class="p">:</span> <span class="o">&amp;</span><span class="k">'static</span> <span class="nb">str</span><span class="p">,</span>
                  <span class="n">follows</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">PersonRef</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;&gt;</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">PersonRef</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;</span> <span class="p">{</span>
        <span class="k">let</span> <span class="n">idx</span> <span class="o">=</span> <span class="k">self</span><span class="py">.people</span><span class="nf">.len</span><span class="p">();</span>
        <span class="k">self</span><span class="py">.people</span><span class="nf">.push</span><span class="p">(</span><span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">Person</span> <span class="p">{</span>
            <span class="n">name</span><span class="p">,</span>
            <span class="n">follows</span><span class="p">:</span> <span class="n">follows</span><span class="nf">.into</span><span class="p">(),</span>
            <span class="n">reverse_follows</span><span class="p">:</span> <span class="nn">Default</span><span class="p">::</span><span class="nf">default</span><span class="p">(),</span>
        <span class="p">}));</span>
        <span class="k">let</span> <span class="n">me</span> <span class="o">=</span> <span class="o">&amp;</span><span class="k">self</span><span class="py">.people</span><span class="p">[</span><span class="n">idx</span><span class="p">];</span>
        <span class="k">for</span> <span class="n">friend</span> <span class="k">in</span> <span class="o">&amp;</span><span class="n">me</span><span class="py">.follows</span> <span class="p">{</span>
            <span class="c1">// We're mutating existing arena entries to add references,</span>
            <span class="c1">// potentially creating cycles!</span>
            <span class="n">friend</span><span class="py">.reverse_follows</span><span class="nf">.push</span><span class="p">(</span><span class="n">me</span><span class="p">)</span>
        <span class="p">}</span>
        <span class="n">me</span>
    <span class="p">}</span>

    <span class="k">fn</span> <span class="nf">dump</span><span class="p">(</span><span class="o">&amp;</span><span class="nv">'arena</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// code to print out every Person, their followers, and the people who follow them</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Note the <code class="language-plaintext highlighter-rouge">&amp;'arena self</code> in <code class="language-plaintext highlighter-rouge">add_person</code>.</p>

<p>A <em>good</em> implementation here would typically separate out code handling the higher level invariant of “if A <code class="language-plaintext highlighter-rouge">follows</code> B then B <code class="language-plaintext highlighter-rouge">reverse_follows</code> A”, but this is just an example.</p>

<p>And finally, we can use the arena like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">arena</span> <span class="o">=</span> <span class="nn">Arena</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
    <span class="k">let</span> <span class="n">lonely</span> <span class="o">=</span> <span class="n">arena</span><span class="nf">.add_person</span><span class="p">(</span><span class="s">"lonely"</span><span class="p">,</span> <span class="nd">vec!</span><span class="p">[]);</span>
    <span class="k">let</span> <span class="n">best_friend</span> <span class="o">=</span> <span class="n">arena</span><span class="nf">.add_person</span><span class="p">(</span><span class="s">"best friend"</span><span class="p">,</span> <span class="nd">vec!</span><span class="p">[</span><span class="n">lonely</span><span class="p">]);</span>
    <span class="k">let</span> <span class="n">threes_a_crowd</span> <span class="o">=</span> <span class="n">arena</span><span class="nf">.add_person</span><span class="p">(</span><span class="s">"threes a crowd"</span><span class="p">,</span> <span class="nd">vec!</span><span class="p">[</span><span class="n">lonely</span><span class="p">,</span> <span class="n">best_friend</span><span class="p">]);</span>
    <span class="k">let</span> <span class="n">rando</span> <span class="o">=</span> <span class="n">arena</span><span class="nf">.add_person</span><span class="p">(</span><span class="s">"rando"</span><span class="p">,</span> <span class="nd">vec!</span><span class="p">[]);</span>
    <span class="k">let</span> <span class="n">_everyone</span> <span class="o">=</span> <span class="n">arena</span><span class="nf">.add_person</span><span class="p">(</span><span class="s">"follows everyone"</span><span class="p">,</span> <span class="nd">vec!</span><span class="p">[</span><span class="n">rando</span><span class="p">,</span> <span class="n">threes_a_crowd</span><span class="p">,</span> <span class="n">lonely</span><span class="p">,</span> <span class="n">best_friend</span><span class="p">]);</span>
    <span class="n">arena</span><span class="nf">.dump</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>In this case all of the “mutability” happens in the implementation of the arena itself, but it would be possible for this code to add entries directly to the <code class="language-plaintext highlighter-rouge">follows</code>/<code class="language-plaintext highlighter-rouge">reverse_follows</code> lists, or <code class="language-plaintext highlighter-rouge">Person</code> could have <code class="language-plaintext highlighter-rouge">RefCell</code>s for other kinds of links, or whatever.</p>

<h3 id="how-the-lifetimes-work">How the lifetimes work</h3>

<p>So how does this work? As I said earlier, with such abstractions in Rust, the caller typically has freedom to set the lifetime based on what they do with it. For example, if you have a <code class="language-plaintext highlighter-rouge">HashMap&lt;K, &amp;'a str&gt;</code>, the <code class="language-plaintext highlighter-rouge">'a</code> will get set based on the lifetime of what you try to insert.</p>

<p>When you construct the <code class="language-plaintext highlighter-rouge">Arena</code> its lifetime parameter is indeed still unconstrained, and we can test this by checking that the following code, which forcibly constrains the lifetime, still compiles.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">arena</span><span class="p">:</span> <span class="n">Arena</span><span class="o">&lt;</span><span class="k">'static</span><span class="o">&gt;</span> <span class="o">=</span> <span class="nn">Arena</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
</code></pre></div></div>

<p>But the moment you try to do anything with the arena, this stops working:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">arena</span><span class="p">:</span> <span class="n">Arena</span><span class="o">&lt;</span><span class="k">'static</span><span class="o">&gt;</span> <span class="o">=</span> <span class="nn">Arena</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="k">let</span> <span class="n">lonely</span> <span class="o">=</span> <span class="n">arena</span><span class="nf">.add_person</span><span class="p">(</span><span class="s">"lonely"</span><span class="p">,</span> <span class="nd">vec!</span><span class="p">[]);</span>
</code></pre></div></div>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error[E0597]: `arena` does not live long enough
  --&gt; examples/mutable_arena.rs:5:18
   |
4  |     let arena: Arena&lt;'static&gt; = Arena::new();
   |                -------------- type annotation requires that `arena` is borrowed for `'static`
5  |     let lonely = arena.add_person("lonely", vec![]);
   |                  ^^^^^ borrowed value does not live long enough
...
11 | }
   | - `arena` dropped here while still borrowed
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">add_person</code> method is somehow suddenly forcing the <code class="language-plaintext highlighter-rouge">'arena</code> parameter of <code class="language-plaintext highlighter-rouge">Arena</code> to be set to its <em>own</em> lifetime, constraining it (and making it impossible to force-constrain it to be anything else with type annotations).</p>

<p>What’s going on here is a neat interaction with the <code class="language-plaintext highlighter-rouge">&amp;'arena self</code> signature of <code class="language-plaintext highlighter-rouge">add_person</code> (i.e. <code class="language-plaintext highlighter-rouge">self</code> is <code class="language-plaintext highlighter-rouge">&amp;'arena Arena&lt;'self&gt;</code>), and the fact that <code class="language-plaintext highlighter-rouge">'arena</code> in <code class="language-plaintext highlighter-rouge">Arena&lt;'arena&gt;</code> is an <a href="https://doc.rust-lang.org/nomicon/subtyping.html#variance"><em>invariant lifetime</em></a>.</p>

<p>Usually in your Rust programs, lifetimes are a little bit stretchy-squeezy. The following code compiles just fine:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// ask for two strings *with the same lifetime*</span>
<span class="k">fn</span> <span class="n">take_strings</span><span class="o">&lt;</span><span class="nv">'a</span><span class="o">&gt;</span><span class="p">(</span><span class="n">x</span><span class="p">:</span> <span class="o">&amp;</span><span class="nv">'a</span> <span class="nb">str</span><span class="p">,</span> <span class="n">y</span><span class="p">:</span> <span class="o">&amp;</span><span class="nv">'a</span> <span class="nb">str</span><span class="p">)</span> <span class="p">{}</span>

<span class="c1">// string literal with lifetime 'static</span>
<span class="k">let</span> <span class="n">lives_forever</span> <span class="o">=</span> <span class="s">"foo"</span><span class="p">;</span>
<span class="c1">// owned string with shorter, local lifetime</span>
<span class="k">let</span> <span class="n">short_lived</span> <span class="o">=</span> <span class="nn">String</span><span class="p">::</span><span class="nf">from</span><span class="p">(</span><span class="s">"bar"</span><span class="p">);</span>

<span class="c1">// still works!</span>
<span class="nf">take_strings</span><span class="p">(</span><span class="n">lives_forever</span><span class="p">,</span> <span class="o">&amp;*</span><span class="n">short_lived</span><span class="p">);</span>
</code></pre></div></div>

<p>In this code, Rust is happy to notice that while <code class="language-plaintext highlighter-rouge">lives_forever</code> and <code class="language-plaintext highlighter-rouge">&amp;*short_lived</code> have different lifetimes, it’s totally acceptable to <em>pretend</em> <code class="language-plaintext highlighter-rouge">lives_forever</code> has a shorter lifetime for the duration of the <code class="language-plaintext highlighter-rouge">take_strings</code> function. It’s just a reference, a reference valid for a long lifetime is <em>also</em> valid for a shorter lifetime.</p>

<p>The thing is, this stretchy-squeeziness is not the same for all lifetimes! The <a href="https://doc.rust-lang.org/nomicon/subtyping.html">nomicon chapter on subtyping and variance</a> goes into detail on <em>why</em> this is the case, but a general rule of thumb is that most lifetimes are “squeezy”<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">3</a></sup> like the one in <code class="language-plaintext highlighter-rouge">&amp;'a str</code> above, but if some form of mutability is involved, they are rigid, also known as “invariant”. You can also have “stretchy”<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">4</a></sup> lifetimes if you’re using function types, but they’re rare.</p>

<p>Our <code class="language-plaintext highlighter-rouge">Arena&lt;'arena&gt;</code> is using interior mutability (via the <code class="language-plaintext highlighter-rouge">FrozenVec</code>) in a way that makes <code class="language-plaintext highlighter-rouge">'arena</code> invariant.</p>

<p>Let’s look at our two lines of code again. When the compiler sees the first line of the code below, it constructs <code class="language-plaintext highlighter-rouge">arena</code>, whose lifetime we’ll call <code class="language-plaintext highlighter-rouge">'a</code>. At this point the type of <code class="language-plaintext highlighter-rouge">arena</code> is <code class="language-plaintext highlighter-rouge">Arena&lt;'?&gt;</code>, where <code class="language-plaintext highlighter-rouge">'?</code> is made up notation for a yet-unconstrained lifetime.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">arena</span> <span class="o">=</span> <span class="nn">Arena</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span> 
<span class="k">let</span> <span class="n">lonely</span> <span class="o">=</span> <span class="n">arena</span><span class="nf">.add_person</span><span class="p">(</span><span class="s">"lonely"</span><span class="p">,</span> <span class="nd">vec!</span><span class="p">[]);</span>
</code></pre></div></div>

<p>Let’s actually rewrite this to be clearer on what the lifetimes are.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">arena</span> <span class="o">=</span> <span class="nn">Arena</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span> <span class="c1">// type Arena&lt;'?&gt;, lives for 'a</span>

<span class="c1">// explicitly write the `self` that gets constructed when you call add_person</span>
<span class="k">let</span> <span class="n">ref_to_arena</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">arena</span><span class="p">;</span> <span class="c1">// type &amp;'a Arena&lt;'?&gt;</span>
<span class="k">let</span> <span class="n">lonely</span> <span class="o">=</span> <span class="nn">Arena</span><span class="p">::</span><span class="nf">add_person</span><span class="p">(</span><span class="n">ref_to_arena</span><span class="p">,</span> <span class="s">"lonely"</span><span class="p">,</span> <span class="nd">vec!</span><span class="p">[]);</span>

</code></pre></div></div>

<p>Remember the second rule I listed earlier?</p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">Arena</code> methods should all receive <code class="language-plaintext highlighter-rouge">Arena&lt;'arena&gt;</code> as <code class="language-plaintext highlighter-rouge">&amp;'arena self</code>, i.e. their <code class="language-plaintext highlighter-rouge">self</code> type is <code class="language-plaintext highlighter-rouge">&amp;'arena Arena&lt;'arena&gt;</code></p>
</blockquote>

<p>We followed this rule; the signature of <code class="language-plaintext highlighter-rouge">add_person</code> is <code class="language-plaintext highlighter-rouge">fn add_person(&amp;'arena self)</code>. This means that <code class="language-plaintext highlighter-rouge">ref_to_arena</code> is <em>forced</em> to have a lifetime that matches the pattern <code class="language-plaintext highlighter-rouge">&amp;'arena Arena&lt;'arena&gt;</code>. Currently its lifetime is <code class="language-plaintext highlighter-rouge">&amp;'a Arena&lt;'?&gt;</code>, which means that <code class="language-plaintext highlighter-rouge">'?</code> is <em>forced</em> to be the same as <code class="language-plaintext highlighter-rouge">'a</code>, i.e. the lifetime of the <code class="language-plaintext highlighter-rouge">arena</code> variable itself. If the lifetime weren’t invariant, the compiler would be able to squeeze other lifetimes to fit, but it is invariant, and the unconstrained lifetime is forced to be exactly one lifetime.</p>

<p>And by this rather subtle sleight of hand we’re able to force the compiler to set the lifetime <em>parameter</em> of <code class="language-plaintext highlighter-rouge">Arena&lt;'arena&gt;</code> to the lifetime of its <em>instance</em>.</p>

<p>After this, the rest is pretty straightforward. <code class="language-plaintext highlighter-rouge">Arena&lt;'arena&gt;</code> holds entries of type <code class="language-plaintext highlighter-rouge">Person&lt;'arena&gt;</code>, which is basically a way of saying “a <code class="language-plaintext highlighter-rouge">Person</code> that is allowed to reference items of lifetime <code class="language-plaintext highlighter-rouge">'arena</code>, i.e. items in <code class="language-plaintext highlighter-rouge">Arena</code>”. <code class="language-plaintext highlighter-rouge">type PersonRef&lt;'arena&gt; = &amp;'arena Person&lt;'arena&gt;</code> is a convenient shorthand for “a reference to a <code class="language-plaintext highlighter-rouge">Person</code> that lives in <code class="language-plaintext highlighter-rouge">Arena</code> and is allowed to reference objects from it”.</p>

<h3 id="what-about-destructors">What about destructors?</h3>

<p>So a thing I’ve not covered so far is how this can be safe in the presence of destructors. If your arena is allowed to have cyclic references, and you write a destructor reading from those cyclic references, whichever participant in the cycle that is deleted later on will have dangling references.</p>

<p>This gets to a <em>really</em> obscure part of Rust, even more obscure than variance. You almost never need to really understand this, beyond “explicit destructors subtly change borrow check behavior”. But it’s useful to know to get a better mental model of what’s going on here.</p>

<p>If we add the following code to our arena example:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">impl</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;</span> <span class="nb">Drop</span> <span class="k">for</span> <span class="n">Person</span><span class="o">&lt;</span><span class="nv">'arena</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">drop</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
        <span class="nd">println!</span><span class="p">(</span><span class="s">"goodbye {:?}"</span><span class="p">,</span> <span class="k">self</span><span class="py">.name</span><span class="p">);</span>
        <span class="k">for</span> <span class="n">friend</span> <span class="k">in</span> <span class="o">&amp;</span><span class="k">self</span><span class="py">.reverse_follows</span> <span class="p">{</span>
            <span class="c1">// potentially dangling!</span>
            <span class="nd">println!</span><span class="p">(</span><span class="s">"</span><span class="se">\t\t</span><span class="s">{}"</span><span class="p">,</span> <span class="n">friend</span><span class="py">.name</span><span class="p">);</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>we actually get this error:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">error</span><span class="p">[</span><span class="n">E0597</span><span class="p">]:</span> <span class="err">`</span><span class="n">arena</span><span class="err">`</span> <span class="n">does</span> <span class="n">not</span> <span class="n">live</span> <span class="n">long</span> <span class="n">enough</span>
  <span class="o">-</span><span class="k">-&gt;</span> <span class="n">examples</span><span class="o">/</span><span class="n">mutable_arena</span><span class="py">.rs</span><span class="p">:</span><span class="mi">5</span><span class="p">:</span><span class="mi">18</span>
   <span class="p">|</span>
<span class="mi">5</span>  <span class="p">|</span>     <span class="k">let</span> <span class="n">lonely</span> <span class="o">=</span> <span class="n">arena</span><span class="nf">.add_person</span><span class="p">(</span><span class="s">"lonely"</span><span class="p">,</span> <span class="nd">vec!</span><span class="p">[]);</span>
   <span class="p">|</span>                  <span class="o">^^^^^</span> <span class="n">borrowed</span> <span class="n">value</span> <span class="n">does</span> <span class="n">not</span> <span class="n">live</span> <span class="n">long</span> <span class="n">enough</span>
<span class="o">...</span>
<span class="mi">11</span> <span class="p">|</span> <span class="p">}</span>
   <span class="p">|</span> <span class="o">-</span>
   <span class="p">|</span> <span class="p">|</span>
   <span class="p">|</span> <span class="err">`</span><span class="n">arena</span><span class="err">`</span> <span class="n">dropped</span> <span class="n">here</span> <span class="k">while</span> <span class="n">still</span> <span class="n">borrowed</span>
   <span class="p">|</span> <span class="n">borrow</span> <span class="n">might</span> <span class="n">be</span> <span class="n">used</span> <span class="n">here</span><span class="p">,</span> <span class="n">when</span> <span class="err">`</span><span class="n">arena</span><span class="err">`</span> <span class="n">is</span> <span class="n">dropped</span> <span class="n">and</span> <span class="n">runs</span> <span class="n">the</span> <span class="n">destructor</span> <span class="k">for</span> <span class="k">type</span> <span class="err">`</span><span class="n">Arena</span><span class="o">&lt;</span><span class="nv">'_</span><span class="o">&gt;</span><span class="err">`</span>
</code></pre></div></div>

<p>The presence of destructors subtly changes the behavior of the borrow checker around self-referential lifetimes. The exact rules are tricky and <a href="https://doc.rust-lang.org/nomicon/dropck.html">explained in the nomicon</a>, but <em>essentially</em> what happened was that the existence of a custom destructor on <code class="language-plaintext highlighter-rouge">Person&lt;'arena&gt;</code> made <code class="language-plaintext highlighter-rouge">'arena</code> in <code class="language-plaintext highlighter-rouge">Person</code> (and thus <code class="language-plaintext highlighter-rouge">Arena</code>) a lifetime which is “observed during destruction”. This is then taken into account during borrow checking – suddenly the implicit <code class="language-plaintext highlighter-rouge">drop()</code> at the end of the scope is known to be able to read <code class="language-plaintext highlighter-rouge">'arena</code> data, and Rust makes the appropriate conclusion that <code class="language-plaintext highlighter-rouge">drop()</code> will be able to read things after they’ve been cleaned up, since destruction is itself a mutable operation, and <code class="language-plaintext highlighter-rouge">drop()</code> is run interspersed in it.</p>

<p>Of course, a reasonable question to ask is how we can store things like <code class="language-plaintext highlighter-rouge">Box</code> and <code class="language-plaintext highlighter-rouge">FrozenVec</code> in this arena if destructors aren’t allowed to “wrap” types with <code class="language-plaintext highlighter-rouge">'arena</code>. The reason is that Rust knows that <code class="language-plaintext highlighter-rouge">Drop</code> on <code class="language-plaintext highlighter-rouge">Box</code> <em>cannot</em> inspect <code class="language-plaintext highlighter-rouge">person.follows</code> because <code class="language-plaintext highlighter-rouge">Box</code> does not even know what <code class="language-plaintext highlighter-rouge">Person</code> is, and has promised to never try and find out. This wouldn’t necessarily be true if we had a random generic type since the destructor can call trait methods (or specialized blanket methods) which <em>do</em> know how to read the contents of <code class="language-plaintext highlighter-rouge">Person</code>, but in such a case the subtly changed borrow checker rules would kick in again. The stdlib types and other custom datastructures achieve this with an escape hatch, <a href="https://doc.rust-lang.org/nomicon/dropck.html#an-escape-hatch"><code class="language-plaintext highlighter-rouge">#[may_dangle]</code></a> (also known as “the eyepatch”<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">5</a></sup>), which allows you to pinky swear that you won’t be reading from a lifetime or generic parameter in a custom destructor.</p>

<p>This applies to crates like <a href="https://docs.rs/typed-arena/"><code class="language-plaintext highlighter-rouge">typed-arena</code></a> as well; if you are creating cycles you will not be able to write custom destructors on the types you put in the arena. You <em>can</em> write custom destructors with <a href="https://docs.rs/typed-arena/"><code class="language-plaintext highlighter-rouge">typed-arena</code></a> as long as you refrain from mutating things in ways that can create cycles; so you will not be able to use interior mutability to have one arena entry point to another.</p>

<p><em>Thanks to <a href="https://mpc.sh">Mark Cohen</a> and <a href="https://twitter.com/kneecaw/">Nika Layzell</a> for reviewing drafts of this post.</em></p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:0" role="doc-endnote">
      <p>But not in a cyclic way; the borrow checker will enforce this! <a href="#fnref:0" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:1" role="doc-endnote">
      <p>You may wonder how it is safe for destructors to be safely run on cyclic references – after all, the destructor of whichever entry gets destroyed second will be able to read a dangling reference. We’ll cover this later in the post but it has to do with drop check, and specifically that if you attempt to set up cycles, the only explicit destructors allowed on the arena entries themselves will be ones on appropriately marked types. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>The technical term for this is “covariant lifetime” <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>The technical term for this is “contravariant lifetime” <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>Because you’re claiming the destructor “can’t see” the type or lifetime, see? <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Integrating Rust and C++ in Firefox]]></title>
    <link href="http://manishearth.github.io/blog/2021/02/22/integrating-rust-and-c-plus-plus-in-firefox/"/>
    <updated>2021-02-22T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2021/02/22/integrating-rust-and-c-plus-plus-in-firefox</id>
    <content type="html"><![CDATA[<p><em>This post was originally drafted in August 2018, but I never got around to finishing it. As such, parts of its framing (e.g. the focus on bindgen) are outdated, given the relative infancy of the interop space at the time. I was recently told that the post is still useful in this form so I decided to finish and publish it anyway, while attempting to mark outdated things as such when I notice them. Everything after the allocators section was written near the time of publication.</em></p>

<p>In 2017 I worked on the <a href="https://hacks.mozilla.org/2017/08/inside-a-super-fast-css-engine-quantum-css-aka-stylo/">Stylo</a> project, uplifting Servo’s CSS engine (“style system”) into Firefox’s browser engine
(“Gecko”). This involved a <em>lot</em> of gnarly FFI between Servo’s Rust codebase and Firefox’s C++ codebase. There were a
lot of challenges in doing this, and I feel like it’s worth sharing things from our experiences.</p>

<p>If you’re interested in Rust integrations, you may find <a href="https://www.youtube.com/watch?v=x9acx2zgx4Q">this talk by Katharina on Rust - C++ FFI</a>, and <a href="https://hsivonen.fi/modern-cpp-in-rust/">this blog post by Henri on integrating encoding-rs into Firefox</a> useful as well.</p>

<h2 id="who-is-this-post-for">Who is this post for?</h2>

<p>So, first off the bat, I’ll mention that when integrating Rust into a C++ codebase, you
want to <em>avoid</em> having integrations as tight as Stylo. Don’t do what we did; make your Rust
component mostly self-contained so that you just have to maintain something like ten FFI functions
for interacting with it. If this is possible to do, you should do it and your life will be <em>much</em> easier. Pick a clean API boundary, define a straightforward API, use cbindgen or bindgen if necessary without any tricks, and you should be good to go.</p>

<p>That said, sometimes you <em>have</em> to have gnarly integrations, and this blog post is for those use cases.
These techniques mostly use bindgen in their examples, however you can potentially use them with hand-rolled bindings or another tool as well. If you’re at this level of complexity, however, the potential for mistakes in the hand-rolled bindings is probably not worth it.</p>

<p><em>Note from 2021: <a href="https://github.com/dtolnay/cxx">cxx</a> is probably a better tool for many of the use cases here, though many of the techniques still transfer.</em></p>

<h2 id="what-was-involved-in-stylos-ffi">What was involved in Stylo’s FFI?</h2>

<p>So, what made Stylo’s FFI so complicated?</p>

<p>It turns out that browsers are quite monolithic. You can split them into vaguely-defined components, but
these components are still tightly integrated. If you intend to replace a component, you may need to
make a jagged edge of an integration surface.</p>

<p>The style system is more self-contained than other parts, but it’s still quite tightly integrated.</p>

<p>The main job of a “style system” is to take the CSS rules and DOM tree, and run them through “the cascade”<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>
with an output of “computed styles” tagged on each node in the tree. So, for example, it will take a document like
the following:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;style </span><span class="na">type=</span><span class="s">"text/css"</span><span class="nt">&gt;</span>
    <span class="nt">body</span> <span class="p">{</span>
        <span class="nl">font-size</span><span class="p">:</span> <span class="m">12px</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="nt">div</span> <span class="p">{</span>
        <span class="nl">height</span><span class="p">:</span> <span class="m">2em</span><span class="p">;</span>
    <span class="p">}</span>
<span class="nt">&lt;/style&gt;</span>
<span class="nt">&lt;body&gt;</span>
    <span class="nt">&lt;div</span> <span class="na">id=</span><span class="s">"foo"</span><span class="nt">&gt;&lt;/div&gt;</span>

<span class="nt">&lt;/body&gt;</span>
</code></pre></div></div>

<p>and turn it into something like:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">&lt;body&gt;</code> has a <code class="language-plaintext highlighter-rouge">font-size</code> of <code class="language-plaintext highlighter-rouge">12px</code>, everything else is the default</li>
  <li>the <code class="language-plaintext highlighter-rouge">div</code> <code class="language-plaintext highlighter-rouge">#foo</code> has a computed <code class="language-plaintext highlighter-rouge">height</code> of <code class="language-plaintext highlighter-rouge">24px</code> <sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">2</a></sup>, everything else is the default. It “inherits” the <code class="language-plaintext highlighter-rouge">font-size</code> from <code class="language-plaintext highlighter-rouge">&lt;body&gt;</code> as <code class="language-plaintext highlighter-rouge">12px</code></li>
</ul>

<p>From a code point of view, this means that Stylo takes in Gecko’s C++ DOM tree. It parses all the CSS,
and then runs the cascade on the tree. It stores computed styles on each element in a way that Gecko can read
very cheaply.</p>

<p>Style computation can involve some complex steps that require calling back into C++ code. Servo’s style system
is multithreaded, but Gecko is mostly designed to work off of a single main thread per process, so we need to
deal with this impedence mismatch.</p>

<p>Since the output of Stylo is C++-readable structs, Stylo needs to be able to read and write nontrivial C++
abstractions. Typical FFI involves passing values over a boundary, never to be seen again, however here we’re
dealing with persistent state that is accessed by both sides.</p>

<p>To sum up, we have:</p>

<ul>
  <li>Lots and lots of back-and-forth FFI</li>
  <li>Thread safety concerns</li>
  <li>Rust code regularly dealing with nontrivial C++ abstractions</li>
  <li>A need for nontrivial abstractions to be passed over FFI</li>
</ul>

<p>All of this conspires to make for some really complicated FFI code.</p>

<h1 id="the-actual-techniques">The actual techniques</h1>

<p>I’ll try to structure this so that the more broadly useful (and/or less gnarly) techniques come earlier in the post.</p>

<h2 id="the-basics-of-bindgen">The basics of bindgen</h2>

<p><a href="https://github.com/rust-lang-nursery/rust-bindgen/">Bindgen</a> is a tool that generates Rust bindings for structs and functions from the provided C or C++ header files. It’s often used for writing Rust bindings to existing C/C++ libraries, however it’s useful for integrations as well.</p>

<p>To use it for an integration, write a header file containing the functions your Rust code needs (referencing structs from other header files if necessary), and <a href="https://rust-lang-nursery.github.io/rust-bindgen/command-line-usage.html">run bindgen on it</a>. For some codebases, doing this once and
checking in the generate file suffices, but if your C++ code is going to change a lot, <a href="https://rust-lang-nursery.github.io/rust-bindgen/tutorial-1.html">run it as a build dependency instead</a>. Beware that this can adversely impact build times, since your Rust build now has a partial
C++ compilation step.</p>

<p>For large C++ codebases, pulling in a single header will likely pull in a <em>lot</em> of stuff. You should <a href="https://rust-lang.github.io/rust-bindgen/allowlisting.html">allowlist</a>, <a href="https://rust-lang.github.io/rust-bindgen/blocklisting.html">blocklist</a>, and/or mark things as <a href="https://rust-lang.github.io/rust-bindgen/opaque.html">opaque</a> to reduce the amount of bindings generated. It’s best to go the allowlisting route — give bindgen an allowlisted list of functions / structs to generate bindings for, and it will transitively generate bindings for any dependencies they may have. Sometimes even this will end up generating a lot, it’s sometimes worth finding structs you’re not using and marking them as opaque so that their bindings aren’t necessary. Marking something as opaque replaces it with an array of the appropriate size and alignment, so from the Rust side it’s just some bits you don’t care about and can’t introspect further.</p>

<p>Bindgen <a href="https://rust-lang-nursery.github.io/rust-bindgen/cpp.html"><em>does</em> support some C++ features</a> (you may need to pass <code class="language-plaintext highlighter-rouge">-x c++</code>). This is pretty good for generating bindings to e.g. templated structs. However, it’s not possible to support <em>all</em> C++ features here, so you may need to blocklist, opaqueify, or use intermediate types if you have some complicated C++ abstractions in the deps. You’ll typically get an error when generating bindings or when compiling the generated bindings, so don’t worry about this unless that happens.</p>

<p>Bindgen is <em>quite</em> configurable. Stylo has a <a href="https://searchfox.org/mozilla-central/rev/819cd31a93fd50b7167979607371878c4d6f18e8/servo/components/style/build_gecko.rs">script</a> that consumes a <a href="https://searchfox.org/mozilla-central/source/layout/style/ServoBindings.toml">large toml file</a> containing all of the configuration.</p>

<h2 id="cbindgen">cbindgen</h2>

<p>We don’t use <a href="https://github.com/eqrion/cbindgen">cbindgen</a> in Stylo, but it’s used for Webrender. It does the inverse of what bindgen does: given a Rust crate, it generates C headers for its public <code class="language-plaintext highlighter-rouge">extern "C"</code> API. It’s also quite configurable.</p>

<h2 id="cxx">cxx</h2>

<p><a href="https://github.com/dtolnay/cxx">cxx</a> is the cool new hotness in 2021, which kind of approaches the problem from both sides, enabling you to write Rust bindings for C++ and C++ bindings for Rust. It’s definitely worth checking out, a lot of the things that are hard to make work with bindgen are trivial in cxx. For example, it automatically figures out what types need to be opaque, it automatically converts between <code class="language-plaintext highlighter-rouge">&amp;T</code> and <code class="language-plaintext highlighter-rouge">T*</code> across FFI, and it is overall more targeted for the use case of an FFI layer where Rust and C++ both call each other.</p>

<h2 id="bindgen-aided-c-calling-rust">Bindgen-aided C++ calling Rust</h2>

<p>So bindgen helps with creating things for Rust to call and manipulate, but not in the opposite direction. cbindgen can help here, but I’m not sure if it’s advisable to have <em>both</em> bindgen and cbindgen operating near each other on the same codebase.</p>

<p>In Stylo we use a bit of a hack for this. Firstly, all FFI functions defined in C++ that Rust calls are declared in <a href="https://searchfox.org/mozilla-central/rev/819cd31a93fd50b7167979607371878c4d6f18e8/layout/style/ServoBindingList.h">one file</a>, and are all named <code class="language-plaintext highlighter-rouge">Gecko_*</code>. Bindgen supports regexes for things like allowlisting, so this naming scheme makes it easy to deal with.</p>

<p>We also declare the FFI functions defined in Rust that C++ calls in <a href="https://searchfox.org/mozilla-central/rev/819cd31a93fd50b7167979607371878c4d6f18e8/layout/style/ServoBindingList.h">another file</a>, named <code class="language-plaintext highlighter-rouge">Servo_*</code>. They’re also all <a href="https://searchfox.org/mozilla-central/rev/819cd31a93fd50b7167979607371878c4d6f18e8/servo/ports/geckolib/glue.rs">defined in one place</a>.</p>

<p>However, there’s nothing ensuring that the signatures match! If we’re not careful, there may be mismatches, causing bad things to happen at link time or runtime. We use a small <a href="https://searchfox.org/mozilla-central/rev/819cd31a93fd50b7167979607371878c4d6f18e8/servo/ports/geckolib/tests/build.rs">autogenerated</a> <a href="https://searchfox.org/mozilla-central/rev/819cd31a93fd50b7167979607371878c4d6f18e8/servo/ports/geckolib/tests/servo_function_signatures.rs">unit test</a> to ensure the validity of the signatures.</p>

<p>This is especially important as we do things like type replacement, and we need tests to ensure that the rug isn’t pulled out from underneath us.</p>

<h2 id="type-replacing-for-fun-and-profit">Type replacing for fun and profit</h2>

<p>Using <a href="https://rust-lang.github.io/rust-bindgen/blocklisting.html">blocklisting</a> in conjunction with the <code class="language-plaintext highlighter-rouge">--raw-line</code>/<code class="language-plaintext highlighter-rouge">raw_line()</code> flag, one can effectively ask bindgen to “replace” types. Blocklisting asks bindgen not to generate bindings for a type, however bindgen will continue to generate bindings <em>referring</em> to that type if necessary. (Unlike opaque types where bindgen generates an opaque binding for the type and uses it everywhere). <code class="language-plaintext highlighter-rouge">--raw-line</code> lets you request bindgen to add a line of raw rust code to the file, and such a line can potentially define or import a new version of the type you blocklisted. Effectively, this lets you replace types.</p>

<p>Bindgen generates unit tests ensuring that the layout of your structs is correct (run them!), so if you accidentally replace a type with something incompatible, you will get warnings at the struct level (functions may not warn).</p>

<p>There are various ways this can be used:</p>

<h3 id="safe-references-across-ffi">Safe references across FFI</h3>

<p><em>Note from 2021: <a href="https://github.com/dtolnay/cxx">cxx</a> does this automatically</em></p>

<p>Calling into C++ (and accepting data from C++) is unsafe. However, there’s no reason we should have to worry about this more than we have to. For example, it would be nice if accessor FFI functions – functions which take a foreign object and return something from inside it –  could use lifetimes. It would be even nicer if nullability were represented on the FFI boundary so that you don’t miss null checks, and can assume non-nullness when the C++ API is okay with it.</p>

<p>In Stylo, we have lots of functions like the following:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">RawGeckoNodeBorrowedOrNull</span> <span class="nf">Gecko_GetLastChild</span><span class="p">(</span><span class="n">RawGeckoNodeBorrowed</span> <span class="n">node</span><span class="p">);</span>
</code></pre></div></div>

<p>which bindgen translates to:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">extern</span> <span class="s">"C"</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">Gecko_GetLastChild</span><span class="p">(</span><span class="n">x</span><span class="p">:</span> <span class="o">&amp;</span><span class="n">RawGeckoNode</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">Option</span><span class="o">&lt;&amp;</span><span class="n">RawGeckoNode</span><span class="o">&gt;</span><span class="p">;</span>   
<span class="p">}</span>
</code></pre></div></div>

<p>Using the <a href="https://searchfox.org/mozilla-central/rev/819cd31a93fd50b7167979607371878c4d6f18e8/servo/components/style/build_gecko.rs">bindgen build script</a> on a provided <a href="https://searchfox.org/mozilla-central/rev/819cd31a93fd50b7167979607371878c4d6f18e8/layout/style/ServoBindings.toml#648-671">list of borrow-able types</a>, we’ve told bindgen that:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">FooBorrowedOrNull</code> is actually <code class="language-plaintext highlighter-rouge">Option&lt;&amp;Foo&gt;</code></li>
  <li><code class="language-plaintext highlighter-rouge">FooBorrowed</code> is actually <code class="language-plaintext highlighter-rouge">&amp;Foo</code></li>
</ul>

<p><code class="language-plaintext highlighter-rouge">Option&lt;&amp;Foo&gt;</code> <a href="https://doc.rust-lang.org/nomicon/repr-rust.html">is represented as a single nullable pointer in Rust</a>, so this is a clean translation. 
We’re forced to null-check it, but once we do we can safely assume that the reference is valid. Furthermore, due to lifetime elision the actual signature of the FFI function is <code class="language-plaintext highlighter-rouge">fn Gecko_GetLastChild&lt;'a&gt;(x: &amp;'a RawGeckoNode) -&gt; Option&lt;&amp;'a RawGeckoNode&gt;</code>, which ensures we won’t let the returned reference outlive the passed reference. Lifetime elision means that we can call C++ functions “safely” with the appropriate lifetime requirements, even though C++ has no such concept!</p>

<p>Note that this is shifting some of the safety invariants to the C++ side: We rely on the C++ to give us valid references, and we rely on it to not have nulls when the type is not marked as nullable. Most C++ codebases internally rely on such invariants for safety anyway, so this isn’t much of a stretch.</p>

<p>We do this on both sides, actually: Many of our Rust-defined <code class="language-plaintext highlighter-rouge">extern "C"</code> functions that C++ calls get to be internally-safe because the types let us assume the validity of the pointers obtained from C++.</p>

<h3 id="making-c-abstractions-rust-accessible">Making C++ abstractions Rust-accessible</h3>

<p>A very useful thing to do here is to replace various C++ abstractions with Rust versions of them that share semantics. In Gecko, most strings are stored in <code class="language-plaintext highlighter-rouge">nsString</code>/<code class="language-plaintext highlighter-rouge">nsAString</code>/etc.</p>

<p>We’ve written an <a href="https://searchfox.org/mozilla-central/rev/6ddb5fb144993fb5de044e2e8d900d7643b98a4d/servo/support/gecko/nsstring/src/lib.rs">nsstring</a> crate that represents layout-compatible <code class="language-plaintext highlighter-rouge">nsString</code>s in a more Rusty way, with Rusty APIs. We then ask bindgen to replace Gecko <code class="language-plaintext highlighter-rouge">nsString</code>s with these.</p>

<p>Usually it’s easier to just write an impl for the bindgen-generated abstraction, however sometimes you must replace it:</p>

<ul>
  <li>When the abstraction internally does a lot of template stuff not supported by bindgen</li>
  <li>When you want the code for the abstraction to be in a separate crate</li>
</ul>

<h2 id="potential-pitfall-passing-c-classes-by-value-over-ffi">Potential pitfall: Passing C++ classes by-value over FFI</h2>

<p>It’s quite tempting to do stuff like</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">RefPtr</span><span class="o">&lt;</span><span class="n">Foo</span><span class="o">&gt;</span> <span class="n">Servo_Gimme</span><span class="p">(...);</span>
</code></pre></div></div>

<p>where you pass complicated classes by-value over FFI (<code class="language-plaintext highlighter-rouge">RefPtr</code> is Gecko’s variant of <code class="language-plaintext highlighter-rouge">Rc&lt;T&gt;</code>/<code class="language-plaintext highlighter-rouge">Arc&lt;T&gt;</code>).</p>

<p>This works on some systems, but is broken on MSVC:
<a href="https://github.com/rust-lang/rust/issues/38258">The ABI for passing non-POD types through functions is different</a>. The linker usually notices this and complains, but it’s worth avoiding this entirely.</p>

<p>In Stylo we handle this by using some macro-generated intermediate types which are basically the same thing as the original class but without any constructors/destructors/operators. We convert to/from these types immediately before/after the FFI call, and on the Rust side we do similar conversions to Rust-compatible abstractions.</p>

<h2 id="sharing-abstractions-with-destructors">Sharing abstractions with destructors</h2>

<p>If you’re passing ownership of collections or other templated types across FFI, you probably want Rust code to be able to destroy C++ objects, and vice versa.</p>

<p>One way of doing this is to implement <code class="language-plaintext highlighter-rouge">Drop</code> on the generated struct. If you have <code class="language-plaintext highlighter-rouge">class MyString</code>, you can do:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">MyString</span> <span class="p">{</span>
    <span class="c1">// ...</span>
    <span class="o">~</span><span class="n">MyString</span><span class="p">();</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">MyString_Destroy</span><span class="p">(</span><span class="o">*</span><span class="n">MyString</span> <span class="n">x</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">x</span><span class="o">-&gt;~</span><span class="n">MyString</span><span class="p">()</span>
<span class="p">}</span>
</code></pre></div></div>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">impl</span> <span class="nb">Drop</span> <span class="k">for</span> <span class="nn">bindings</span><span class="p">::</span><span class="n">MyString</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">drop</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// (bindgen only)</span>
        <span class="nn">bindings</span><span class="p">::</span><span class="nn">MyString</span><span class="p">::</span><span class="nf">destruct</span><span class="p">(</span><span class="k">self</span><span class="p">)</span>
        <span class="c1">// OR</span>
        <span class="nn">bindings</span><span class="p">::</span><span class="nf">MyString_Destroy</span><span class="p">(</span><span class="k">self</span><span class="p">)</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">MyString_Destroy</code> isn’t necessary with bindgen – bindgen will generate a <code class="language-plaintext highlighter-rouge">MyString::destruct()</code> function for you – but be careful, this will make your generated bindings very platform-specific, so be sure to only do this if running them at build time. In general, when bindgen generates C++ <em>methods</em>, your bindings become platform specific and are best regenerated at build time<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">3</a></sup>.</p>

<p>In Stylo we went down the route of manually defining <code class="language-plaintext highlighter-rouge">_Destroy()</code> functions since we started off with checked-in platform-agnostic bindings, however we could probably switch to using <code class="language-plaintext highlighter-rouge">destruct()</code> if we want to now.</p>

<p>When it comes to generic types, it’s a bit trickier, since <code class="language-plaintext highlighter-rouge">Drop</code> can’t be implemented piecewise on a generic type (you cannot <code class="language-plaintext highlighter-rouge">impl Drop for MyVector&lt;Foo&gt;</code>). You have to do something like:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">class</span> <span class="nc">MyVector</span> <span class="p">{</span>
    <span class="c1">// ...</span>
<span class="p">}</span>

<span class="c1">// Deallocate buffer, but do not call destructors on elements</span>
<span class="kt">void</span> <span class="nf">MyVector_Deallocate_Buffer</span><span class="p">(</span><span class="n">MyVector</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;*</span> <span class="n">x</span><span class="p">);</span>
</code></pre></div></div>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// assume we have an implementation of Iterator for MyVector&lt;T&gt; somewhere</span>

<span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="nb">Drop</span> <span class="k">for</span> <span class="nn">bindings</span><span class="p">::</span><span class="n">MyVector</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">drop</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="k">self</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">for</span> <span class="n">v</span> <span class="k">in</span> <span class="k">self</span><span class="nf">.iter_mut</span><span class="p">()</span> <span class="p">{</span>
            <span class="c1">// calls the destructor for `v`, if any</span>
            <span class="nn">std</span><span class="p">::</span><span class="nn">ptr</span><span class="p">::</span><span class="nf">drop_in_place</span><span class="p">(</span><span class="n">v</span><span class="p">)</span>
        <span class="p">}</span>
        <span class="nn">bindings</span><span class="p">::</span><span class="nf">MyVector_Deallocate_Buffer</span><span class="p">(</span><span class="k">self</span> <span class="k">as</span> <span class="o">*</span><span class="k">mut</span> <span class="n">MyVector</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="k">as</span> <span class="o">*</span><span class="k">mut</span> <span class="n">MyVector</span><span class="o">&lt;</span><span class="nb">c_void</span><span class="o">&gt;</span><span class="p">)</span>
    <span class="p">}</span>
<span class="p">}</span>

</code></pre></div></div>

<p>Note that if you forget to add a <code class="language-plaintext highlighter-rouge">Drop</code> implementation for <code class="language-plaintext highlighter-rouge">T</code>, this will silently forget to clean up the contents of the vector. See <a href="#mirror-types">the next section</a> for some ways to handle this by creating a “safe” mirror type.</p>

<h2 id="mirror-types">Mirror types</h2>

<p>C++ libraries often have useful templated abstractions, and it’s nice to be able to manipulate them from Rust. Sometimes, it’s possible to just tack on semantics on the Rust side (either by adding an implementation or by doing type replacement), but in some cases this is tricky.</p>

<p>For example, Gecko has <code class="language-plaintext highlighter-rouge">RefPtr&lt;T&gt;</code>, which is similar to <code class="language-plaintext highlighter-rouge">Rc&lt;T&gt;</code>, except the actual refcounting logic is up to <code class="language-plaintext highlighter-rouge">T</code> to implement (it can choose between threadsafe, non-threadsafe, etc), which it does by writing <code class="language-plaintext highlighter-rouge">AddRef()</code> and <code class="language-plaintext highlighter-rouge">Release()</code> methods.</p>

<p>We mirror this in Rust by having a trait:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cd">/// Trait for all objects that have Addref() and Release</span>
<span class="cd">/// methods and can be placed inside RefPtr&lt;T&gt;</span>
<span class="k">pub</span> <span class="k">unsafe</span> <span class="k">trait</span> <span class="n">RefCounted</span> <span class="p">{</span>
    <span class="cd">/// Bump the reference count.</span>
    <span class="k">fn</span> <span class="nf">addref</span><span class="p">(</span><span class="o">&amp;</span><span class="k">self</span><span class="p">);</span>
    <span class="cd">/// Decrease the reference count.</span>
    <span class="k">unsafe</span> <span class="k">fn</span> <span class="nf">release</span><span class="p">(</span><span class="o">&amp;</span><span class="k">self</span><span class="p">);</span>
<span class="p">}</span>

<span class="cd">/// A custom RefPtr implementation to take into account Drop semantics and</span>
<span class="cd">/// a bit less-painful memory management.</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">RefPtr</span><span class="o">&lt;</span><span class="n">T</span><span class="p">:</span> <span class="n">RefCounted</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">ptr</span><span class="p">:</span> <span class="o">*</span><span class="k">mut</span> <span class="n">T</span><span class="p">,</span>
    <span class="n">_marker</span><span class="p">:</span> <span class="n">PhantomData</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p>We implement the <code class="language-plaintext highlighter-rouge">RefCounted</code> trait for C++ types that are wrapped in <code class="language-plaintext highlighter-rouge">RefPtr</code> which we wish to access through Rust. We have <a href="https://searchfox.org/mozilla-central/rev/cfaa5a1d48d6bc6552199e73004ecb05d0a9c921/servo/components/style/gecko_bindings/sugar/refptr.rs#258-315">some</a> <a href="https://searchfox.org/mozilla-central/rev/cfaa5a1d48d6bc6552199e73004ecb05d0a9c921/layout/style/GeckoBindings.h#52-60">macros</a> that make this easier to do. We have to have such a trait, because otherwise Rust code wouldn’t know how to manage various C++ types.</p>

<p>However, <code class="language-plaintext highlighter-rouge">RefPtr&lt;T&gt;</code> here can’t be the type that ends up being used in bindgen. Rust doesnt let us do things like <code class="language-plaintext highlighter-rouge">impl&lt;T: RefCounted&gt; Drop for RefPtr&lt;T&gt;</code> <sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">4</a></sup>, so we can’t effectively make this work with the bindgen generated type unless we write a <code class="language-plaintext highlighter-rouge">RefCounted</code> implementation for every refcounted type that shows up in the bindgen output at all – which would be a lot of work.</p>

<p>Instead, we let bindgen generate its own <code class="language-plaintext highlighter-rouge">RefPtr&lt;T&gt;</code>, called <code class="language-plaintext highlighter-rouge">structs::RefPtr&lt;T&gt;</code> (all the structs that bindgen generates for Gecko go in a <code class="language-plaintext highlighter-rouge">structs::</code> module). <code class="language-plaintext highlighter-rouge">structs::RefPtr&lt;T&gt;</code> itself doesn’t have enough semantics to be something we can pass around willy-nilly in Rust code without causing leaks. However, it has <a href="https://searchfox.org/mozilla-central/rev/cfaa5a1d48d6bc6552199e73004ecb05d0a9c921/servo/components/style/gecko_bindings/sugar/refptr.rs#150-234">some methods</a> that allow for conversion into the “safe” mirror <code class="language-plaintext highlighter-rouge">RefPtr&lt;T&gt;</code> (but only if <code class="language-plaintext highlighter-rouge">T: RefCounted</code>). So if you need to manipulate a <code class="language-plaintext highlighter-rouge">RefPtr&lt;T&gt;</code> in a C++ struct somewhere, you immediately use one of the conversion methods to get a safe version of it first, and <em>then</em> do things to it. Refcounted types that don’t have the <code class="language-plaintext highlighter-rouge">RefCounted</code> implementation won’t have conversion methods: they may exist in the data you’re manipulating, however you won’t be able to work with them.</p>

<p>In general, whenever attaching extra semantics to generic bindgen types doesn’t work create a mirror type that’s completely safe to use from Rust, with a trait that gates conversion to the mirror type.</p>

<h2 id="potential-pitfall-allocators">Potential pitfall: Allocators</h2>

<p>If you’re passing heap-managed abstractions across FFI, be careful about which code frees which objects. If your Rust
and C++ code don’t share allocators, deallocating memory allocated on the other side can have disastrous consequences.</p>

<p>If you’re building a cdylib or staticlib with Rust (this is likely if you’re linking it with a C++ application), the compiler will by default pick the system allocator (<code class="language-plaintext highlighter-rouge">malloc</code>), so if your C++ application also uses the same you’re all set.</p>

<p>On some platforms when building rlibs and binaries, Rust may choose jemalloc instead. It’s also possible that your C++ code uses a different allocator (lots of applications use allocators like jemalloc or tcmalloc, some have their own custom allocators like <code class="language-plaintext highlighter-rouge">tor_malloc</code> in Tor).</p>

<p>In such cases you have one of three options:</p>

<ul>
  <li>Avoid transferring ownership of heap-allocated items, only share things as borrowed references</li>
  <li>Call destructors over FFI, as detailed in <a href="#sharing-abstractions-with-destructors">the section on destructors above</a></li>
  <li>Set Rust’s allocator to be the same as documented <a href="https://doc.rust-lang.org/nightly/std/alloc/#the-global_allocator-attribute">in the <code class="language-plaintext highlighter-rouge">std::alloc</code> module</a>. Basically, can use the <code class="language-plaintext highlighter-rouge">#[global_allocator]</code> attribute to select which allocator you wish to use, and if necessary you can implement the <code class="language-plaintext highlighter-rouge">GlobalAlloc</code> trait on a custom allocator type that calls into whatever custom allocator C++ is using.</li>
</ul>

<p><em>Note from 2021: Most stdlib collections (<a href="https://doc.rust-lang.org/nightly/std/vec/struct.Vec.html"><code class="language-plaintext highlighter-rouge">Vec</code></a>, for example) now have an optional “custom allocator” parameter that can be used to swap in a different allocator for a specific use site.</em></p>

<h2 id="arcs-over-ffi-triomphe">Arcs over FFI: Triomphe</h2>

<p>This isn’t really a generalizable technique, but it’s pretty cool and generally instructive, so I’m including it here.</p>

<p>Stylo uses a lot of <code class="language-plaintext highlighter-rouge">Arc</code>s. A <em>lot</em> of them. The entire computation of styles makes heavy use of <code class="language-plaintext highlighter-rouge">Arc::make_mut</code>’s copy-on-write semantics so that we can build up the style tree in parallel but not have to make unnecessary copies of duplicated/defaulted styles for each element.</p>

<p>Many of these <code class="language-plaintext highlighter-rouge">Arc</code>s need to be readable from C++. Rust’s <code class="language-plaintext highlighter-rouge">Arc</code>, however, consists of a pointer to an allocation containing a refcount and the data, so if C++ needs to get access to the data it needs to know the layout of the <code class="language-plaintext highlighter-rouge">Arc</code> allocation, which we’d rather not do<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup>.</p>

<p>We picked a different route: We created a crate duplicating <code class="language-plaintext highlighter-rouge">Arc&lt;T&gt;</code> which behaves almost exactly the same as <code class="language-plaintext highlighter-rouge">Arc&lt;T&gt;</code>, but it can be converted to <code class="language-plaintext highlighter-rouge">OffsetArc&lt;T&gt;</code> which has its pointer point to the <em>middle</em> of the allocation, where the <code class="language-plaintext highlighter-rouge">T</code> begins. To C++, this just looks like a <code class="language-plaintext highlighter-rouge">*const T</code>! We were then able to make it work with <code class="language-plaintext highlighter-rouge">RefPtr&lt;T&gt;</code> on the C++ side so that C++ can transparently read from the <code class="language-plaintext highlighter-rouge">OffsetArc&lt;T&gt;</code>, and only needs to call into Rust if it wishes to clone or drop it.</p>

<p>The external version of this crate can be found in <a href="https://docs.rs/triomphe">triomphe</a>. It contains a bunch of other goodies that are additionally useful outside of the FFI world, like <code class="language-plaintext highlighter-rouge">ArcBorrow</code> which is essentially “<code class="language-plaintext highlighter-rouge">&amp;Arc&lt;T&gt;</code> without double indirection”, <code class="language-plaintext highlighter-rouge">UniqueArc&lt;T&gt;</code>, a mutable <code class="language-plaintext highlighter-rouge">Arc&lt;T&gt;</code> known to be uniquely owned, and <code class="language-plaintext highlighter-rouge">ArcUnion&lt;T, U&gt;</code>, which is a space-efficient union of <code class="language-plaintext highlighter-rouge">Arc&lt;T&gt;</code> and <code class="language-plaintext highlighter-rouge">Arc&lt;U&gt;</code>.</p>

<h2 id="other-pitfalls">Other pitfalls</h2>

<h3 id="transparent">Transparent</h3>

<p>It’s <em>very</em> tempting to wrap C++ types in tuple structs and pass them over FFI. For example, one might imagine that the following is okay:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nf">Wrapper</span><span class="p">(</span><span class="nn">bindings</span><span class="p">::</span><span class="n">SomeCppType</span><span class="p">);</span>

<span class="k">extern</span> <span class="s">"C"</span> <span class="p">{</span>
    <span class="c1">// C++ signature: `SomeCppType get_cpp_type();`</span>
    <span class="k">fn</span> <span class="nf">get_cpp_type</span><span class="p">()</span> <span class="k">-&gt;</span> <span class="n">Wrapper</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This kind of thing is quite useful to get around coherence, or for adding additional semantics to a type.</p>

<p>While there’s basically one obvious way <code class="language-plaintext highlighter-rouge">Wrapper</code> can be represented, ABI stuff can be tricky, and Rust’s layout isn’t defined. It is safer to use <code class="language-plaintext highlighter-rouge">#[repr(transparent)]</code>, which guarantees that <code class="language-plaintext highlighter-rouge">Wrapper</code> will have the same representation as the type it contains.</p>

<h3 id="c-enums">C enums</h3>

<p>Rust supports C-like enums, but there’s a crucial difference between them. In C, it is not undefined behavior for an enum to have an unlisted value. In fact, the following pattern is not uncommon:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">enum</span> <span class="n">Flags</span> <span class="p">{</span>
    <span class="n">Flag1</span> <span class="o">=</span> <span class="mi">0</span><span class="n">b0001</span><span class="p">,</span>
    <span class="n">Flag2</span> <span class="o">=</span> <span class="mi">0</span><span class="n">b0010</span><span class="p">,</span>
    <span class="n">Flag3</span> <span class="o">=</span> <span class="mi">0</span><span class="n">b0100</span><span class="p">,</span>
    <span class="n">Flag4</span> <span class="o">=</span> <span class="mi">0</span><span class="n">b1000</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p>where the enum is actually used for bitflags, and <code class="language-plaintext highlighter-rouge">Flag1 | Flag2</code> and <code class="language-plaintext highlighter-rouge">0</code> are both valid values for <code class="language-plaintext highlighter-rouge">Flags</code>.</p>

<p>This is not the case in Rust. If you are type-replacing C enums with Rust ones, make sure they are <code class="language-plaintext highlighter-rouge">#[repr(C)]</code>. The Rust compiler uses invalid enum values as space for packing other information while optimizing types, for example Rust is able to represent <code class="language-plaintext highlighter-rouge">Option&lt;Option&lt;... 255 times .. Option&lt;bool&gt;&gt;</code> as a single byte.</p>

<p>If you are working with a C enum that is used for bitflags like above, please use an integer type instead. <code class="language-plaintext highlighter-rouge">#[repr(C)]</code> on enums in Rust guarantees layout, but it is <a href="https://doc.rust-lang.org/stable/nomicon/other-reprs.html">still undefined behavior for any enum to take on invalid values</a>.</p>

<h3 id="abi-concerns">ABI concerns</h3>

<p>ABIs can be tricky. If you <em>just</em> use bindgen with no special flags, you can be pretty much guaranteed to have an okay ABI, but as you start doing type replacements, stuff can get murkier.</p>

<p>Firstly, make sure you’re not passing owned C++ classes with destructors/etc across FFI boundaries. See <a href="#potential-pitfall-passing-c-classes-by-value-over-ffi">above</a> for why. There’s a bunch of subtle stuff here, but you can avoid most of it it if you just don’t pass these things across FFI in an owned way.</p>

<p>Also, try to make sure everything is <code class="language-plaintext highlighter-rouge">#[repr(C)]</code> across the boundary. Rust’s <code class="language-plaintext highlighter-rouge">improper-ctypes</code> lints will help here.</p>

<h2 id="should-c-apis-be-unconditionally-unsafe">Should C++ APIs be unconditionally <code class="language-plaintext highlighter-rouge">unsafe</code>?</h2>

<p>Before I get into this, I want to reiterate that most of the recommendations in this post are for <em>complex</em> C++-Rust integrations, which are likely to only crop up when attempting to rewrite parts of a large C++ codebase in Rust. Such codebases have unique needs and it’s important to calibrate for that when judging what’s right for them.</p>

<p>I recall when <a href="https://www.chromium.org/Home/chromium-security/memory-safety/rust-and-c-interoperability">this Chromium post</a> and <a href="https://steveklabnik.com/writing/the-cxx-debate">Steve’s <code class="language-plaintext highlighter-rouge">cxx</code> post</a> came out, there was a bunch of discussion about C++ functions not being universally marked <code class="language-plaintext highlighter-rouge">unsafe</code>. Essentially, a lot of people are of the opinion that all FFI into C++ (or C) should be unconditionally marked <code class="language-plaintext highlighter-rouge">unsafe</code> (and that tools like <code class="language-plaintext highlighter-rouge">cxx</code> should follow these rules).</p>

<p>Back then I wrote <a href="https://www.reddit.com/r/rust/comments/ielvxu/the_cxx_debate/g2jurb3/?context=3">a Reddit comment</a> about my thoughts on this. It’s a comment that’s the length of a blog post in and of itself so I’m not going to reproduce all of it here, but I’ll try to get the gist. I highly suggest you read it instead of this section.</p>

<p>In short, I would recommend people in large, complex codebases doing heavy C++ interop to generally be okay with marking functions calling into C++ as “safe” provided that function would be considered “safe to call without thinking too much about it” on the C++ side, whatever that means for your codebase.</p>

<p>From <a href="https://manishearth.github.io/blog/2017/12/24/undefined-vs-unsafe-in-rust/">my post on “undefined” vs “unsafe”</a>, for Rust I define “safe” as</p>

<blockquote>
  <p>Basically, in Rust a bit of code is “safe” if it cannot exhibit undefined behavior under all circumstances of that code being used.</p>
</blockquote>

<p>C++ doesn’t have a rigid language-level concept of safety that can be applied the same way. Instead, most C++ code follows a similar heuristic:</p>

<blockquote>
  <p>a bit of code is “safe” if it cannot exhibit undefined behavior under all <strong>expected</strong> circumstances of that code being used.</p>
</blockquote>

<p>This is, perhaps, not as good or useful a heuristic as the one we have for Rust, but it’s still a heuristic that gets used in deciding how careful one needs to be when using various APIs. After all, there are <em>plenty</em> of giant C++ codebases out there, they have got to be able to reason about safety <em>somehow</em>.</p>

<p>When you decide to meld together a C++ and Rust codebase, or start rewriting parts of a C++ codebase in Rust, you have already in essence decided for a large part of the codebase to not exactly follow Rust’s safety rules (but hopefully still be safe). There is little to be gained by making that an explicit part of your FFI boundary. Rather, it is more useful to save <code class="language-plaintext highlighter-rouge">unsafe</code> on the FFI boundary for truly unsafe functions which you actually do need to be careful to call.</p>

<p><code class="language-plaintext highlighter-rouge">unsafe</code> is useful for finding potential sources of badness in your codebase. For a tightly-integrated Rust/C++ codebase it’s already well known that the C++-side is introducing badness, marking every simple C++ getter as <code class="language-plaintext highlighter-rouge">unsafe</code> will lead to alarm fatigue and make it <em>harder</em> to find the real problems.</p>

<p>It’s worth figuring out where this boundary lies for you. Tools like <code class="language-plaintext highlighter-rouge">cxx</code> make it straightforward to call C++ functions through a safe interface, and it’s valuable to make use of that support.</p>

<h2 id="closing-comments">Closing comments</h2>

<p>Again, before going down this route it’s worth wondering if you <em>really</em> need tight Rust-C++ integration. When possible, it’s always better to pick a small, well-defined API boundary, rather than Stylo-esque tight integration with shared objects and a highly criscrossed callgraph.</p>

<p>These days <a href="https://github.com/dtolnay/cxx">cxx</a> is probably the most complete tool for such integrations. <a href="https://github.com/rust-lang-nursery/rust-bindgen/">bindgen</a> and <a href="https://github.com/eqrion/cbindgen">cbindgen</a> are still quite good, but cxx is C++-first, with a lot more magic, and generally seems to Just Work without too much configuration.</p>

<p><a href="https://github.com/google/autocxx">autocxx</a> is a cool concept by Adrian Taylor which melds bindgen and cxx to make something even <em>more</em> magical. It’s currently experimental, but I’m going to be watching it with interest.</p>

<p>Overall the field of Rust and C++ integration is at a stage where it’s mature enough for integrations to be <em>possible</em> without too much effort, but there are still tons of ways things could be improved and I’m super excited to see that happen as more people work on such integrations!</p>

<p><em>Thanks to Adam Perry, Adrian Taylor, katie martin, Nika Layzell, and Tyler Mandry for reviewing drafts of this post</em></p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>The <em>cascade</em> in “Cascading Style Sheets” is the process used to take all the potential rules which could apply to an element and find the “most applicable” one that gets actually used. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>The <code class="language-plaintext highlighter-rouge">em</code> unit is font-size-relative, so <code class="language-plaintext highlighter-rouge">2em</code> with a <code class="language-plaintext highlighter-rouge">font-size</code> of <code class="language-plaintext highlighter-rouge">12px</code> is computed to <code class="language-plaintext highlighter-rouge">2 * 12 = 24px</code>. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>C++ name mangling <a href="https://en.wikipedia.org/wiki/Name_mangling#How_different_compilers_mangle_the_same_functions">is not standardized</a>, so any function with the C++ ABI will generate a <code class="language-plaintext highlighter-rouge">#[link_name = "_Z1foobarbaz"]</code> attribute on the Rust side, and the exact string used here will differ across compiler implementations and platforms. Since GCC and Clang follow the same scheme, most people will encounter this problem when their code doesn’t work on Windows due to MSVC using a different scheme. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p><code class="language-plaintext highlighter-rouge">Drop</code> impls are restricted in a bunch of ways for safety, in particular you cannot write <code class="language-plaintext highlighter-rouge">impl&lt;T: RefCounted&gt; Drop for RefPtr&lt;T&gt;</code> unless <code class="language-plaintext highlighter-rouge">RefPtr</code> is defined as <code class="language-plaintext highlighter-rouge">RefPtr&lt;T: RefCounted&gt;</code>. It’s not possible to have a generic type that has an impl of <code class="language-plaintext highlighter-rouge">Drop</code> for only <em>some</em> possible instantiations of its generics. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>Rust’s standard library does not typically guarantee anything about the layout of its types, and furthermore, Rust does not make many guarantees about the stability of most types without a <code class="language-plaintext highlighter-rouge">#[repr]</code> attribute. This would <em>work</em>, but it would be brittle and prone to breakage. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[On Voting Systems]]></title>
    <link href="http://manishearth.github.io/blog/2019/10/09/on-voting-systems/"/>
    <updated>2019-10-09T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2019/10/09/on-voting-systems</id>
    <content type="html"><![CDATA[<p>Election season is starting up again, and as with many other topics I’m seeing a lot of overconfident takes from people in tech wanting to “solve” how voting works with naïve techy solutions. Hell, <a href="https://cointelegraph.com/news/andrew-yang-wants-to-make-us-elections-fraud-proof-using-blockchain">even a presidential candidate seems to have proposed an extremely uninformed plan for “fixing” voting using blockchain technology</a>.</p>

<p>Last year I wrote <a href="https://twitter.com/ManishEarth/status/1056255900095340545">a thread on Twitter</a> covering some of the essential properties good voting systems uphold as well as how they prevent fraud. It was through the lens of Alameda County’s voting system, where I’ve volunteered as a poll worker in the past (and intend to do again). I’ve been meaning to write down the contents of that thread in blog form for a while, and now seemed like a good opportunity to do it.</p>

<p>I’ll be explaining more about most of these properties later, but ideally, a good voting system should uphold:</p>

<ul>
  <li>Secret ballot: Nobody, not even you, can verify who you voted for after you’re out of the polling place, to prevent vote-buying and coercion.</li>
  <li>Auditable paper trail: We should be able to audit the election. Paper trails are usually the most robust way to enable effective audits.</li>
  <li>Obviousness: It should be relatively obvious what individuals should be doing when they need to mark their ballots. A system that you can easily “mess up” with is a bad system.</li>
  <li>Accessibility: It should not exclude individuals with disabilities from being able to vote.</li>
</ul>

<h2 id="how-voting-works-in-alameda-county">How voting works in Alameda County</h2>

<p>I’ll first go over how voting in my county works. The system isn’t perfect, but it’s pretty good, and it’s a good springboard for understanding how voting systems in general can work. There’s a <a href="https://www.acvote.org/acvote-assets/04_resources/PDFs/pwmanuals/06042019/Guide-FINAL-june.pdf">poll worker guide</a> you can refer to if you’re really interested in all the specifics.</p>

<p>Broadly speaking, there are four ways to vote:</p>

<ul>
  <li>By mail</li>
  <li>In person at certain government offices, before election day (“early voting”)</li>
  <li>In person on election day at a polling place</li>
  <li>Provisionally, in person on election day at a polling place</li>
</ul>

<p>Voting by mail is pretty straightforward: When you register you can choose to vote by mail (or you can choose to do so online after the fact). You get a ballot in the mail, along with a special envelope. You fill in the ballot at your leisure, stick it in the envelope, write your name/address on the envelope, sign it, and mail it back. There are also convenient ballot dropboxes all over the place in case you’re a millenial like me and don’t want to figure out how to buy stamps<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>.</p>

<p>If you’re voting by mail you can also show up at any polling place on the day of the election and drop off your ballots in a sealed bin. At the polling place I helped run roughly half of the people coming in were just there to drop off their vote by mail ballots!</p>

<p>Voting by mail is by far the easiest option here. Sadly not all counties support it<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>. In some states <a href="https://en.wikipedia.org/wiki/Vote-by-mail_in_Oregon">this is even the <em>default</em> option</a>.</p>

<p>As I understand it, voting in person at designated government offices<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup> is pretty much the same as voting in person at a polling place, it’s just run by government employees instead of volunteers and open for a few weeks before election day.</p>

<figure class="caption-wrapper center" style="width: 400px"><img class="caption" src="http://manishearth.github.io/images/post/polls/bling.jpeg" width="400" /><figcaption class="caption-text"><p>Poll workers are given some neat bling to wear</p>
</figcaption></figure>

<h3 id="in-person-voting">In person voting</h3>

<p>If you’ve chosen to vote in person, you are supposed to turn up at your assigned polling place (you get your assignment in the mail along with other voter info booklets).</p>

<p>There’s a copy of the list of people assigned to the polling place posted outside, and another with the poll workers inside. When you tell your name to the poll workers, they cross your name off the list, and you have to sign your name next to it<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup>.</p>

<ul>
  <li>If your name isn’t on the list, the poll workers will try and find your assigned precinct and inform you that you can go there instead, but you can still choose to vote provisionally at the existing precinct.</li>
  <li>If your name isn’t on the list of all voters (perhaps you registered very late, or were unable to register), you can also vote provisionally.</li>
  <li>If your name is on the list but marked as voting-by-mail (and you want to vote in person), you can vote normally only if you surrender your mail ballot (which poll workers will mark as spoiled and put in a separate pouch).</li>
  <li>If you lost/didn’t receive your ballot, you can always vote provisionally.</li>
</ul>

<p>When you are voting normally, signing your name on the list fraudulently is illegal.</p>

<p>If it is your first time voting, you need to show some form of ID, but it doesn’t need to be photo ID and <a href="https://en.wikipedia.org/wiki/Help_America_Vote_Act#Voter_identification">even a utility bill is fine</a>.</p>

<p>Once you’re done signing, you’ll be given your ballot cards and a privacy sleeve folder so you can carry your filled ballots around. Because this is California and there are tons of local and state measures, we had 4 (!!) ballot cards, six sides to fill in<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup>. Usually a poll worker will also detach the ballot stubs in front of you and hand them to you to keep. You can use these to check the status (but not the contents!) of your ballot online.</p>

<p>You take your cards to a voting booth, fill them in, and come back. A poll worker will then help you feed your ballot cards into a scanner machine. This machine will reject cards with any problems — which you can fix, rerequesting new ballot cards if necessary, but you then have to spoil and return the old ballot card.</p>

<p>The machine keeps an externally-visible tally of the number of ballots submitted, and an internal tally of all the votes made, ignoring write-ins. It also internally stores ballot cards in one of two bins (depending on write-ins). These bins are verified to be empty when polls open, and are inaccessible till polls close.</p>

<p>It’s important to note that the scanner is not a load-bearing component of the system: It could be replaced with a locked bin with a slot, and the system would still work. The scanner enables one to get <em>preliminary</em> results for the precinct, and provides a way to double-check results.</p>

<p>And that’s it! You’ll be given an I Voted sticker, and you can go home!</p>

<figure class="caption-wrapper center" style="width: 400px"><img class="caption" src="http://manishearth.github.io/images/post/polls/stickers.png" width="400" /><figcaption class="caption-text"><p>Some “I Voted!” stickers in Spanish</p>
</figcaption></figure>

<h3 id="using-a-voting-machine">Using a voting machine</h3>

<p>In case you think you will have trouble filling out a ballot card in pen (e.g. if your vision is heavily impared), there’s an alternate way to vote that doesn’t involve a pen. Instead, we have a machine which has a touchscreen and an audio unit, which prompts the voter for their selection for each ballot item on the touchscreen or audio unit. When they’re done, the machine will print out a “receipt” listing their choices inside a sealed box with a glass window, so they can verify that their vote was recorded correctly<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">6</a></sup>. Once they’re done the sealed box will scroll the “receipt” out of view so that the next voter can’t see it.</p>

<p>The sealed box is called a <a href="https://en.wikipedia.org/wiki/Voter-verified_paper_audit_trail">Voter-Verified Paper Trail</a> box: the election runners no longer need to trust the machine’s internal memory, they can trust the paper trail inside the box (which, while produced by a potentially-untrustworthy machine, was verified by the voters), and the machine’s internal memory is simply a way to double-check (and get fast preliminary results).</p>

<h3 id="provisional-voting">Provisional voting</h3>

<p>There are many, many situations in which you may not be able to vote normally. Perhaps you showed up at the wrong precinct but don’t have time to go to the right one. Perhaps you were signed up for vote-by-mail but didn’t receive (or lost) your ballot. Perhaps you recently moved into the county and weren’t able to register in time. Perhaps you were a first-time in-person voter and didn’t have some form of ID.</p>

<p>In such a case you can always vote provisionally. The beauty of this system is that it removes most liability from poll workers: we don’t have any reason to turn people away from the polls, all we can do is refuse to let people vote normally (and instead vote provisionally) in the case of any inconsistencies. This is not to say that malicious poll workers <em>can’t</em> turn people away; it’s illegal but it happens. But well-meaning poll workers cannot, by accident, disenfranchise a voter because we are always allowed to give them a provisional ballot, and that’s an easy rule to follow.</p>

<p>With provisional voting, the voters are given the same ballot cards, but they’re also given an envelope with a form on it. This envelope is equivalent to a voter registration form, (re)registering them in their appropriate county/district<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">7</a></sup>. They vote on the ballot cards normally, but instead of submitting the ballots to the scanner, they put them in the envelope, which goes into a sealed bin<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">8</a></sup>. You’re also given a number you can call to check the status of your ballot.</p>

<p>When you vote provisionally, the registrar of voters will manually process your envelope, remaking your ballot on the right set of cards if necessary, and feeding them into a scanner machine.</p>

<h3 id="integrity-checks">Integrity checks</h3>

<p>Underlying this system is a bevy of integrity checks. There’s an intricate seal system, with numbered seals of varying colors. Some are to be added and never removed, some are to be removed after checking the number, some are never supposed to be touched, some are added at the beginning of the day and removed at the end of the day.</p>

<p>For example, during setup we check that the bins in the scanner are empty, and seal it with a numbered seal. This number is noted down on a form, along with some numbers from the scanner/touchscreen display. The first person to vote is asked to verify all this, and signs the form along with the poll workers.</p>

<p>Election officials drop in multiple times during the day, and may check these numbers. At the end of the day, the numbers of all seals used, and any physical seals that were removed are sent back along with all the ballots.</p>

<p>Various ballot counts are also kept track of. We keep track of the number of provisional ballots, the number of submitted regular ballots (also kept track by the scanner), the number of ballot cards used, and the number of unused ballots left over. Everything needs to match up at the end of the day, and all unused ballots are sent back. These counts are also noted down.</p>

<p>Poll watchers are allowed to be around for most of this, though I think they’re not allowed to <em>touch</em> anything. I think poll watchers are also allowed to be around when the actual ballots are being counted by election officials.</p>

<h3 id="immediate-local-results">Immediate local results</h3>

<p>As mentioned before, the scanner isn’t a crucial part of the system, but if it happens to be working it can be used to get immediate local results. At the end of the day, the scanner prints out a bunch of stuff, including vote totals for races which got more than N votes (N=20, IIRC), so you get immediate results for your precinct. This printout is supposed to be taped to the polling place doors for everyone to see, and presumably the registrar of voters uses the copy submitted to them to publish quick preliminary results.</p>

<p>Using paper ballots doesn’t mean that we have to give up all the benefits of computers doing some of the work for us! We can still use computers to get fast results, without relying on them for the integrity of the system.</p>

<figure class="caption-wrapper center" style="width: 400px"><img class="caption" src="http://manishearth.github.io/images/post/polls/totals.jpeg" width="400" /><figcaption class="caption-text"><p>Vote totals posted outside. Our ballots are big and have lots of races on them; so the list of vote totals is absolutely ginormous.</p>
</figcaption></figure>

<h2 id="properties-of-this-voting-system">Properties of this voting system</h2>

<p>This system has some crucial properties.</p>

<h3 id="secret-ballot">Secret ballot</h3>

<p>It’s well known that nobody is supposed to be able to see who you voted for. But a crucial part of this system is that, once you submit your ballots, <em>you</em> can’t see who you voted for either. Of course, you probably can <em>remember</em>, but you have no <em>proof</em>. On the face of it this sounds like a bad property — wouldn’t it be nicer if people could verify that their vote was counted correctly?</p>

<p>The problem is that if <em>I</em> can verify that my vote was counted correctly, someone else can coerce me into doing this in front of them to ensure I voted a certain way. Any system that gives me the ability to verify my vote gives people who have power over me (or just people who want to buy my vote) the same ability.</p>

<p>Provisional voting doesn’t quite have this property, but it’s supposed to be for edge cases. Vote by mail trades off some of this property for convenience; people can now see who you voted for while you’re voting (and the people you live with can fradulently vote on your behalf, too).</p>

<h3 id="conservation-of-ballots-auditable-paper-trail">Conservation of ballots (Auditable paper trail)</h3>

<p>The total number of ballots in the system is roughly conserved and kept track of. If you’re registered to vote by mail, you cannot request a normal ballot without surrendering your vote by mail ballot and envelope (which we mark as spoiled and put in a separate pouch). If you re-request a ballot card because you made a mistake, the old card needs to be similarly spoiled and put away separately. It’s one set of ballot cards per voter, and almost all potential aberrations in this property result in a provisional vote<sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">9</a></sup>. Even provisional votes are converted to normal ballot cards in the end.</p>

<p>Eventually, there will be a giant roomful of ballots that cannot be traced back to their individual voters, but it can still be traced back to the <em>entirety</em> of the voters — it’s hard to put a ballot into the system without a corresponding voter. This is perfect — the ballots can be hand-counted, but they can’t be individually corellated with their respective voters.</p>

<p>You don’t even need to recount the entire set of ballots to perform an audit, <a href="https://risklimitingaudits.org/">risk limiting audits</a> are quite effective and much more efficient to carry out.</p>

<h3 id="paper-ballots">Paper ballots</h3>

<p>The fact that they can (and should) be hand counted is itself an important property. Hand counting of ballots can be independently verified in ways that can’t be done for software. Despite not being able to trace a ballot back to its voter, there still is a paper trail of integrity for the ballots as a bulk unit.</p>

<p>This property leads to [software independance]: while we may use software in the process, it’s not possible for a bug in the software to cause an undetectable error in the final vote counts.</p>

<figure class="caption-wrapper center" style="width: 500px"><img class="caption" src="http://manishearth.github.io/images/post/polls/totals-zoom.png" width="500" /><figcaption class="caption-text"><p>Specific vote totals for the top races</p>
</figcaption></figure>

<h3 id="obviousness">Obviousness</h3>

<p>Figuring out what to do in the voting booth isn’t hard. You’re allowed to request assistance, but you’ll rarely have to. There are systems (like the scanner’s error checking) that are designed to ensure you don’t mess up, but the system is quite sound even without them; they just provide an additional buffer.</p>

<p>Compare this with <a href="https://www.texastribune.org/2018/11/01/texas-straight-ticket-voting-problems-old-machines/">the problems some Texas voting machines had last midterm</a>. The machines were somewhat buggy, but, crucially, there was an opaque right and wrong way to use them, and some voters accidentally used it the wrong way, and then didn’t check the final page before submitting. This kind of thing should never happen in a good voting system.</p>

<p>It’s really important that the system is intuitive and hard to make mistakes in.</p>

<h2 id="fraud-prevention">Fraud prevention</h2>

<p>So, how is this robust against fraud?</p>

<p>Firstly, voter fraud isn’t a major problem in the US, and it’s often used as an excuse to propagate voter suppression tactics, which <em>are</em> a major problem here.</p>

<p>But even then, we probably want our system to be robust against fraud.</p>

<p>Let’s see how an individual might thwart this system. They could vote multiple times, under assumed identites. This doesn’t scale well and isn’t really worth it: to influence an election you’d need to do this many times, or have many individuals do it a couple times, and the chance of getting caught (e.g., the people who you are voting as may come by and try to vote later, throwing up a flag) and investigated scales exponentially with the number of votes. That’s not worth it at all.</p>

<p>Maybe poll workers could do something malicious. Poll worker manipulation would largely exist in the form of submitting extra ballots. But that’s hard because the ballot counts need to match the list of voters. So you have the same problem as individual voters committing fraud: if the actual voter shows up, they’ll notice. Poll workers <em>could</em> wait till the end of the day to do this, but then to make any kind of difference you’d have to do a bulk scan of ballots, and that’s very noticeable. Poll workers would have to collude to make anything like this work, and poll watchers (and government staff) may be present.</p>

<p>Poll workers can also <em>discard</em> ballots to influence an election. But you can’t do this in front of the voters, and the receptacles with non-defaced ballots are all sealed so you can’t do this when nobody is watching without having to re-seal (which means you need a new numbered seal, which the election office will notice). The scanner’s inner receptacle is opened at the end of the day but you can’t tamper with that without messing up the various counts.</p>

<p>Election officials have access to giant piles of ballots and could mess with things there, but I suspect poll watchers are present during the ballot sorting and counting process, and again, it’s hard to tamper with anything without messing up the various counts.</p>

<p>Overall, this system is pretty robust. It’s important to note that fraud prevention is achieved by more social means, not technical means: there are seals, counts, and various properties of the system, but no computers involved in any crucial roles.</p>

<h2 id="techy-solutions-for-voting">Techy solutions for voting</h2>

<p>In general, amongst the three properties of “secret ballot”, “obviousness”, and “auditable paper trail”, computer-based voting systems almost always fail at one, and usually fail at two.</p>

<p>A lot of naïve tech solutions for voting are explicitly designed to not have the secret ballot property: they are instead designed specifically to let voters check that what their vote was counted as after the election. As mentioned earlier, this is a problem for vote-buying and coercion.</p>

<p>It’s theoretically possible to have a system where you can ensure your ballot, specifically, was correctly counted after the election, without losing the secret ballot property: <a href="https://en.wikipedia.org/wiki/ThreeBallot">ThreeBallot</a> is a cool example of such a system, though it fails the “obviousness” property.</p>

<p>Most systems end up not having an auditable paper trail since they rely on machines to record votes. This is vulnerable to bugs in the machine: you end up having to trust the output of the machine. Buggy/vulnerable voting machines are so common that every year at DEFCON <a href="https://media.defcon.org/DEF%20CON%2027/voting-village-report-defcon27.pdf">people get together to hack the latest voting machines, and typically succeed</a>.</p>

<p>Voting machines can still produce a paper trail: Voter-Verified Paper Trail systems partially succeed in doing this. They’re not as good with maintaining the “conservation of ballots” property that makes tampering much harder, and they’re not as good on the “obviousness” part since people need to check the VVPAT box for what their vote was recorded as.</p>

<p>Ballot-Marking devices are a bit better at this: These still produce paper ballots, it’s just that the ballot is marked by the machine on your behalf. There’s still a bit of an “obviousness” fail in that people may not double check the marked ballot, but at least there’s a nice paper trail with ballot conservation! Of course, these only work if the produced ballot is easily human-readable.</p>

<p>It’s not <em>impossible</em> to design good voting systems that rely on technology, but it’s hard to maintain the same properties you can with paper ballots. If you want to try, please keep the properties listed above in mind.</p>

<h3 id="blockchain">Blockchain?</h3>

<p>Every now and then people will suggest using blockchains for voting. This is a pretty large design space, but …. most of these proposals are <em>extremely</em> naïve and achieve very little.</p>

<p>For one, most of them are of the category that lose the “secret ballot” property, instead producing some kind of identifier you’re able to check in some published blockchain. This lets you see what your vote was after the election, and as I’ve covered already that’s not a good thing.</p>

<p>Even if this process only lets you verify that your vote was counted (but not what it was), it typically involves some understanding of cryptography to spot-check the validity of the machine output (e.g. you need to verify that some hash is the hash of your actual vote or something). This fails the obviousness property.</p>

<p>Blockchains don’t really bring much to the table here. They’re decent for byzantine fault tolerance in a space without a central authority, but elections <em>do</em> have some form of central authority and we’re not getting rid of that. The anonymity properties of blockchains can usually be achieved without blockchains for things like elections.</p>

<p>There are some kinds of cryptography that can be useful for auditability — zero knowledge proofs and homomorphic encryption come to mind — but you don’t need blockchains to use these, and using these still requires some form of technology as a key part of the voting system and this makes other properties of the system harder to achieve.</p>

<h2 id="become-a-poll-worker">Become a poll worker!</h2>

<p>It’s still a bit early for the next election, but I highly recommend you volunteer to be a poll worker for your county if you can!</p>

<p>It’s really fun, you get to learn about the inner workings of voting systems, and you get to meet a lot of people!</p>

<figure class="caption-wrapper center" style="width: 700px"><img class="caption" src="http://manishearth.github.io/images/post/polls/nancy.jpeg" width="700" /><figcaption class="caption-text">
<p>We had a cool kid come in and <a href="https://twitter.com/ManishEarth/status/1060052694772011008">more or less do this</a> at one point</p>

</figcaption></figure>

<p><em>Thanks to Nika Layzell, Sunjay Varma, Jane Lusby, and Arshia Mufti for providing feedback on drafts of this blog post.</em></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>Last year they required postage, but I they’ve changed that with <a href="https://www.sos.ca.gov/administration/news-releases-and-advisories/2019/no-stamp-no-problem-all-vote-mail-ballots-now-come-prepaid-postage-return-envelopes/">a law</a> this year. Yay! <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Ostensibly because of fears of voter fraud, but they’re largely unfounded — in practice this just reduces turnout <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>I think for Alameda county the only such office is the Registrar of Voters in Oakland <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>The crossing-off and signing lists are different, but this isn’t too important. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>I remember one particularly curmudgeonly voter loudly grumbling about all the propositions as they were voting. One doesn’t “vote” in California, one fills out social studies homework. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:6" role="doc-endnote">
      <p>I don’t quite recall how the verifiability works for people using the audio unit, they may be allowed to ask someone else to verify for them? <a href="#fnref:6" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:7" role="doc-endnote">
      <p>If you vote in a different precinct, or worse, a different county, the ballot cards may not contain all the same races, so voting provisionally from the wrong district means that you only get to vote for the races common to both ballot cards. <a href="#fnref:7" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:8" role="doc-endnote">
      <p>It’s imperative that these do not go into the scanner (since that operation cannot be undone), and to prevent this poll workers are instructed to not give provisional voters a secrecy sleeve as the envelope acts as a secrecy sleeve. Whoever is supervising the scanner will only allow people with secrecy sleeves to slip their ballots into the scanner. <a href="#fnref:8" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:9" role="doc-endnote">
      <p>The exception is using the touchscreen machine, where you get to vote without using up a ballot card on voting day. However, tallies for the machine are kept separately, and I think these too are eventually turned into normal ballot cards. <a href="#fnref:9" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Rust Governance: Scaling Empathy]]></title>
    <link href="http://manishearth.github.io/blog/2019/02/04/rust-governance-scaling-empathy/"/>
    <updated>2019-02-04T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2019/02/04/rust-governance-scaling-empathy</id>
    <content type="html"><![CDATA[<p>There’s been a lot of talk about improving Rust’s governance model lately. As we decompress from last year’s hectic edition work, we’re slowly starting to look at all the bits of <a href="https://twitter.com/ManishEarth/status/1073088515041198080">debt</a> we accumulated, and <a href="https://boats.gitlab.io/blog/post/rust-2019/">organizational debt</a> is high on that list.</p>

<p>I’ve been talking in private with people about a bunch of these things for quite a while now, and I felt it worthwhile to write down as much of my thoughts as I can before the Rust All Hands in Berlin this week.</p>

<p>In the interest of brevity<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> I’m going to assume the reader is roughly familiar with most of the stuff that’s happened with the Rust community in the past few years. I’m probably going to omit concrete examples of incidents, both to avoid mischaracterizing individual actions (as with most such analyses, I wish to talk in more general terms about trends), and also just because it would take me forever to write this if I were to supply all the layers of context. If you feel something is inaccurate, please let me know.</p>

<p>This blog post is probably going to reach the eyes of non-Rust-community members. You’re welcome to read it, but please accept my apologies in advance if it doesn’t make any sense. This is something that I initially planned to circulate as a private post (writing for a general audience is <em>hard</em>), but I felt this would be more widely useful. However due to time constraints I haven’t had time to edit it to make it acceptable to a wider audience.</p>

<h2 id="the-symptoms">The symptoms</h2>

<p>Before I actually get into it, I’d like to carefully delineate <em>what</em> the problem is that I’m talking about. Or more accurately, the <em>symptoms</em> I am talking about — as I’ll explain soon I feel like these are not the actual problem but symptoms of a more general problem.</p>

<p>Basically, as time has gone by our decisionmaking process has become more and more arduous, both for community members and the teams. Folks have to deal with:</p>

<ul>
  <li>The same arguments getting brought up over and over again</li>
  <li>Accusations of bad faith</li>
  <li>Derailing</li>
  <li>Not feeling heard</li>
  <li>Just getting exhausted by all the stuff that’s going on</li>
</ul>

<p>The RFC process is the primary exhibitor of these symptoms, but semi-official consensus-building threads on <a href="https://internals.rust-lang.org">internals.rust-lang.org</a> have similar problems.</p>

<p>Aaron <a href="http://aturon.github.io/2018/05/25/listening-part-1/">has written some extremely empathetic blog posts</a> about a bunch of these problems, starting with concrete examples and ending with a takeaway of a bunch of values for us to apply as well as thoughts on what our next steps can be. I highly recommend you read them if you haven’t already.</p>

<p>Fundamentally I consider our problems to be social problems, not technical ones. In my opinion, technical solutions like changing the discussion forum format may be necessary but are not sufficient for fixing this.</p>

<h2 id="the-scaling-problem">The scaling problem</h2>

<p>I contend that all of these issues are symptoms of an underlying <em>scaling issue</em>, but also a failure of how our moderation works.</p>

<p>The scaling issue is somewhat straightforward. Such forum discussions are inherently N-to-N discussions. When you leave a comment, you’re leaving a comment for <em>everyone</em> to read and interpret, and this is hard to get right. It’s <em>much</em> easier to have one-on-one discussions because it’s easy to build a shared vocabulary and avoid misunderstandings; any misunderstandings can often be quickly detected and corrected.</p>

<p>I find that most unpleasant technical arguments stem from an unenumerated mismatch of assumptions, or sometimes what I call a mismatch of axioms (i.e. when there is fundamental conflict between core beliefs). A mismatch of assumptions, if identified, can be resolved, leading to an amicable conclusion. Mismatches of axioms are harder to resolve, however recognizing them can take most of the vitriol out of an argument, because both parties will <em>understand</em> each other, even if they don’t <em>agree</em>. In such situations the end result may leave one or both parties <em>unhappy</em>, but rarely <em>angry</em>. (It’s also not necessary that axiom mismatches leave people unhappy, embracing <a href="http://aturon.github.io/2018/06/02/listening-part-2/#pluralism-and-positive-sums">positive sum thinking</a> helps us come to mutually beneficial conclusions)</p>

<p>All of these mismatches are easy to identify in one-on-one discussions, because it’s easy to switch gears to the meta discussion for a bit.</p>

<p>One-on-one discussions are pleasant. They foster empathy.</p>

<p>N-to-N discussions are <em>not</em>. It’s harder to converge on this shared vocabulary amongst N other people. It’s harder to identify these mismatches, partly because it’s hard to switch into the meta-mode of a discussion at all, but also because there’s a lot going on. It’s harder to build empathy.</p>

<p>As we’ve grown, discussion complexity has grown quadratically, and we’re not really attempting to relinearize them.</p>

<h3 id="hanabi-and-parallel-universes">Hanabi and parallel universes</h3>

<p>I quite enjoy the game of <a href="https://en.wikipedia.org/wiki/Hanabi_(card_game)">Hanabi</a>. It’s a game of information and trust, and I find it extremely fun, especially with the right group.</p>

<p>Hanabi is a cooperative game. You can see everyone’s cards (or tiles) but your own, and information-sharing is severely restricted. The goal is to play the right cards in the right order to collectively win. The gimmick is to share additional information through the side-channel of <em>the choice of move you make</em>.</p>

<p>A very common occurrence in this game is that people start making plans in their mind. You typically have a decent understanding of what information everyone has, and you can use this to make predictions as to what everyone’s moves will be. With this in mind, you can attempt to “set up” situations where the game progresses rapidly in a short period of time. This is somewhat necessary for the game to work, but a common pitfall is for these plans to be <em>extremely</em> elaborate, leading to frustration as the game doesn’t actually play out as planned.</p>

<p>The core issue behind this is forgetting that you actually <em>can’t</em> see the entire game state, since your own cards are hidden. It’s not just <em>you</em> who has plans — everyone does! And each of those plans is incomplete since they’re missing a piece of the picture, just as you are.</p>

<p>In Hanabi it’s very easy to forget that you’re missing a piece of the picture — in competitive card games you mostly can’t see the game state since everyone else’s cards are hidden. But in Hanabi you can see <em>most</em> of the cards and it’s easy to forget that your own four cards are hidden from you.</p>

<p>So what ends up happening is that due to incomplete information, everyone is operating in their own little parallel universe, occasionally getting frustrated when it becomes clear that other players are not operating in the same universe. As long as you recognize the existence of these parallel universes beforehand you’re fine, but if you don’t you will be frustrated.</p>

<p>This is largely true of N-to-N discussions as well. Because most of what’s being said makes sense to an individual in a particular way, it’s very easy for them to forget that other people may not share your assumptions and thus may be on a different page. Every time someone leaves a comment, different people may interpret it differently, “forking” the common understanding of the state of the discussion into multiple parallel universes. Eventually there are enough parallel universes that everyone’s talking past each other.</p>

<p>One thing I often prefer doing in such cases is to have a one on one discussion with people who disagree with me — typically the shared understanding that is the end result of such discussions is super useful and can be brought back to the discussion as something that all participants interpret the same way. I’m not consistent in doing this — in the midst of a heated argument it’s easy to get too wrapped up in the argument to think about getting results and I’ve certainly had my time arguing instead of resolving — but overall whenever I’ve chosen to do this it’s been a useful policy.</p>

<p>This is a good example of how relinearization and communication can help move N-to-N discussions along. Operating in different parallel universes is kind of the <em>point</em> of Hanabi, but it’s not the point of having a technical discussion.</p>

<h2 id="the-moderation-problem">The moderation problem</h2>

<p>In a technical discussion, broadly speaking, I find that there are three kinds of comments disagreeing with you:</p>

<ul>
  <li>Constructive: Comments which disagree with you constructively. We’re glad these exist, disagreement can hurt but is necessary for us to collaboratively reach the best outcomes.</li>
  <li>Disruptive: Comments which may be written in good faith but end up being disruptive. For example, this includes people who don’t read enough of the discussion and end up rehashing the same points. It also includes taking discussions off topic. These kinds of things are problematic but not covered by the code of conduct.</li>
  <li>Abrasive: Comments which are rude/abrasive. These are covered by the code of conduct. The mod team tries to handle these.</li>
</ul>

<p>(For a long time I and <a href="http://twitter.com/aaron_turon/">Aaron</a> had a shared vocabulary of “Type A, B, C” for these, mostly because I’m often unimaginative when it comes to such things, thanks to <a href="https://github.com/mark-simulacrum">Mark</a> for coming up with, better, descriptive titles)</p>

<p>Note that while I’m talking about “disruptive” comments it’s not a judgement on the <em>intent</em> of the participants, but rather a judgement on the harm it has caused.</p>

<p>The second category – disruptive comments – are the thing we’re currently unable to handle well. They snowball pretty badly too — as more and more of these collect, more and more people get frustrated and in turn leave comments that cause further disruption. As the discussion progresses into more and more “parallel universes” it also just becomes <em>easier</em> for a comment to be disruptive.</p>

<p>The Rust moderation team operates mostly passively, we simply don’t have the scale<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> to watch for and nip these things in the bud. Active moderation requires a degree of involvement we cannot provide. So while the best response would be to work with participants and resolve issues early as we see them crop up, we typically get pulled in at a point where some participants are already causing harm, and our response has to be more severe. It’s a bit of a catch-22: it’s not exactly our job to deal with this stuff<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>, but by the time it <em>becomes</em> our job (or even, by the time we <em>notice</em>), most acceptable actions for us to take are extremely suboptimal. The problem with passive moderation is that it’s largely reactive — it’s harder to proactively nudge the discussion in the right direction when you don’t even <em>notice</em> what’s going on until it’s too late. This is largely okay for dealing with bad-faith actors (the main goal of the mod team); it’s hard to <em>prevent</em> someone from deciding to harass someone else. But for dealing with disruptive buildups, we kind of need something different.</p>

<h2 id="participation-guidelines">Participation guidelines</h2>

<p>Part of the solution here is recognizing that spaces for official discussion are <em>different</em> from community hangout spaces. Our code of conduct attempts to handle abrasive behavior, which can disrupt discussions anywhere, but the comments that can disrupt consensus building in official discussions aren’t really covered. Nor are the repercussions of code of conduct violations really <em>appropriate</em> for such disruptive comments anyway.</p>

<p>A proposal I’ve circulated in the past is to have a notion of participation guidelines. Discussions in team spaces (RFCs, pre-RFCs, discord/zulip/IRC channels during team meetings) follow a set of rules set forth by the individual teams. It might be worth having a base set of participation guidelines defined by the core team. Something like the following is a very rough strawman:</p>

<ul>
  <li>Don’t have irrelevant discussions during team meetings on Discord/IRC/Zulip</li>
  <li>Don’t take threads off topic</li>
  <li>Don’t rehash discussions</li>
</ul>

<p>We ask people to read these before participating, but also breaking these rules isn’t considered serious, it just triggers a conversation (and maybe the hiding/deletion of a comment). If someone repeatedly breaks these rules they may be asked to not participate in a given thread anymore. The primary goal here is to empower team members to better deal with disruptive comments by giving them a formalized framework. Having codified rules helps team members confidently deal with such situations without having to worry as much about drawing direct ire from affected community members.</p>

<p>A base participation guidelines document can also be a value statement, not just a set of rules but also set of values. These values can be things like:</p>

<ul>
  <li>“We explicitly value high empathy interactions”</li>
  <li>“How everyone is feeling is everyone’s business”</li>
</ul>

<p>(h/t <a href="http://twitter.com/adam_n_p/">Adam</a> for the articulate wording here)</p>

<p>Having such words written somewhere — both the high level values we expect people to hold, and the individual behaviors we expect people to exhibit (or not exhibit) — is really valuable in and of itself, even if not enforced. The value of such documents is not that everyone reads them before participating — most don’t — but they serve as a good starting point for people interested in learning how to best conduct themselves, as well as an easy place to point people to where they’re having trouble doing so.</p>

<p>On its own, I find that this is a powerful framework but may not achieve the goal of improving the situation. I recently realized that this actually couples really well with a <em>different</em> idea I’ve been talking about for quite a while now, the idea of having facilitators:</p>

<h2 id="facilitators">Facilitators</h2>

<p>A common conflict I see occurring is that in many cases it’s a team’s job to think about and opine on a technical decision, but it’s also the team’s job to shepherd the discussion for that decision. This often works out great, but it also leads to people just feeling unheard. It kinda hurts when someone who has just strongly disagreed with you goes on to summarize the state of the discussion in a way that you feel you’ve been unfairly represented. The natural response to that for most people isn’t to work with that person and try to be properly represented, it’s to just get angry, leading to less empathy over time.</p>

<p>By design, Rust team members are <em>partisan</em>. The teams exist to build well-informed, carefully crafted opinions, and present them to the community. They also exist to make final decisions based on the results of a consensusbuilding discussion, which can involve picking sides. This is fine, there is always going to be some degree of partisanship amongst decisionmakers, or decisions would not get made.</p>

<p>Having team members also facilitate discussions is somewhat at odds with all of this. Furthermore, I feel like they don’t have enough bandwidth to do this well anyway. Some teams do have a concept of “sheriffs”, but this is more of an onramp to full team membership and the role of a sheriff is largely the same as the role of a team member, just without a binding vote.</p>

<p>I feel like it would be useful to have a group of (per-team?) <em>facilitators</em> to help with this. Facilitators are people who are interested in seeing progress happening, and largely don’t have <em>much</em> of an opinion on a given discussion, or are able to set aside this opinion in the interest of moving a discussion forward. They operate largely at the meta level of the discussion. Actions they may take are:</p>

<ul>
  <li>Summarizing the discussion every now and then</li>
  <li>Calling out one sided discussions</li>
  <li>Encouraging one-on-one tangents to be discussed elsewhere (perhaps creating a space for them, like an issue)</li>
  <li>Calling out specific people to do a thing that helps move the discussion forward. For example, something like “hey @Manishearth, I noticed you’ve been vocal in <a href="https://github.com/mystor/slag">arguing that Rust should switch to whitespace-sensitive syntax</a>, could you summarize all the arguments made by people on your side?” would help.</li>
  <li>Reinforcing positive behavior</li>
  <li>Occasionally pinging participants privately to help them improve their comments</li>
  <li>Attempting to identify the root cause of a disagreement, or empowering people to work together to identify this. This one is important but tricky. I’ve often enjoyed doing it — noticing the core axiomatic disagreement at play and spelling it out is a great feeling. But I’ve also found that it’s incredibly hard to do when you’re emotionally involved, and I’ve often needed a nudge from someone else to get there.</li>
</ul>

<p>At a high level, the job of the facilitators is to:</p>

<ul>
  <li>help foster empathy between participants</li>
  <li>help linearize complex discussions</li>
  <li>nudge towards cooperative behavior, away from adversarial behavior. Get people playing not to win, but to win-win.</li>
</ul>

<p>It’s important to note that facilitators don’t make decisions — the team does. In fact, they almost completely avoid making technical points, they instead keep their comments largely at the meta level, perhaps occasionally making factual corrections.</p>

<p>The teams <em>could</em> do most of this themselves<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">4</a></sup>, but as I’ve mentioned before it’s harder for others to not perceive all of your actions as partisan when some of them are. Furthermore, it can come off as patronizing at times.</p>

<p>This is also something the moderation team could do, however it’s <em>much</em> harder to scale the moderation team this way. Given that the moderation team deals with harassment and stuff like that, we need to be careful about how we build it up. On the other hand facilitating discussions is largely a public task, and the stakes aren’t as high: screwups can get noticed, and they don’t cause much harm. As a fundamentally <em>proactive</em> moderation effort, most actions taken will be to nudge things in a positive direction; getting this wrong usually just means that the status quo is maintained, not that harm is caused. Also, from talking to people it seems that while very few people want to be involved in moderating Rust, this notion of <em>facilitating</em> sounds much more fun and rewarding (I’d love to hear from people who would like to help).</p>

<p>And to me, this pairs really well with the idea of participation guidelines: teams can write down how they want discussions to take place on their venues, and facilitators can help ensure this works out. It’s good to look at the participation guidelines less as a set of rules and more as an aspiration for how we conduct ourselves, with the facilitators as a means to achieving that goal.</p>

<p>There are a lot of specifics we can twiddle with this proposal. For example, we can have a per-team group of appointed facilitators (with no overlap with the team), and for a given discussion one facilitator is picked (if they don’t have time or feel like they have strong opinions, try someone else). But there’s also no strong need for there to be such a group, facilitators can be picked as a discussion is starting, too. I don’t expect <em>most</em> discussions to need facilitators, so this is mostly reserved for discussions we expect will get heated, or discussions that have started to get heated. I’m not really going to spend time analysing these specifics; I have opinions but I’d rather have us figure out if we want to do something like this and how before getting into the weeds.</p>

<h2 id="prospective-outcomes">Prospective outcomes</h2>

<p>The real goal here is to bootstrap better empathy within the community. In an ideal world we don’t need facilitators, instead everyone is able to facilitate well. The explicitly non-partisan nature of facilitators is <em>useful</em>, but if everyone was able to operate in this manner it would largely be unnecessary. But as with any organization, being able to horizontally scale specific skills is really tricky without specialization.</p>

<p>I suspect that in the process of building up such a team of facilitators, we will also end up building a set of resources that can help others learn to act the same way, and eventually overall improve how empathetic our community is.</p>

<p>The concept of facilitators directly addresses the moderation problem, but it also handles the scaling problem pretty well! Facilitators are key in re-linearizing the n-to-n discussions, bringing the “parallel universes” together again. This should overall help people (especially team members) who are feeling overwhelmed by all the things that are going on.</p>

<p>This also helps with concerns people have that they’re not getting heard, as facilitators are basically posed as allies on all sides of the argument; people whose primary goal is to <em>help communication happen</em>.</p>

<hr />

<p>Overall what I’ve proposed here isn’t a fully-formed idea; but it’s the seed of one. There are a lot of interesting bits to discuss and build upon. I’m hoping through this post we might push forward some of the discussions about governance — both by providing a strawman idea, as well as by providing a perspective on the problem that I hope is useful.</p>

<p>I’m really interested to hear what people think!</p>

<p><em>Thanks to <a href="http://twitter.com/aaron_turon/">Aaron</a>, <a href="https://twitter.com/ag_dubs">Ashley</a>, <a href="http://twitter.com/adam_n_p/">Adam</a>, <a href="https://twitter.com/ember_arlynx">Ember</a>, <a href="http://twitter.com/arshia__">Arshia</a>, <a href="https://twitter.com/mgattozzi">Michael</a>, <a href="https://twitter.com/sunjay03">Sunjay</a>, <a href="http://twitter.com/fitzgen/">Nick</a> and other people I’ve probably forgotten for having been part of these discussions with me over the last few years, helping me refine my thoughts</em></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>I am way too verbose for “brief” to be an accurate description of anything I write, but might as well <em>try</em>. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Scaling the moderation team properly is another piece of this puzzle that I’m working on; we’ve made some progress recently. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>I helped draft <a href="https://www.rust-lang.org/policies/code-of-conduct#moderation">our moderation policy</a>, so this is a somewhat a lack of foresight on my part, but as I’ll explain later it’s suboptimal for the mod team to be dealing with this anyway. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>In particular, I feel like Aaron has done an <em>excellent</em> and consistent job of facilitating discussions this way in many cases. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Converting a WebGL Application to WebVR]]></title>
    <link href="http://manishearth.github.io/blog/2018/09/11/converting-a-webgl-application-to-webvr/"/>
    <updated>2018-09-11T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2018/09/11/converting-a-webgl-application-to-webvr</id>
    <content type="html"><![CDATA[<p>I wrote a post for Mozilla Hacks on converting WebGL applications to WebVR,
<a href="https://hacks.mozilla.org/2018/09/converting-a-webgl-application-to-webvr/">you can read it there</a></p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Why I Enjoy Blogging]]></title>
    <link href="http://manishearth.github.io/blog/2018/08/26/why-i-enjoy-blogging/"/>
    <updated>2018-08-26T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2018/08/26/why-i-enjoy-blogging</id>
    <content type="html"><![CDATA[<p><em>See also: <a href="https://myrrlyn.net/blog/misc/to-all-the-posts-ive-blogged-before">Alex’s version of this blog post</a></em></p>

<p>I started this blog three years ago, moving from my <a href="http://inpursuitoflaziness.blogspot.com/">older blog</a>, hoping to written about programming, math, physics, books, and miscellenia. I’ve not quite written about everything I wanted to, but I’ve been very very happy with the experience of blogging. <code class="language-plaintext highlighter-rouge">wc</code> says I’ve written almost 75k words, which is mind-boggling to me!</p>

<p>I often get asked by others — usually trying to decide if they should start blogging — what it’s like. I also often try to convince friends to blog by enumerating why I think it’s awesome. Might as well write it down so that it’s generally useful for everyone! 😃</p>

<h2 id="blogging-helps-cement-my-understanding-of-things">Blogging helps cement my understanding of things!</h2>

<p>I’ve often noticed that I’ll start blogging about something I <em>think</em> I understand, and it turns out that my understanding of the subject was somewhat nebulous. Turns out it’s pretty easy to convince ourselves that we understand something.</p>

<p>The act of writing stuff down helps cement my own understanding — words are usually not as nebulous as thoughts so I’m forced to figure out little details.</p>

<p>I recall when I wrote my post on <a href="https://manishearth.github.io/blog/2015/05/30/how-rust-achieves-thread-safety/">how Rust’s thread safety guarantees work</a>, I <em>thought</em> I understood <code class="language-plaintext highlighter-rouge">Send</code> and <code class="language-plaintext highlighter-rouge">Sync</code> in Rust. I understood what they did, but I didn’t have a clear mental model for them. I obtained this mental model through the process of writing the post; to be able to explain it to others I had to first explain it to myself.</p>

<p>I point out this post in particular because this was both one of the first posts for me where I’d noticed this, and, more importantly, my more concrete mental model led to me <a href="https://github.com/rust-lang/rust/issues/25894">finding a soundness bug in Rust’s standard library</a>. When I was thinking about my mental model I realized “an impl that looks like this should never exist”,
so I grepped the source code and found one<sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">1</a></sup>.</p>

<p>I’ve even noticed a difference between one-on-one explaining and explaining things through blog posts. I <em>love</em> explaining things one-on-one, it’s much easier to tailor the explanation to the other person’s background,
as well as what they’re actually asking for help with. Plus, it’s interactive. A <em>lot</em> of my posts are of the “okay I get this question a lot I’m going to write down the answer so I don’t have to repeat myself” kind and I’ve found that I’ve often learned things from these despite having talked about the thing in the article contents multiple times.</p>

<p>I guess it’s basically that blogging is inherently one-many — you’re trying to explain to a whole group of people with varied backgrounds — which means you need to cover all your bases<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">2</a></sup> and explain everything together instead of the minimum necessary.</p>

<h2 id="its-really-fun-to-revisit-old-posts">It’s really fun to revisit old posts!</h2>

<p>Okay, I’ll admit that I never really write blog posts with this in mind. But when I <em>do</em> reread them, I’m usually quite thankful I wrote them!</p>

<p>I’m a fan of rereading in general, I’ve reread most of my favorite books tens of times; I make a yearly pilgrimage to <a href="https://mickens.seas.harvard.edu/wisdom-james-mickens">James Mickens’ website</a>; I reread many of my favorite posts and articles on the internet; and I often reread my <em>own</em> posts from the past.</p>

<p>Sometimes I’ll do it because I want a refresher in a topic. Sometimes I’ll do it because I’m bored. Whatever the reason, it’s always been a useful and fun thing to do.</p>

<p>Rereading old posts is a great way to transport myself back to my mindset from when I wrote the post. It’s easy to see progress in my understanding of things as well as in my writing. It’s interesting to note what I thought super important to include in the post <em>then</em> that I consider totally obvious <em>now</em><sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">3</a></sup>. It’s interesting to relearn what I’ve forgotten. It’s reassuring to realize that my terrible jokes were just as terrible as they are now.</p>

<p>One of my favorite posts to reread is <a href="https://manishearth.github.io/blog/2016/03/05/exploring-zero-knowledge-proofs/">this really long one on generalized zero knowledge proofs</a>. It’s the longest post I’ve written so far<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">4</a></sup>, and it’s on a topic I don’t deal with often — cryptography. Not only does it help put me back in a mindset for thinking about cryptography, it’s about something super interesting but also complicated enough that rereading the post is like learning it all over again.</p>

<h2 id="it-lets-me-exercise-a-different-headspace">It lets me exercise a different headspace!</h2>

<p>I like programming a lot, but if programming was <em>all</em> I did, I’d get tired pretty quickly. When I was a student learning physics I’d often contribute to open source in my spare time, but now I write code full time so I’m less inclined to do it in my free time<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">5</a></sup>.</p>

<p>But I still sometimes feel like doing programmery things in my spare time just … not programming.</p>

<p>Turns out that blogging doesn’t tire me out the same way! I’m sure that if I spent the whole day writing I’d not want to write when I go home, but I don’t spend the whole day writing, so it’s all good. It’s refreshing to sit down to write a blog post and discover a fresh reserve of energy. I’m not sure if this is the right term, but I usually call this “using a different headspace”.</p>

<p>I’ve also started using this to plan my work, I mix up the kinds of headspace I’m employing for various tasks so that I feel energetic throughout the day.</p>

<p>This is also why I really enjoy mentoring — mentoring often requires the same effort from me as fixing it myself, but it’s a different headspace I’m employing so it’s less tiring.</p>

<h2 id="blogging-lets-me-be-lazy">Blogging lets me be lazy!</h2>

<p>I often find myself explaining things often. I like helping folks and explaining things, but I’m also lazy<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">6</a></sup>, so writing stuff down really makes stuff easy for me! If folks ask me a question I can give a quick answer and then go “if you want to learn more, I’ve written about it here!”. If folks are asking a question a lot, there’s probably something missing in the documentation or learning materials about it. Some things can be fixed upstream in documentation, but other things — like <a href="https://manishearth.github.io/blog/2017/05/14/mentally-modelling-modules/">“how should I reason about modules in Rust?”</a> deserve to be tackled as a separate problem and addressed with their own post.</p>

<p>(Yes, this post is in this category!)</p>

<h2 id="its-okay-if-folks-have-written-about-it-before">It’s okay if folks have written about it before!</h2>

<p>A common question I’ve gotten is “Well I can write about X but … there are a lot of other posts out there about it, should I still?”</p>

<p>Yes!!</p>

<p>People think differently, people learn differently, and people come from different backgrounds. Existing posts may be useful for some folks but less useful for others.</p>

<p>My personal rule of thumb is that if it took <em>me</em> some effort to understand something after reading about it, that’s something worth writing about, so it’s easier to understand for others like me encountering the subject.</p>

<p>One of my favorite bloggers, <a href="https://jvns.ca/">Julia Evans</a> very often writes posts explaining computer concepts. Most of the times these have been explained before in other blog posts or manuals. But that doesn’t matter — her posts are <em>different</em>, and they’re <em>amazing</em>. They’re upbeat, fun to read, and often get me excited to learn more about things I knew about but never really looked at closely before.</p>

<h2 id="i-kinda-feel-its-my-duty-to">I kinda feel it’s my duty to?</h2>

<p>There’s a quote by Toni Morrison I quite enjoy:</p>

<blockquote>
  <p>I tell my students, ‘When you get these jobs that you have been so brilliantly trained for, just remember that your real job is that if you are free, you need to free somebody else. If you have some power, then your job is to empower somebody else. This is not just a grab-bag candy game.</p>
</blockquote>

<p>I enjoy it so much I <a href="https://manishearth.github.io/rustfest-slides/#/13">concluded my talk at RustFest Kyiv with it</a>!</p>

<p>I have the privilege of having time to do things like blogging and mentoring. Given that, I feel that it really is my duty to share what I know as much as possible; to help others attempting to tread the path I’m treading; and to battle against tribal knowledge.</p>

<p>When it comes to programming I’m mostly “self-taught”. But when I say that, I really mean that I wasn’t taught in a traditional way by other humans — I learned things by trying stuff out and <em>reading what others had written</em>. I didn’t learn Rust by taking <code class="language-plaintext highlighter-rouge">rustc</code> and pretending to be a fuzzer and just trying random nonsense till stuff made sense, I went through the tutorial (and <em>then</em> started exploring by trying random stuff). I didn’t figure out cool algorithms by discovering them from first principles, I picked them up from books and blog posts. I’m “self-taught” because I’ve been in charge of my learning process, but I’ve definitely relied on the work of other people throughout this process.</p>

<p>This means that for me, personally, knowledge-sharing is especially important. If I had to spend time figuring something out, I should make it easier for the next people to try<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">7</a></sup>.</p>

<p>(All this said, I probably don’t blog as much as I <em>should</em>)</p>

<h2 id="you-should-blog-too">You should blog too!</h2>

<p>I wish everyone wrote more. I know not everyone has the time/privilege to do this, but if you do, I urge you to start!</p>

<p>I feel like tips on <em>how</em> to blog would fill up an entire other blog post, but Julia Evans has <a href="https://jvns.ca/blog/2016/05/22/how-do-you-write-blog-posts//">multiple</a> <a href="https://jvns.ca/blog/2017/03/20/blogging-principles/">posts</a> on this that I strongly recommend. Feel free to ask me for review on posts!</p>

<p>As for the technicalities of setting up a blog, my colleague Emily recently <a href="https://www.emilykager.com/writing/2018/07/27/myo-website.html">wrote a great post about doing this with Jekyll</a>. This blog uses <a href="http://octopress.org">Octopress</a> which is similar to set up.</p>

<p><em>Thanks to <a href="https://twitter.com/arshia__">Arshia</a>, <a href="https://twitter.com/QuietMisdreavus">QuietMisdreavus</a>, and <a href="https://twitter.com/myrrlyn">Alex</a> for reviewing drafts of this blog post.</em></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:11" role="doc-endnote">
      <p>Who needs to <a href="https://www.ralfj.de/blog/2017/06/09/mutexguard-sync.html">look for unsoundness with rigorous formal verification</a> when you have <code class="language-plaintext highlighter-rouge">grep</code>? <a href="#fnref:11" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>Incidentally, I find there’s a similar dynamic when it comes to forum discussions vs hashing things out one-on-one, it’s way harder to get anywhere with forum discussions because they’re one-many and you have to put in that much more work to empathize with everyone else and also phrase things in a way that is resilient to accidental misinterpretation. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>This is especially important as I get more and more “used” to subjects I’m familiar with – it’s easy to lose the ability to explain things when I think half of it is obvious. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:6" role="doc-endnote">
      <p>This is probably the <em>real</em> reason I love rereading it — I like being verbose and would nest parentheses and footnotes if society let me <a href="#fnref:6" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:8" role="doc-endnote">
      <p>I also am in general less inclined to do technical things in my free time and have a better work-life balance, glad that worked out! <a href="#fnref:8" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:1" role="doc-endnote">
      <p>See blog title <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:10" role="doc-endnote">
      <p>One of my former title ideas for this post was “Knowledge is Theft”, riffing off of this concept, but I felt that was a bit too tongue-in-cheek. <a href="#fnref:10" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The Future of Clippy]]></title>
    <link href="http://manishearth.github.io/blog/2018/06/05/the-future-of-clippy-the-rust-linter/"/>
    <updated>2018-06-05T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2018/06/05/the-future-of-clippy-the-rust-linter</id>
    <content type="html"><![CDATA[<p>We’ve recently been making lots of progress on future plans for <a href="https://github.com/rust-lang-nursery/rust-clippy">clippy</a> and I
thought I’d post an update.</p>

<p>For some background, Clippy is the linter for Rust. We have more than 250 lints, and
are steadily growing.</p>

<h2 id="clippy-and-nightly">Clippy and Nightly</h2>

<p>Sadly, Clippy has been nightly-only for a very long time. The reason behind this is
that to perform its analyses it hooks into the compiler so that it doesn’t have to
reimplement half the compiler’s info to get things like type information. But
these are internal APIs and as such will never stabilize, so Clippy needs to be
used with nightly Rust.</p>

<p>We’re hoping this will change soon! The plan is that Clippy will eventually
be distributed by Rustup, so something like <code class="language-plaintext highlighter-rouge">rustup component add clippy</code> will
get you the clippy binary.</p>

<p>The first steps are <a href="https://github.com/rust-lang/rust/pull/51122">happening</a>, we’re planning on setting it up so that when it compiles
Rustup will be able to fetch a clippy component (however this won’t be the recommended way
to use clippy until we figure out the workflow here, so sit tight!)</p>

<p>Eventually, clippy will probably block nightlies<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>; and after a bunch of cycles of letting that
work itself out, hopefully clippy will be available with the stable compiler. There’s a lot of
stuff that needs to be figured out, and we want to do this in a way that minimally impacts
compiler development, so this may move in fits and starts.</p>

<h2 id="lint-audit">Lint audit</h2>

<p>A couple months ago <a href="https://github.com/oli-obk">Oliver</a> and I<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> did a <a href="https://github.com/rust-lang-nursery/rust-clippy/pull/2579">lint audit</a> in Clippy. Previously,
clippy lints were classified as simply “clippy”, “clippy_pedantic”, and “restriction”.
“restriction” was for allow-by-default lints for things which are generally not a problem but may
be something you specifically want to forbid based on the situation, and “pedantic”
was for all the lints which were allow-by-default for other reasons.</p>

<p>Usually these reasons included stuff like “somewhat controversial lint”, “lint is very buggy”,
or for lints which are actually exceedingly pedantic and may only be wanted by folks
who very seriously prefer their code to be <em>perfect</em>.</p>

<p>We had a lot of buggy lints, and these categories weren’t as helpful. People use clippy
for different reasons. Some folks only care about clippy catching bugs, whereas others want
its help enforcing the general “Rust Style”.</p>

<p>So we came up with a better division of lints:</p>

<ul>
  <li>Correctness (Deny): Probable bugs, e.g. calling <code class="language-plaintext highlighter-rouge">.clone()</code> on <code class="language-plaintext highlighter-rouge">&amp;&amp;T</code>, which clones the (<code class="language-plaintext highlighter-rouge">Copy</code>) reference and not the actual type</li>
  <li>Style (Warn): Style issues; where the fix usually doesn’t semantically change the code. For example, having a method named <code class="language-plaintext highlighter-rouge">into_foo()</code> that doesn’t take <code class="language-plaintext highlighter-rouge">self</code> by-move</li>
  <li>Complexity (Warn): For detecting unnecessary code complexities and helping simplify them. For example, replacing <code class="language-plaintext highlighter-rouge">.filter(..).next()</code> with <code class="language-plaintext highlighter-rouge">.find(..)</code></li>
  <li>Perf (Warn): Detecting potential performance footguns, like using <code class="language-plaintext highlighter-rouge">Box&lt;Vec&lt;T&gt;&gt;</code> or calling <code class="language-plaintext highlighter-rouge">.or(foo())</code> instead of <code class="language-plaintext highlighter-rouge">or_else(foo)</code>.</li>
  <li>Pedantic (Allow): Controversial or exceedingly pedantic lints</li>
  <li>Nursery (Allow): For lints which are buggy or need more work</li>
  <li>Cargo (Allow): Lints about your Cargo setup</li>
  <li>Restriction (Allow): Lints for things which are not usually a problem, but may be something specific situations may dictate disallowing.</li>
</ul>

<p>and applied it to the codebase. You can see the results on our <a href="https://rust-lang-nursery.github.io/rust-clippy/master/index.html">lint list</a></p>

<p>Some lints could belong in more than one group, and we picked the best one in that case. Feedback welcome!</p>

<h2 id="clippy-10">Clippy 1.0</h2>

<p>In the run up to making Clippy a rustup component we’d like to do a 1.0 release of Clippy. This involves an RFC,
and pinning down an idea of stability.</p>

<p>The general plan we have right now is to have the same idea of lint stability as rustc; essentially
we do not guarantee stability under <code class="language-plaintext highlighter-rouge">#[deny(lintname)]</code>. This is mostly fine since <code class="language-plaintext highlighter-rouge">deny</code> only affects
the current crate (dependencies have their lints capped) so at most you’ll be forced to slap on an <code class="language-plaintext highlighter-rouge">allow</code>
somewhere after a rustup.</p>

<p>With specifics, this means that we’ll never remove lints. We may recategorize them, or “deprecate” them
(which makes the lint do nothing, but keeps the name around so that <code class="language-plaintext highlighter-rouge">#[allow(lintname)]</code> doesn’t break the build
aside from emitting a warning).</p>

<p>We’ll also not change what individual lints do fundamentally. The kinds of changes you can expect are:</p>

<ul>
  <li>Entirely new lints</li>
  <li>Fixing false positives (a lint may no longer lint in a buggy case)</li>
  <li>Fixing false negatives (A case where the lint <em>should</em> be linting but doesn’t is fixed)</li>
  <li>Bugfixes (When the lint panics or does something otherwise totally broken)</li>
</ul>

<p>When fixing false negatives this will usually be fixing things that can be understood as comfortably within the
scope of the lint as documented/named</p>

<p>I’ll be posting an RFC soonish that both contains this general plan of stability, as well as a list of the current
lint categorization for folks to discuss.</p>

<hr />

<p>Anyway, thought I’d just post a general update on everything, since stuff’s changing quickly.</p>

<p>There’s still time for stable or even just reliably rustuppable nightly clippy to happen but the path to it is pretty clear now!</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>As in, if clippy is broken there will not be a nightly that day. Rustfmt and RLS work this way right now AIUI. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Okay, mostly Oliver <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Down a Rusty Rabbit Hole]]></title>
    <link href="http://manishearth.github.io/blog/2018/04/12/down-a-rusty-rabbit-hole/"/>
    <updated>2018-04-12T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2018/04/12/down-a-rusty-rabbit-hole</id>
    <content type="html"><![CDATA[<p>Last week I fell down a rather interesting rabbit hole in Rust, which was basically
me discovering a series of quirks of the Rust compiler/language, each one leading to the
next when I asked “why?”.</p>

<p>It started when someone asked why autogenerated <code class="language-plaintext highlighter-rouge">Debug</code> impls use argument names like <code class="language-plaintext highlighter-rouge">__arg_0</code>
which start with a double underscore.</p>

<p>This happened to be <a href="https://github.com/rust-lang/rust/pull/32294">my fault</a>. The reason <a href="https://github.com/rust-lang/rust/pull/32251#issuecomment-197481726">we used a double underscore</a> was that
while a single underscore tells rustc not to warn about a possibly-unused variable, there’s an off-
by-default clippy lint that warns about variables that start with a single underscore that are used,
which can be silenced with a double underscore. Now, the correct fix here is to make the lint ignore
derive/macros (which I believe we did as well), but at the time we needed to add an underscore
anyway so a double underscore didn’t seem worse.</p>

<p>Except of course, this double underscore appears in the docs. Oops.</p>

<p>Ideally the rustc derive infrastructure would have a way of specifying the argument name to use so
that we can at least have descriptive things here, but that’s a bit more work (I’m willing to mentor
this work though!). So I thought I’d fix this by at least removing the double underscore, and making
the unused lint ignore <code class="language-plaintext highlighter-rouge">#[derive()]</code> output.</p>

<p>While going through the code to look for underscores I also discovered a hygiene issue. The following code
throws a bunch of very weird type errors:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">const</span> <span class="n">__cmp</span><span class="p">:</span> <span class="nb">u8</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>

<span class="nd">#[derive(PartialOrd,</span> <span class="nd">PartialEq)]</span>
<span class="k">pub</span> <span class="k">enum</span> <span class="n">Foo</span> <span class="p">{</span>
    <span class="nf">A</span><span class="p">(</span><span class="nb">u8</span><span class="p">),</span> <span class="nf">B</span><span class="p">(</span><span class="nb">u8</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>(<a href="https://play.rust-lang.org/?gist=2352b6a2192f38caba70bc2b1fa889e7&amp;version=stable">playpen</a>)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error[E0308]: mismatched types
 --&gt; src/main.rs:6:7
  |
6 |     A(u8), B(u8)
  |       ^^^ expected enum `std::option::Option`, found u8
  |
  = note: expected type `std::option::Option&lt;std::cmp::Ordering&gt;`
             found type `u8`
.....
</code></pre></div></div>

<p>This is because the generated code for PartialOrd contains the following:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">match</span> <span class="n">foo</span><span class="nf">.cmp</span><span class="p">(</span><span class="n">bar</span><span class="p">)</span> <span class="p">{</span>
    <span class="nf">Some</span><span class="p">(</span><span class="nn">Ordering</span><span class="p">::</span><span class="n">Equal</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="o">.....</span><span class="p">,</span>
    <span class="n">__cmp</span> <span class="k">=&gt;</span> <span class="n">__cmp</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">__cmp</code> can both be a binding to a wildcard pattern match as well as a match against a constant
named <code class="language-plaintext highlighter-rouge">__cmp</code>, and in the presence of such a constant it resolves to the constant, causing
type errors.</p>

<p>One way to fix this is to bind <code class="language-plaintext highlighter-rouge">foo.cmp(bar)</code> to some temporary variable <code class="language-plaintext highlighter-rouge">x</code> and use that directly in
a <code class="language-plaintext highlighter-rouge">_ =&gt; x</code> branch.</p>

<p>I thought I could be clever and try <code class="language-plaintext highlighter-rouge">cmp @ _ =&gt; cmp</code> instead. <code class="language-plaintext highlighter-rouge">match</code> supports syntax where you can
do <code class="language-plaintext highlighter-rouge">foo @ &lt;pattern&gt;</code>, where <code class="language-plaintext highlighter-rouge">foo</code> is bound to the entire matched variable. The <code class="language-plaintext highlighter-rouge">cmp</code> here is unambiguously
a binding; it cannot be a pattern. So no conflicting with the <code class="language-plaintext highlighter-rouge">const</code>, problem solved!</p>

<p>So I made <a href="https://github.com/rust-lang/rust/pull/49676">a PR for both removing the underscores and also fixing this</a>. The change for <code class="language-plaintext highlighter-rouge">__cmp</code>
is no longer in that PR, but you can find it <a href="https://github.com/Manishearth/rust/commit/partial-cmp-hygiene">here</a>.</p>

<p>Except I hit a problem. With that PR, the following still breaks:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">const</span> <span class="n">cmp</span><span class="p">:</span> <span class="nb">u8</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>

<span class="nd">#[derive(PartialOrd,</span> <span class="nd">PartialEq)]</span>
<span class="k">pub</span> <span class="k">enum</span> <span class="n">Foo</span> <span class="p">{</span>
    <span class="nf">A</span><span class="p">(</span><span class="nb">u8</span><span class="p">),</span> <span class="nf">B</span><span class="p">(</span><span class="nb">u8</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>throwing a slightly cryptic error:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error[E0530]: match bindings cannot shadow constants
 --&gt; test.rs:9:7
  |
4 | pub const cmp: u8 = 1;
  | ---------------------- a constant `cmp` is defined here
...
9 |     B(u8)
  |       ^^^ cannot be named the same as a constant
</code></pre></div></div>

<p>You can see a reduced version of this error in the following code:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">const</span> <span class="n">cmp</span> <span class="p">:</span> <span class="nb">u8</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>

<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">match</span> <span class="mi">1</span> <span class="p">{</span>
        <span class="n">cmp</span> <span class="o">@</span> <span class="n">_</span> <span class="k">=&gt;</span> <span class="p">()</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>(<a href="https://play.rust-lang.org/?gist=feebbc048b47c286d5720b9926c6925e&amp;version=stable">playpen</a>)</p>

<p>Huh. Wat. Why? <code class="language-plaintext highlighter-rouge">cmp @ _</code> seems to be pretty unambiguous, what’s wrong with it shadowing a constant?</p>

<p>Turns out bindings cannot shadow constants at all, for a <a href="https://github.com/rust-lang/rust/issues/33118#issuecomment-233962221">rather subtle reason</a>:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">const</span> <span class="n">A</span><span class="p">:</span> <span class="nb">u8</span> <span class="o">=</span> <span class="o">...</span><span class="p">;</span> <span class="c1">// A_const</span>
<span class="k">let</span> <span class="n">A</span> <span class="o">@</span> <span class="n">_</span> <span class="o">=</span> <span class="o">...</span><span class="p">;</span> <span class="c1">// A_let</span>
<span class="k">match</span> <span class="o">..</span> <span class="p">{</span>
    <span class="n">A</span> <span class="k">=&gt;</span> <span class="o">...</span><span class="p">;</span> <span class="c1">// A_match</span>
<span class="p">}</span>
</code></pre></div></div>

<p>What happens here is that constants and variables occupy the same namespace. So <code class="language-plaintext highlighter-rouge">A_let</code> shadows
<code class="language-plaintext highlighter-rouge">A_const</code> here, and when we attempt to <code class="language-plaintext highlighter-rouge">match</code>, <code class="language-plaintext highlighter-rouge">A_match</code> is resolved to <code class="language-plaintext highlighter-rouge">A_let</code> and rejected (since
you can’t match against a variable), and <code class="language-plaintext highlighter-rouge">A_match</code> falls back to resolving as a fresh binding
pattern, instead of resolving to a pattern that matches against <code class="language-plaintext highlighter-rouge">A_const</code>.</p>

<p>This is kinda weird, so we disallow shadowing constants with variables. This is rarely a problem
because variables are lowercase and constants are uppercase. We could <em>technically</em> allow this
language-wise, but it’s hard on the implementation (and irrelevant in practice) so we don’t.</p>

<hr />

<p>So I dropped that fix. The temporary local variable approach is broken as well since
you can also name a constant the same as the local variable and have a clash (so again, you
need the underscores to avoid surprises).</p>

<p>But then I realized that we had an issue with removing the underscores from <code class="language-plaintext highlighter-rouge">__arg_0</code> as well.</p>

<p>The following code is also broken:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">const</span> <span class="n">__arg_0</span><span class="p">:</span> <span class="nb">u8</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>

<span class="nd">#[derive(Debug)]</span>
<span class="k">struct</span> <span class="nf">Foo</span><span class="p">(</span><span class="nb">u8</span><span class="p">);</span>
</code></pre></div></div>

<p>(<a href="https://play.rust-lang.org/?gist=6e10fd8de1123c6f6f695c891e879f70&amp;version=stable">playpen</a>)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error[E0308]: mismatched types
 --&gt; src/main.rs:3:10
  |
3 | #[derive(Debug)]
  |          ^^^^^ expected mutable reference, found u8
  |
  = note: expected type `&amp;mut std::fmt::Formatter&lt;'_&gt;`
             found type `u8`
</code></pre></div></div>

<p>You can see a reduced version of this error in the following code:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">const</span> <span class="n">__arg_0</span><span class="p">:</span> <span class="nb">u8</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>

<span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">__arg_0</span><span class="p">:</span> <span class="nb">bool</span><span class="p">)</span> <span class="p">{}</span>
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error[E0308]: mismatched types
 --&gt; src/main.rs:3:8
  |
3 | fn foo(__arg_0: bool) {}
  |        ^^^^^^^ expected bool, found u8
</code></pre></div></div>

<p>(<a href="https://play.rust-lang.org/?gist=2cf2c8b3520d5b343de1b76f80ea3fe7&amp;version=stable">playpen</a>)</p>

<p>This breakage is not an issue with the current code because of the double underscores – there’s a
very low chance someone will create a constant that is both lowercase and starts with a double
underscore. But it’s a problem when I remove the underscores since that chance shoots up.</p>

<p>Anyway, this failure is even weirder. Why are we attempting to match against the constant in the
first place? <code class="language-plaintext highlighter-rouge">fn</code> argument patterns<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> are irrefutable, i.e. all possible values of the type should match
the argument. For example, <code class="language-plaintext highlighter-rouge">fn foo(Some(foo): Option&lt;u8&gt;) {}</code> will fail to compile with
“refutable pattern in function argument: <code class="language-plaintext highlighter-rouge">None</code> not covered”.</p>

<p>There’s no point trying to match against constants here; because even if we find a constant it will be rejected
later. Instead, we can unambiguously resolve identifiers as new bindings, yes?</p>

<p>Right?</p>

<p>Firm in my belief, <a href="https://github.com/rust-lang/rust/issues/49680">I filed an issue</a>.</p>

<p>I was wrong, it’s <a href="https://github.com/rust-lang/rust/issues/49680#issuecomment-379029404">not going to always be rejected later</a>. With zero-sized types this
can totally still work:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">S</span><span class="p">;</span>

<span class="k">const</span> <span class="n">C</span><span class="p">:</span> <span class="n">S</span> <span class="o">=</span> <span class="n">S</span><span class="p">;</span>

<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">C</span> <span class="o">=</span> <span class="n">S</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Here because <code class="language-plaintext highlighter-rouge">S</code> has only one state, matching against a constant of the type is still irrefutable.</p>

<p>I argued that this doesn’t matter – since the type has a single value, it doesn’t matter whether we resolved to
a new binding or the constant; the value and semantics are the same.</p>

<p>This is true.</p>

<p>Except.</p>

<p><a href="https://github.com/rust-lang/rust/issues/49680#issuecomment-379032842">Except for when destructors come in</a>.</p>

<p>It was at this point that my table found itself in the perplexing state of being upside-down.</p>

<p>This is still really fine, zero-sized-constants-with-destructors is a pretty rare thing in Rust
and I don’t really see folks <em>relying</em> on this behavior.</p>

<p>However I later realized that this entire detour was pointless because even if we fix this, we end up
with a way for bindings to shadow constants. Which … which we already realized isn’t allowed by the
compiler till we fix some bugs.</p>

<p>Damn.</p>

<hr />

<p>The <em>actual</em> fix to the macro stuff is to use hygenic generated variable names, which the current
infrastructure supports. I plan to make a PR for this eventually.</p>

<p>But it was a very interesting dive into the nuances of pattern matching in Rust.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>Yes, function arguments in Rust are patterns. You can totally do things like <code class="language-plaintext highlighter-rouge">(a, b): (u8, u8)</code> in function arguments (like you can do in <code class="language-plaintext highlighter-rouge">let</code>) <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Picking Apart the Crashing iOS String]]></title>
    <link href="http://manishearth.github.io/blog/2018/02/15/picking-apart-the-crashing-ios-string/"/>
    <updated>2018-02-15T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2018/02/15/picking-apart-the-crashing-ios-string</id>
    <content type="html"><![CDATA[<p>So there’s <a href="https://www.theverge.com/2018/2/15/17015654/apple-iphone-crash-ios-11-bug-imessage">yet another iOS text crash</a>, where just looking at a particular string crashes
iOS. Basically, if you put this string in any system text box (and other places), it crashes that
process. I’ve been testing it by copy-pasting characters into Spotlight so I don’t end up crashing
my browser.</p>

<p>The original sequence is U+0C1C U+0C4D U+0C1E U+200C U+0C3E, which is a sequence of Telugu
characters: the consonant ja (జ), a virama ( ్ ), the consonant nya (ఞ), a zero-width non-joiner, and
the vowel aa ( ా).</p>

<p>I was pretty interested in what made this sequence “special”, and started investigating.</p>

<p>So first when looking into this, I thought that the &lt;ja, virama, nya&gt; sequence was the culprit.
That sequence forms a special ligature in many Indic scripts (ज्ञ in Devanagari) which is often
considered a letter of its own. However, the ligature for Telugu doesn’t seem very “special”.</p>

<p>Also, from some experimentation, this bug seemed to occur for <em>any</em> pair of Telugu consonants with
a vowel, as long as the vowel is not   ై (ai). Huh.</p>

<p>The ZWNJ must be doing something weird, then. &lt;consonant, virama, consonant, vowel&gt; is a
pretty common sequence in any Indic script; but ZWNJ before a vowel isn’t very useful for most
scripts (except for Bengali and Oriya, but I’ll get to that).</p>

<p>And then I saw that <a href="https://twitter.com/FakeUnicode/status/963300865762254848">there was a sequence in Bengali</a> that also crashed.</p>

<p>The sequence is U+09B8 U+09CD U+09B0 U+200C U+09C1, which is the consonant “so” (স), a virama ( ্ ),
the consonant “ro” (র), a ZWNJ, and vowel u (  ু).</p>

<p>Before we get too into this, let’s first take a little detour to learn how Indic scripts work:</p>

<h2 id="indic-scripts-and-consonant-clusters">Indic scripts and consonant clusters</h2>

<p>Indic scripts are <em>abugidas</em>; which means that their “letters” are consonants, which you
can attach diacritics to to change the vowel. By default, consonants have a base vowel.
So, for example, क is “kuh” (kə, often transcribed as “ka”), but I can change the vowel to make it के
(the “ka” in “okay”) का (“kaa”, like “car”).</p>

<p>Usually, the default vowel is the ə sound, though not always (in Bengali it’s more of an o sound).</p>

<p>Because of the “default” vowel, you need a way to combine consonants. For example, if you wished to
write the word “ski”, you can’t write it as स + की (sa + ki = “saki”), you must write it as स्की.
What’s happened here is that the स got its vowel “killed”, and got tacked on to the की to form a
consonant cluster ligature.</p>

<p>You can <em>also</em> write this as स्‌की . That little tail you see on the स is known as a “virama”;
it basically means “remove this vowel”. Explicit viramas are sometimes used when there’s no easy way
to form a ligature, e.g. in ङ्‌ठ because there is no simple way to ligatureify ङ into ठ. Some scripts
also <em>prefer</em> explicit viramas, e.g. “ski” in Malayalam is written as സ്കീ, where the little crescent
is the explicit virama.</p>

<p>In unicode, the virama character is always used to form a consonant cluster. So स्की was written as
&lt;स,  ्, क,  ी&gt;, or &lt;sa, virama, ka, i&gt;. If the font supports the cluster, it will show up
as a ligature, otherwise it will use an explicit virama.</p>

<p>For Devanagari and Bengali, <em>usually</em>, in a consonant cluster the first consonant is munged a bit and the second consonant stays intact.
There are exceptions – sometimes they’ll form an entirely new glyph (क + ष = क्ष), and sometimes both
glyphs will change (ड + ड = ड्ड, द + म = द्म, द + ब = द्ब). Those last ones should look like this in conjunct form:</p>

<p><img class="center" src="http://manishearth.github.io/images/post/unicode-crash/conjuncts.png" width="200" /></p>

<h2 id="investigating-the-bengali-case">Investigating the Bengali case</h2>

<p>Now, interestingly, unlike the Telugu crash, the Bengali crash seemed to only occur when the second
consonant is র (“ro”). However, I can trigger it for any choice of the first consonant or vowel, except
when the vowel is  ো (o) or  ৌ (au).</p>

<p>Now, র is an interesting consonant in some Indic scripts, including Devanagari. In Devanagari,
it looks like र (“ra”). However, it does all kinds of things when forming a cluster. If you’re having it
precede another consonant in a cluster, it forms a little feather-like stroke, like in र्क (rka). In Marathi,
that stroke can also look like a tusk, as in र्‍क. As a suffix consonant, it can provide a little
“extra leg”, as in क्र (kra). For letters without a vertical stroke, like ठ (tha), it does this caret-like thing,
ठ्र (thra).</p>

<p>Basically, while most consonants retain some of their form when put inside a cluster, र does not. And
a more special thing about र is that this happens even when र is the <em>second</em> consonant in a cluster – as I mentioned
before, for most consonant clusters the second consonant stays intact. While there are exceptions, they are usually
specific to the cluster; it is only र for which this happens for all clusters.</p>

<p>It’s similar in Bengali, র as the second consonant adds a tentacle-like thing on the existing consonant. For example,
প + র (po + ro) gives প্র (pro).</p>

<p>But it’s not just র that does this in Bengali, the consonant “jo” does as well. প + য (po + jo) forms প্য (pjo),
and the য is transformed into a wavy line called a “jophola”.</p>

<p>So I tried it with য  — , and it turns out that the Bengali crash occurs for  য as well!
So the general Bengali case is &lt;consonant, virama, র OR য, ZWNJ, vowel&gt;, where the vowel is not   ো or  ৌ.</p>

<h2 id="suffix-joining-consonants">Suffix-joining consonants</h2>

<p>So we’re getting close, here. At least for Bengali, it occurs when the second consonant is such that it often
combines with the first consonant without modifying its form much.</p>

<p>In fact, this is the case for Telugu as well! Consonant clusters in Telugu are usually formed by preserving the
original consonant, and tacking the second consonant on below!</p>

<p>For example, the original crashy string contains the cluster జ + ఞ, which looks like జ్ఞ. The first letter isn’t
really modified, but the second is.</p>

<p>From this, we can guess that it will also occur for Devanagari with र. Indeed it does! U+0915 U+094D U+0930 U+200C U+093E, that is,
&lt;क,  ्, र, zwnj,  ा&gt; (&lt; ka, virama, ra, zwnj, aa &gt;) is one such crashing sequence.</p>

<p>But this isn’t really the whole story, is it? For example, the crash does occur for “kro” + zwnj + vowel in Bengali,
and in “kro” (ক্র = ক + র = ko + ro) the resultant cluster involves the munging of both the prefix and suffix. But
the crash doesn’t occur for द्ब or ड्ड. It seems to be specific to the letter, not the nature of the cluster.</p>

<p>Digging deeper, the reason is that for many fonts (presumably the ones in use), these consonants
form “suffix joining consonants”<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> (a term I made up) when preceded by a virama. This seems to
correspond to the <a href="https://docs.microsoft.com/en-us/typography/opentype/spec/features_pt#tag-pstf"><code class="language-plaintext highlighter-rouge">pstf</code> OpenType feature</a>, as well as <a href="https://docs.microsoft.com/en-us/typography/opentype/spec/features_uz#vatu"><code class="language-plaintext highlighter-rouge">vatu</code></a>.</p>

<p>For example, the sequence virama + क gives   ्क, i.e. it renders a virama with a placeholder followed by a क.</p>

<p>But, for र, virama + र renders  ्र, which for me looks like this:</p>

<p><img class="center" src="http://manishearth.github.io/images/post/unicode-crash/virama-ra.png" width="200" /></p>

<p>In fact, this is the case for the other consonants as well. For me,  ्र  ্র  ্য  ్ఞ  ్క
(Devanagari virama-ra, Bengali virama-ro, Bengali virama-jo, Telugu virama-nya, Telugu virama-ka)
all render as “suffix joining consonants”:</p>

<p><img class="center" src="http://manishearth.github.io/images/post/unicode-crash/virama-consonant.png" width="200" /></p>

<p>(This is true for all Telugu consonants, not just the ones listed).</p>

<p>An interesting bit is that the crash does not occur for &lt;र, virama, र, zwnj, vowel&gt;, because र-virama-र
uses the prefix-joining form of the first र (र्र). The same occurs for র with itself or ৰ or য. Because the virama
is “stickier” to the left in these cases, it doesn’t cause a crash. (h/t <a href="https://github.com/hackbunny">hackbunny</a> for discovering this
using a <a href="https://github.com/hackbunny/viramarama">script</a> to enumerate all cases).</p>

<p>Kannada <em>also</em> has “suffix joining consonants”, but for some reason I cannot trigger the crash with it. Ya in Gurmukhi
is also suffix-joining.</p>

<h2 id="the-zwnj">The ZWNJ</h2>

<p>The ZWNJ is curious. The crash doesn’t happen without it, but as I mentioned before a ZWNJ before a vowel
doesn’t really <em>do</em> anything for most Indic scripts. In Indic scripts, a ZWNJ can be used to explicitly force a
virama if used after the virama (I used it to write स्‌की in this post), however that’s not how it’s being used here.</p>

<p>In Bengali and Oriya specifically, a ZWNJ can be used to force a different vowel form when used before a vowel
(e.g. রু vs র‌ু), however this bug seems to apply to vowels for which there is only one form, and this bug
also applies to other scripts where this isn’t the case anyway.</p>

<p>The exception vowels are interesting. They’re basically all vowels that are made up of <em>two</em> glyph components. Philippe Verdy
points out:</p>

<blockquote>
  <p>And why this bug does not occur with some vowels is because these are vowels in two parts,
that are first decomposed into two separate glyphs reordered in the buffer of glyphs, while
other vowels do not need this prior mapping and keep their initial direct mapping from their
codepoints in fonts, which means that this has to do to the way the ZWNJ looks for the glyphs
of the vowels in the glyphs buffer and not in the initial codepoints buffer: there’s some desynchronization,
and more probably an uninitialized data field (for the lookup made in handling ZWNJ) if no vowel decomposition was done
(the same data field is correctly initialized when it is the first consonnant which takes an alternate form before
a virama, like in most Indic consonnant clusters, because the a glyph buffer is created.</p>
</blockquote>

<h2 id="generalizing">Generalizing</h2>

<p>So, ultimately, the full set of cases that cause the crash are:</p>

<p>Any sequence <code class="language-plaintext highlighter-rouge">&lt;consonant1, virama, consonant2, ZWNJ, vowel&gt;</code> in Devanagari, Bengali, and Telugu, where:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">consonant2</code> is suffix-joining (<code class="language-plaintext highlighter-rouge">pstf</code>/<code class="language-plaintext highlighter-rouge">vatu</code>) – i.e. र, র, য, ৰ, and all Telugu consonants</li>
  <li><code class="language-plaintext highlighter-rouge">consonant1</code> is not a reph-forming letter like र/র (or a variant, like ৰ)</li>
  <li><code class="language-plaintext highlighter-rouge">vowel</code> does not have two glyph components, i.e. it is not   ై,   ো, or   ৌ</li>
</ul>

<p>This leaves one question open:</p>

<p>Why doesn’t it apply to Kannada? Or, for that matter, Khmer, which has a similar virama-like thing called a “coeng”?</p>

<h2 id="are-these-valid-strings">Are these valid strings?</h2>

<p>A recurring question I’m getting is if these strings are valid in the language, or unicode gibberish
like Zalgo text. Breaking it down:</p>

<ul>
  <li>All of the <em>rendered</em> glyphs are valid. The original Telugu one is the root of the word for
“knowledge” (and I’ve taken to calling this bug “forbidden knowledge” for that reason).</li>
  <li>In Telugu and Devanagari, there is no functional use of the ZWNJ as used before a vowel. It
should not be there, and one would not expect it in typical text.</li>
  <li>In Bengali (also Oriya), putting a ZWNJ before some vowels prevents them from ligatureifying, and this is
mentioned in the Unicode spec. However, it seems rare for native speakers to use this.</li>
  <li>In all of these scripts, putting a ZWNJ after viramas can be used to force an explicit virama
over a ligature. That is not the position ZWNJ is used here, but it gives a hint that this
might have been a mistype. Doing this is <em>also</em> rare at least for Devanagari (and I believe
for the other two scripts as well)</li>
  <li>Android has an explicit key for ZWNJ on its keyboards for these languages<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>, right next to the spacebar. iOS has this as
well on the long-press of the virama key. <em>Very</em> easy to mistype, at least for Android.</li>
</ul>

<p>So while the crashing strings are usually invalid, and when not, very rare, they are easy enough to mistype.</p>

<p>An example by <a href="https://twitter.com/FakeUnicode">@FakeUnicode</a> was the string “For/k” (or “Foŕk”, if accents were easier to type). A
slash isn’t something you’d normally type there, and the produced string is gibberish, but it’s easy enough to type
by accident.</p>

<p>Except of course that the mistake in “For/k”/”Foŕk” is visually obvious and would be fixed; this
isn’t the case for most of the crashing strings.</p>

<h2 id="conclusion">Conclusion</h2>

<p>I don’t really have <em>one</em> guess as to what’s going on here – I’d love to see what people think – but my current
guess is that the “affinity” of the virama to the left instead of the right confuses the algorithm that handles ZWNJs after
viramas into thinking the ZWNJ applies to the virama (it doesn’t, there’s a consonant in between), and this leads to some numbers
not matching up and causing a buffer overflow or something. Philippe’s diagnosis of the vowel situation matches up with this.</p>

<p>An interesting thing is that I can cause this crash to happen more reliably in browsers by clicking on the string.</p>

<p>Additionally, <em>sometimes</em> it actually renders in spotlight for a split second before crashing; which
means that either the crash isn’t deterministic, or it occurs in some process <em>after</em> rendering. I’m
not sure what to think of either. Looking at the backtraces, the crash seems to occur in different
places, so it’s likely that it’s memory corruption that gets uncovered later.</p>

<p>I’d love to hear if folks have further insight into this.</p>

<p>Update: Philippe on the Unicode mailing list has <a href="https://www.unicode.org/mail-arch/unicode-ml/y2018-m02/0103.html">an interesting theory</a></p>

<p><small>Yes, I could attach a debugger to the crashing process and investigate that instead, but that’s no fun 😂</small></p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>Philippe Verdy points out that these may be called “phala forms” at least for Bengali <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>I don’t think the Android keyboard <em>needs</em> this key; the keyboard seems very much a dump of “what does this unicode block let us do”, and includes things like Sindhi-specific or Kashmiri-specific characters for the Marathi keyboard as well as <em>extremely</em> archaic characters, whilst neglecting more common things like the eyelash reph (which doesn’t have its own code point but is a special unicode sequence; native speakers should not be expected to be aware of this sequence). <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A Rough Proposal for Sum Types in Go]]></title>
    <link href="http://manishearth.github.io/blog/2018/02/01/a-rough-proposal-for-sum-types-in-go/"/>
    <updated>2018-02-01T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2018/02/01/a-rough-proposal-for-sum-types-in-go</id>
    <content type="html"><![CDATA[<p>Sum types are pretty cool. Just like how a struct is basically “This contains one of these <em>and</em> one of these”,
a sum type is “This contains one of these <em>or</em> one of these”.</p>

<p>So for example, the following sum type in Rust:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">enum</span> <span class="n">Foo</span> <span class="p">{</span>
    <span class="nf">Stringy</span><span class="p">(</span><span class="nb">String</span><span class="p">),</span>
    <span class="nf">Numerical</span><span class="p">(</span><span class="nb">u32</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>or Swift:</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">enum</span> <span class="kt">Foo</span> <span class="p">{</span>
    <span class="k">case</span> <span class="nf">stringy</span><span class="p">(</span><span class="kt">String</span><span class="p">),</span>
    <span class="k">case</span> <span class="nf">numerical</span><span class="p">(</span><span class="kt">Int</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>would be one where it’s either <code class="language-plaintext highlighter-rouge">Foo::Stringy</code> (<code class="language-plaintext highlighter-rouge">Foo::stringy</code> for swift), containing a <code class="language-plaintext highlighter-rouge">String</code>,
<em>or</em> <code class="language-plaintext highlighter-rouge">Foo::Numerical</code>, containing an integer.</p>

<p>This can be pretty useful. For example, messages between threads are often of a “this or that or that or that”
form.</p>

<p>The nice thing is, matching (switching) on these enums is usually <em>exhaustive</em> – you must list all
the cases (or include a default arm) for your code to compile. This leads to a useful component
of type safety – if you add a message to your message passing system, you’ll know where to update it.</p>

<p>Go doesn’t have these. Go <em>does</em> have interfaces, which are dynamically dispatched. The drawback here
is that you do not get the exhaustiveness condition, and consumers of your library can even add further
cases. (And, of course, dynamic dispatch can be slow). You <em>can</em> get exhaustiveness in Go with <a href="https://github.com/haya14busa/gosum">external tools</a>,
but it’s preferable to have such things in the language IMO.</p>

<p>Many years ago when I was learning Go I wrote a <a href="http://inpursuitoflaziness.blogspot.in/2015/02/thoughts-of-rustacean-learning-go.html">blog post</a> about what I liked and disliked
as a Rustacean learning Go. Since then, I’ve spent a lot more time with Go, and I’ve learned to like each Go design decision that I initially
disliked, <em>except</em> for the lack of sum types. Most of my issues arose from “trying to program Rust in Go”,
i.e. using idioms that are natural to Rust (or other languages I’d used previously). Once I got used to the
programming style, I realized that aside from the lack of sum types I really didn’t find much missing
from the language. Perhaps improvements to error handling.</p>

<p>Now, my intention here isn’t really to sell sum types. They’re somewhat controversial for Go, and
there are good arguments on both sides. You can see one discussion on this topic <a href="https://github.com/golang/go/issues/19412">here</a>.
If I were to make a more concrete proposal I’d probably try to motivate this in much more depth. But even
I’m not very <em>strongly</em> of the opinion that Go needs sum types; I have a slight preference for it.</p>

<p>Instead, I’m going to try and sketch this proposal for sum types that has been floating around my
mind for a while. I end up mentioning it often and it’s nice to have something to link to. Overall,
I think this “fits well” with the existing Go language design.</p>

<h2 id="the-proposal">The proposal</h2>

<p>The essence is pretty straightforward: Extend interfaces to allow for “closed interfaces”. These are
interfaces that are only implemented for a small list of types.</p>

<p>Writing the <code class="language-plaintext highlighter-rouge">Foo</code> sum type above would be:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="n">Foo</span> <span class="k">interface</span> <span class="p">{</span>
    <span class="n">SomeFunction</span><span class="p">()</span>
    <span class="n">OtherFunction</span><span class="p">()</span>
    <span class="k">for</span> <span class="kt">string</span><span class="p">,</span> <span class="kt">int</span>
<span class="p">}</span>
</code></pre></div></div>

<p>It doesn’t even need to have functions defined on it.</p>

<p>The interface functions can only be called if you have an interface object; they are not directly available
on variant types without explicitly casting (<code class="language-plaintext highlighter-rouge">Foo("...").SomeFunction()</code>).</p>

<p>(I’m not strongly for the <code class="language-plaintext highlighter-rouge">for</code> keyword syntax, it’s just a suggestion. The core idea is that
you define an interface and you define the types it closes over. Somehow.)</p>

<p>A better example would be an interface for a message-passing system for Raft:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="n">VoteRequest</span> <span class="k">struct</span> <span class="p">{</span>
    <span class="n">CandidateId</span> <span class="kt">uint</span>
    <span class="n">Term</span> <span class="kt">uint</span>
    <span class="c">// ...</span>
<span class="p">}</span>

<span class="k">type</span> <span class="n">VoteResponse</span> <span class="k">struct</span> <span class="p">{</span>
    <span class="n">Term</span> <span class="kt">uint</span>
    <span class="n">VoteGranted</span> <span class="kt">bool</span>
    <span class="n">VoterId</span> <span class="kt">uint</span>
<span class="p">}</span>

<span class="k">type</span> <span class="n">AppendRequest</span> <span class="k">struct</span> <span class="p">{</span>
    <span class="c">//...</span>
<span class="p">}</span>

<span class="k">type</span> <span class="n">AppendResponse</span> <span class="k">struct</span> <span class="p">{</span>
    <span class="c">//...</span>
<span class="p">}</span>
<span class="c">// ...</span>
<span class="k">type</span> <span class="n">RaftMessage</span> <span class="k">interface</span> <span class="p">{</span>
    <span class="k">for</span> <span class="n">VoteRequest</span><span class="p">,</span> <span class="n">VoteResponse</span><span class="p">,</span> <span class="n">AppendRequest</span><span class="p">,</span> <span class="n">AppendResponse</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Now, you use type switches for dealing with these:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">switch</span> <span class="n">value</span> <span class="o">:=</span> <span class="n">msg</span><span class="o">.</span><span class="p">(</span><span class="k">type</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">case</span> <span class="n">VoteRequest</span><span class="o">:</span>
        <span class="k">if</span> <span class="n">value</span><span class="o">.</span><span class="n">Term</span> <span class="o">&lt;=</span> <span class="n">me</span><span class="o">.</span><span class="n">Term</span> <span class="p">{</span>
            <span class="n">me</span><span class="o">.</span><span class="n">reject_vote</span><span class="p">(</span><span class="n">value</span><span class="o">.</span><span class="n">CandidateId</span><span class="p">)</span>
        <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
            <span class="n">me</span><span class="o">.</span><span class="n">accept_vote</span><span class="p">(</span><span class="n">value</span><span class="o">.</span><span class="n">CandidateId</span><span class="p">,</span> <span class="n">value</span><span class="o">.</span><span class="n">Term</span><span class="p">)</span>
        <span class="p">}</span>
    <span class="k">case</span> <span class="n">VoteResponse</span><span class="o">:</span> <span class="c">// ...</span>
    <span class="k">case</span> <span class="n">AppendRequest</span><span class="o">:</span> <span class="c">// ...</span>
    <span class="k">case</span> <span class="n">AppendResponse</span><span class="o">:</span> <span class="c">// ...</span>
<span class="p">}</span>
</code></pre></div></div>

<p>There is no need for the default case, unless you wish to leave one or more of the cases out.</p>

<p>Ideally, these could be implemented as inline structs instead of using dynamic dispatch. I’m not sure
what this entails for the GC design, but I’d love to hear thoughts on this.</p>

<p>We also make it possible to add methods to closed interfaces. This is in the spirit of
<a href="https://github.com/golang/go/issues/16254">this proposal</a>, where you allow</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="p">(</span><span class="n">message</span> <span class="n">RaftMessage</span><span class="p">)</span> <span class="n">Process</span><span class="p">(</span><span class="n">me</span> <span class="n">Me</span><span class="p">)</span> <span class="kt">error</span> <span class="p">{</span>
    <span class="c">// message handling logic</span>
<span class="p">}</span>
</code></pre></div></div>

<p>for closed interfaces.</p>

<p>This aligns more with how sum types are written and used in other languages; instead of assuming
that each method will be a <code class="language-plaintext highlighter-rouge">switch</code> on the variant, you can write arbitrary code that <em>may</em> <code class="language-plaintext highlighter-rouge">switch</code>
on the type but it can also just call other methods. This is really nice because you can write
methods in <em>both</em> ways – if it’s a “responsibility of the inner type” kind of method, require it in
the interface and delegate it to the individual types. If it’s a “responsibility of the interface”
method, write it as a method on the interface as a whole. I kind of wish Rust had this, because in Rust
you sometimes end up writing things like:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">match</span> <span class="n">foo</span> <span class="p">{</span>
    <span class="nn">Foo</span><span class="p">::</span><span class="nf">Stringy</span><span class="p">(</span><span class="n">s</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">s</span><span class="nf">.process</span><span class="p">(),</span>
    <span class="nn">Foo</span><span class="p">::</span><span class="nf">Numerical</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">n</span><span class="nf">.process</span><span class="p">(),</span>
    <span class="c1">// ...</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Yes, this would work better as a trait, but then you lose some niceties of Rust enums. With this
proposal Go can have it both ways.</p>

<hr />

<p>Anyway, thoughts? This is a really rough proposal, and I’m not sure how receptive other Gophers will be
to this, nor how complex its implementation would be. I don’t really intend to submit this as a formal proposal,
but if someone else wants to they are more than welcome to build on this idea.</p>

]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[What Are Tokio and Async IO All About?]]></title>
    <link href="http://manishearth.github.io/blog/2018/01/10/whats-tokio-and-async-io-all-about/"/>
    <updated>2018-01-10T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2018/01/10/whats-tokio-and-async-io-all-about</id>
    <content type="html"><![CDATA[<p>The Rust community lately has been focusing a lot on “async I/O” through the <a href="https://github.com/tokio-rs/">tokio</a>
project. This is pretty great!</p>

<p>But for many in the community who haven’t worked with web servers and related things it’s pretty
confusing as to what we’re trying to achieve there. When this stuff was being discussed around 1.0,
I was pretty lost as well, having never worked with this stuff before.</p>

<p>What’s all this Async I/O business about? What are coroutines? Lightweight threads? Futures? How
does this all fit together?</p>

<h2 id="what-problem-are-we-trying-to-solve">What problem are we trying to solve?</h2>

<p>One of Rust’s key features is “fearless concurrency”. But the kind of concurrency required for handling a
large amount of I/O bound tasks – the kind of concurrency found in Go, Elixir, Erlang – is absent
from Rust.</p>

<p>Let’s say you want to build something like a web service. It’s going to be handling thousands of
requests at any point in time (known as the “<a href="https://en.wikipedia.org/wiki/C10k_problem">c10k</a> problem”). In general, the problem we’re
considering is having a huge number of I/O bound (usually network I/O) tasks.</p>

<p>“Handling N things at once” is best done by using threads. But … <em>thousands</em> of threads? That
sounds a bit much. Threads can be pretty expensive: Each thread needs to allocate a large stack,
setting up a thread involves a bunch of syscalls, and context switching is expensive.</p>

<p>Of course, thousands of threads <em>all doing work</em> at once is not going to work anyway. You only
have a fixed number of cores, and at any one time only one thread will be running on a core.</p>

<p>But for cases like web servers, most of these threads won’t be doing work. They’ll be waiting on the
network. Most of these threads will either be listening for a request, or waiting for their response
to get sent.</p>

<p>With regular threads, when you perform a blocking I/O operation, the syscall returns control
to the kernel, which won’t yield control back, because the I/O operation is probably not finished.
Instead, it will use this as an opportunity to swap in a different thread, and will swap the original
thread back when its I/O operation is finished (i.e. it’s “unblocked”). Without Tokio and friends,
this is how you would handle such things in Rust. Spawn a million threads; let the OS deal with
scheduling based on I/O.</p>

<p>But, as we already discovered, threads don’t scale well for things like this<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>.</p>

<p>We need “lighter” threads.</p>

<h2 id="lightweight-threading">Lightweight threading</h2>

<p>I think the best way to understand lightweight threading is to forget about Rust for a moment
and look at a language that does this well, Go.</p>

<p>Instead of using OS threads, Go has lightweight threads, called “goroutines”. You spawn these with the <code class="language-plaintext highlighter-rouge">go</code>
keyword. A web server might do something like this:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">listener</span><span class="p">,</span> <span class="n">err</span> <span class="o">=</span> <span class="n">net</span><span class="o">.</span><span class="n">Listen</span><span class="p">(</span><span class="o">...</span><span class="p">)</span>
<span class="c">// handle err</span>
<span class="k">for</span> <span class="p">{</span>
    <span class="n">conn</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">listener</span><span class="o">.</span><span class="n">Accept</span><span class="p">()</span>
    <span class="c">// handle err</span>

    <span class="c">// spawn goroutine:</span>
    <span class="k">go</span> <span class="n">handler</span><span class="p">(</span><span class="n">conn</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This is a loop which waits for new TCP connections, and spawns a goroutine with the connection
and the function <code class="language-plaintext highlighter-rouge">handler</code>. Each connection will be a new goroutine, and the goroutine will shut down
when <code class="language-plaintext highlighter-rouge">handler</code> finishes. In the meantime, the main loop continues executing, because it’s running in
a different goroutine.</p>

<p>So if these aren’t “real” (operating system) threads, what’s going on?</p>

<p>A goroutine is an example of a “lightweight” thread. The operating system doesn’t know about these,
it sees N threads owned by the Go runtime, and the Go runtime maps M goroutines onto them<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>, swapping
goroutines in and out much like the operating system scheduler. It’s able to do this because
Go code is already interruptible for the GC to be able to run, so the scheduler can always ask goroutines
to stop. The scheduler is also aware of I/O, so when a goroutine is waiting on I/O it yields to the scheduler.</p>

<p>Essentialy, a compiled Go function will have a bunch of points scattered throughout it where it
tells the scheduler and GC “take over if you want” (and also “I’m waiting on stuff, please take
over”).</p>

<p>When a goroutine is swapped on an OS thread, some registers will be saved, and
the program counter will switch to the new goroutine.</p>

<p>But what about its stack? OS threads have a large stack with them, and you kinda need a stack for functions
and stuff to work.</p>

<p>What Go used to do was segmented stacks. The reason a thread needs a large stack is that most
programming languages, including C, expect the stack to be contiguous, and stacks can’t just be
“reallocated” like we do with growable buffers since we expect stack data to stay put so that
pointers to stack data to continue to work. So we reserve all the stack we think we’ll ever need
(~8MB), and hope we don’t need more.</p>

<p>But the expectation of stacks being contiguous isn’t strictly necessary. In Go, stacks are made of tiny
chunks. When a function is called, it checks if there’s enough space on the stack for it to run, and if not,
allocates a new chunk of stack and runs on it. So if you have thousands of threads doing a small amount of work,
they’ll all get thousands of tiny stacks and it will be fine.</p>

<p>These days, Go actually does something different; it <a href="https://blog.cloudflare.com/how-stacks-are-handled-in-go/">copies stacks</a>. I mentioned that stacks can’t
just be “reallocated” we expect stack data to stay put. But that’s not necessarily true —
because Go has a GC it knows what all the pointers are <em>anyway</em>, and it can rewrite pointers to
stack data on demand.</p>

<p>Either way, Go’s rich runtime lets it handle this stuff well. Goroutines are super cheap, and you can spawn
thousands without your computer having problems.</p>

<p>Rust <em>used</em> to support lightweight/”green” threads (I believe it used segmented stacks). However, Rust cares
a lot about not paying for things you don’t use, and this imposes a penalty on all your code even if you
aren’t using green threads, and it was removed pre-1.0.</p>

<h2 id="async-io">Async I/O</h2>

<p>A core building block of this is Async I/O. As mentioned in the previous section,
with regular blocking I/O, the moment you request I/O your thread will not be allowed to run
(“blocked”) until the operation is done. This is perfect when working with OS threads (the OS
scheduler does all the work for you!), but if you have lightweight threads you instead want to
replace the lightweight thread running on the OS thread with a different one.</p>

<p>Instead, you use non-blocking I/O, where the thread queues a request for I/O with the OS and continues
execution. The I/O request is executed at some later point by the kernel. The thread then needs to ask the
OS “Is this I/O request ready yet?” before looking at the result of the I/O.</p>

<p>Of course, repeatedly asking the OS if it’s done can be tedious and consume resources. This is why
there are system calls like <a href="https://en.wikipedia.org/wiki/Epoll"><code class="language-plaintext highlighter-rouge">epoll</code></a>. Here, you can bundle together a bunch of unfinished I/O requests,
and then ask the OS to wake up your thread when <em>any</em> of these completes. So you can have a scheduler
thread (a real thread) that swaps out lightweight threads that are waiting on I/O, and when there’s nothing
else happening it can itself go to sleep with an <code class="language-plaintext highlighter-rouge">epoll</code> call until the OS wakes it up (when one of the I/O
requests completes).</p>

<p>(The exact mechanism involved here is probably more complex)</p>

<p>So, bringing this to Rust, Rust has the <a href="https://github.com/carllerche/mio">mio</a> library, which is a platform-agnostic
wrapper around non-blocking I/O and tools like epoll/kqueue/etc. It’s a building block; and while
those used to directly using <code class="language-plaintext highlighter-rouge">epoll</code> in C may find it helpful, it doesn’t provide a nice programming
model like Go does. But we can get there.</p>

<h2 id="futures">Futures</h2>

<p>These are another building block. A <a href="https://docs.rs/futures/0.1.17/futures/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> is the promise of eventually having a value
(in fact, in Javascript these are called <code class="language-plaintext highlighter-rouge">Promise</code>s).</p>

<p>So for example, you can ask to listen on a network socket, and get a <code class="language-plaintext highlighter-rouge">Future</code> back  (actually, a
<code class="language-plaintext highlighter-rouge">Stream</code>, which is like a future but for a sequence of values). This <code class="language-plaintext highlighter-rouge">Future</code> won’t contain the
response <em>yet</em>, but will know when it’s ready. You can <code class="language-plaintext highlighter-rouge">wait()</code> on a <code class="language-plaintext highlighter-rouge">Future</code>, which will block
until you have a result, and you can also <code class="language-plaintext highlighter-rouge">poll()</code> it, asking it if it’s done yet (it will give you
the result if it is).</p>

<p>Futures can also be chained, so you can do stuff like <code class="language-plaintext highlighter-rouge">future.then(|result| process(result))</code>.
The closure passed to <code class="language-plaintext highlighter-rouge">then</code> itself can produce another future, so you can chain together
things like I/O operations. With chained futures, <code class="language-plaintext highlighter-rouge">poll()</code> is how you make progress; each time
you call it it will move on to the next future provided the existing one is ready.</p>

<p>This is a pretty good abstraction over things like non-blocking I/O.</p>

<p>Chaining futures works much like chaining iterators. Each <code class="language-plaintext highlighter-rouge">and_then</code> (or whatever combinator)
call returns a struct wrapping around the inner future, which may contain an additional closure.
Closures themselves carry their references and data with them, so this really ends up being
very similar to a tiny stack!</p>

<h2 id="-tokio-">🗼 Tokio 🗼</h2>

<p>Tokio’s essentially a nice wrapper around mio that uses futures. Tokio has a core
event loop, and you feed it closures that return futures. What it will do is
run all the closures you feed it, use mio to efficiently figure out which futures
are ready to make a step<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>, and make progress on them (by calling <code class="language-plaintext highlighter-rouge">poll()</code>).</p>

<p>This actually is already pretty similar to what Go was doing, at a conceptual level.
You have to manually set up the Tokio event loop (the “scheduler”), but once you do
you can feed it tasks which intermittently do I/O, and the event loop takes
care of swapping over to a new task when one is blocked on I/O. A crucial difference is
that Tokio is single threaded, whereas the Go scheduler can use multiple OS threads
for execution. However, you can offload CPU-critical tasks onto other OS threads and
use channels to coordinate so this isn’t that big a deal.</p>

<p>While at a conceptual level this is beginning to shape up to be similar to what we had for Go, code-wise this doesn’t look so pretty. For the following Go code:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">// error handling ignored for simplicity</span>

<span class="k">func</span> <span class="n">foo</span><span class="p">(</span><span class="o">...</span><span class="p">)</span> <span class="n">ReturnType</span> <span class="p">{</span>
    <span class="n">data</span> <span class="o">:=</span> <span class="n">doIo</span><span class="p">()</span>
    <span class="n">result</span> <span class="o">:=</span> <span class="n">compute</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
    <span class="n">moreData</span> <span class="o">=</span> <span class="n">doMoreIo</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
    <span class="n">moreResult</span> <span class="o">:=</span> <span class="n">moreCompute</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
    <span class="c">// ...</span>
    <span class="k">return</span> <span class="n">someFinalResult</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The Rust code will look something like</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// error handling ignored for simplicity</span>

<span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="o">...</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">Future</span><span class="o">&lt;</span><span class="n">ReturnType</span><span class="p">,</span> <span class="n">ErrorType</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="nf">do_io</span><span class="p">()</span><span class="nf">.and_then</span><span class="p">(|</span><span class="n">data</span><span class="p">|</span> <span class="nf">do_more_io</span><span class="p">(</span><span class="nf">compute</span><span class="p">(</span><span class="n">data</span><span class="p">)))</span>
          <span class="nf">.and_then</span><span class="p">(|</span><span class="n">more_data</span><span class="p">|</span> <span class="nf">do_even_more_io</span><span class="p">(</span><span class="nf">more_compute</span><span class="p">(</span><span class="n">more_data</span><span class="p">)))</span>
    <span class="c1">// ......</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Not pretty. <a href="https://docs.rs/futures/0.1.25/futures/future/fn.loop_fn.html#examples">The code gets worse if you introduce branches and loops</a>. The problem is that in Go we
got the interruption points for free, but in Rust we have to encode this by chaining up combinators
into a kind of state machine. Ew.</p>

<h2 id="generators-and-asyncawait">Generators and async/await</h2>

<p>This is where generators (also called coroutines) come in.</p>

<p><a href="https://doc.rust-lang.org/nightly/unstable-book/language-features/generators.html">Generators</a> are an experimental feature in Rust. Here’s an example:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="k">mut</span> <span class="n">generator</span> <span class="o">=</span> <span class="p">||</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">loop</span> <span class="p">{</span>
        <span class="k">yield</span> <span class="n">i</span><span class="p">;</span>
        <span class="n">i</span> <span class="o">+=</span> <span class="mi">1</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">};</span>
<span class="nd">assert_eq!</span><span class="p">(</span><span class="n">generator</span><span class="nf">.resume</span><span class="p">(),</span> <span class="nn">GeneratorState</span><span class="p">::</span><span class="nf">Yielded</span><span class="p">(</span><span class="mi">0</span><span class="p">));</span>
<span class="nd">assert_eq!</span><span class="p">(</span><span class="n">generator</span><span class="nf">.resume</span><span class="p">(),</span> <span class="nn">GeneratorState</span><span class="p">::</span><span class="nf">Yielded</span><span class="p">(</span><span class="mi">1</span><span class="p">));</span>
<span class="nd">assert_eq!</span><span class="p">(</span><span class="n">generator</span><span class="nf">.resume</span><span class="p">(),</span> <span class="nn">GeneratorState</span><span class="p">::</span><span class="nf">Yielded</span><span class="p">(</span><span class="mi">2</span><span class="p">));</span>
</code></pre></div></div>

<p>Functions are things which execute a task and return once. On the other hand, generators
return multiple times; they pause execution to “yield” some data, and can be resumed
at which point they will run until the next yield. While my example doesn’t show this, generators
can also finish executing like regular functions.</p>

<p>Closures in Rust are
<a href="http://huonw.github.io/blog/2015/05/finding-closure-in-rust/">sugar for a struct containing captured data, plus an implementation of one of the <code class="language-plaintext highlighter-rouge">Fn</code> traits to make it callable</a>.</p>

<p>Generators are similar, except they implement the <code class="language-plaintext highlighter-rouge">Generator</code> trait<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">4</a></sup>, and usually store an enum representing various states.</p>

<p>The <a href="https://doc.rust-lang.org/nightly/unstable-book/language-features/generators.html#generators-as-state-machines">unstable book</a> has some examples on what the generator state machine enum will look like.</p>

<p>This is much closer to what we were looking for! Now our code can look like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="o">...</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">Future</span><span class="o">&lt;</span><span class="n">ReturnType</span><span class="p">,</span> <span class="n">ErrorType</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">generator</span> <span class="o">=</span> <span class="p">||</span> <span class="p">{</span>
        <span class="k">let</span> <span class="k">mut</span> <span class="n">future</span> <span class="o">=</span> <span class="nf">do_io</span><span class="p">();</span>
        <span class="k">let</span> <span class="n">data</span><span class="p">;</span>
        <span class="k">loop</span> <span class="p">{</span>
            <span class="c1">// poll the future, yielding each time it fails,</span>
            <span class="c1">// but if it succeeds then move on</span>
            <span class="k">match</span> <span class="n">future</span><span class="nf">.poll</span><span class="p">()</span> <span class="p">{</span>
                <span class="nf">Ok</span><span class="p">(</span><span class="nn">Async</span><span class="p">::</span><span class="nf">Ready</span><span class="p">(</span><span class="n">d</span><span class="p">))</span> <span class="k">=&gt;</span> <span class="p">{</span> <span class="n">data</span> <span class="o">=</span> <span class="n">d</span><span class="p">;</span> <span class="k">break</span> <span class="p">},</span>
                <span class="nf">Ok</span><span class="p">(</span><span class="nn">Async</span><span class="p">::</span><span class="nf">NotReady</span><span class="p">(</span><span class="n">d</span><span class="p">))</span> <span class="k">=&gt;</span> <span class="p">(),</span>
                <span class="nf">Err</span><span class="p">(</span><span class="o">..</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="o">...</span>
            <span class="p">};</span>
            <span class="k">yield</span> <span class="n">future</span><span class="nf">.polling_info</span><span class="p">();</span>
        <span class="p">}</span>
        <span class="k">let</span> <span class="n">result</span> <span class="o">=</span> <span class="nf">compute</span><span class="p">(</span><span class="n">data</span><span class="p">);</span>
        <span class="c1">// do the same thing for `doMoreIo()`, etc</span>
    <span class="p">}</span>

    <span class="nf">futurify</span><span class="p">(</span><span class="n">generator</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>where <code class="language-plaintext highlighter-rouge">futurify</code> is a function that takes a generator and returns a future which on
each <code class="language-plaintext highlighter-rouge">poll</code> call will <code class="language-plaintext highlighter-rouge">resume()</code> the generator, and return <code class="language-plaintext highlighter-rouge">NotReady</code> until the generator
finishes executing.</p>

<p>But wait, this is even <em>more</em> ugly! What was the point of converting our relatively
clean callback-chaining code into this mess?</p>

<p>Well, if you look at it, this code now looks <em>linear</em>. We’ve converted our callback
code to the same linear flow as the Go code, however it has this weird loop-yield boilerplate
and the <code class="language-plaintext highlighter-rouge">futurify</code> function and is overall not very neat.</p>

<p>And that’s where <a href="https://github.com/alexcrichton/futures-await">futures-await</a> comes in. <code class="language-plaintext highlighter-rouge">futures-await</code> is a procedural macro that
does the last-mile work of packaging away this boilerplate. It essentially lets you write
the above function as</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[async]</span>
<span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="o">...</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">Result</span><span class="o">&lt;</span><span class="n">ReturnType</span><span class="p">,</span> <span class="n">ErrorType</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">data</span> <span class="o">=</span> <span class="k">await</span><span class="o">!</span><span class="p">(</span><span class="nf">do_io</span><span class="p">());</span>
    <span class="k">let</span> <span class="n">result</span> <span class="o">=</span> <span class="nf">compute</span><span class="p">(</span><span class="n">data</span><span class="p">);</span>
    <span class="k">let</span> <span class="n">more_data</span> <span class="o">=</span> <span class="k">await</span><span class="o">!</span><span class="p">(</span><span class="nf">do_more_io</span><span class="p">());</span>
    <span class="c1">// ....</span>
</code></pre></div></div>

<p>Nice and clean. Almost as clean as the Go code, just that we have explicit <code class="language-plaintext highlighter-rouge">await!()</code> calls. These
await calls are basically providing the same function as the interruption points that Go code
gets implicitly.</p>

<p>And, of course, since it’s using a generator under the hood, you can loop and branch and do whatever
else you want as normal, and the code will still be clean.</p>

<h2 id="tying-it-together">Tying it together</h2>

<p>So, in Rust, futures can be chained together to provide a lightweight stack-like system. With async/await,
you can neatly write these future chains, and <code class="language-plaintext highlighter-rouge">await</code> provides explicit interruption points on each I/O operation.
Tokio provides an event loop “scheduler” abstraction, which you can feed async functions to, and under the hood it
uses mio to abstract over low level non-blocking I/O primitives.</p>

<p>These are components which can be used independently — you can use tokio with futures without
using async/await. You can use async/await without using Tokio. For example, I think this would be
useful for Servo’s networking stack. It doesn’t need to do <em>much</em> parallel I/O (not at the order
of thousands of threads), so it can just use multiplexed OS threads. However, we’d still want
to pool threads and pipeline data well, and async/await would help here.</p>

<p>Put together, all these components get something almost as clean as the Go stuff, with a little more
explicit boilerplate. Because generators (and thus async/await) play nice with the borrow checker
(they’re just enum state machines under the hood), Rust’s safety guarantees are all still in play,
and we get to have “fearless concurrency” for programs having a huge quantity of I/O bound tasks!</p>

<p><em>Thanks to Arshia Mufti, Steve Klabnik, Zaki Manian, and Kyle Huey for reviewing drafts of this post</em></p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>Note that this isn’t necessarily true for <em>all</em> network server applications. For example, Apache uses OS threads. OS threads are often the best tool for the job. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Lightweight threading is also often called M:N threading (also “green threading”) <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>In general future combinators aren’t really aware of tokio or even I/O, so there’s no easy way to ask a combinator “hey, what I/O operation are you waiting for?”. Instead, with Tokio you use special I/O primitives that still provide futures but also register themselves with the scheduler in thread local state. This way when a future is waiting for I/O, Tokio can check what the recentmost I/O operation was, and associate it with that future so that it can wake up that future again when <code class="language-plaintext highlighter-rouge">epoll</code> tells it that that I/O operation is ready. (<em>Edit Dec 2018: This has changed, futures now have a built in <code class="language-plaintext highlighter-rouge">Waker</code> concept that handles passing things up the stack</em>) <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>The <code class="language-plaintext highlighter-rouge">Generator</code> trait has a <code class="language-plaintext highlighter-rouge">resume()</code> function which you can call multiple times, and each time it will return any yielded data or tell you that the generator has finished running. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Rust in 2018]]></title>
    <link href="http://manishearth.github.io/blog/2018/01/10/rust-in-2018/"/>
    <updated>2018-01-10T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2018/01/10/rust-in-2018</id>
    <content type="html"><![CDATA[<p>A week ago <a href="https://blog.rust-lang.org/2018/01/03/new-years-rust-a-call-for-community-blogposts.html">we put out a call for blog posts for what folks think Rust should do in 2018</a>.</p>

<p>This is mine.</p>

<h2 id="overall-focus">Overall focus</h2>

<p>I think 2017 was a great year for Rust. Near the beginning of the year, after custom derive
and a bunch of things stabilized, I had a strong feeling that Rust was “complete”. Not really “finished”,
there’s still tons of stuff to improve, but this was the first time stable Rust was the language
I wanted it to be, and was something I could recommend for most kinds of work without reservations.</p>

<p>I think this is a good signal to wind down the frightening pace of new features Rust has been getting.
And that happened! We had the impl period, which took some time to focus on <em>getting things done</em> before
proposing new things. And Rust is feeling more polished than ever.</p>

<p>Like <a href="https://www.ncameron.org/blog/rust-2018/">Nick</a>, I feel like 2018 should be boring. I feel like we should focus on polishing what
we have, implementing all the things, and improving our approachability as a language.</p>

<p>Basically, I want to see this as an extended impl period.</p>

<p>This doesn’t mean I’m looking for a moratorium on RFCs, really. Hell, in the past few days I’ve posted
one pre-pre-RFC<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>, one pre-RFC, and one RFC (from the pre-RFC). I’m mostly looking for <em>prioritizing</em> impl
work over designing new things, but still having <em>some</em> focus on design.</p>

<h2 id="language">Language</h2>

<p>I think Rust still has some “missing bits” which make it hard to justify for some use cases. Rust’s
async story is being fleshed out. We don’t yet have stable SIMD or stable inline ASM. The microcontroller
story is kinda iffy. RLS/clippy need nightly. I’d like to see these crystallize and stabilize this year.</p>

<p>I think this year we need to continue to take a critical look at Rust’s ergonomics. Last year the
ergonomics initiative was really good for Rust, and I’d like to see more of that. This is kind of at
odds with my “focus on polishing Rust” statement, but fixing ergonomics is not just new features. It’s
also about figuring out barriers in Rust, polishing mental models, improving docs/diagnostics, and in
general figuring out how to best present Rust’s features. Starting dialogues about confusing bits of
the language and figuring out the best mental model to present them with is something we should
continue doing. Sometimes this may need new features, indeed, but not always. We must continue
to take a critical look at how our language presents itself to newcomers.</p>

<h2 id="community">Community</h2>

<p>I’d like to see a stronger focus on mentoring. Mentoring on rustc, mentoring on major libraries, mentoring on
Rust tooling, mentoring everywhere. This includes not just the mentors, but the associated infrastructure –
contribution docs, sites like <a href="http://starters.servo.org/">servo-starters</a> and <a href="https://www.rustaceans.org/findwork">findwork</a>, and similar tooling.</p>

<p>I’m also hoping for more companies to invest back into Rust. This year <a href="http://buoyant.io/">Buoyant</a> became pretty well
known within the community, and many of their employees are paid to work on various important parts
of the Rust ecosystem. There are also multiple consulting groups that contribute to the ecosystem.
It’s nice to see that “paid to work on Rust” is no longer limited to Mozilla, and this is crucial
for the health of the language. I hope this trend continues.</p>

<p>Finally, I want to see more companies <em>talk</em> about Rust. Success stories are really nice to hear.
I’ve heard many amazing success stories this year, but a lot of them are things which can’t be shared.</p>

<h2 id="governance">Governance</h2>

<p>Last year we started seeing the limits of the RFC process. Large RFCs were stressful for both the RFC authors
and participating community members, and rather opaque for newer community members wishing to participate.
Alternative models have been discussed; I’d like to see more movement on this front.</p>

<p>I’d also like to grow the moderation team; it is currently rather small and doesn’t have the capacity to handle
incidents in a timely fashion.</p>

<h2 id="docs--learning">Docs / Learning</h2>

<p>I’d like to see a focus on improving Rust for folks who learn the language by <em>trying things</em> over reading books <sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> <sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>.</p>

<p>This means better diagnostics, better alternative resources like rustbyexample, etc. Improving mentorship helps here
as well.</p>

<p>Of course, I’d like to see our normal docs work continue to happen.</p>

<hr />

<p>I’m overall really excited for 2018. I think we’re doing great on most fronts so far, and if we
maintain the momentum we’ll have an even-more-awesome Rust by the end of this year!</p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>This isn’t a “pre rfc” because I’ve written it as a much looser sketch of the problem and a solution <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>There is literally no programming language I’ve personally learned through a book or formal teaching. I’ve often read books after I know a language because it’s fun and instructive, but it’s always started out as “learn extreme basics” followed by “look at existing code, tweak stuff, and write your own code”. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>Back in <em>my</em> day Rust didn’t have a book, just this tiny thing called “The Tutorial”. <em>grouches incessantly</em> <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Undefined vs Unsafe in Rust]]></title>
    <link href="http://manishearth.github.io/blog/2017/12/24/undefined-vs-unsafe-in-rust/"/>
    <updated>2017-12-24T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2017/12/24/undefined-vs-unsafe-in-rust</id>
    <content type="html"><![CDATA[<p>Recently Julia Evans wrote an <a href="https://jvns.ca/blog/2017/12/23/segfault-debugging/">excellent post</a> about debugging a segfault in Rust. (Go read it, it’s good)</p>

<p>One thing it mentioned was</p>

<blockquote>
  <p>I think “undefined” and “unsafe” are considered to be synonyms.</p>
</blockquote>

<p>This is … incorrect. However, we in the Rust community have never really explicitly outlined the
distinction, so that confusion is on us! This blog post is an attempt to clarify the difference of
terminology as used within the Rust community. It’s a very useful but subtle distinction and I feel we’d be
able to talk about safety more expressively if this was well known.</p>

<h2 id="unsafe-means-two-things-in-rust-yay">Unsafe means two things in Rust, yay</h2>

<p>So, first off, the waters are a bit muddied by the fact that Rust uses <code class="language-plaintext highlighter-rouge">unsafe</code> to both mean “within
an <code class="language-plaintext highlighter-rouge">unsafe {}</code> block” and “something Bad is happening here”. It’s possible to have safe code
within an <code class="language-plaintext highlighter-rouge">unsafe</code> block; indeed this is the <em>primary function</em> of an <code class="language-plaintext highlighter-rouge">unsafe</code> block. Somewhat
counterintutively, the <code class="language-plaintext highlighter-rouge">unsafe</code> block’s purpose is to actually tell the compiler “I know you don’t
like this code but trust me, it’s safe!” (where “safe” is the negation of the <em>second</em> meaning of “unsafe”,
i.e. “something Bad is not happening here”).</p>

<p>Similarly, we use “safe code” to mean “code not using <code class="language-plaintext highlighter-rouge">unsafe{}</code> blocks” but also “code that is not unsafe”,
i.e. “code where nothing bad happens”.</p>

<p>This blog post is primarily about the “something bad is happening here” meaning of “unsafe”. When referring
to the other kind I’ll specifically say “code within <code class="language-plaintext highlighter-rouge">unsafe</code> blocks” or something like that.</p>

<h2 id="undefined-behavior">Undefined behavior</h2>

<p>In languages like C, C++, and Rust, undefined behavior is when you reach a point where
the compiler is allowed to do anything with your code. This is distinct from implementation-defined
behavior, where usually a given compiler/library will do a deterministic thing, however they have some
freedom from the spec in deciding what that thing is.</p>

<p>Undefined behavior can be pretty scary. This is usually because in practice it causes problems when
the compiler assumes “X won’t happen because it is undefined behavior”, and X ends up happening,
breaking the assumptions. In some cases this does nothing dangerous, but often the compiler will
end up doing wacky things to your code. Dereferencing a null pointer will <em>sometimes</em> cause segfaults
(which is the compiler generating code that actually dereferences the pointer, making the kernel
complain), but sometimes it will be optimized in a way that assumes it won’t and moves around code
such that you have major problems.</p>

<p>Undefined behavior is a global property, based on how your code is <em>used</em>. The following function
in C++ or Rust may or may not exhibit undefined behavior, based on how it gets used:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">deref</span><span class="p">(</span><span class="kt">int</span><span class="o">*</span> <span class="n">x</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">return</span> <span class="o">*</span><span class="n">x</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// do not try this at home</span>
<span class="k">fn</span> <span class="nf">deref</span><span class="p">(</span><span class="n">x</span><span class="p">:</span> <span class="o">*</span><span class="k">mut</span> <span class="nb">u32</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">u32</span> <span class="p">{</span>
    <span class="k">unsafe</span> <span class="p">{</span> <span class="o">*</span><span class="n">x</span> <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>As long as you always call it with a valid pointer to an integer, there is no undefined behavior
involved.</p>

<p>But in either language, if you use it with some pointer conjured out of thin air (like <code class="language-plaintext highlighter-rouge">0x01</code>), that’s
probably undefined behavior.</p>

<p>As it stands, UB is a property of the entire program and its execution. Sometimes you may have snippets of code
that will always exhibit undefined behavior regardless of how they are called, but in general UB
is a global property.</p>

<h2 id="unsafe-behavior">Unsafe behavior</h2>

<p>Rust’s concept of “unsafe behavior” (I’m coining this term because “unsafety” and “unsafe code” can
be a bit confusing) is far more scoped. Here, <code class="language-plaintext highlighter-rouge">fn deref</code> <em>is</em> “unsafe”<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>, even if you <em>always</em>
call it with a valid pointer. The reason it is still unsafe is because it’s possible to trigger UB by only
changing the “safe” caller code. I.e. “changes to code outside unsafe blocks can trigger UB if they include
calls to this function”.</p>

<p>Basically, in Rust a bit of code is “safe” if it cannot exhibit undefined behavior under all circumstances of
that code being used. The following code exhibits “safe behavior”:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">unsafe</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
    <span class="k">let</span> <span class="n">raw</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">x</span> <span class="k">as</span> <span class="o">*</span><span class="k">const</span> <span class="nb">u32</span><span class="p">;</span>
    <span class="nd">println!</span><span class="p">(</span><span class="s">"{}"</span><span class="p">,</span> <span class="o">*</span><span class="n">raw</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>We dereferenced a raw pointer, but we knew it was valid. Of course, actual <code class="language-plaintext highlighter-rouge">unsafe</code> blocks will
usually be “actually totally safe” for less obvious reasons, and part of this is because
<a href="https://doc.rust-lang.org/nomicon/working-with-unsafe.html#working-with-unsafe"><code class="language-plaintext highlighter-rouge">unsafe</code> blocks sometimes can pollute the entire module</a>.</p>

<p>Basically, “safe” in Rust is a more local property. Code isn’t safe just because you only use it in
a way that doesn’t trigger UB, it is safe because there is literally <em>no way to use it such that it
will do so</em>. No way to do so without using <code class="language-plaintext highlighter-rouge">unsafe</code> blocks, that is<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>.</p>

<p>This is a distinction that’s <em>possible</em> to draw in Rust because it gives us the ability
to compartmentalize safety. Trying to apply this definition to C++ is problematic; you can
ask “is <code class="language-plaintext highlighter-rouge">std::unique_ptr&lt;T&gt;</code> safe?”, but you can <em>always</em> use it within code in a way that you trigger
undefined behavior, because C++ does not have the tools for compartmentalizing safety. The distinction
between “code which doesn’t need to worry about safety” and “code which does need to worry about safety”
exists in Rust in the form of “code outside of <code class="language-plaintext highlighter-rouge">unsafe {}</code>” and “code within <code class="language-plaintext highlighter-rouge">unsafe {}</code>”, whereas in
C++ it’s a lot fuzzier and based on expectations (and documentation/the spec).</p>

<p>So C++’s <code class="language-plaintext highlighter-rouge">std::unique_ptr&lt;T&gt;</code> is “safe” in the sense that it does what you expect but
if you use it in a way counter to how it’s <em>supposed</em> to be used (constructing one from an invalid pointer, for example)
it can blow up. This is still a useful sense of safety, and is how one regularly reasons about safety in C++. However it’s not
the same sense of the term as used in Rust, which can be a bit more formal about what the expectations
actually are.</p>

<p>So <code class="language-plaintext highlighter-rouge">unsafe</code> in Rust is a strictly more general concept – all code exhibiting undefined behavior in Rust is also “unsafe”,
however not all “unsafe” code in Rust exhibits undefined behavior as written in the current program.</p>

<p>Rust furthermore attempts to guarantee that you will not trigger undefined behavior if you do not use <code class="language-plaintext highlighter-rouge">unsafe {}</code> blocks.
This of course depends on the correctness of the compiler (it has bugs) and of the libraries you use (they may also have bugs)
but this compartmentalization gets you most of the way there in having UB-free programs.</p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>Once again in we have a slight difference between an “<code class="language-plaintext highlighter-rouge">unsafe fn</code>”, i.e. a function that needs an <code class="language-plaintext highlighter-rouge">unsafe</code> block to call and probably is unsafe, and an “unsafe function”, a function that exhibits unsafe behavior. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>This caveat and the confusing dual-usage of the term “safe” lead to the rather tautological-sounding sentence “Safe Rust code is Rust code that cannot cause undefined behavior when used in safe Rust code” <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Font-size: An Unexpectedly Complex CSS Property]]></title>
    <link href="http://manishearth.github.io/blog/2017/08/10/font-size-an-unexpectedly-complex-css-property/"/>
    <updated>2017-08-10T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2017/08/10/font-size-an-unexpectedly-complex-css-property</id>
    <content type="html"><![CDATA[<p><a href="https://developer.mozilla.org/en/docs/Web/CSS/font-size"><code class="language-plaintext highlighter-rouge">font-size</code></a> is the worst.</p>

<p>It’s a CSS property probably everyone who writes CSS has used at some point. It’s pretty ubiquitous.</p>

<p>And it’s <em>super</em> complicated.</p>

<p>“But it’s just a number”, you say. “How can that be complicated?”</p>

<p>I too felt that way one time. And then I worked on implementing it for <a href="https://wiki.mozilla.org/Quantum/Stylo">stylo</a>.</p>

<p>Stylo is the project to integrate <a href="http://github.com/servo/servo/">Servo</a>’s styling system into Firefox. The styling system handles
parsing CSS, determining which rules apply to which elements, running this through the cascade,
and eventually computing and assigning styles to individual elements in the tree. This happens not
only on page load, but also whenever various kinds of events (including DOM manipulation) occur,
and is a nontrivial portion of pageload and interaction times.</p>

<p>Servo is in <a href="https://rust-lang.org">Rust</a>, and makes use of Rust’s safe parallelism in many places, one of them being
styling. Stylo has the potential to bring these speedups into Firefox, along with the added safety
of the code being in a safer systems language.</p>

<p>Anyway, as far as the styling system is concerned, I believe that font-size is the most complex
property it has to handle. Some properties may be more complicated when it comes to layout or
rendering, but font-size is probably the most complex one in the department of styling.</p>

<p>I’m hoping this post can give an idea of how complex the Web can <em>get</em>, and also serve as documentation
for some of these complexities. I’ll also try to give an idea of how the styling system works throughout this post.</p>

<p>Alright. Let’s see what is so complex about font-size.</p>

<h2 id="the-basics">The basics</h2>

<p>The syntax of the property is pretty straightforward. You can specify it as:</p>

<ul>
  <li>A length (<code class="language-plaintext highlighter-rouge">12px</code>, <code class="language-plaintext highlighter-rouge">15pt</code>, <code class="language-plaintext highlighter-rouge">13em</code>, <code class="language-plaintext highlighter-rouge">4in</code>, <code class="language-plaintext highlighter-rouge">8rem</code>)</li>
  <li>A percentage (<code class="language-plaintext highlighter-rouge">50%</code>)</li>
  <li>A compound of the above, via a calc (<code class="language-plaintext highlighter-rouge">calc(12px + 4em + 20%)</code>)</li>
  <li>An absolute keyword (<code class="language-plaintext highlighter-rouge">medium</code>, <code class="language-plaintext highlighter-rouge">small</code>, <code class="language-plaintext highlighter-rouge">large</code>, <code class="language-plaintext highlighter-rouge">x-large</code>, etc)</li>
  <li>A relative keyword (<code class="language-plaintext highlighter-rouge">larger</code>, <code class="language-plaintext highlighter-rouge">smaller</code>)</li>
</ul>

<p>The first three are common amongst quite a few length-related properties. Nothing abnormal in the syntax.</p>

<p>The next two are interesting. Essentially, the absolute keywords map to various pixel values, and match
the result of <code class="language-plaintext highlighter-rouge">&lt;font size=foo&gt;</code> (e.g. <code class="language-plaintext highlighter-rouge">size=3</code> is the same as <code class="language-plaintext highlighter-rouge">font-size: medium</code>). The <em>actual</em> value they map to
is not straightforward, and I’ll get to that later in this post.</p>

<p>The relative keywords basically scale the size up or down. The mechanism of the scaling was also complex, however
this has changed. I’ll get to that too.</p>

<h2 id="em-and-rem-units">em and rem units</h2>

<p>First up: <code class="language-plaintext highlighter-rouge">em</code> units. One of the things you can specify in <em>any</em> length-based CSS property is a value with an <code class="language-plaintext highlighter-rouge">em</code> or <code class="language-plaintext highlighter-rouge">rem</code>
unit.</p>

<p><code class="language-plaintext highlighter-rouge">5em</code> means “5 times the <code class="language-plaintext highlighter-rouge">font-size</code> of the element this is applied to”. <code class="language-plaintext highlighter-rouge">5rem</code> means “5 times the font-size of the root element”</p>

<p>The implications of this are that font-size needs to be computed before all the other properties (well, not quite, but we’ll get to that!)
so that it is available during that time.</p>

<p>You can also use <code class="language-plaintext highlighter-rouge">em</code> units within <code class="language-plaintext highlighter-rouge">font-size</code> itself. In this case, it computed relative to the font-size of the <em>parent</em> element, since
you can’t use the font-size of the element to compute itself. (This is identical to using a percentage unit)</p>

<h2 id="minimum-font-size">Minimum font size</h2>

<p>Browsers let you set a “minimum” font size in their preferences, and text will not be scaled below it. It’s useful for those with
trouble seeing small text.</p>

<p>However, this doesn’t affect properties which depend on font-size via <code class="language-plaintext highlighter-rouge">em</code> units. So if you’re using a minimum font size,
<code class="language-plaintext highlighter-rouge">&lt;div style="font-size: 1px; height: 1em; background-color: red"&gt;</code> will have a very tiny height (which you’ll notice from the color),
but the text will be clamped to the minimum size.</p>

<p>What this effectively means is that you need to keep track of <em>two</em> separate computed font size values. There’s one value that
is used to actually determine the font size used for the text, and one value that is used whenever the style system needs to
know the font-size (e.g. to compute an <code class="language-plaintext highlighter-rouge">em</code> unit.)</p>

<p>This gets slightly more complicated when <a href="https://en.wikipedia.org/wiki/Ruby_character">ruby</a> is involved. In ideographic scripts (usually, Han
and Han-based scripts like Kanji or Hanja) it’s sometimes useful to have the pronunciation
of each character above it in a phonetic script, for the aid of readers without proficiency in that
script, and this is known as “ruby” (“furigana” in Japanese). Because these scripts are ideographic,
it’s not uncommon for learners to know the pronunciation of a word but have no idea how to write it.
An example would be <ruby><rb>日</rb><rt>に</rt><rb>本</rb><rt>ほん</rt></ruby>, which is 日本 (“nihon”,
i.e. “Japan”) in Kanji with ruby にほん in the phonetic Hiragana script above it.</p>

<p>As you can probably see, the phonetic ruby text is in a smaller font size (usually 50% of the font
size of the main text<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">1</a></sup>). The minimum font-size support <em>respects</em> this, and ensures that if the ruby
is supposed to be <code class="language-plaintext highlighter-rouge">50%</code> of the size of the text, the minimum font size for the ruby is <code class="language-plaintext highlighter-rouge">50%</code> of the
original minimum font size. This avoids clamped text from looking like <ruby><rb>日</rb><rt style="font-size: 1em">に</rt><rb>本</rb><rt style="font-size: 1em">ほん</rt></ruby> (where both get set to
the same size), which is pretty ugly.</p>

<h2 id="text-zoom">Text zoom</h2>

<p>Firefox additionally lets you zoom text only when zooming. If you have trouble reading small things, it’s great to
be able to just blow up the text on the page without having the whole page get zoomed (which means you need to scroll
around a lot).</p>

<p>In this case, <code class="language-plaintext highlighter-rouge">em</code> units of other properties <em>do</em> get zoomed as well. After all, they’re supposed to be relative to the text’s font
size (and may have some relation to the text), so if that size has changed so should they.</p>

<p>(Of course, that argument could also apply to the min font size stuff. I don’t have an answer for why it doesn’t.)</p>

<p>This is actually pretty straightforward to implement. When computing absolute font sizes (including
keywords), zoom them if text zoom is on. For everything else continue as normal.</p>

<p>Text zoom is also disabled within <code class="language-plaintext highlighter-rouge">&lt;svg:text&gt;</code> elements, which leads to some trickiness here.</p>

<h2 id="interlude-how-the-style-system-works">Interlude: How the style system works</h2>

<p>Before I go ahead it’s probably worth giving a quick overview of how everything works.</p>

<p>The responsibiltiy of a style system is to take in CSS code and a DOM tree, and assign computed styles to each element.</p>

<p>There’s a distinction between “specified” and “computed” here. “specified” styles are in the format
you specify in CSS, whereas computed styles are those that get attached to the elements, sent to
layout, and inherited from. A given specified style may compute to different values when applied to
different elements.</p>

<p>So while you can <em>specify</em> <code class="language-plaintext highlighter-rouge">width: 5em</code>, it will compute to something like <code class="language-plaintext highlighter-rouge">width: 80px</code>. Computed values are usually a
cleaned up form of the specified value.</p>

<p>The style system will first parse the CSS, producing a bunch of rules usually containing declarations (a declaration is like <code class="language-plaintext highlighter-rouge">width: 20%;</code>; i.e. a property name and a specified value)</p>

<p>It then goes through the tree in top-down order (this is parallelized in Stylo), figuring out which declarations <em>apply</em> to each element
and in which order – some declarations have precedence over others. Then it will compute each relevant declaration against the element’s style (and parent style, among other bits of info),
and store this value in the element’s “computed style”.</p>

<p>There are a bunch of optimizations that Gecko and Servo do here to avoid duplicated work<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">2</a></sup>. There’s a
bloom filter for quickly checking if deep descendent selectors apply to a subtree. There’s a “rule
tree” that helps cache effort from determining applicable declarations. Computed styles are
reference counted and shared very often (since the default state is to inherit from the parent or
from the default style).</p>

<p>But ultimately, this is the gist of what happens.</p>

<h2 id="keyword-values">Keyword values</h2>

<p>Alright, this is where it gets complicated.</p>

<p>Remember when I said <code class="language-plaintext highlighter-rouge">font-size: medium</code> was a thing that mapped to a value?</p>

<p>So what does it map to?</p>

<p>Well, it turns out, it depends on the font family. For the following HTML:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;span</span> <span class="na">style=</span><span class="s">"font: medium monospace"</span><span class="nt">&gt;</span>text<span class="nt">&lt;/span&gt;</span>
<span class="nt">&lt;span</span> <span class="na">style=</span><span class="s">"font: medium sans-serif"</span><span class="nt">&gt;</span>text<span class="nt">&lt;/span&gt;</span>
</code></pre></div></div>

<p>you get (<a href="https://codepen.io/anon/pen/RZgxjw">codepen</a>)</p>

<div style="border: 1px solid black; display: inline-block; padding: 15px;">
<span style="font: medium monospace">text</span>
<span style="font: medium sans-serif">text</span>
</div>

<p>where the first one computes to a font-size of 13px, and the second one computes to a font-size of
16px. You can check this in the computed style pane of your devtools, or by using
<code class="language-plaintext highlighter-rouge">getComputedStyle()</code>.</p>

<p>I <em>think</em> the reason behind this is that monospace fonts tend to be wider, so the default font size (medium)
is scaled so that they have similar widths, and all other keyword font sizes get shifted as well. The final result is something like this:</p>

<p><img class="center" src="http://manishearth.github.io/images/post/font-size-table.png" width="600" /></p>

<p>Firefox and Servo have <a href="https://github.com/servo/servo/blob/d415617a5bbe65a73bd805808a7ac76f38a1861c/components/style/properties/longhand/font.mako.rs#L763-L774">a matrix</a> that helps derive the values for all the absolute
font-size keywords based on the “base size” (i.e. the computed of <code class="language-plaintext highlighter-rouge">font-size: medium</code>). Actually,
Firefox has <a href="http://searchfox.org/mozilla-central/rev/c329d562fb6c6218bdb79290faaf015467ef89e2/layout/style/nsRuleNode.cpp#3272-3341">three tables</a> to support some legacy use cases like quirks mode (Servo has
yet to add support for these tables). We query other parts of the browser for what the “base size”
is based on the language and font family.</p>

<p>Wait, but what does the language have to do with this anyway? How does the language impact font-size?</p>

<p>It turns out that the base size depends on the font family <em>and</em> the language, and you can configure this.</p>

<p>Both Firefox and Chrome (using an extension) actually let you tweak which fonts get used on a per-language basis,
<em>as well as the default (base) font-size</em>.</p>

<p>This is not as obscure as one might think. Default system fonts are often really ugly for non-Latin-
using scripts. I have a separate font installed that produces better-looking Devanagari ligatures.</p>

<p>Similarly, some scripts are just more intricate than Latin. My default font size for Devanagari is
set to 18 instead of 16. I’ve started learning Mandarin and I’ve set that font size to 18 as well. Hanzi glyphs
can get pretty complicated and I still struggle to learn (and later recognize) them. A larger font size is great for this.</p>

<p>Anyway, this doesn’t complicate things too much.  This does mean that the font family needs to be
computed before font-size, which already needs to be computed before most other properties. The
language, which can be set using a <code class="language-plaintext highlighter-rouge">lang</code> HTML attribute, is internally treated as a CSS property by
Firefox since it inherits, and it must be computed earlier as well.</p>

<p>Not too bad. So far.</p>

<p>Now here’s the kicker. This <em>dependence</em> on the language and family <em>inherits</em>.</p>

<p>Quick, what’s the font-size of the inner <code class="language-plaintext highlighter-rouge">div</code>?</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;div</span> <span class="na">style=</span><span class="s">"font-size: medium; font-family: sans-serif;"</span><span class="nt">&gt;</span> <span class="c">&lt;!-- base size 16 --&gt;</span>
    font size is 16px
    <span class="nt">&lt;div</span> <span class="na">style=</span><span class="s">"font-family: monospace"</span><span class="nt">&gt;</span> <span class="c">&lt;!-- base size 13 --&gt;</span>
        font size is ??
    <span class="nt">&lt;/div&gt;</span>
<span class="nt">&lt;/div&gt;</span>
</code></pre></div></div>

<p>For a normal inherited CSS property<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">3</a></sup>, if the parent has a computed value of <code class="language-plaintext highlighter-rouge">16px</code>,
and the child has no additional values specified, the child will inherit a value of <code class="language-plaintext highlighter-rouge">16px</code>.
<em>Where</em> the parent got that computed value from doesn’t matter.</p>

<p>Here, <code class="language-plaintext highlighter-rouge">font-size</code> “inherits” a value of <code class="language-plaintext highlighter-rouge">13px</code>. You can see this below (<a href="https://codepen.io/anon/pen/MvorQQ">codepen</a>):</p>

<div style="border: 1px solid black; display: inline-block; padding: 15px;">
<div style="font-size: medium; font-family: sans-serif;"> <!-- base size 16 -->
    font size is 16px
    <div style="font-family: monospace"> <!-- base size 13 -->
        font size is ??
    </div>
</div>
</div>

<p>Basically, if the computed value originated from a keyword, whenever the font family or language
change, font-size is recomputed from the original keyword with the new font family and language.</p>

<p>The reason this exists is because otherwise the differing font sizes wouldn’t work anyway! The default font size
is <code class="language-plaintext highlighter-rouge">medium</code>, so basically the root element gets a <code class="language-plaintext highlighter-rouge">font-size: medium</code> and all elements inherit from it. If you change
to monospace or a different language in the document you need the font-size recomputed.</p>

<p>But it doesn’t stop here. This even inherits <em>through relative units</em> (Not in IE).</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;div</span> <span class="na">style=</span><span class="s">"font-size: medium; font-family: sans-serif;"</span><span class="nt">&gt;</span> <span class="c">&lt;!-- base size 16 --&gt;</span>
    font size is 16px
    <span class="nt">&lt;div</span> <span class="na">style=</span><span class="s">"font-size: 0.9em"</span><span class="nt">&gt;</span> <span class="c">&lt;!-- could also be font-size: 50%--&gt;</span>
        font size is 14.4px (16 * 0.9)
        <span class="nt">&lt;div</span> <span class="na">style=</span><span class="s">"font-family: monospace"</span><span class="nt">&gt;</span> <span class="c">&lt;!-- base size 13 --&gt;</span>
            font size is 11.7px! (13 * 0.9)
        <span class="nt">&lt;/div&gt;</span>
    <span class="nt">&lt;/div&gt;</span>
<span class="nt">&lt;/div&gt;</span>
</code></pre></div></div>

<p>(<a href="https://codepen.io/anon/pen/oewpER">codepen</a>)</p>

<div style="border: 1px solid black; display: inline-block; padding: 15px;">
<div style="font-size: medium; font-family: sans-serif;"> <!-- base size 16 -->
    font size is 16px
    <div style="font-size: 0.9em"> <!-- could also be font-size: 90%-->
        font size is 14.4px (16 * 0.9)
        <div style="font-family: monospace"> <!-- base size 13 -->
            font size is 11.7px! (13 * 0.9)
        </div>
    </div>
</div>
</div>

<p>So we’re actually inheriting a font-size of <code class="language-plaintext highlighter-rouge">0.9*medium</code> when we inherit from the second div, not <code class="language-plaintext highlighter-rouge">14.4px</code>.</p>

<p>Another way of looking at it is whenever the font family or language changes, you should recompute the font-size as if the language and family <em>were always that way</em> up the tree.</p>

<p>Firefox code uses both of these strategies. The original Gecko style system handles this by actually
going back to the top of the tree and recalculating the font size as if the language/family were
different. I suspect this is inefficient, but the rule tree seems to be involved in making this slightly
more efficient</p>

<p>Servo, on the other hand, stores some extra data on the side when computing stuff, data which gets copied over to the child element. It basically
stores the equivalent of saying “Yes, this font was computed from a keyword. The keyword was <code class="language-plaintext highlighter-rouge">medium</code>, and after that we applied a factor of 0.9 to it.”<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">4</a></sup></p>

<p>In both cases, this leads to a bunch of complexities in all the <em>other</em> font-size complexities, since they need to be carefully preserved through this.</p>

<p>In Servo, <em>most</em> of this gets handled <a href="https://github.com/servo/servo/blob/53c6f8ea8bf1002d0c99c067601fe070dcd6bcf1/components/style/properties/longhand/font.mako.rs#L964-L1061">via custom cascading functions for font-size</a>.</p>

<h2 id="largersmaller">Larger/smaller</h2>

<p>So I mentioned that <code class="language-plaintext highlighter-rouge">font-size: larger</code> and <code class="language-plaintext highlighter-rouge">smaller</code> scale the size, but didn’t mention by what fraction.</p>

<p>According <a href="https://drafts.csswg.org/css-fonts-3/#relative-size-value">to the spec</a>, if the font-size currently matches the value of an absolute keyword size (medium/large/etc),
you should pick the value of the next/previous keyword sizes respectively.</p>

<p>If it is <em>between</em> two, find the same point between the next/previous two sizes.</p>

<p>This, of course, must play well with the weird inheritance of keyword font sizes mentioned before. In Gecko’s model this isn’t too hard,
since Gecko recalculates things anyway. In Servo’s model we’d have to store a sequence of applications of <code class="language-plaintext highlighter-rouge">larger</code>/<code class="language-plaintext highlighter-rouge">smaller</code> and relative
units, instead of storing just a relative unit.</p>

<p>Additionally, when computing this during text-zoom, you have to unzoom before looking it up in the table, and then rezoom.</p>

<p>Overall, a bunch of complexity for not much gain — turns out only Gecko actually followed the spec here! All other browser engines
used simple ratios here.</p>

<p>So my fix here <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1361550">was simply to remove this behavior from Gecko</a>. That simplified things.</p>

<h2 id="mathml">MathML</h2>

<p>Firefox and Safari support MathML, a markup language for math. It doesn’t get used much on the Web these days, but it exists.</p>

<p>MathML has its own complexities when it comes to font-size. Specifically, <code class="language-plaintext highlighter-rouge">scriptminsize</code>, <code class="language-plaintext highlighter-rouge">scriptlevel</code>, and <code class="language-plaintext highlighter-rouge">scriptsizemultiplier</code>.</p>

<p>For example, in MathML, the text in the numerator or denominator of a fraction or the text of a superscript is 0.71 times the size of the text outside of it. This is because
the default <code class="language-plaintext highlighter-rouge">scriptsizemultiplier</code> for MathML elements is 0.71, and these specific elements all get a default scriptlevel of <code class="language-plaintext highlighter-rouge">+1</code>.</p>

<p>Basically, <code class="language-plaintext highlighter-rouge">scriptlevel=+1</code> means “multiply the font size by <code class="language-plaintext highlighter-rouge">scriptsizemultiplier</code>”, and
<code class="language-plaintext highlighter-rouge">scriptlevel=-1</code> is for dividing. This can be specified via a <code class="language-plaintext highlighter-rouge">scriptlevel</code> HTML attribute on an <code class="language-plaintext highlighter-rouge">mstyle</code> element. You can
similarly tweak the (inherited) multiplier via the <code class="language-plaintext highlighter-rouge">scriptsizemultiplier</code> HTML attribute, and the minimum size via <code class="language-plaintext highlighter-rouge">scriptminsize</code>.</p>

<p>So, for example:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;math&gt;&lt;msup&gt;</span>
    <span class="nt">&lt;mi&gt;</span>text<span class="nt">&lt;/mi&gt;</span>
    <span class="nt">&lt;mn&gt;</span>small superscript<span class="nt">&lt;/mn&gt;</span>
<span class="nt">&lt;/msup&gt;&lt;/math&gt;&lt;br&gt;</span>
<span class="nt">&lt;math&gt;</span>
    text
    <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">+1</span><span class="nt">&gt;</span>
        small
        <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">+1</span><span class="nt">&gt;</span>
            smaller
            <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">-1</span><span class="nt">&gt;</span>
                small again
            <span class="nt">&lt;/mstyle&gt;</span>
        <span class="nt">&lt;/mstyle&gt;</span>
    <span class="nt">&lt;/mstyle&gt;</span>
<span class="nt">&lt;/math&gt;</span>
</code></pre></div></div>

<p>will show as (you will need Firefox to see the rendered version, Safari supports MathML too but the support isn’t as good):</p>

<div style="border: 1px solid black; display: inline-block; padding: 15px;">
<math><msup><mi>text</mi><mn>small superscript</mn></msup></math><br />
<math>text&lt;mstyle scriptlevel=+1&gt; small &lt;mstyle scriptlevel=+1&gt; smaller &lt;mstyle scriptlevel=-1&gt; small again &lt;/mstyle&gt;&lt;/mstyle&gt;&lt;/mstyle&gt;</math>
</div>

<p>(<a href="https://codepen.io/anon/pen/BdZJgR">codepen</a>)</p>

<p>So this isn’t as bad. It’s as if <code class="language-plaintext highlighter-rouge">scriptlevel</code> is a weird <code class="language-plaintext highlighter-rouge">em</code> unit. No biggie, we know how to deal with those already.</p>

<p>Except you also have <code class="language-plaintext highlighter-rouge">scriptminsize</code>. This lets you set the minimum font size <em>for changes caused by <code class="language-plaintext highlighter-rouge">scriptlevel</code></em>.</p>

<p>This means that <code class="language-plaintext highlighter-rouge">scriptminsize</code> will make sure <code class="language-plaintext highlighter-rouge">scriptlevel</code> never causes changes that make the font smaller than the min size,
but it will ignore cases where you deliberately specify an <code class="language-plaintext highlighter-rouge">em</code> unit or a pixel value.</p>

<p>There’s already a subtle bit of complexity introduced here, <code class="language-plaintext highlighter-rouge">scriptlevel</code> now becomes another thing
that tweaks how <code class="language-plaintext highlighter-rouge">font-size</code> inherits. Fortunately, in Firefox/Servo internally <code class="language-plaintext highlighter-rouge">scriptlevel</code> (as are
<code class="language-plaintext highlighter-rouge">scriptminsize</code> and <code class="language-plaintext highlighter-rouge">scriptsizemultiplier</code>) is also handled as a CSS property, which means that we
can use the same framework we used for font-family and language here – compute the script
properties before font-size, and if <code class="language-plaintext highlighter-rouge">scriptlevel</code> is set, force-recalculate the font size even if
font-size itself was not set.</p>

<h3 id="interlude-early-and-late-computed-properties">Interlude: early and late computed properties</h3>

<p>In Servo the way we handle dependencies in properties is to have a set of “early” properties and a
set of “late” properties (which are allowed to depend on early properties). We iterate the
declarations twice, once looking for early properties, and once for late. However, now we have a
pretty intricate set of dependencies, where font-size must be calculated after language, font-family,
and the script properties, but before everything else that involves lengths. Additionally, font-family
has to be calculated after all the other early properties due to another font complexity I’m not covering here.</p>

<p>The way we handle this is to <a href="https://github.com/servo/servo/blob/53c6f8ea8bf1002d0c99c067601fe070dcd6bcf1/components/style/properties/properties.mako.rs#L3195-L3204">pull font-size and font-family</a> out during the early computation,
but not deal with them until <a href="https://github.com/servo/servo/blob/53c6f8ea8bf1002d0c99c067601fe070dcd6bcf1/components/style/properties/properties.mako.rs#L3211-L3327">after the early computation is done</a>.</p>

<p>At that stage we first <a href="https://github.com/servo/servo/blob/53c6f8ea8bf1002d0c99c067601fe070dcd6bcf1/components/style/properties/properties.mako.rs#L3219-L3233">handle the disabling of text-zoom</a>, and then handle <a href="https://github.com/servo/servo/blob/53c6f8ea8bf1002d0c99c067601fe070dcd6bcf1/components/style/properties/properties.mako.rs#L3235-L3277">the complexities of font-family</a>.</p>

<p>We then <a href="https://github.com/servo/servo/blob/53c6f8ea8bf1002d0c99c067601fe070dcd6bcf1/components/style/properties/properties.mako.rs#L3280-L3303">compute the font family</a>. If a font size was specified, we <a href="https://github.com/servo/servo/blob/53c6f8ea8bf1002d0c99c067601fe070dcd6bcf1/components/style/properties/properties.mako.rs#L3305-L3309">just compute that</a>. If it
was not, but a font family, lang, or scriptlevel was specified, we <a href="https://github.com/servo/servo/blob/53c6f8ea8bf1002d0c99c067601fe070dcd6bcf1/components/style/properties/properties.mako.rs#L3310-L3324">force compute as inherited</a>, which handles all the constraints.</p>

<h3 id="why-scriptminsize-gets-complicated">Why scriptminsize gets complicated</h3>

<p>Unlike with the other “minimum font size”, using an <code class="language-plaintext highlighter-rouge">em</code> unit in any property will calculate the
length with the clamped value, not the “if nothing had been clamped” value, when the font size has
been clamped with scriptminsize. So at first glance handling this seems straightforward; only
consider the script min size when deciding to scale because of scriptlevel.</p>

<p>As always, it’s not that simple 😀:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;math&gt;</span>
<span class="nt">&lt;mstyle</span> <span class="na">scriptminsize=</span><span class="s">"10px"</span> <span class="na">scriptsizemultiplier=</span><span class="s">"0.75"</span> <span class="na">style=</span><span class="s">"font-size:20px"</span><span class="nt">&gt;</span>
    20px
    <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">"+1"</span><span class="nt">&gt;</span>
        15px
        <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">"+1"</span><span class="nt">&gt;</span>
            11.25px
                <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">"+1"</span><span class="nt">&gt;</span>
                    would be 8.4375, but is clamped at 10px
                        <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">"+1"</span><span class="nt">&gt;</span>
                            would be 6.328125, but is clamped at 10px
                                <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">"-1"</span><span class="nt">&gt;</span>
                                    This is not 10px/0.75=13.3, rather it is still clamped at 10px
                                        <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">"-1"</span><span class="nt">&gt;</span>
                                            This is not 10px/0.75=13.3, rather it is still clamped at 10px
                                            <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">"-1"</span><span class="nt">&gt;</span>
                                                This is 11.25px again
                                                    <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">"-1"</span><span class="nt">&gt;</span>
                                                        This is 15px again
                                                    <span class="nt">&lt;/mstyle&gt;</span>
                                            <span class="nt">&lt;/mstyle&gt;</span>
                                        <span class="nt">&lt;/mstyle&gt;</span>
                                <span class="nt">&lt;/mstyle&gt;</span>
                        <span class="nt">&lt;/mstyle&gt;</span>
                <span class="nt">&lt;/mstyle&gt;</span>
        <span class="nt">&lt;/mstyle&gt;</span>
    <span class="nt">&lt;/mstyle&gt;</span>
<span class="nt">&lt;/mstyle&gt;</span>
<span class="nt">&lt;/math&gt;</span>
</code></pre></div></div>

<p>(<a href="https://codepen.io/anon/pen/wqepjo">codepen</a>)</p>

<p>Basically, if you increase the level a bunch of times after hitting the min size, decreasing it by one should not immediately
compute <code class="language-plaintext highlighter-rouge">min size / multiplier</code>. That would make things asymmetric; something with a net script level of <code class="language-plaintext highlighter-rouge">+5</code> should
have the same size as something with a net script level of <code class="language-plaintext highlighter-rouge">+6 -1</code>, provided the multiplier hasn’t changed.</p>

<p>So what happens is that the script level is calculated against the font size <em>as if scriptminsize had never applied</em>,
and we only use that size if it is greater than the min size.</p>

<p>It’s not just a matter of keeping track of the script level at which clamping happened – the multiplier could change
in the process and you need to keep track of that too. So this ends up in creating <em>yet another font-size value to inherit</em>.</p>

<p>To recap, we are now at <em>four</em> different notions of font size being inherited:</p>

<ul>
  <li>The main font size used by styling</li>
  <li>The “actual” font size, i.e. the main font size but clamped by the min size</li>
  <li>(In servo only) The “keyword” size; i.e. the size stored as a keyword and ratio, if it was derived from a keyword</li>
  <li>The “script unconstrained” size; the font size as if scriptminsize never existed.</li>
</ul>

<p>Another complexity here is that the following should still work:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;math&gt;</span>
<span class="nt">&lt;mstyle</span> <span class="na">scriptminsize=</span><span class="s">"10px"</span> <span class="na">scriptsizemultiplier=</span><span class="s">"0.75"</span> <span class="na">style=</span><span class="s">"font-size: 5px"</span><span class="nt">&gt;</span>
    5px
    <span class="nt">&lt;mstyle</span> <span class="na">scriptlevel=</span><span class="s">"-1"</span><span class="nt">&gt;</span>
        6.666px
    <span class="nt">&lt;/mstyle&gt;</span>
<span class="nt">&lt;/mstyle&gt;</span>
<span class="nt">&lt;/math&gt;</span>
</code></pre></div></div>

<p>(<a href="https://codepen.io/anon/pen/prwpVd">codepen</a>)</p>

<p>Basically, if you were already below the scriptminsize, reducing the script level (to increase the font size) should not get clamped, since then you’d get something too large.</p>

<p>This basically means you only apply scriptminsize if you are applying the script level to a value <em>greater than</em> the script min size.</p>

<p>In Servo, all of the MathML handling culminates in <a href="https://github.com/servo/servo/blob/53c6f8ea8bf1002d0c99c067601fe070dcd6bcf1/components/style/properties/gecko.mako.rs#L2304-L2403">this wonderful function that is more comment than code</a>, and
some code in the functions near it.</p>

<hr />

<p>So there you have it. <code class="language-plaintext highlighter-rouge">font-size</code> is actually pretty complicated. A lot of the web platform has hidden complexities like this, and it’s always fun to encounter more of them.</p>

<p>(Perhaps less fun when I have to implement them 😂)</p>

<p><em>Thanks to mystor, mgattozzi, bstrie, and projektir for reviewing drafts of this post</em></p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:5" role="doc-endnote">
      <p>Interestingly, in Firefox, this number is 50% for all ruby <em>except</em> for when the language is Taiwanese Mandarin (where it is 30%). This is because Taiwan uses a phonetic script called Bopomofo, and each Han glyph can be represented as a maximum of 3 Bopomofo letters. So it is possible to choose a reasonable minimum size such that the ruby never extends the size of the glyph below it. On the other hand, pinyin can be up to six letters, and Hiragana up to (I think) 5, and the corresponding “no overflow” scaling will be too tiny. So fitting them on top of the glyph is not a consideration and instead we elect to have a larger font size for better readability. Additionally, Bopomofo ruby is often set on the side of the glyph instead of on top, and 30% works better there. (h/t @upsuper for pointing this out) <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:1" role="doc-endnote">
      <p>Other browser engines have other optimizations, I’m just less familiar with them <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Some properties are inherited, some are “reset”. For example, <code class="language-plaintext highlighter-rouge">font-family</code> is inherited — child elements inherit font family from the parent unless otherwise specified. However <code class="language-plaintext highlighter-rouge">transform</code> is not, if you transform an element that does not further transform the children. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>This won’t handle <code class="language-plaintext highlighter-rouge">calc</code>s, which is something I need to fix. Fixing this is trivial, you store an absolute offset in addition to the ratio. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Teaching Programming: Proactive vs Reactive]]></title>
    <link href="http://manishearth.github.io/blog/2017/05/19/teaching-programming-proactive-vs-reactive/"/>
    <updated>2017-05-19T00:00:00+00:00</updated>
    <id>http://manishearth.github.io/blog/2017/05/19/teaching-programming-proactive-vs-reactive</id>
    <content type="html"><![CDATA[<p>I’ve been thinking about this a lot these days. In part because of <a href="https://github.com/Manishearth/rust-clippy/issues/1737">an idea I had</a>
but also due to <a href="https://twitter.com/sehurlburt/status/863829482645340160">this twitter discussion</a>.</p>

<p>When teaching most things, there are two non-mutually-exclusive ways of approaching the problem. One
is “proactive”<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>, which is where the teacher decides a learning path beforehand, and executes it. The
other is “reactive”, where the teacher reacts to the student trying things out and dynamically
tailors the teaching experience.</p>

<p>Most in-person teaching experiences are a mix of both. Planning beforehand is very important whilst teaching,
but tailoring the experience to the student’s reception of the things being taught is important too.</p>

<p>In person, you <em>can</em> mix these two, and in doing so you get a “best of both worlds” situation. Yay!</p>

<p>But … we don’t really learn much programming in a classroom setup.
Sure, some folks learn the basics in college for a few years, but everything
they learn after that isn’t in a classroom situation where this can work<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>.
I’m an autodidact,
and while I have taken a few programming courses for random interesting things, I’ve taught myself most of what I know
using various sources. I care a lot about improving the situation here.</p>

<p>With self-driven learning we have a similar divide. The “proactive” model corresponds to reading books
and docs. Various people have proactively put forward a path for learning in the form of a book
or tutorial. It’s up to you to pick one, and follow it.</p>

<p>The “reactive” model is not so well-developed. In the context of self-driven learning in programming,
it’s basically “do things, make mistakes, hope that Google/Stackoverflow help”. It’s how
a lot of people learn programming; and it’s how I prefer to learn programming.</p>

<p>It’s very nice to be able to “learn along the way”. While this is a long and arduous process,
involving many false starts and a lack of a sense of progress, it can be worth it in terms of
the kind of experience this gets you.</p>

<p>But as I mentioned, this isn’t as well-developed. With the proactive approach, there still
is a teacher – the author of the book! That teacher may not be able to respond in real time,
but they’re able to set forth a path for you to work through.</p>

<p>On the other hand, with the “reactive” approach, there is no teacher. Sure, there are
Random Answers on the Internet, which are great, but they don’t form a coherent story.
Neither can you really be your own teacher for a topic you do not understand.</p>

<p>Yet plenty of folks do this. Plenty of folks approach things like learning a new language by reading
at most two pages of docs and then just diving straight in and trying stuff out. The only language I
have not done this for is the first language I learned<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup> <sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup>.</p>

<p>I think it’s unfortunate that folks who prefer this approach don’t get the benefit of a teacher.
In the reactive approach, teachers can still tell you what you’re doing wrong and steer you away from
tarpits of misunderstanding. They can get you immediate answers and guidance. When we look
for answers on stackoverflow, we get some of this, but it also involves a lot of pattern-matching
on the part of the student, and we end up with a bad facsimile of what a teacher can do for you.</p>

<p>But it’s possible to construct a better teacher for this!</p>

<p>In fact, examples of this exist in the wild already!</p>

<p>The Elm compiler is my favorite example of this. <a href="http://elm-lang.org/blog/compilers-as-assistants">It has amazing error messages</a></p>

<p><img class="center" src="http://manishearth.github.io/images/post/elm-error.png" />
<img class="center" src="http://manishearth.github.io/images/post/elm-error2.png" /></p>

<p>The error messages tell you what you did wrong, sometimes suggest fixes, and help
correct potential misunderstandings.</p>

<p>Rust does this too. Many compilers do. (Elm is exceptionally good at it)</p>

<p><img class="center" src="http://manishearth.github.io/images/post/rust-error.png" width="700" /></p>

<p>One thing I particularly like about Rust is that from that error you can
try <code class="language-plaintext highlighter-rouge">rustc --explain E0373</code> and get a terminal-friendly version
of <a href="https://doc.rust-lang.org/nightly/error-index.html#E0373">this help text</a>.</p>

<p>Anyway, diagnostics basically provide a reactive component to learning programming. I’ve cared about
diagnostics in Rust for a long time, and I often remind folks that many things taught through the
docs can/should be taught through diagnostics too. Especially because diagnostics are a kind of soapbox
for compiler writers — you can’t guarantee that your docs will be read, but you can guarantee
that your error messages will. These days, while I don’t have much time to work on stuff myself I’m
very happy to mentor others working on improving diagnostics in Rust.</p>

<p>Only recently did I realize <em>why</em> I care about them so much – they cater exactly to my approach
to learning programming languages! If I’m not going to read the docs when I get started and try the
reactive approach, having help from the compiler is invaluable.</p>

<p>I think this space is relatively unexplored. Elm might have the best diagnostics out there,
and as diagnostics (helping all users of a language – new and experienced), they’re great,
but as a teaching tool for newcomers; they still have a long way to go. Of course, compilers
like Rust are even further behind.</p>

<p>One thing I’d like to experiment with is a first-class tool for reactive teaching. In a sense,
<a href="https://github.com/Manishearth/rust-clippy">clippy</a> is already something like this. Clippy looks out for antipatterns, and tries to help
teach. But it also does many other things, and not all are teaching moments are antipatterns.</p>

<p>For example, in C, this isn’t necessarily an antipattern:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">thingy</span> <span class="o">*</span><span class="n">result</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">result</span> <span class="o">=</span> <span class="n">do_the_thing</span><span class="p">())</span> <span class="p">{</span>
    <span class="n">frob</span><span class="p">(</span><span class="o">*</span><span class="n">result</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Many C codebases use <code class="language-plaintext highlighter-rouge">if (foo = bar())</code>. It is a potential footgun if you confuse it with <code class="language-plaintext highlighter-rouge">==</code>,
but there’s no way to be sure. Many compilers now have a warning for this that you can silence by
doubling the parentheses, though.</p>

<p>In Rust, this isn’t an antipattern either:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">add_one</span><span class="p">(</span><span class="k">mut</span> <span class="n">x</span><span class="p">:</span> <span class="nb">u8</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">x</span> <span class="o">+=</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">let</span> <span class="n">num</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="nf">add_one</span><span class="p">(</span><span class="n">num</span><span class="p">);</span>
<span class="c1">// num is still 0</span>
</code></pre></div></div>

<p>For someone new to Rust, they may feel that the way to have a function mutate arguments (like <code class="language-plaintext highlighter-rouge">num</code>) passed to it
is to use something like <code class="language-plaintext highlighter-rouge">mut x: u8</code>. What this actually does is copies <code class="language-plaintext highlighter-rouge">num</code> (because <code class="language-plaintext highlighter-rouge">u8</code> is a <code class="language-plaintext highlighter-rouge">Copy</code> type),
and allows you to mutate the copy within the scope of the function. The right way to make a function that
mutates arguments passed to it by-reference would be to do something like <code class="language-plaintext highlighter-rouge">fn add_one(x: &amp;mut u8)</code>.
If you try the <code class="language-plaintext highlighter-rouge">mut x</code> thing for non-Copy values, you’d get a “reading out of moved value” error
when you try to access <code class="language-plaintext highlighter-rouge">num</code> after calling <code class="language-plaintext highlighter-rouge">add_one</code>. This would help you figure out what you did wrong,
and potentially that error could detect this situation and provide more specific help.</p>

<p>But for <code class="language-plaintext highlighter-rouge">Copy</code> types, this will just compile. And it’s not an antipattern – the way this works
makes complete sense in the context of how Rust variables work, and is something that you do need
to use at times.</p>

<p>So we can’t even warn on this. Perhaps in “pedantic clippy” mode, but really, it’s not
a pattern we want to discourage. (At least in the C example that pattern is one
that many people prefer to forbid from their codebase)</p>

<p>But it would be nice if we could tell a learning programmer “hey, btw, this is what this syntax
means, are you sure you want to do this?”. With explanations and the ability to dismiss the error.</p>

<p>In fact, you don’t even need to restrict this to potential footguns!</p>

<p>You can detect various things the learner is trying to do. Are they probably mixing up <code class="language-plaintext highlighter-rouge">String</code>
and <code class="language-plaintext highlighter-rouge">&amp;str</code>? Help them! Are they writing a trait? Give a little tooltip explaining the feature.</p>

<p>This is beginning to remind me of the original “office assistant” <a href="https://en.wikipedia.org/wiki/Office_Assistant">Clippy</a>, which was super annoying.
But an opt-in tool or IDE feature which gives helpful suggestions could still be nice, especially
if you can strike a balance between being so dense it is annoying and so sparse it is useless.</p>

<p>It also reminds me of well-designed tutorial modes in games. Some games have a tutorial mode that guides you
through a set path of doing things. Other games, however, have a tutorial mode that will give you hints even
if you stray off the beaten path. <a href="https://twitter.com/mgattozzi">Michael</a> tells me that <a href="http://store.steampowered.com/app/480490/Prey/">Prey</a> is
a recent example of such a game.</p>

<p>This really feels like it fits the “reactive” model I prefer. The student gets to mold their own
journey, but gets enough helpful hints and nudges from the “teacher” (the tool) so that they
don’t end up wasting too much time and can make informed decisions on how to proceed learning.</p>

<p>Now, rust-clippy isn’t exactly the place for this kind of tool. This tool needs the ability to globally
“silence” a hint once you’ve learned it. rust-clippy is a linter, and while you can silence lints in
your code, you can’t silence them globally for the current user. Nor does that really make sense.</p>

<p>But rust-clippy does have the infrastructure for writing stuff like this, so it’s an ideal prototyping
point. I’ve filed <a href="https://github.com/Manishearth/rust-clippy/issues/1737">this issue</a> to discuss this topic.</p>

<p>Ultimately, I’d love to see this as an IDE feature.</p>

<p>I’d also like to see more experimentation in the department of “reactive” teaching — not just tools like this.</p>

<p>Thoughts? Ideas? Let me know!</p>

<p><em>thanks to Andre (llogiq) and Michael Gattozzi for reviewing this</em></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>This is how I’m using these terms. There seems to be precedent in pedagogy for the proactive/reactive classification, but it might not be exactly the same as the way I’m using it. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>This is true for everything, but I’m focusing on programming (in particular programming <em>languages</em>) here. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>And when I learned Rust, it only <em>had</em> two pages of docs, aka “The Tutorial”. Good times. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>I do eventually get around to doing a full read of the docs or a book but this is after I’m already able to write nontrivial things in the language, and it takes a lot of time to get there. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
</feed>
