Wrapper Types in Rust: Choosing Your Guarantees

This post is now a part of the official rust book

In my previous post I talked a bit about why the RWlock pattern is important for accessing data, which is why Rust enforces this pattern either at compile time or runtime depending on the abstractions used.

It occurred to me that there are many such abstractions in Rust, each with their unique guarantees. The programmer once again has the choice between runtime and compile time enforcement. It occurred to me that this plethora of “wrapper types”1 could be daunting to newcomers; in this post I intend to give a thorough explanation of what some prominent ones do and when they should be used.

I’m assuming the reader knows about ownership and borrowing in Rust. Nevertheless, I will attempt to keep the majority of this post accessible to those not yet familiar with these concepts. Aside from the two links into the book above, these two blog posts cover the topic in depth.

Basic pointer types

Box<T>

Box<T> is an “owned pointer” or a “box”. While it can hand out borrowed references to the data, it is the only owner of the data. In particular, when something like the following occurs:

Here, the box was moved into y. As x no longer owns it, the compiler will no longer allow the programmer to use x after this. A box can similarly be moved out of a function by returning, and when a box (one which hasn’t been moved) goes out of scope, destructors are run, deallocating the inner data.

This abstraction is a low cost abstraction for dynamic allocation. If you want to allocate some memory on the heap and safely pass a pointer to that memory around, this is ideal. Note that you will only be allowed to share borrowed references to this by the regular borrowing rules, checked at compile time.

Interlude: Copy

Move/ownership semantics are not special to Box<T>; it is a feature of all types which are not Copy.

A Copy type is one where all the data it logically encompasses (usually, owns) is part of its stack representation2. Most types containing pointers to other data are not Copy, since there is additional data elsewhere, and simply copying the stack representation may accidentally share ownership of that data in an unsafe manner.

Types like Vec<T> and String which also have data on the heap are also not Copy. Types like the integer/boolean types are Copy

&T and raw pointers are Copy. Even though they do point to further data, they do not “own” that data. Whereas Box<T> can be thought of as “some data which happens to be dynamically allocated”, &T is thought of as “a borrowing reference to some data”. Even though both are pointers, only the first is considered to be “data”. Hence, a copy of the first should involve a copy of the data (which is not part of its stack representation), but a copy of the second only needs a copy of the reference. &mut T is not Copy because mutable aliases cannot be shared, and &mut T “owns” the data it points to somewhat since it can mutate.

Practically speaking, a type can be Copy if a copy of its stack representation doesn’t violate memory safety.

&T and &mut T

These are immutable and mutable references respectively. They follow the “read-write lock” pattern described in my previous post, such that one may either have only one mutable reference to some data, or any number of immutable ones, but not both. This guarantee is enforced at compile time, and has no visible cost at runtime. In most cases such pointers suffice for sharing cheap references between sections of code.

These pointers cannot be copied in such a way that they outlive the lifetime associated with them.

*const T and *mut T

These are C-like raw pointers with no lifetime or ownership attached to them. They just point to some location in memory with no other restrictions. The only guarantee that these provide is that they cannot be dereferenced except in code marked unsafe.

These are useful when building safe, low cost abstractions like Vec<T>, but should be avoided in safe code.

Rc<T>

This is the first wrapper we will cover that has a runtime cost.

Rc<T> is a reference counted pointer. In other words, this lets us have multiple “owning” pointers to the same data, and the data will be freed (destructors will be run) when all pointers are out of scope.

Internally, it contains a shared “reference count”, which is incremented each time the Rc is cloned, and decremented each time one of the Rcs goes out of scope. The main responsibility of Rc<T> is to ensure that destructors are called for shared data.

The internal data here is immutable, and if a cycle of references is created, the data will be leaked. If we want data that doesn’t leak when there are cycles, we need a garbage collector. I do not know of any existing GCs in Rust, but I am working on one with Michael Layzell and there’s another cycle collecting one being written by Nick Fitzgerald.

Guarantees

The main guarantee provided here is that the data will not be destroyed until all references to it are out of scope.

This should be used when you wish to dynamically allocate and share some data (read-only) between various portions of your program, where it is not certain which portion will finish using the pointer last. It’s a viable alternative to &T when &T is either impossible to statically check for correctness, or creates extremely unergonomic code where the programmer does not wish to spend the development cost of working with.

This pointer is not thread safe, and Rust will not let it be sent or shared with other threads. This lets one avoid the cost of atomics in situations where they are unnecessary.

There is a sister smart pointer to this one, Weak<T>. This is a non-owning, but also non-borrowed, smart pointer. It is also similar to &T, but it is not restricted in lifetime — a Weak<T> can be held on to forever. However, it is possible that an attempt to access the inner data may fail and return None, since this can outlive the owned Rcs. This is useful for when one wants cyclic data structures and other things.

Cost

As far as memory goes, Rc<T> is a single allocation, though it will allocate two extra words as compared to a regular Box<T> (for “strong” and “weak” refcounts).

Rc<T> has the computational cost of incrementing/decrementing the refcount whenever it is cloned or goes out of scope respectively. Note that a clone will not do a deep copy, rather it will simply increment the inner reference count and return a copy of the Rc<T>

Cell types

“Cells” provide interior mutability. In other words, they contain data which can be manipulated even if the type cannot be obtained in a mutable form (for example, when it is behind an &-ptr or Rc<T>).

These types are generally found in struct fields, but they may be found elsewhere too.

Cell<T>

Cell<T> is a type that provides zero-cost interior mutability, but only for Copy types. Since the compiler knows that all the data owned by the contained value is on the stack, there’s no worry of leaking any data behind references (or worse!) by simply replacing the data.

It is still possible to violate your own invariants using this wrapper, so be careful when using it. If a field is wrapped in Cell, it’s a nice indicator that the chunk of data is mutable and may not stay the same between the time you first read it and when you intend to use it.

Note that here we were able to mutate the same value from various immutable references.

This has the same runtime cost as the following:

but it has the added benefit of actually compiling successfully.

Guarantees

This relaxes the “no aliasing with mutability” restriction in places where it’s unnecessary. However, this also relaxes the guarantees that the restriction provides; so if one’s invariants depend on data stored within Cell, one should be careful.

This is useful for mutating primitives and other Copy types when there is no easy way of doing it in line with the static rules of & and &mut.

Gábor Lehel summed up the guarantees provided by Cell in a rather succinct manner:

The basic guarantee we need to ensure is that interior references can’t be invalidated (left dangling) by mutation of the outer structure. (Think about references to the interiors of types like Option, Box, Vec, etc.) &, &mut, and Cell each make a different tradeoff here. & allows shared interior references but forbids mutation; &mut allows mutation xor interior references but not sharing; Cell allows shared mutability but not interior references.

Ultimately, while shared mutability can cause many logical errors (as outlined in my previous post ), it can only cause memory safety errors when coupled with “interior references”. This is for types who have an “interior” whose type/size can itself be changed. One example of this is a Rust enum; where by changing the variant you can change what type is contained. If you have an alias to the inner type whilst the variant is changed, pointers within that alias may be invalidated. Similarly, if you change the length of a vector while you have an alias to one of its elements, that alias may be invalidated.

Since Cell doesn’t allow references to the insides of a type (you can only copy out and copy back in), enums and structs alike are safe to be aliased mutably within this.

This comment by Eddy also touches on the guarantees of Cell and the alternatives

Cost

There is no runtime cost to using Cell<T>, however if one is using it to wrap larger (Copy) structs, it might be worthwhile to instead wrap individual fields in Cell<T> since each write is a full copy of the struct.

RefCell<T>

RefCell<T> also provides interior mutability, but isn’t restricted to Copy types.

Instead, it has a runtime cost. RefCell<T> enforces the RWLock pattern at runtime (it’s like a single-threaded mutex), unlike &T/&mut T which do so at compile time. This is done by the borrow() and borrow_mut() functions, which modify an internal reference count and return smart pointers which can be dereferenced immutably and mutably respectively. The refcount is restored when the smart pointers go out of scope. With this system, we can dynamically ensure that there are never any other borrows active when a mutable borrow is active. If the programmer attempts to make such a borrow, the thread will panic.

Similar to Cell, this is mainly useful for situations where it’s hard or impossible to satisfy the borrow checker. Generally one knows that such mutations won’t happen in a nested form, but it’s good to check.

For large, complicated programs, it becomes useful to put some things in RefCells to make things simpler. For example, a lot of the maps in the ctxt struct in the rust compiler internals are inside this wrapper. These are only modified once (during creation, which is not right after initialization) or a couple of times in well-separated places. However, since this struct is pervasively used everywhere, juggling mutable and immutable pointers would be hard (perhaps impossible) and probably form a soup of &-ptrs which would be hard to extend. On the other hand, the RefCell provides a cheap (not zero-cost) way of safely accessing these. In the future, if someone adds some code that attempts to modify the cell when it’s already borrowed, it will cause a (usually deterministic) panic which can be traced back to the offending borrow.

Similarly, in Servo’s DOM we have a lot of mutation, most of which is local to a DOM type, but some of which crisscrosses the DOM and modifies various things. Using RefCell and Cell to guard all mutation lets us avoid worrying about mutability everywhere, and it simultaneously highlights the places where mutation is actually happening.

Note that RefCell should be avoided if a mostly simple solution is possible with & pointers.

Guarantees

RefCell relaxes the static restrictions preventing aliased mutation, and replaces them with dynamic ones. As such the guarantees have not changed.

Cost

RefCell does not allocate, but it contains an additional “borrow state” indicator (one word in size) along with the data.

At runtime each borrow causes a modification/check of the refcount.

Synchronous types

Many of the types above cannot be used in a threadsafe manner. Particularly, Rc<T> and RefCell<T>, which both use non-atomic ref counts, cannot be used this way. This makes them cheaper to use, but one needs thread safe versions of these too. They exist, in the form of Arc<T> and Mutex<T>/RWLock<T>

Note that the non-threadsafe types cannot be sent between threads, and this is checked at compile time. I’ll touch on how this is done in a later blog post.

There are many useful wrappers for concurrent programming in the sync module, but I’m only going to cover the major ones.

Arc<T>

Arc<T> is just a version of Rc<T> that uses an atomic reference count (hence, “Arc”). This can be sent freely between threads.

C++’s shared_ptr is similar to Arc, however in C++s case the inner data is always mutable. For semantics similar to that from C++, we should use Arc<Mutex<T>>, Arc<RwLock<T>>, or Arc<UnsafeCell<T>>3 (UnsafeCell<T> is a cell type that can be used to hold any data and has no runtime cost, but accessing it requires unsafe blocks). The last one should only be used if one is certain that the usage won’t cause any memory unsafety. Remember that writing to a struct is not an atomic operation, and many functions like vec.push() can reallocate internally and cause unsafe behavior (so even monotonicity4 may not be enough to justify UnsafeCell)

Guarantees

Like Rc, this provides the (thread safe) guarantee that the destructor for the internal data will be run when the last Arc goes out of scope (barring any cycles).

Cost

This has the added cost of using atomics for changing the refcount (which will happen whenever it is cloned or goes out of scope). When sharing data from an Arc in a single thread, it is preferable to share & pointers whenever possible.

Mutex<T> and RwLock<T>

Mutex<T> and RwLock<T> provide mutual-exclusion via RAII guards. For both of these, the mutex is opaque until one calls lock() on it, at which point the thread will block until a lock can be acquired, and then a guard will be returned. This guard can be used to access the inner data (mutably), and the lock will be released when the guard goes out of scope.

RwLock has the added benefit of being efficient for multiple reads. It is always safe to have multiple readers to shared data as long as there are no writers; and RwLock lets readers acquire a “read lock”. Such locks can be acquired concurrently and are kept track of via a reference count. Writers must obtain a “write lock” which can only be obtained when all readers have gone out of scope.

Guarantees

Both of these provide safe shared mutability across threads, however they are prone to deadlocks. Some level of additional protocol safety can be obtained via the type system. An example of this is rust-sessions, an experimental library which uses session types for protocol safety.

Costs

These use internal atomic-like types to maintain the locks, and these are similar pretty costly (they can block all memory reads across processors till they’re done). Waiting on these locks can also be slow when there’s a lot of concurrent access happening.

Composition

A common gripe when reading Rust code is with stuff like Rc<RefCell<Vec<T>>> and more complicated compositions of such types.

Usually, it’s a case of composing together the guarantees that one needs, without paying for stuff that is unnecessary.

For example, Rc<RefCell<T>> is one such composition. Rc itself can’t be dereferenced mutably; because Rc provides sharing and shared mutability isn’t good, so we put RefCell inside to get dynamically verified shared mutability. Now we have shared mutable data, but it’s shared in a way that there can only be one mutator (and no readers) or multiple readers.

Now, we can take this a step further, and have Rc<RefCell<Vec<T>>> or Rc<Vec<RefCell<T>>>. These are both shareable, mutable vectors, but they’re not the same.

With the former, the RefCell is wrapping the Vec, so the Vec in its entirety is mutable. At the same time, there can only be one mutable borrow of the whole Vec at a given time. This means that your code cannot simultaneously work on different elements of the vector from different Rc handles. However, we are able to push and pop from the Vec at will. This is similar to an &mut Vec<T> with the borrow checking done at runtime.

With the latter, the borrowing is of individual elements, but the overall vector is immutable. Thus, we can independently borrow separate elements, but we cannot push or pop from the vector. This is similar to an &mut [T]5, but, again, the borrow checking is at runtime.

In concurrent programs, we have a similar situation with Arc<Mutex<T>>, which provides shared mutability and ownership.

When reading code that uses these, go in step by step and look at the guarantees/costs provided.

When choosing a composed type, we must do the reverse; figure out which guarantees we want, and at which point of the composition we need them. For example, if there is a choice between Vec<RefCell<T>> and RefCell<Vec<T>>, we should figure out the tradeoffs as done above and pick one.

Discuss: HN, Reddit

1. I’m not sure if this is the technical term for them, but I’ll be calling them that throughout this post.

2. By “stack representation” I mean the data on the stack when a value of this type is held on the stack. For example, a Vec<T> has a stack representation of a pointer and two integers (length, capacity). While there is more data behind the indirection of the pointer, it is not part of the stack-held portion of the Vec. Looking at this a different way, a type is Copy if a memcopy of the data copies all the data owned by it.

3. Arc<UnsafeCell<T>> actually won’t compile since UnsafeCell<T> isn’t Send or Sync, but we can wrap it in a type and implement Send/Sync for it manually to get Arc<Wrapper<T>> where Wrapper is struct Wrapper<T>(UnsafeCell<T>).

4. By this I mean a piece of data that has a monotonic consistency requirement; i.e. a counter or a monotonically growing stack

5. &[T] and &mut [T] are slices; they consist of a pointer and a length and can refer to a portion of a vector or array. &mut [T] can have its elements mutated, however its length cannot be touched.

The Problem With Single-threaded Shared Mutability

Edit (Jan 2017): I re-discovered Niko’s post which touches on this and reaches for the same realization. I suspect I subconsciously got the idea for this from that post, at least in part.

This is a post that I’ve been meaning to write for a while now; and the release of Rust 1.0 gives me the perfect impetus to go ahead and do it.

Whilst this post discusses a choice made in the design of Rust; and uses examples in Rust; the principles discussed here apply to other languages for the most part. I’ll also try to make the post easy to understand for those without a Rust background; please let me know if some code or terminology needs to be explained.

What I’m going to discuss here is the choice made in Rust to disallow having multiple mutable aliases to the same data (or a mutable alias when there are active immutable aliases), even from the same thread. In essence, it disallows one from doing things like:

This is essentially the “Read-Write lock” (RWLock) pattern, except it’s not being used in a threaded context, and the “locks” are done via static analysis (compile time “borrow checking”).

Newcomers to the language have the recurring question as to why this exists. Ownership semantics and immutable borrows can be grasped because there are concrete examples from languages like C++ of problems that these concepts prevent. It makes sense that having only one “owner” and then multiple “borrowers” who are statically guaranteed to not stick around longer than the owner will prevent things like use-after-free.

But what could possibly be wrong with having multiple handles for mutating an object? Why do we need an RWLock pattern? 1

It causes memory unsafety

This issue is specific to Rust, and I promise that this will be the only Rust-specific answer.

Rust enums provide a form of algebraic data types. A Rust enum is allowed to “contain” data, for example you can have the enum

which gives us a type that can either be a variant Str, with an associated string, or a variant Int2, with an associated integer.

With such an enum, we could cause a segfault like so:

Here, we invalidated the insides reference because setting x to Int(1) meant that there is no longer a string inside it. However, insides is still a reference to a String, and the generated assembly would try to dereference the memory location where the pointer to the allocated string was, and probably end up trying to dereference 1 or some nearby data instead, and cause a segfault.

Okay, so far so good. We know that for Rust-style enums to work safely in Rust, we need the RWLock pattern. But are there any other reasons we need the RWLock pattern? Not many languages have such enums, so this shouldn’t really be a problem for them.

Iterator invalidation

Ah, the example that is brought up almost every time the question above is asked. While I’ve been quite guilty of using this example often myself (and feel that it is a very appropriate example that can be quickly explained), I also find it to be a bit of a cop-out, for reasons which I will explain below. This is partly why I’m writing this post in the first place; a better idea of the answer to The Question should be available for those who want to dig deeper.

Iterator invalidation involves using tools like iterators whilst modifying the underlying dataset somehow.

For example,

Firstly, this will loop infinitely (if it compiled, which it doesn’t, because Rust prevents this). The equivalent C++ example would be this one, which I use at every opportunity.

What’s happening in both code snippets is that the iterator is really just a pointer to the vector and an index. It doesn’t contain a snapshot of the original vector; so pushing to the original vector will make the iterator iterate for longer. Pushing once per iteration will obviously make it iterate forever.

The infinite loop isn’t even the real problem here. The real problem is that after a while, we could get a segmentation fault. Internally, vectors have a certain amount of allocated space to work with. If the vector is grown past this space, a new, larger allocation may need to be done (freeing the old one), since vectors must use contiguous memory.

This means that when the vector overflows its capacity, it will reallocate, invalidating the reference stored in the iterator, and causing use-after-free.

Of course, there is a trivial solution in this case — store a reference to the Vec/vector object inside the iterator instead of just the pointer to the vector on the heap. This leads to some extra indirection or a larger stack size for the iterator (depending on how you implement it), but overall will prevent the memory unsafety.

This would still cause problems with more complex situations involving multidimensional vectors, however.

Aliasing with mutability in a sufficiently complex, single-threaded program is effectively the same thing as accessing data shared across multiple threads without a lock

(The above is my paraphrasing of someone else’s quote; but I can’t find the original or remember who made it)

Edit (Jan 2017): I found the original, it’s a comment by kmc:

My intuition is that code far away from my code might as well be in another thread, for all I can reason about what it will do to shared mutable state.

Let’s step back a bit and figure out why we need locks in multithreaded programs. The way caches and memory work; we’ll never need to worry about two processes writing to the same memory location simultaneously and coming up with a hybrid value, or a read happening halfway through a write.

What we do need to worry about is the rug being pulled out underneath our feet. A bunch of related reads/writes would have been written with some invariants in mind, and arbitrary reads/writes possibly happening between them would invalidate those invariants. For example, a bit of code might first read the length of a vector, and then go ahead and iterate through it with a regular for loop bounded on the length. The invariant assumed here is the length of the vector. If pop() was called on the vector in some other thread, this invariant could be invalidated after the read to length but before the reads elsewhere, possibly causing a segfault or use-after-free in the last iteration.

However, we can have a situation similar to this (in spirit) in single threaded code. Consider the following:

We have the same invariant here; but can we be sure that x.do_something_complicated() doesn’t modify x.some_vec for some reason? In a complicated codebase, where do_something_complicated() itself calls a lot of other functions which may also modify x, this can be hard to audit.

Of course, the above example is a simplification and contrived; but it doesn’t seem unreasonable to assume that such bugs can happen in large codebases — where many methods being called have side effects which may not always be evident.

Which means that in large codebases we have almost the same problem as threaded ones. It’s very hard to maintain invariants when one is not completely sure of what each line of code is doing. It’s possible to become sure of this by reading through the code (which takes a while), but further modifications may also have to do the same. It’s impractical to do this all the time and eventually bugs will start cropping up.

On the other hand, having a static guarantee that this can’t happen is great. And when the code is too convoluted for a static guarantee (or you just want to avoid the borrow checker), a single-threaded RWlock-esque type called RefCell is available in Rust. It’s a type providing interior mutability and behaves like a runtime version of the borrow checker. Similar wrappers can be written in other languages.

Edit: In case of many primitives like simple integers, the problems with shared mutability turn out to not be a major issue. For these, we have a type called Cell which lets these be mutated and shared simultaenously. This works on all Copy types; i.e. types which only need to be copied on the stack to be copied. (Unlike types involving pointers or other indirection)

This sort of bug is a good source of reentrancy problems too.

Safe abstractions

In particular, the issue in the previous section makes it hard to write safe abstractions, especially with generic code. While this problem is clearer in the case of Rust (where abstractions are expected to be safe and preferably low-cost), this isn’t unique to any language.

Every method you expose has a contract that is expected to be followed. Many times, a contract is handled by type safety itself, or you may have some error-based model to throw out uncontractual data (for example, division by zero).

But, as an API (can be either internal or exposed) gets more complicated, so does the contract. It’s not always possible to verify that the contract is being violated at runtime either, for example many cases of iterator invalidation are hard to prevent in nontrivial code even with asserts.

It’s easy to create a method and add documentation “the first two arguments should not point to the same memory”. But if this method is used by other methods, the contract can change to much more complicated things that are harder to express or check. When generics get involved, it only gets worse; you sometimes have no way of forcing that there are no shared mutable aliases, or of expressing what isn’t allowed in the documentation. Nor will it be easy for an API consumer to enforce this.

This makes it harder and harder to write safe, generic abstractions. Such abstractions rely on invariants, and these invariants can often be broken by the problems in the previous section. It’s not always easy to enforce these invariants, and such abstractions will either be misused or not written in the first place, opting for a heavier option. Generally one sees that such abstractions or patterns are avoided altogether, even though they may provide a performance boost, because they are risky and hard to maintain. Even if the present version of the code is correct, someone may change something in the future breaking the invariants again.

My previous post outlines a situation where Rust was able to choose the lighter path in a situation where getting the same guarantees would be hard in C++.

Note that this is a wider problem than just with mutable aliasing. Rust has this problem too, but not when it comes to mutable aliasing. Mutable aliasing is important to fix however, because we can make a lot of assumptions about our program when there are no mutable aliases. Namely, by looking at a line of code we can know what happened wrt the locals. If there is the possibility of mutable aliasing out there; there’s the possibility that other locals were modified too. A very simple example is:

In both cases, when the two references are the same3, instead of swapping, the two variables get set to zero. A user (internal to your library, or an API consumer) would expect swap() to not change anything when fed equal references, but this is doing something totally different. This assumption could get used in a program; for example instead of skipping the passes in an array sort where the slot is being compared with itself, one might just go ahead with it because swap() won’t change anything there anyway; but it does, and suddenly your sort function fills everything with zeroes. This could be solved by documenting the precondition and using asserts, but the documentation gets harder and harder as swap() is used in the guts of other methods.

Of course, the example above was contrived. It’s well known that those swap() implementations have that precondition, and shouldn’t be used in such cases. Also, in most swap algorithms it’s trivial to ignore cases when you’re comparing an element with itself, generally done by bounds checking.

But the example is a simplified sketch of the problem at hand.

In Rust, since this is statically checked, one doesn’t worry much about these problems, and robust APIs can be designed since knowing when something won’t be mutated can help simplify invariants.

Wrapping up

Aliasing that doesn’t fit the RWLock pattern is dangerous. If you’re using a language like Rust, you don’t need to worry. If you’re using a language like C++, it can cause memory unsafety, so be very careful. If you’re using a language like Java or Go, while it can’t cause memory unsafety, it will cause problems in complex bits of code.

This doesn’t mean that this problem should force you to switch to Rust, either. If you feel that you can avoid writing APIs where this happens, that is a valid way to go around it. This problem is much rarer in languages with a GC, so you might be able to avoid it altogether without much effort. It’s also okay to use runtime checks and asserts to maintain your invariants; performance isn’t everything.

But this is an issue in programming; and make sure you think of it when designing your code.

Discuss: HN, Reddit

1. Hereafter referred to as “The Question”

2. Note: Str and Int are variant names which I chose; they are not keywords. Additionally, I’m using “associated foo” loosely here; Rust does have a distinct concept of “associated data” but it’s not relevant to this post.

3. Note that this isn’t possible in Rust due to the borrow checker.

Where Rust Really Shines

Yesterday I was working on a small feature for the Rust compiler, and came across a situation which really showcased Rust’s awesomeness as a language.

There was a struct which was exposed to an API, and I wished to give it access to a list of things known as “attributes”, where the list was a heap-allocated vector.

Now, I have two ways of actually giving the struct access to a vector. I can either clone it (i.e. make a copy of its contents), or use a reference (pointer) to it or its contents.

In a language like C++ there’s only once choice in this situation; that is to clone the vector1. In a large C++ codebase if I wished to use a pointer I would need to be sure that the vector isn’t deallocated by the time I’m done with it, and more importantly, to be sure that no other code pushes to the vector (when a vector overflows its capacity it will be reallocated, invalidating any other pointers to its contents).

For a smaller codebase this might be possible, but in this specific case it could have taken me a while to become sure of this. The code was related to the “expansion” portion of compilation, where the AST is expanded to a bigger AST. A lot of things change and get moved around, so it is reasonable to assume that it might not be possible to safely use it. I would have had to find out where the vector is originally stored; all the entry points for the code I was modifying, and make sure it isn’t being mutated (not as hard in Rust, but I would still need to muck around a large codebase). And then I would have to somehow make sure that nobody tries to mutate it in the future. This is a task which I would not even consider trying in C++.

However, I had another option here, because this was Rust. In Rust I can store a reference to the contents of the vector without fear of invalidation, since the compiler will prevent me from using the vector in a way that could cause unsafety. Such a reference is known as a slice.

Whilst in C++ I would have to manually go through a lot of code to be sure of safety (and even after all that be left with code that would be brittle to changes elsewhere the codebase), in Rust the compiler can do this for me!

Being able to do this was important — this code is called quite often for a regular compile, and all those extra allocations could be heavy, especially given that this was a feature that would be used by very few.

So first I started off by adding a field to the FieldInfo struct which was a slice of attributes. Notice that I added a lifetime specifier, the 'a to the struct definition.

For those of you new to Rust, a lifetime is part of the type of a reference. It’s related to the scope of the reference, and generally can be treated as a generic parameter. So, for example, here, I have a FieldInfo with a lifetime parameter of 'a where 'a is the lifetime of the inner slice of attributes. If I construct this struct with slices from different scopes, its type will be different each time. Lifetimes can get automatically cast depending on their context however, and quite often they get elided away, so one doesn’t need to specify them that much (aside from struct/enum definitions). You can find more information in the Rust book

I then updated code everywhere to pass the attributes from their source to their destination through the chained methods.

An important thing to note here is that none of the lifetime specifiers you see now in the commit were added when I did this. For example, the return value of create_struct_pattern was (P<ast::Pat>, Vec<(Span, Option<Ident>, P<Expr>, &[ast::Attribute])>) at this point, not (P<ast::Pat>, Vec<(Span, Option<Ident>, P<Expr>, &'a [ast::Attribute])>). You can ignore the complicated types being passed around, for now just pretend that a slice of attributes was returned.

Now comes the magic. After these small changes necessary for the feature, I basically let the compiler do the rest of the work. See, at this point the code was wrong. I had forgotten lifetime specifiers in places where they were important, and still wasn’t sure if storing a reference would in fact be possible in the first place. However, the compiler was smart enough to figure things out for me. It would tell me to add lifetime specifiers, and I would add them.

First, the compiler asked me to add a lifetime to the FieldInfo parts of SubstructureFields. So, the following:

became

This needed to happen because elision doesn’t work for structs and enums, and besides, the compiler would need to know if the &ast::Variant was supposed to be the same lifetime as the parameter of the FieldInfos. I decided to just use the existing 'a parameter, which meant that yes, the &ast::Variant was supposed to live just as long. I could also have opted to give the FieldInfos a different lifetime by adding a 'b parameter, but I guessed that it would work this way too (knowing the origin of the fieldinfo and variant, and that implicit lifetime casting would fix most issues that cropped up). I didn’t need to think this out much, though — the compiler gave me a suggestion and I could simply copy it.

The next error was in create_enum_variant_pattern() and create_struct_pattern() as well as some other places.

Here, the method had a signature of

and I changed it to

In this case, the code was uncomfortable with taking a slice of attributes out of an arbitrary StructDef reference and returning it. What if the StructDef doesn’t live long enough? Generally the compiler internally figures out the lifetimes necessary and uses them here, but if you have too many references there’s no single way to make the fix. In this case, the compiler suggested I add a 'a to &StructDef and the returned &[Attribute], and I did so. The 'a lifetime was declared at the top of the impl, so it was the lifetime parameter of self2. This meant that the returned attribute of the function will have a lifetime tied to self and the input StructDef, and due to this it cannot outlive the inputs, which is what we wanted in the first place. In essence, I took a bit of code that was doing:

and changed it to

Again, I didn’t need to think this out much (I’m only thinking it through now for this blog post). I followed the suggestion given to me by the compiler:

There were a couple of similar errors elsewhere that were caused by tying these two lifetimes together. Since these methods were chained, updating the lifetimes of a child method would mean that I would have to now update the parent method which passes its arguments down to the children and returns a modification of its return value (and thus must now impose the same restrictions on its own signature). All of this was done by just listening to the suggestions of the compiler (which all contain a function signature to try out). In some cases I introduced a 'b lifetime, because tying it to 'a (the self lifetime parameter) was possibly too restrictive. All of this at the suggestion of the compiler.

While this all seems long and complicated, in reality it wasn’t. I simply added the field to the initial struct, tried compiling a couple of times to figure out which code needed updating to pass around the attributes, and then went through 3-4 more compilation attempts to fix the lifetimes. It didn’t take long, and I didn’t need to put much mental effort into it. I just listened to the compiler, and it worked.

And now I trust completely that that code will not cause any segfaults due to attempted access of a destroyed or moved vector. And this is despite the fact that I still don’t know where that particular vector is modified or destroyed — I didn’t explore that far because I didn’t need to! (or want to :P)

And this is one place Rust really shines. It lets you do optimizations which you wouldn’t dream of doing in C++. In fact, while the C++ way of looking at this problem would probably be to just clone and move on, most Rust programmers would think of using slices as the default, and not even consider it an “optimization”. And again, this wasn’t with much cognitive overhead; I could just follow the compiler and it fixed everything for me.

1. Some people have pointed out that a shared pointer to the vector itself would work here too. This is correct, but a shared pointer also has a runtime overhead, and more importantly doesn’t prevent iterator invalidation. I had no idea how the vector was being used elsewhere, so this was a risk I didn’t want to take. Additionally, whilst a shared pointer to the vector itself is immune to the issue of the vector being moved, since this was an API, someone consuming the API might take a reference of an attribute and hold on to it long enough for it to become invalidated. This is something we can’t have either – an API consumer should not have to worry about where the pointers will invalidate.

2. Note: This is not the lifetime of the reference &self, which is the lifetime of the pointer (&'b self), but the lifetime parameter of self, a TraitDef<'a>, which has a lifetime parameter for its child fields.

New Blog!

I’ll be moving from my old Blogger-powered blog to this new one powered by github pages and Octopress. I never enjoyed writing a blog in WYSIWYG or HTML (I would constantly switch between both and still get the formatting wrong); Markdown is my cup of tea.

I may “uplift” some of my favorite posts to this blog later.

I’m quite excited!