“Wait”, you ask, “why does Rust need a garbage collector”? Rust is supposed to work without a GC, that’s one of its main selling points!
True. Rust does work pretty well without a GC. It’s managed to do without one so far, and we still have all sorts of well-written crates out there (none of which use a GC).
But Rust is not just about low-cost memory safety. It’s also about choosing your costs and
Box<T> and stack allocation are not always sufficient, sometimes one needs to
reach for something like
Rc<T> (reference counting). But even
Rc is not perfect; it can’t handle
cycles between pointers. There are solutions to that issue like using
Weak<T>, but that only works
in limited cases (when you know what the points-to graph looks like at compile time), and isn’t very
Cases where one needs to maintain a complicated, dynamic graph are where a GC becomes useful. Similarly, if one is writing an interpreter for a GCd language, having a GC in Rust would simplify things a lot.
Not to say that one should pervasively use a GC in Rust. Similar to
Rc<T>, it’s best to use
regular ownership-based memory management as much as possible, and sprinkle
Gc in places
where your code needs it.
This isn’t the first GC in Rust. Automatic memory management has existed before in various forms, but all were limited.
Besides the ones listed below, Nick Fitzgerald’s cycle collector based on this paper exists and is something that you should look into if you’re interested. There’s also an RFC by Peter Liniker which sketches out a design for an immutable GC.
Core Rust GC(s)
Rust itself had a garbage collector until a bit more than a year ago. These “managed pointers”
@T) were part of the language. They were removed later with a plan to make GC a library feature.
I believe these were basically reference counted (cycle collected?) pointers with some language integration, but I’m not sure.
Nowadays, the only form of automatic memory management in Rust are via
which are nonatomic and atomic reference counted pointers respectively. In other words, they keep
track of the number of shared references via a reference count (incremented when it is cloned,
decremented when destructors run). If the reference count reaches zero, the contents are cleaned up.
This is a pretty useful abstraction, however, as mentioned above, it doesn’t let you create cycles without leaking them.
You can read more about Servo’s Spidermonkey bindings in this blog post (somewhat outdated, but still relevant)
Of course, this sort of GC isn’t really useful for generic use since it comes bundled with a JS runtime. However, the Rust side of the GC is of a design that could be used in an independent library.
The Rust side of the Spidermonkey GC is done through a bunch of smart pointers, and a trait called
JSTraceable is a trait which can “trace” recursively down some data, finding and
marking all GC-managed objects inside it. This is autoderived using Rust’s plugin infrastructure, so
#[jstraceable] annotation will generate trace hooks for the struct it is on.
Now, we have various smart pointers. The first is
JS<T>. This is opaque, but can be held by other
GC-managed structs. To use this on the stack, this must be explicitly rooted, via
Root<T>, which can be dereferenced to get the inner object. When the
Root is created,
the contained object is listed in a collection of “roots” in a global. A root indicates that the
value is being used on the stack somewhere, and the GC starts tracing usage from these roots. When
Root<T> is destroyed, the root is removed.
The problem with this is that
JS<T> doesn’t work on the stack. There is no way for the GC to know
that we are holding on to
JS<T> on the stack. So, if I copy a
JS<T> to the stack, remove all
references to it from objects in the GC heap, and trigger a collection, the
JS<T> will still be
around on the stack after collection since the GC can’t trace to it. If I attempt to root it, I may
get a panic or a segfault depending on the implementation.
To protect against this, we have a bunch of lints. The relevant one here protects
JS<T> from being carried around on the stack; but like most lints, it’s not perfect.
To summarize: Spidermonkey gives us a good GC. However using it for a generic Rust program is ill advised. Additionally, Servo’s wrappers around the GC are cheap, but need lints for safety. While it would probably be possible to write safer wrappers for general usage, it’s pretty impractical to carry around a JS runtime when you don’t need one.
However, Spidermonkey’s GC did inspire me to think more into the matter.
Brainstorming a design
For quite a while I’d had various ideas about GCs. Most were simplifications of Servo’s wrappers (there’s some complexity brought in there by Spidermonkey that’s not necessary for a general GC). Most were tracing/rooting with mark-and-sweep collection. All of them used lints. Being rather busy, I didn’t really work on it past that, but planned to work on it if I could find someone to work with.
One day, Nika pinged me on IRC and asked me about GCs. Lots of people knew that I was interested in writing a GC for Rust, and one of them directed her to me when she expressed a similar interest.
So we started discussing GCs. We settled on a tracing mark-and-sweep GC. In other words, the GC runs regular “sweeps” where it first “traces” the usage of all objects and marks them and their children as used, and then sweeps up all unused objects.
This model on its own has a flaw. It doesn’t know about GC pointers held on the stack as local variables (“stack roots”). There are multiple methods for solving this. We’ve already seen one above in the Spidermonkey design – maintain two types of pointers (one for the stack, one for the heap), and try very hard using static analysis to ensure that they don’t cross over.
A common model (used by GCs like Boehm, called “conservative GCs”) is to do something called “stack scanning”. In such a system, the GC goes down the stack looking for things which may perhaps be GC pointers. Generally the GC allocates objects in known regions of the memory, so a GC pointer is any value on the stack which belongs to one of these regions.
Of course, this makes garbage collection rather inefficient, and will miss cases like
where the GCd pointer is accessible, but through a non-GC pointer.
We decided rather early on that we didn’t want a GC based on lints or stack scanning. Both are rather suboptimal solutions in my opinion, and very hard to make sound1. We were also hoping that Rust’s type system and ownership semantics could help us in designing a good, safe, API.
So, we needed a way to keep track of roots, and we needed a way to trace objects.
The latter part was easy. We wrote a compiler plugin (well, we stole Servo’s tracing plugin which
I’d written earlier) which autoderives an implementation of the
Trace trait on any
given struct or enum, using the same internal infrastructure that
#[derive(PartialEq)] and the
rest use. So, with just the following code, it’s easy to make a struct or enum gc-friendly:
1 2 3 4 5 6 7 8 9 10
foo of type
foo.trace(), will expand to a call of
bar.trace() will check which variant it is and call
trace() on the
if it’s a
Baz. For most structs this turns out to be a no-op and is often optimized away by
inlining, but if a struct contains a
Gc<T>, the special implementation of
“mark” the traceability of the
Gc<T>. Types without
Trace implemented cannot be used in types
Trace or in a
Gc, which is enforced with a
T: Trace bound on
So, we have a way of walking the fields of a given object and finding inner
Gc<T>s. Splendid. This
lets us write the mark&sweep phase easily: Take the list of known reachable
Gc<T>s, walk their
contents until you find more
Gc<T>s (marking all you find), and clean up any which aren’t
Of course, now we have to solve the problem of keeping track of the known reachable
the roots. This is a hard problem to solve without language support, and I hope that eventually we
might be able to get the language hooks necessary to solve it. LLVM has support for tracking
GCthings on the stack, and some day we may be able to leverage that in Rust.
As noted above, Spidermonkey’s solution was to have non-rooted (non-dereferencable) heap pointers, which can be explicitly converted to rooted pointers and then read.
We went the other way. All
Gc<T> pointers, when created, are considered “rooted”. The instance of
Gc<T> has a “rooted” bit set to true, and the underlying shared box (
GcBox, though this is not a
public interface) has its “root count” set to one.
Gc<T> is cloned, an identical
Gc<T> (with rooted bit set to true) is returned, and the
underlying root count is incremented. Cloning a
Gc does not perform a deep copy.
1 2 3 4 5 6
This is rather similar to how
Rc works, however there is no
root field, and the
is called a “reference counter”.
For regular local sharing, it is recommended to just use a borrowed reference to the inner variable (borrowing works fine with rust-gc!) since there is no cost to creating this reference.
When a GC thing is put inside another GC thing, the first thing no longer can remain a root. This is handled by “unrooting” the first GC thing:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Of course, we need a way to traverse the object passed to the
Gc<T>, in this case
Foo, and look
for any contained
Gc<T>s to unroot. Sound familiar? This needs the same mechanism that
needed! We add struct-walking
unroot() methods to the
Trace trait which are auto-
derived exactly the same way, and continue. (We don’t need
root() right now, but we will need it
Now, during collection, we can just traverse the list of
GcBoxs and use the ones with a nonzero
root count as roots for our mark traversal.
So far, so good. We have a pretty sound design for a GC that works … for immutable data.
Gc<T> is by default immutable. Rust abhors aliasable mutability, even in single
threaded contexts, and both these smart pointers allow aliasing.
Mutation poses a problem for our GC, beyond the regular problems of aliasable mutability: It’s possible to move rooted things into heap objects and vice versa:
1 2 3 4 5 6 7 8 9 10 11 12
Gc<T> doesn’t implement
DerefMut, none of this is possible — one cannot mutate the
inner data. This is one of the places where Rust’s ownership/mutability system works out awesomely
in our favor.
Of course, an immutable GC isn’t very useful. We can’t even create cycles in an immutable GC, so why would anyone need this in the first place2?
So of course, we needed to make it somehow mutable. People using
Rc<T> solve this problem by using
RefCell<T>, which maintains something similar to the borrow semantics at runtime and is internally
RefCell<T> itself can’t be used by us since it doesn’t guard against the problem
illustrated above (and hence won’t implement
Trace, but a similar cell type would work).
So we created
GcCell<T>. This behaves just like
RefCell<T>, except that it will
beginning a mutable borrow, and
unroot() before ending it (well, only if it itself is not rooted,
which is tracked by an internal field similar to
Gc<T>). Now, everything is safe:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
So now, mutation works too! We have a working garbage collector!
I believe this can be solved without lints, but it may require some upcoming features of Rust to be implemented first (like specialization).
In essence, destructors implemented on a value inside
Gc<T> can be unsafe. This will only happen
if they try to access values within a
Gc<T> — if they do, they may come across a box that
has already been collected, or they may lengthen the lifetime of a box scheduled to be collected.
The basic solution to this is to use “finalizers” instead of destructors. Finalizers, like in Java, are not guaranteed to run. However, we may need further drop hooks or trait specialization to make an airtight interface for this. I don’t have a concrete design for this yet, though.
Our model mostly just works in a concurrent situation (with thread safety tweaks, of course); in
fact it’s possible to make it so that the concurrent GC will not “stop the world” unless someone
tries to do a write to a
GcCell. We have an experimental concurrent GC in this pull
request. We still need to figure out how to make interop between both GCs safe, though we may
just end up making them such that an object using one GC cannot be fed to an object using the other.
So far we haven’t really focused on performance, and worked on ensuring safety. Our collection triggering algorithm, for example, was horribly inefficient, though we planned on improving it. The wonderful Huon fixed this, though.
Similarly, we haven’t yet optimized storage. We have some ideas which we may work on later. (If you want to help, contributions welcome!)
Currently, an object deriving
Trace should have
Traceable children. This isn’t always possible
when members from another crate (which does not depend on rust-gc) are involved. At the moment, we
#[unsafe_ignore_trace] annotation on fields which are of this type (which excludes it
from being traced – if that crate doesn’t transitively depend on rust-gc, its members cannot
contain GCthings anyway unless generics are involved). It should be possible to detect whether or
not this is safe, and/or autoderive
Trace using the opt-in builtin traits framework (needs
specialization to work), but at the moment we don’t do anything other than expose that annotation.
Stdlib support for a global
Trace trait that everyone derives would be awesome.
Designing a GC was a wonderful experience! I didn’t get to write much code (I was busy and Nika was able to implement most of it overnight because she’s totally awesome), but the long design discussions followed by trying to figure out holes in the GC design in every idle moment of the day were quite enjoyable. GCs are very hard to get right, but it’s very satisfying when you come up with a design that works! I’m also quite happy at how well Rust helped in making a safe interface.
I encourage everyone to try it out and/or find holes in our design. Contributions of all kind welcome, we’d especially love performance improvements and testcases.