I came across the paper Ownership You Can Count On (by Adam Dingle and David F. Bacon, seemingly written in 2007) some years ago and it stuck with me as being an interesting variant on traditional reference counting. Since then I've come across references to it multiple times in the programming language design and implementation community and thought it might be worth jotting down some notes on the paper itself and the various attempts to adopt its ideas.
The basic idea is very straight-forward. Introduce the idea of an owning
pointer type (not dissimilar to std::unique_ptr
in C++) where each object
may have only a single owning pointer and the object is freed when that
pointer goes out of scope. In addition to that allow an arbitrary number of
non-owning pointers to the object, but require that all non-owning pointers
have been destroyed by the time the object is freed. This requirement
prevents use-after-free and is implemented using run-time checks.
A reference count is incremented or decremented whenever a non-owning pointer
is created or destroyed (referred to as alias counting in the paper). That
count is checked when the object is freed and if it still has non-owning
references (i.e. the count is non-zero), then the program exits with an error.
Contrast this with a conventional reference counting system, where objects are freed when their refcount reaches zero. In the alias counting scheme, the refcount stored on the object is simply decremented when a non-owning reference goes out of scope, and this refcount needs to be compared to 0 only upon the owning pointer going out of scope rather than upon every decrement (as an object is never freed as a result of the non-owning pointer count reaching zero). Additionally, refcount manipulation is never needed when passing around the owning pointer. The paper also describes an analysis that allows many refcount operations to be elided in regions where there aren't modifications to owned pointers of the owned object's subclass/class/superclass (which guarantees the pointed-to object can't be destructed in this region). The paper also claims support for destroying data structures with pointer cycles that can't be automatically destroyed with traditional reference counting. The authors suggest for cases where you might otherwise reach for multiple ownership, (e.g. an arbitrary graph) to allocate an array of owning pointers to hold your nodes, then use non-owning pointers between them.
The paper describes a C# derived language called Gel which only requires two
additional syntactic constructs to support the alias counting model: owned
pointers indicated by a ^
(e.g. Foo ^f = new Foo();
and a take
operator
to take ownership of a value from an owning field. Non-owning pointers are
written just as Foo f
. They also achieve a rather neat erasure
property, whereby if you take a Gel program and remove all ^
and take
you'll have a C# program that is valid as long as the original Gel program was
valid.
That all sounds pretty great, right? Does this mean we can have it all: full memory safety, deterministic destruction, low runtime and annotation overhead? As you'd expect, there are some challenges:
So in summary, an interesting idea that is meaningfully different to traditional reference counting, but the largest program written using this scheme is the Gel compiler itself and many of the obvious questions require larger scale usage to judge the practicality of the scheme.
Ownership You Can Count On was written around 2007 and as far as I can tell never published in the proceedings of a conference or workshop, or in a journal. Flicking through the Gel language repository and applying some world-class logical deduction based on the directory name holding a draft version of the paper leads to me suspect it was submitted to PLDI though. Surprisingly it has no academic citations, despite being shared publicly on David F. Bacon's site (and Bacon has a range of widely cited papers related to reference counting / garbage collection). Yet, the work has been used as the basis for memory management in one language (Inko) and was seriously evaluated (partially implemented?) for both Nim and Vale.
Inko started out with a garbage collector, but its creator Yorick Peterse announced in 2021 a plan to adopt the scheme from Ownership You Can Count on, who then left his job to work on Inko full-time and successfully transitioned to the new memory management scheme in the Inko 0.10.0 release about a year later. Inko as a language is more ambitious than Gel - featuring parametric polymorphism, lightweight processes for concurrency, and more. Yet it's still early days and it's not yet, for instance, good for drawing conclusions about performance on modern systems as optimisations aren't currently applied. Dusty Phillips wrote a blog post earlier this year explaining Inko's memory management through some example data structures, which also includes some thoughts on the usability of the system and some drawbacks. Some of the issues may be more of a result of the language being young, e.g. the author notes it took a lot of trial and error to figure out some of the described techniques (perhaps this will be easier once common patterns are better documented and potentially supported by library functions or syntax?), or that debuggability is poor when the program exits with a dangling reference error.
Nim was at one point going to move to Gel's scheme (see the blog post and (RFC). I haven't been able to find a detailed explanation for the reasons why it was rejected, though Andreas Rumpf (Nim language creator) commented on a Dlang forum discussion thread about the paper that "Nim tried to use this in production before moving to ORC and I'm not looking back, 'ownership you can count on' was actually quite a pain to work with...". Nim has since adopted a more conventional non-atomic reference counted scheme (ARC), with an optional cycle collector (ORC).
Vale was previously adopting the Gel / Ownership You Can Count On scheme (calling the unowned references "constraint references"), but has since changed path slightly and now uses "generational references" Rather than a reference count, each allocation includes a generation counter which is incremented each time it is reused. Fat pointers that include the expected value of the generation counter are used, and checked before dereferencing. If they don't match, that indicates the memory location was since reallocated and the program will fault. This also puts restrictions on the allocator's ability to reuse freed allocations without compromising safety. Vale's memory management scheme retains similarities to Gel: the language is still based around single ownership and this is exploited to elide checks on owning references/pointers.
Some tangentially related things that didn't make it into the main body of text above:
And, that's about it. If you're hoping for a definitive answer on whether alias counting is a winning idea or not I'm sorry to disappoint, but I've at least succeeded in collecting together the various places I've seen it explored and am looking forward to seeing how Inko's adoption of it evolves. I'd be very interested to hear any experience reports of adopting alias counting or using a language like Inko that tries to use it.