On Garbage Collection

So, this article is trending on the top of Hacker News. And while I'm pretty ambivalent about it, I've decided to not comment on the parts I think are short-sighted. Instead, its given me motivation to finally write up my feelings on the issue. I've been a systems programmer for a long-time and at this point, and I'm quite sick of the Manual Memory Management vs Garbage Collection flamewars. Let's elevate the discourse.

Ubiquity

Here's the reality: we all use garbage-collection. And any sufficiently sophisticated system grows it's own form of garbage collection.

Don't believe me? Let's take a tour:

Filesystems garbage-collect dead-files / blocks to accelerate user response time
When a linux process terminates, it's process state is collected.
Dead-code elimination in compiler back-ends is literally the mark-and-sweep algorithm (example)
SSDs internally collect old blocks and migrate data out of band
Linux kernel RCU mechanism is a garbage-collector (as the article points out)
...etc etc etc...

But the most damning example of all is reference-counted pointers:

System	Name
Rust	`Rc<>` / `Arc<>`
C++	`shared_ptr<>`
C	`<hand-rolled>`

But, those aren't true garbage collection you may quibble?

Well, this paper argues that refcounting and garbage-collecting are two-sided to the same coin: A Unified Theory of Garbage Collection

The similarities are obvious to anyone who is willing to reflect for a moment. In a traditional garbage-collected programming language, it's quite easy to be sloppy about objects and not consider what owns what because the "garbage collector is magic". But, when one tries to write code in a language such as Rust or C++, they have to unlearn bad habits. Or do they? Can't they just slap an Arc<> or shared_ptr<> on everything and go back to "normal"? Sure they can. And it's called garbage collection.

Ref-counting is garbage collection.

But why?

It turns out we absolutely need the garbage-collection concept whenever an object has >1 independent owners. Each of these owners can operate independently without coordination and use the underlying object in any way they wish without restriction. Each may terminate at any time without coordination or consideration of any other owner. How do we ensure correctness? Garbage collection.

To eschew garbage collection is to assert that every object can be owned by Exactly One owner. This restriction lets one get quite far, but has its limits. If you're writing pure C code and hit such an issue, you immediately start introducing an adhoc refcnt'ing solution. And, BAM, your C program now has a garbage collector!

The Distraction

The standard discussion is a big distraction from the actually interesting issues. Saying "garbage collectors suck" or "just use a garbage collector" is poor engineering. Instead we should consider: "what tools do we need in or belts to do good engineering?" And in particular: "Where should we put the garbage collectors when we need them?"

To be or not to be

A sensible reason to avoid garbage collectors in 2024 is simply.. simplicity. A high-quality, enterprise-grade garbage-collector is a complicated and sophisticated system that will fail in complicated and surprising ways. If 99.5% of your objects are owned by exactly one owner, is it worthwhile to add the burden of understanding and tuning a general-purpose garbage collector? Consider also that manual memory management has never been easier with modern languages (e.g. Rust).

It's fair that many interesting systems-applications have been written in garbage-collected languages. I think Golang alone is proof of this assertion. In fact, golang does a clever thing to detect "exactly one owner" cases at compile-time using escape-analysis, so it can eliminate runtime garbage-collector work.

Another example is the NASDAQ (ISLAND) matching-engine being written in Java. Though, I'm not sure that one is the best example, as they do a lot of pre-allocation and object-pooling to avoid the collector.

Conversations I want to have

The conversations I actually want to have are about what a sensible opt-in garbage-collector could look like in a systems programming language. To date I only know one popular language that has made an attempt at doing a hybrid: The D Programming Language. They draw a distinction between class objects being garbage-collected and struct objects being manually-managed. But, if you use the metric "how many times has an idea been copied?", then this idea hasn't been particularly successful.

I would love to see more exploration in this area. What systems programmers tend to dislike is large complicated runtimes / frameworks that they can't control. Is it possible to have opt-in garbage-collection that doesn't need to control the entire runtime? What trade-offs does that require? Is it still usable or does it just drown in its own complexity? How far can one go in a pure "library" form-factor? How composable can it be with the rest of the language? Etc.. etc.. etc.

But until then, we'll all just keep rolling our own adhoc collectors whenever we're faced with yet another num_owners > 1 problem.