This is a series of blog posts where we talk about garbage collection: why do it, what is it, and how to implement a simple one.
Last time I've discussed about reference counting and escape analysis as simple forms of automatic memory management. The two forms of management are suitable for simple acyclic data structures, or cyclic data structures where one can determine the parent–child relation.
For complex graphs with cyclic references, it's really hard to use reference counting without encountering memory leaks, this is where garbage collection comes in.
Where as in a reference counted model where we track all ownership using a reference count, in garbage collected model, we don't track ownership at all, at least not statically. We can not determine if an object is being owned without runtime information.
While the garbage collector does not care about who owns what, it does care about which unused objects it can deallocate. How does it determine this?
We first give it a set of known roots, objects that we are certain are reachable, we are currently using it, and we don't want it to be deallocated. The roots might be objects referenced by local variables (on the stack) or by global variables.
Next, we wait. Until the heap is x percent full, we can proceed to allocate memory as usual. Only when it reaches that threshold do we start doing garbage collection.
We reached that point when the heap is at its limit, the garbage collection algorithm triggers.
Let's see how it performs on this hypothetical graph of
nodes. We will represent known reachable nodes with
o
and unreachable nodes with x
.
Nodes are connected with a -
. 2 nodes
connected with a =
means they have a cyclic
reference.
x
/ \
x x
/ \ /
x x
x=x
x - x
The garbage collection algorithm triggers, and a GC cycle starts. In every GC cycle we do:
x
/ \
x x
/ \ /
x x
x=x
x - x
o
/ \
x x
/ \ /
x x
x=x
x - x
o
/ \
o o
/ \ /
o o
x=x
x - x
o
/ \
o o
/ \ /
o o
What you've seen there is a precise mark and sweep garbage collector in action. Precise, meaning we know the exact location of its child. In contrast to this is a conservative style garbage collector, where it has to literally scan the memory of the nodes in order to find its children.
This and derivative garbage collectors stop the world, you can not do anything else when a GC cycle occurs, or else memory corruption would happen.
Although garbage collection, implemented correctly, takes out the pain of managing object relationships, each GC cycle is costly, leading to infamous garbage collection pauses found in many interpreted languages using this paradigm of management. Not that suitable for high latency applications.
One solution we could do to lower GC pause times, is to add another state to a GC node:
x
o
?
: we
know that the node is reachable, however we don't know
whether its children are or notFor the garbage collector to be stable and not free out memory that is in use, we'll have to uphold the following invariants:
o
must only contain either: reachable nodes
o
or intermediate nodes ?
?
must only contain unreachable nodes
x
We must maintain this invariance because the garbage collector is incremental, a GC cycle doesn't fully walk through all of the nodes.
The algorithm looks like this:
x
/ \
x x
/ \ /
x x
x=x
x - x
o
/ \
? ?
/ \ /
x x
x=x
x - x
o
/ \
o o
/ \ /
? ?
x=x
x - x
o
/ \
o o
/ \ /
o o
x=x
x - x
o
/ \
o o
/ \ /
o o
What if you want to allocate in the middle of the marking phase? You can allocate nodes as normally, however you must maintain the invariance by either:
In either case, the invariance is namintained and the GC can continue its business.
Many high performance language interpreters like v8, the JVM,... provide much better garbage collectors than one discussed here.
There are concurrent garbage collectors which performs each GC cycle on a separate thread, only stopping the world when the roots are scanned.
There are parallel garbage collectors which splits their marking/sweeping workloads across multiple cores.
There are even collectors that physically move objects, defragmenting allocated memory so as to increase memory usage efficiency.
You should check out luajit's wiki and v8's development blog to see how those developers implement much more complex GC algorithms than discussed here.
Next time, I'll show you how to implement a simple mark and sweep garbage collector in C++. See you then.