You use a hash, but you only save the object_id in the WeakHash. Every time you
need the hash, you get it back from the heap by _id2ref. You use a finalizer
to be acknowledged, if the hash is garbage-collected.
Sometimes this concept works, sometimes I get a RangeError. I log the
finalization and my theory - without reading reference-material about
ruby-garbage-collection - is this: The hash is cleared on the heap, but
the finalizer is not immediately called. In this gap it's possible to get a
RangeError.
I took another look at gc.c, and Jens is right on.
You can find corrected versions of the SimpleWeakHash and WeakHash classes
(two weak hash tables with slightly different semantics) below, but I'll first
expand on why GC and object finalization can be fairly distant in time.
The GC
Ruby uses a very simple mark&sweep GC, which as its name implies works by
first marking all
live*1 objects, and then reclaiming all those that
remained unmarked.
gc.c is actually one of the easiest core components; it's easier to follow
than eval.c, since it doesn't require remembering lots of things (node
types and undocumented conventions regarding the way they nest) and it's leaps
and bounds simpler than parse.y. So it only took me a few minutes to verify
Jens' intuition.
Finalizers
How do finalizers fit in the picture? As the rest of gc.c, this is also very
simple. The interpreter just adds the objects being swept for which a
finalizer has been defined to a list of... well, objects whose finalizers need to
be called. (If you want to see where this happens, read gc_sweep and look for
uses of the deferred_final_list and final_list variables). At some point,
later in time, Ruby follows that list and executes the finalizers one after
the other.
This explains how finalizers are executed, but not when. The code could hardly
be more expressive: