Sometimes your code is using much more RAM than it should. You've made sure
that you are not keeping too many large objects around, you got rid of their
references and you are sure they were GCed. Yet ps is saying your code takes
up too much memory. Memory fragmentation can be nasty at times, but you can
avoid it by allocating carefully.
Making sure our objects are being collected
Before you blame fragmentation, you've got to make sure the GC is doing its
job properly (and you're helping it do it). The obvious way is iterating over
the ObjectSpace and counting how many objects of each class there are.
I've seen that rewritten countless times :)
Estimates of the amount of memory used for each kind of object are a bit less
common. I don't mean, say, somestring.size, but something a bit more accurate
that takes into account the
overhead introduced by ruby's internal structures.
This is taken from FastRI; I was mostly concerned about Arrays there, but it
can be easily completed with the information from the above link:
if$DEBUG# trying to see where our memory is goingpopulation=Hash.new{|h,k|h[k]=[0,0]}array_sizes=Hash.new{|h,k|h[k]=0}ObjectSpace.each_objectdo|object|# rough estimates, see http://eigenclass.org/hiki.rb?ruby+space+overheadsize=caseobjectwhenArrayarray_sizes[object.size/10]+=1caseobject.sizewhen0..1620+64else20+4*object.size*1.5endwhenHash;40+4*[object.size/5,11].max+16*object.sizewhenString;30+object.sizeelse120# the iv_tbl, etcendcount,tsize=population[object.class]population[object.class]=[count+1,tsize+size]endpopulation.sort_by{|k,(c,s)|s}.reverse[0..10].eachdo|klass,(count,bytes)|puts"%-20s %7d %9d"%[klass,count,bytes]endputs"Array sizes:"array_sizes.sort.each{|k,v|puts"%5d %6d"%[k*10,v]}end
This will tell you how much memory your arrays/hashes/strings/other objects
are taking. It could be modified to proceed recursively, so the space needed
for objects held in instance variables is attributed to the parent object,
but it might not be doable if there are multiple references (you either count
it only once, possibly in the wrong object, or repeatedly).
Fighting memory fragmentation
Here's a simplification of the indexing loop in the first version of
FTSearch full-text search engine
(of course, it didn't quite look like this, work was divided more cleanly
among the fulltext store, the document map and the suffix array writer):