[isabelle-dev] isabelle build

David Matthews dm at prolingua.co.uk
Wed Aug 8 10:54:36 CEST 2012


The garbage collector now includes a pass that looks for cells that have 
the same contents and merges them using the principle that equality in 
ML cannot distinguish between two pointers to the same cell or two cells 
with the same contents.  There has been a PolyML.shareCommonData 
function for a long time that does this but that has always had to be 
called explicitly by the user.  It has also required quite a lot of 
extra memory.  Currently, I understand Isabelle uses it just before it 
saves the state to reduce the sizes of saved state files.

Some time ago Makarius wondered if it was possible to have this invoked 
as part of the GC.  I couldn't see how to do this especially without 
adding any extra memory overhead but gradually I've been improving it. 
The heap sizing code now fires off this pass when memory is particularly 
short since it costs several times the cost of a normal GC.  The latest 
improvement was to find a way to merge deep structures (lists and trees) 
something that PolyML.shareCommonData has always done but at the cost of 
the extra memory.  This made a dramatic difference to JinjaThreads.  It 
appears that it creates many equal data structures and merging these can 
reduce the live data size by some 80%.

Actually, one has to be careful when interpreting the parallelism 
figures.  Much of the time an ML application is memory-bound which means 
that using more processors can make an improvement in overall run-time 
but the overall CPU usage will grow.  As far as I can tell, the time 
that a processor spends waiting for memory counts as its CPU time unlike 
when it is waiting for IO.

David

On 08/08/2012 09:00, Lawrence Paulson wrote:
> I understand about the parallelism, but what has cut back on the memory consumption?
> Larry



More information about the isabelle-dev mailing list