[isabelle-dev] Using PolyML's memory consumption profiling on Isabelle
Makarius
makarius at sketis.net
Fri Oct 14 13:37:07 CEST 2011
On Fri, 14 Oct 2011, Andreas Schropp wrote:
>> Isabelle is generating a *lot* of copies of types& terms, particularly
>> via Term.map_atyps. Since shareCommonData eliminates them, many are
>> duplicates. It's possible that further use of the "same" variants from
>> Term_Subst might help. It's also possible that the repeated
>> reconstruction is necessary (e.g. repeated generalization/abstraction
>> of type variables) and further use of the new Term_Sharing mechanism
>> might be the answer.
>>
>
> The way I learned Isabelle programming one views term-traversal as being
> cheap, which seems to be true most of the time especially when they are
> freshly allocated with nice locality properties. Sharing lots of
> subterms might interfere with this. Isn't this what GC was made for? Why
> introduce artificial sharing?
There is a spectrum of things that can be done, and the various tendencies
in optimization are often in conflict. Andreas is right about his
intuition how operations in Isabelle are usually done. GC works pretty
well and with an increasing tendency of multiple cores and distributed
memory excessive sharing could become counter-productive.
We used to have global sharing of term structure for results for the sake
of SML/NJ, which lacks the convenience of share_common_data of Poly/ML.
When the parallelization of Poly/ML and Isabelle/ML started around 2007, I
removed that global resource for performance reasons. As a rule of thumb
it is better to let parallel threads run freely and independently, even if
they allocate redundant data -- in different regions of the heap.
I reintroduced some bits of Term_Sharing recently for different reasons,
to ensure that terms entering the system by the new protocol layers are
"interned" in the way one normally expects. (The concrete syntax layer
did this implicitly due to lookup of formaly entities.)
The details of parallel / distributed memory management are a black art,
at all levels (hardware, runtime-system, application code). David
Matthews is the one who understands that best :-)
> BTW: the Coq kernel does huge amounts of sharing IIRC. Should we be
> concerned or is this just a thing because of proof terms?
I can't say what Coq does exactly -- the general tendency there (and
OCaml) seems to target hardware from the mid 1990-ies. They also have
serious problems catching up with the multicore trend, but I am actually
collaborating with some Coq experts on the parallel prover theme so some
improvements in the general understanding can be anticipated.
> Makarius, please comment on this, because now I feel like a wasteful
> programmer. ;D
I see no particular problems here. Just continue with a purely functional
mindset, with a bit of a sense for the general workings of the
runtime-system and underlying hardware.
Makarius
More information about the isabelle-dev
mailing list