[isabelle-dev] [158c513a39f5] JVM crash

Manuel Eberl eberlm at in.tum.de
Wed Nov 8 09:21:29 CET 2017


I was not able to find out what exactly crashes and why, but while
trying to do that, I found out that the problem is actually not present
in the development version of Isabelle anymore.

After a lengthy bisection, I found that the first revision where no
crashes occur is this one:

changeset:   66920:aefaaef29c58
user:        wenzelm
date:        Thu Oct 26 13:44:41 2017 +0200
summary:     use Poly/ML 5.7.1 test version as default;

So apparently, something changed in Poly/ML 5.7.1 that made the crashes
go away. Older versions of Isabelle, i.e. Isabelle2016-1, Isabelle2017,
and isabelle-dev up to 66919:1f93e376aeb6 /do/ exhibit the crash. It
thus seems likely that whatever caused it was apparently present in
multiple older Poly/ML versions.

Note that the crash appears to be highly sensitive to the environment:
Having two builds run in parallel with different Isabelle versions seems
to make it considerably less frequent; however, only in Poly/ML 5.7.1
does it appear to not happen at all no matter how I run it.

Does Poly/ML 5.7.1 contain any changes that could plausibly cause this
bad behaviour to go away?

Manuel


On 2017-11-07 09:51, Manuel Eberl wrote:
> It seems like this thread is not dead yet.
>
> I had my CPU replaced by a new version that supposedly does not have the
> SMD problem on Linux. The problem with Isabelle did not go away.
>
> I still get reproducible crashes of "isabelle build -c Pure", but only
> with SMT switched on. However, it is worth noting that the crashes
> always seem to happen at the end of the build process. (A successful
> build takes about 9 seconds of elapsed and CPU time on my machine and
> unsuccessful ones always crash at at least that time) However, it is
> worth noting that some of the failed builds look like this:
>
>     0:00:30 elapsed time, 0:00:08 cpu time, factor 0.29
>
> Also, the last line that is printed is always
>
>     ### theory "ML_Bootstrap"
>
> so perhaps that also points to some specific problem. However, I don't
> think it's anything specific to Pure, since the crash is also
> reproducible with other sessions if I remember correctly (I just use
> Pure because it can be built quickly).
>
> To summarise:
> - crash happens randomly in about 6% of builds of the "Pure" session
> - crash seems specific to my Ryzen or at least my system
> - disabling SMT makes problem go away
> - crash reproducible on my Arch installation and a fresh install of
> Arch, but not on Ubuntu
> - RAM passes memtest without errors
> - I did not experience instability with any other software
> - My conjecture would be that it is somehow related to PolyML and
> multithreading
>
> I attached both the console output and the build log.
>
> Manuel
>
>
> On 04/09/17 15:12, Manuel Eberl wrote:
>> Alas, it would appear I have spoken too soon!
>>
>> I experienced a strange build failure with RC1 yesterday and, fearing
>> the worst, did my experiment from a few weeks ago again, with the
>> following result:
>>
>> – building "Pure" fails in around 6 % of the cases
>> – this does not change even after a cold reboot
>> – switching SMT off seems to make the problem go away entirely
>> – switching SMT on makes it reappear
>>
>> Sounds very much like this might well be the Ryzen bug. AMD has started
>> replacing affected CPUs, so I shall enquire about that and see what happens.
>>
>> Cheers,
>>
>> Manuel




More information about the isabelle-dev mailing list