[isabelle-dev] Safe approach to hypothesis substitution

Wed Aug 25 07:57:08 CEST 2010

It's interesting that my innocent thread on hypothesis substitution has been hijacked into a discussion about context-aware coding guidelines. I'm trying to steer clear of contexts - it would seem more elegant to me if there was a purely tactical solution to this problem.

To answer Andreas' question about blast_hyp_subst_tac:

I made some attempts, but it seemed that making blast_hyp_subst_tac share behaviour or even code with the other versions led to incompatibilities I didn't really understand. The patch I sent keeps blast_hyp_subst_tac strictly in compatibility mode with previous behaviour. I think you've outlined why there was a problem - blast is guessing incorrectly what the tactic will do. Perhaps in the future the Hypsubst ML structure could export its eq_var and triv_asm term inspection functions so blast would have a better idea what it would do (assuming blast can construct accurately what the subgoal passed would be).

To respond to Makarius' comment that "the real trouble only starts when trying the main
portion of existing applications, and also doing some performance measurements ...":

I've run the simplest performance measurement I can: rebuilding HOL and HOL-ex.

On my desktop machine at home:

Pre-patch: Finished HOL (0:11:32 elapsed time, 0:11:30 cpu time, factor 0.99)
With patch: Finished HOL (0:11:37 elapsed time, 0:11:33 cpu time, factor 0.99)

On my desktop machine at work:

Pre-patch:
Finished HOL (0:07:24 elapsed time, 0:07:22 cpu time, factor 0.99)
Finished HOL-ex (0:15:25 elapsed time, 0:15:21 cpu time, factor 0.99)
With patch:
Finished HOL (0:07:41 elapsed time, 0:07:41 cpu time, factor 1.00)
Finished HOL-ex (0:15:30 elapsed time, 0:15:23 cpu time, factor 0.99)

In both cases parallel features are off (ISABELLE_USEDIR_OPTIONS="-q 0 -M 1").

Obviously this is only the beginning of the story. As I mentioned before there are some obvious performance improvements that can be made to the tactic itself, which I avoided for the sake of clarity. Whether the additional assumptions are slowing down other tactics is hard to tell.

With regard to running a broader set of existing applications: I was hoping to get some agreement as to whether this is a desirable approach or not before investing much more time in repairing proof scripts. I've set the standard "isabelle make -k test" test running, which may produce interesting results but will more likely tell us that many proofs are broken in trivial ways.

Yours,
    Thomas.

The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.