[isabelle-dev] (Re-)introducing set as a type constructor rather than as mere abbreviation

Sat Aug 20 01:18:46 CEST 2011

Hi again,

> Indeed, I think a general point can be made here, and not just about code generation. In the last couple of years, we've run into situations in Quickcheck, Nitpick, the code generator, and perhaps more, where we felt the need for sets as different from functions. However, there is no guarantee that all the people who talked about "sets vs. functions" meant, or needed, the same thing!

> I want to emphasize that the limitation of the code generator mentioned here not only applies to sets-as-predicates but also to maps-as-functions and other abstract types that are often specified in terms of functions (finite maps, almost constant maps, etc.). Thus, having good old 'a set back is does not fully solve this problem as a whole, just one (important) instance of it.
> My view on this whole topic is outlined in the report I recently sent around (http://www4.in.tum.de/~krauss/papers/refinement.pdf, not updated since last time). In the long run, I would prefer to see flexible transport machinery to move stuff between isomorphic types. 

It is important to note that the motivation to have set back again as a
type constructor is not code generation itself, but fundamental flaws in
the current situation.  Indeed I fully agree that the driving force
behind the development should always be specification and proof, with
derived application at a secondary level.  Also in our case, code
generation is a side effect, although many technical points of this
discussion here are related to code generation, I guess mainly since a
couple of people involved in this story actually want to see progress here.

What is a different – and in itself valuable – thing is the matter of
»transport« mentioned by Alex, something which also I am eager to see
one day.

But now back to sets.  In answer to some statements made, I have
collected evidence why the current situation indeed is problematic:

> Like any definition of a primitive operator, mem_def is not really intended
> to be unfolded by the user, so don't be surprised if your proof is a mess
> after unfolding this definition. You wouldn't unfold the definitions of
> logical operators like "conj" or "disj" either, except for proving
> introduction- or elimination rules for these operators.

> Can this claim be made more concrete (or even quantitative)? Or is it merely a conjecture?
> From what Jasmin wrote, this does not seem to hold for sledgehammer, and simp/auto/blast should not run into the "wrong world" when configured properly.
> I understand that this is true for naive users... but automated tools??? 

> So your point is actually a different one from Florian's: Users unfold mem_def in order to gain automation, which may work in that moment, but is usually a bad idea later on. I understand this point.
> (Aside: It would be interesting to explore why users actually do this. Is there something missing in the automation for sets that works better for predicates?)
> My understanding of Florian's point above was that sets-as-predicates actually hinder automation by enlarging the search space. But it seems to me that this is only a theoretical consideration, and that we do not actually observe this effect in practice.

A typical example for what I was alluding to can be found at:

http://www4.in.tum.de/~haftmann/cgi-bin/hgweb.cgi/isabelle_set/rev/6e3e9c6c5062#l1.37

Observe the following proof:

lemma part_equivpI:
  "(\<exists>x. R x x) \<Longrightarrow> symp R \<Longrightarrow> transp
R \<Longrightarrow> part_equivp R"
  apply (simp add: part_equivp_def)
  -- {* (I) "sane" proof state *}
  apply auto
  -- {* (II) "insane" proof state *}
  apply (auto elim: sympE transpE)
  -- {* (III) does not even terminate; adding "simp add: mem_def"
succeeds! *}

The second »auto« imposes a »\<in>« on predicates, although mem_def,
Collect_def or similar do not appear in the proof text (push to the
»wrong world«).  Worse, the final auto does not even terminate, this is
what I had in mind when referring to an enlarged search space.  Here,
the way the system is build forces me to use »simp add: mem_def«.

(Well, joking you could also argue that there are three thing which
LISP, Java and Isabelle have in common: strange syntax, not writable
without special tool support, choking as soon as higher-order things
come in.)

>> * The logical identification of sets and predicates prevents some
>> > reasonable simp rules on predicates: e.g. you cannot have (A |inf| B) x
>> > = A x & B x
>> > because then expressions like "set xs |inter| set ys" behave strangely
>> > if the are eta-expanded (e.g. due to induction).
> This sounds more like a problem with the underlying infrastructure that should
> be fixed, rather than working around the problem by introducing new type constructors.
> Can you give an example of a proof by induction, where eager eta expansion leads
> to an unnecessarily complicated proof?

theory Scratch
imports Main "~~/src/HOL/Library/Lattice_Syntax"
begin

declare List.set_append [simp del]

thm sup_apply
declare sup_apply [simp]

lemma set_append: "set (xs @ ys) = (set xs \<union> set ys)"
  apply (induct xs)
  apply simp_all
  apply auto -- {* non-termination! *}

The declaration of »sup_apply« as »[simp]« smashes, seemingly natural, a
proof of a fundamental property on sets.  The bitter irony is that what
once was envisaged as a follow-up to the abolishment of the set type
constructor, the unifying of set and lattice operations, has been
complicated by the abolishment rather than facilitated.  Also observe
the annoying non-termination of »auto« again.

> We use map functions to lift coercions between base types to coercions between type constructors. E.g. "nat list" is subtype of "int list" with the coercion "map int", provided that "int" is declared as a coercion and "map" as map function for "'a list". This is a typical example of a covariant map function (i.e. the lifting does not invert the direction of the subtype relation).
> On the other hand, the generic map function for "'a => 'b" ("% f g h x. g (h (f x)) :: ('a => 'b) => ('c => 'd) => ('b => 'c) => ('a => 'd)") is contravariant in the first argument. Concretely that means that "int => bool" is a subtype of "nat => bool", provided the above map function and the coercion "int". In contrast, the corresponding map function for sets as predicates ("image :: ('a => 'b) => ('a => bool) => ('b => bool)") is, as one would expect, covariant in the first argument.
> The current implementation of subtyping allows to declare only one map function for a type constructor. Thus, we can have either the contravariant or the covariant map for the function type, but not both at the same time. The introduction of set as an own datatype would resolve this issue. 

This is an interesting oberservation which also applies to the
instantiation of sets for type classes: in 'a => bool, you have to think
about 'a in a contravariant fashion, whereas in 'a set it is covariant.

Rgds.,
	Florian

-- 

Home:
http://www.in.tum.de/~haftmann

PGP available:
http://home.informatik.tu-muenchen.de/haftmann/pgp/florian_haftmann_at_informatik_tu_muenchen_de

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 262 bytes
Desc: OpenPGP digital signature
URL: <https://mailman46.in.tum.de/pipermail/isabelle-dev/attachments/20110820/382cdd9d/attachment.sig>