types: simple

logic: higher-order

proof: machine-checked

software: formally verified

The dos and don’ts of Isabelle proofs: guidelines for making your proofs more readable, maintainable and enjoyable.

With examples and rationale.

This is Part 1 of my personal style guide for Isabelle/HOL. Part 1 is about content and how to keep proofs maintainable. More requirements than style, really. Part 2 is about good style and advanced rules, and part 3 will be about layout and looks.

Most of these guidelines encode what is followed in the Isabelle distribution. Such rules used to be less enforced in the Archive Of Formal Proofs, but will be more so in the future to keep it maintainable over time. These guidelines are also used to various degrees in some of the large proofs of the L4.verified project, at least on good days. You can observe there how messy it can get when they are not.

This first set of rules, you should always follow. At least, if you ever intend to show your proofs to somebody else. Obviously, they are Ok to break in interactive use and proof search, but proofs that break these rules would not be accepted at the Archive Of Formal Proofs.

**Do not use**`sorry`

Should go without saying that your proofs should be finished before you call them

*proofs*, but it’s worth checking if you forgot one somewhere.You should not leave old experiments sitting around as

`sorry`

lemmas. Use`oops`

to make sure they are not used accidentally, or make them explicit assumption of a locale if you need them, but do not intend to prove them.**Do not use**`back`

This command single-steps Isabelle’s backtracking mechanism. It is occasionally nice for debugging low-level rule applications, but it can be a real headache for maintenance, because backtracking occurs over many different places in Isabelle and its internal order can change. When it does change, it can be very hard to figure out what the proof script was trying to achieve. It should always be eliminated in final proofs, either by using implicit backtracking or by using explicit instantiations.

Bad:

apply (erule allE) back back apply (erule impE) back apply assumption apply clarsimp ...

Good:

apply (erule allE, erule (1) impE) apply clarsimp ...

Also Ok, but neater if you don’t need to instantiate:

apply (erule allE [where x="37"]) apply (erule (1) impE) apply clarsimp ...

**Indent**`apply`

scripts by subgoal.If an apply command is applied to a state with n+1 subgoals, it must be indented by n spaces relative to the first

`apply`

in the sequence (see the Isabelle distribution for many examples). This indicates textually which line solved a subgoal and how many new subgoals where generated. Invaluable information when something goes wrong because something changed somewhere. Also, it gives a nice visual indication of the nesting depth of your proof.The usual indentation for 0 subgoals is 2 spaces for a top-level proof.

There is an experimental bean shell macro available for jEdit that will indent apply scripts by subgoal.

Bad:

apply (erule impE) apply (clarsimp split: split_if_asm) apply assumption apply fastforce apply simp apply clarsimp apply fastforce ...

Good:

apply (erule impE) apply (clarsimp split: split_if_asm) apply assumption apply fastforce apply simp apply clarsimp apply fastforce ...

In the bad version, it is impossible to tell which of the

`simp`

and`clarsimp`

tactics were supposed to solve a goal, and how many new subgoals the`(clarsimp split: split_if_asm)`

produced. In the good version, it is easy to see.From Isabelle2015 it will be easy to indicate that a tactic should solve its goal. This is not mandatory, but often a good idea. It will be covered in Part 2 of the style guide, but it does not absolve you from indentation.

**Never give an unnamed lemma a global attribute for automated methods**Bad:

lemma [simp]: "(A ∧ B) = (B ∧ A)"

It is much too hard to remove such lemmas from the (in this case) simp set again. Instead, give the lemma a name.

This second set of rules will make your life easier down the track, in maintenance and readability. There can be good reasons for exceptions, but you should know what you are doing when you are breaking them. The AFP makes some of these mandatory.

**Instantiations must not use Isabelle-generated names such as**`xa`

.These names change as Isabelle releases go by (in Isabelle2015, for instance, names generated by the datatype command changed drastically). When that has happened, trying to figure out what

`ba17`

meant can drive the strongest mind insane. Use Isar or`rename_tac`

to avoid such names.Bad:

apply (cases n) apply clarsimp apply (rule_tac x="ba17" in exI)

Good:

apply (cases n) apply (rename_tac x y) apply clarsimp apply (rule_tac x="x" in exI)

`rename_tac`

works from right to left, so if the goal shows the parameters`⋀a b xa.`

,`apply (rename_tac x y)`

will change them to`⋀a x y.`

. The idea is that new names are introduced on the right and that it is more likely that you will want to rename something new.Better:

proof (cases n) fix x y assume "P x y" have "Q x" by clarsimp thus ?thesis .. qed

**Do not use**`axiomatization`

This command can introduce inconsistency into the logic. While you may not knowingly make use of any introduced inconsistency, automated tools may do so on your behalf. Unless you are bootstrapping a new logic in Isabelle or are extending the basic axioms of Higher-Order Logic, you should use safer commands such as

`locale`

instead, or use explicit assumptions.If the

`axiomatization`

command is not used to state any properties, but to merely introduce the existence of a constant, it is safe to use.Bad:

axiomatization P :: "'a ⇒ bool" where all_true: "∀x. P x" and all_false: "∀x. ¬P x"

Ok:

axiomatization P :: "'a ⇒ bool"

The second example only states that a

`P`

of type`'a => bool`

exists, which is safe in HOL, because all types are inhabited.**Do not use**`smt`

in proofs you need to maintain over longer time.The outcome of the

`smt`

command depends on tools external to Isabelle, so it can be hard to predict if they will prove the same things in the future or if they will even still be available in an Isabelle-compatible form in a number of years.Try to replace these proofs by other methods. This is mostly important if you need your proofs to still work in a few years or if want to submit to the AFP.

It’s worth trying to just run

`metis`

instead of`smt`

, or to manually re-run`sledgehammer`

. From Isabelle2015 onward,`sledgehammer`

tries harder to give you non-`smt`

proofs back (this does not seem to work as well yet for`autosledgehammer`

mode).**If there are proof steps that take significant time, i.e. longer than roughly 1 min, add a short comment to that step.**If you maintain these proofs in the future you will know what to expect instead of wondering whether something broke. Proof maintainers are impatient people.

Good:

by (auto simp: some_rule split: split_if_asm) (* takes long *)

In Isar proofs, it is better to use

`by`

for long-running terminal proof methods, because`by`

is run asynchronously (so you can continue working), whereas`apply`

is not.**Use unrestricted**`auto`

only as a terminal proof method`auto`

works on all subgoals simultaneously and when its internal composition changes, it is very hard to tell what the resulting proof state used to be. It’s perfectly fine if the resulting proof state is documented, for instance in a structured proof, or when`auto`

solves its goals completely, e.g. as terminal proof method.Bad:

apply (auto simp: some_rules) apply (clarsimp simp: x) apply (rule_tac x="xa" in exI, simp) ...

Good:

apply (auto simp: some_rules x) done

Also Good:

apply ... apply (auto simp: some_rules)[2] apply ... done

In the example above,

`auto`

is restricted to 2 goals, and solves them.In general, most proof methods in Isabelle work on a single goal, and don’t have that potentially globally devastating effect of

`auto`

. Consider replacing an initial`auto`

with a more predictable`clarsimp`

or similar tactics. Then potentially use`auto`

in goal-restricted form, e.g.`apply (auto: ..)[n]`

to solve subgoals that the unconstrained initial`auto`

would have solved, but`clarsimp`

didn’t.Larry Paulson points out that the initial purpose of

`auto`

was to be used at the start of proofs to solve the easy parts of the problem and leave over the difficult ones. He also has a solution:*use*`auto`

to start the proof, but regard the resulting subgoals as separate lemmas that need to be proved first. Combining these proved lemmas with “auto” should solve the problem outright, and you have created a structured proof.**Do not switch from**`apply`

into`proof`

in the middle of a proofIn proofs such as the following

Bad:

apply clarsimp apply (rule my_rule) apply (fastforce simp: foo) proof safe fix new assume "something surpising" show "unforeseen" ...

it is very hard for the reader to tell what is going on without running Isabelle, and the proof will break if the output of the initial apply script changes even slightly.

It is easier on the eye to use

`apply`

inside a structured proof, i.e. to turn the proof around.(Somewhat) better:

proof - { fix new assume "something surprising" have "unforeseen" ... } note 1 = this show ?thesis apply clarsimp apply (rule my_rule) apply (fastforce simp: foo) apply safe apply (rule 1) done

This at least has some chance against small changes in the outcome of the apply script, and you may be able to compress the final apply sequence into a single method invocation such as

`by (auto simp: foo intro: my_rule 1)`

.Even better would be a fully structured proof that breaks up the apply part into readable fragments, or at least those initial parts of the proof.

**Keep initial**`proof`

methods simple in structured proofsBad:

proof (auto simp: complex_rule intro!: more_complex_rule) fix x y assume "Q x" show "P x y" ...

This makes it hard for the reader to figure out how the rest of the proof connects. There can be exceptions to this rule, especially when the outcome of the complex method is simple.

Instead, keep the initial proof method simple. Do less there, and shift work into the

`show ?thesis`

part of the proof.Better:

proof - { fix x y assume "Q x" have "P x y" } thus ?thesis by (auto simp: complex_rule intro!: more_complex_rule) qed

Or potentially possible:

proof (rule more_complex_rule) fix x y assume "Q' x" have "P' x y" thus ?thesis by (auto simp: complex_rule) qed

**Do not use the implicit**`apply rule`

Bad

apply rule apply rule apply rule apply clarsimp

When you’re writing it, the rule Isabelle picks will make perfect sense to you, and when that rule changes in two years, you will have no idea which one it was and what it was called.

Good

apply (rule conjI) apply (rule ballI) apply (rule exI) apply clarsimp

Ok to use in an Isar proof block where the rest of the text statically makes clear what is going on. For example:

Good

lemma "A ∧ B" proof show "A" ... next show "B" ....

The

`proof`

without arguments in this example is equivalent to`proof rule`

, which does the same thing as`apply rule`

.

This concludes part 1 of my style guide for Isabelle/HOL.

If you have feedback, strong opinions, or just liked what you read, get in touch or if you think it is of wider interest, even on the mailing list (subscribe here). I might even update this style guide based on feedback.

Read on for Part 2 with more advanced material.

[24 May 2015: updated with feedback from Daniel Matichuk and Lars Noschinski]

[25 May 2015: updated with feedback from Peter Gammie]

[29 May 2015: updated with feedback from Larry Paulson]

Posted by Gerwin Klein on May 23, 2015.