types: simple logic: higher-order proof: machine-checked software: formally verified
The dos and don’ts of Isabelle proofs - Part 2: Good Style. Advanced material for nicer proofs.
With examples and rationale.
This is Part 2 of my personal style guide for Isabelle/HOL. Part 1 was about the essentials. Part 2 is about good style and advanced rules that may not always apply, but will be beneficial when they do. Part 3 will be about layout and looks.
Thanks for all the positive feedback on Part 1! I hope you will as much fun with Part 2.
This third set of rules in the Isabelle style guide is about things that will make your proofs nicer. If you can follow them you should, but it may not always be feasible to do so. Some of these are for advanced uses of Isabelle.
Use structured Isar proofs instead of apply scripts.
Not always feasible, but most often easier to read and maintain.
Prove your assumptions are not vacuous.
It has happened to many of us: we have proudly proved a glorious lemma, only for someone to point out that its assumptions can never be satisfied. Few things are more embarrassing in proofs, and it is often easy to avoid:
If you prove an invariant, show that it is true for at least one state. If you model a mathematical structure by its axioms, show that there is at least one such structure. If you have a top-level theorem, show that its assumptions are satisfiable.
Use Eisbach instead of ML code for custom tactics.
New in Isabelle2015, Eisbach extends Isar with a proof method language. Proof methods in Eisbach are easier to understand and maintain for non-experts.
Compress single-step low-level proofs into automated search.
Use this judiciously, it can easily be overdone. The idea is to make apply scripts shorter and thereby more readable, and to make them more robust against small changes by using search instead of single steps.
However, if you overdo it, you’ll end up with a large, obscure search tactic invocation that will opaquely tell you Failed to apply proof method
when it does not work any more. You will also want to avoid increasing processing time too much. Sometimes a few low-level apply
steps are siginificantly faster than a search tactic.
Original
apply (rule conjI)
apply (clarsimp simp: X)
apply (rule impI)
apply (erule (1) impE)
apply (rule my_rule)
apply simp
apply clarsimp
done
Good, compressed:
by (fastforce simp: X intro!: my_rule)
This is shorter, forces the subgoal to be solved, and uses search, so will be more robust against small changes. It can also be processed asynchronously, because it can use by
. Caution: only use if my_rule
is safe and the search doesn’t take too long.
Use context
instead of global simp/attribute changes
Unless you are setting up a library or are deliberately designing a globally useful and complete set of rules for proof automation, keep changes to contexts such as [simp]
and [intro]
local by either using notes
or using
inside a single proof, or by using the context
command to keep the change local to the group of lemmas you are currently interested in.
Bad:
declare if_cong [cong]
declare word_neq_0_conv [simp]
lemma ...
lemma ...
declare if_weak_cong [cong]
declare word_neq_0_conv [simp del]
This bad form assumes a certain state of [cong]
and [simp]
attributes in the beginning and tries to restore them. If this assumption changes, it will restore the wrong state. If you merge this theory with another one that has different assumptions, confusion is likely.
Better:
context
notes if_cong [cong]
notes word_neq_0_conv [simp]
begin
lemma ...
lemma ...
end
This form always restores the original context, whatever it was.
Good:
lemma ...
lemma
notes if_cong [cong] word_neq_0_conv [simp]
shows ...
This form uses the changed [simp]
and [cong]
context only for the lemmas that actually need them. This makes it easier to move such lemmas into other theories.
If you have multiple such declarations, it’s worth giving them a name using bundle
:
Good:
bundle cong_word = if_cong [cong] word_neq_0_conv [simp]
lemma ...
lemma
includes cong_word
shows ...
Avoid unnecessary quantifiers and type annotations
Humans and Isabelle are good at type inference, so it is often more readable to leave out type annotations. Similarly, free variables are interpreted as universally quantified in lemmas after they are completed, so no need to add extra quantifiers.
Bad:
lemma "⋀x (y::32 word). ctes_of s ⊢ x → y ⟹ ctes' ⊢ x → y"
Good:
lemma "ctes_of s ⊢ x → y ⟹ ctes' ⊢ x → y"
There are exceptions to this rule, for instance when complex types are hard to infer for humans so that an annotation adds clarity.
Decide on good normal forms and use them consistently.
When you design simpsets and other setups for automated tactics, think about what normal forms your terms will have, and use these normal forms in your lemma statements.
Try to have most of your lemmas in clarsimp
normal form, if that doesn’t make them too unwieldy.
Readable, but bad for automation:
lemma "f (x + 1) = f x + (1::nat)"
This is Ok for top-level theorems and lemmas you want to present to people. If the goal is to use them in further proofs, the following is more useful.
Better:
lemma "f (Suc x) = Suc (f x)"
The second form is better, because simp
will rewrite x + 1
into Suc x
, so the form f (x + 1)
is unlikely to appear in goals.
A simple test whether your lemma is in clarsimp
normal form is to see whether a pure apply clarsimp
changes anything right after stating the lemma. Anything that changes should at least not be in the part of the lemma that needs to match in the goal when it is used.
Use text
and section
commands to document your proofs.
Isabelle supports literate proofs. Make use of that facility, especially for basic definitions, fundamental assumptions, and top-level theorems, but also for any peculiarities, workarounds, and tricks you used in your formalisation.
Use the text
command to document, use chapter
, section
, and subsection
to give structure to theory files, even if you don’t intend to produce a polished PDF proof document.
Adhere to library naming schemes where they exist:
Lists are called xs
, ys
, and so on (s for sequence), sets are capital letters such as A
and S
. Elements of lists and sets are x
, y
, and a
, b
. Predicates are P
, Q
, R
.
Types and type constructors such as nat
, 'a list
, 'a set
are lower case and with under_score_names
, data type constructors such as Nil
and Cons
are upper case and with CamelCaseNames
.
Theory and file names are upper case and in Camel_Underscore_Case
.
Many of the standard libraries have an established naming scheme. Don’t be afraid to look at the library source and pick up the style used there. When in doubt, take what is used in the Isabelle distribution. Always a good idea to browse the contents of src/HOL
for inspiration.
Think of a naming scheme for your application, and use it consistently.
Consistent naming is not only good for maintenance, but also for distributed proof development: person A should be able to predict the name person B is going to give to a lemma or definition.
Avoid one-letter names for constants and types.
Bad:
definition f where "f = ..."
This will reserve the name f
and make it unusable for variables, or lead to confusion in formulas such as ∀f. P f
.
Prefer longer names for functions, and shorter names for parameters and variables.
Avoid unnecessary parentheses, unless precedence is exotic or especially unclear.
Isabelle uses Haskell- and ML-like functional syntax that minimises parentheses for better readability. If unsure about operator precedence, you can enter a term with parentheses and see what Isabelle echoes back: it will leave out the unnecessary parentheses.
Bad:
definition
"my_fun f xs = (map (f)) xs @ [(f (hd xs))]
Good:
definition
"my_fun f xs = map f xs @ [f (hd xs)]
Use judiciously: sometimes, parentheses add clarity. In general it is a good idea to use parentheses sparingly and orient yourself on what is used in the Isabelle distribution.
Consider forcing failure when a proof method does not solve its goal
When something like apply simp
solves a goal, consider replacing it with apply (simp; fail)
so that it will fail early when after a change or in a new Isabelle release simp
does not solve its goal any more. (The ;
operator runs fail
on all new goals after simp
, so the only way this will not fail is if there are no new goals produced by simp
)
If you are using Eisbach, you can also write apply (solves ‹simp›)
. ‹
and ›
can be typed as \open
and \close
in jEdit.
Both forms are a bit verbose for widespread use, but forcing early failure will help in long obscure proofs. Also well worth considering when you’re writing a new proof method in Eisbach.
Avoid theorem-transforming attributes such as [simplified]
and [rule_format]
in lemma statements.
Instead, write out the transformed form.
[rule_format]
was popular for a while when the induct
method did not yet exist and the now deprecated induct_tac
could not handle theorem statements with meta-implication. rule_format
should only be necessary very rarely now.
[simplified]
is unpredictable for the reader and its outcome is dependent on which simplification rules are declared. If that global context changes, or the lemma is moved somewhere else, the result can be subtly different and it will not be easy to figure out what happened.
Bad:
lemma P [simplified]: "P (x+1)"
...
Good:
lemma P: "P (Suc x)"
apply simp
...
Or, if simp
for more fine-grained control:
Good:
lemma P: "P (Suc x)"
proof -
have "P (x+1)"
...
thus ?thesis by simp
qed
If P
is a large term, Isar’s term abbreviations can help, for example:
lemma P: "P (Suc x)" (is "?P (Suc x)")
proof -
have "?P (x+1)"
...
thus ?thesis by simp
qed
Avoid apply (auto; ...)
This has behaviour that is hard to predict for readers: the ;
operator refers only to the goals produced from the first subgoal, not to those produced by all subgoals that auto
may work on. For example, if you have two subgoals before the call to auto
, and auto
produces two new subgoals from the first and two new subgoals form the second, then
apply (auto; x)
will apply x
only to the first two of the 4 total subgoals after auto
. The same applies for other tactics that work on multiple goals simultaneously, such as simp_all
.
This concludes Part 2 of my style guide for Isabelle/HOL.
As with Part 1, if you have feedback, strong opinions, or just liked what you read, get in touch by email or if you think it is of wider interest, on the isabelle-users mailing list (subscribe here).
Part 1 has seen some updates based on feedback. I might do the same for Part 2.
[30 May 2015: udpated with bundle
suggestion by Lars Noschinski]
[30 May 2015: thanks to Larry Paulson for catching a mistake in the context
section]
[Jan 2016: updated with the auto; x
suggestion by Matthew Fernandez]
Posted by Gerwin Klein on May 30, 2015.