Our experiments has highlighted limitations of lemma speculation. It is not as widely applicable as believed at the start of the project. The evaluation set thus contains the 18 theorems below.
Lemma speculation has problems when no ripple-steps are possible between application of the lemma and fertilisation, which will cause the technique to fail with an underspecified lemma.

* Note: After finding the lemma for theorem 16, weak fertilisation can be applied to the original goal. Although the speculated lemma can be proved automatically, there is however also a sub-goal remaining after fertilisation: Suc(b + b) = Suc b + b, which IsaPlanner cannot prove as it lacks capabilities to generalise variables apart.

Webpage maintained by Moa Johansson

**Theorems 1 - 7:**

These theorems come from the evaluation set for lemma speculation in CLAM 3.

*Ireland, A. and Bundy, A. Productive use of failure in inductive proof. Journal of Automated Reasoning, 1996. *

Theorem | Speculated lemma(s) | Proved lemma ? | Time (sec.) | |
---|---|---|---|---|

1 | x + (Suc x) = Suc(x + x) | Suc(x + y) = Suc(y) + x | yes | 6 |

2 | even(x + x) | x + (Suc x) = Suc(x + x) | no | 4 |

3 | even(len(l @ l)) | len(l @ (h # l)) = Suc(len(l @ l)) | no | 13 |

len(l @ (h # l)) = len(h # (l @ l)) | ||||

4 | rev((rev l) @ m) = (rev m) @ l | rev(h # l) @ m = rev l @ (h # m) | yes | 8 |

5 | rev(rev(l @ m)) = rev(rev l) @ rev(rev m) | rev(l @ [h]) = h # rev l | yes | < 1 |

6 | rev(rev l) @ m = rev(rev(l @ m)) | rev(l @ [h]) = h # rev l | yes | < 1 |

7 | rotate (len l) (l @ m) = m @ l | fail | - | 7 |

**Theorems 8 - 11:**

Higher-order theorems about lists. Theorem 8 comes from Isabelle's list library, while the other theorems have been constructed by hand.

Theorem | Speculated lemma(s) | Proved lemma ? | Time (sec.) | |
---|---|---|---|---|

8 | rev(concat l) = concat(map rev (rev l)) | fail | - | 8 |

9 | len(concat (map f l)) = len(maps f l) | fail | - | 1 |

10 | foldl (λ x y. y + x) n ((rev l) @ m) =
foldl (λ x y. y + x) n (m @ l) |
foldl (λ x y. y + x) n (l @ (h # m)) =
foldl (λ x y. y + x) h (n # (l @ m) |
no | 137 |

11 | foldl (λ x y. x + len y) n ((rev l) @ m) =
foldl (λ x y. x + len y) n (m @ l) |
foldl (λ x y. x + len y) n (l @ (h # m)) =
foldl (λ x y. x + len y) h (n # (l @ m) |
no | 34 |

**Theorems 12 - 13:**

Theorems about inequalities.

Theorem | Speculated lemma(s) | Proved lemma ? | Time (sec.) | |
---|---|---|---|---|

12 | x ≤ (y + x) | Suc z ≤ a + (Suc z) = z ≤ Suc(a + z) | yes | 44 |

13 | count n l ≤ count n (l @ m) | fail | - | 23 |

**Theorems 14 - 18:**

Theorems from alternative formalisations of Peano arithmetic. These are seemingly the same theorems, but with different definitions of addition, multiplication and exponentiation. The different theories will force different proofs for the theorems.

The variants of addition considered are:

a1: |
0 + y = y Suc x + y = Suc(x + y) |
a2: |
x + 0 = x x + Suc y = Suc(x + y) |

a3: |
0 + y = y Suc x + y = x + Suc y |
a4: |
x + 0 = x x + Suc y = Suc x + y |

The test-set in alternative variants of Peano arithmetic comes from: *Dixon, L. A proof-planning framework for Isabelle. PhD Thesis, University of Edinburgh, 2005.*

Theorem | Def. of + | Speculated lemma(s) | Proved lemma ? | Time (sec.) | |
---|---|---|---|---|---|

14 | a + (a + b) = b + (a + a) | a1 | fail | - | < 1 |

15 | a + (0 + a) = a + a | a1 | fail | - | < 1 |

16 | a + (0 + a) = a + a | a2 | Suc x + y = x + Suc y | yes* | < 1 |

17 | a + (0 + a) = a + a | a3 | fail | - | < 1 |

18 | a + (0 + a) = a + a | a4 | fail | - | < 1 |

Webpage maintained by Moa Johansson