Lemmas & Induction

Sometimes Z3 can't prove a property automatically. The solver is powerful, but it works within fixed decision procedures — it doesn't invent creative proof strategies on its own. When Z3 gets stuck, you need to help by writing lemmas: small helper proofs that guide the verification.

In Deppy, lemmas are ordinary Python functions decorated with @lemma that return proof terms. Combined with induction, lemmas let you prove complex properties about recursive data structures — lists, trees, natural numbers — all without modifying the code you're verifying.

Today's surface API: the @induction decorator currently takes a single positional argument — the target identifier (@induction("xs")) — and stores fn._deppy_induction_target for the kernel to read. The chapter examples below show @induction("xs", over="list") and similar; the over= keyword is a documentary convention used here to signal which induction principle the proof is meant to invoke (natural-number induction, list induction, tree induction). Inside the kernel the actual selection happens by binding the proof body to one of NatInduction, ListInduction, or structural Cases on the relevant ADT.

Lemmas as Python Functions

A lemma is a function that takes inputs and returns a proof term. The @lemma decorator marks it as a proof obligation — not executable code. The function body constructs the proof using proof combinators that Deppy checks for validity.

Here's the simplest possible lemma: proving that x + 0 == x for any integer.

@lemma
def add_zero_right(x: int) -> Proof[x + 0 == x]:
    # x + 0 is definitionally equal to x
    return Refl(x)

The return type Proof[x + 0 == x] declares exactly what this lemma proves. The body must construct a valid proof term of that type. Here, Refl(x) works because x + 0 reduces to x by the definition of addition.

A slightly more involved lemma — commutativity of addition for a specific case:

@lemma
def add_comm_zero(x: int) -> Proof[0 + x == x + 0]:
    # Both sides equal x, so transitivity through Refl
    p1 = add_zero_right(x)        # x + 0 == x
    p2 = Refl(x)                  # x == x
    return Trans(Symm(p2), p1)   # 0 + x == x == x + 0
Ghost code. Lemmas are "ghost code" — they exist only for verification and are completely erased at runtime. They produce no overhead, no side effects, and no executable instructions. Your production code stays exactly as it was.

Natural Number Induction with @induction

Many properties of natural numbers require induction. The @induction("n") decorator tells Deppy to apply the standard induction schema on the parameter n:

Deppy generates the induction schema automatically and checks that both cases are covered by your proof terms.

Example: Sum of First n Natural Numbers

Let's prove the classic identity: sum(range(n+1)) == n * (n+1) // 2. First, here's the ordinary Python function we're verifying:

def triangle(n: int) -> int:
    """Sum of 0 + 1 + 2 + ... + n."""
    return sum(range(n + 1))

Now the lemma that proves correctness:

@lemma
@induction("n")
def triangle_formula(n: int) -> Proof[triangle(n) == n * (n + 1) // 2]:
    if n == 0:
        # Base case: triangle(0) = sum(range(1)) = 0 = 0*1//2 ✓
        return Refl(0)
    else:
        # Inductive step: assume triangle(n-1) == (n-1)*n//2
        ih = triangle_formula(n - 1)  # induction hypothesis

        # triangle(n) = triangle(n-1) + n
        #             = (n-1)*n//2 + n       (by ih)
        #             = (n-1)*n//2 + 2n//2
        #             = ((n-1)*n + 2n) // 2
        #             = n*(n+1) // 2         ✓
        step = Cong(lambda s: s + n, ih)
        return Trans(step, Refl(n * (n + 1) // 2))

The recursive call triangle_formula(n - 1) is not a runtime call — it's the induction hypothesis provided by the @induction schema. Deppy verifies that the recursion is well-founded (always decreasing toward the base case).

Structural Induction on Lists

Lists have two structural cases: the empty list [] and a cons cell [x] + xs. The decorator @induction("xs", over="list") triggers list induction:

Example: Reverse Is Involutive

We prove that reversing a list twice gives back the original: reverse(reverse(xs)) == xs.

def reverse(xs: list) -> list:
    if not xs:
        return []
    return reverse(xs[1:]) + [xs[0]]
# Helper lemma: reverse distributes over append
@lemma
@induction("xs", over="list")
def reverse_append(xs: list, ys: list) -> Proof[
    reverse(xs + ys) == reverse(ys) + reverse(xs)
]:
    if not xs:
        # reverse([] + ys) == reverse(ys) == reverse(ys) + []
        return Symm(append_nil(reverse(ys)))
    else:
        x, xt = xs[0], xs[1:]
        ih = reverse_append(xt, ys)  # induction hypothesis
        # reverse((x:xt) + ys) = reverse(xt + ys) + [x]
        #                      = (reverse(ys) + reverse(xt)) + [x]  (by ih)
        #                      = reverse(ys) + (reverse(xt) + [x])
        #                      = reverse(ys) + reverse(x:xt)
        step = Cong(lambda r: r + [x], ih)
        return Trans(step, append_assoc(reverse(ys), reverse(xt), [x]))

@lemma
@induction("xs", over="list")
def reverse_involutive(xs: list) -> Proof[
    reverse(reverse(xs)) == xs
]:
    if not xs:
        # reverse(reverse([])) == reverse([]) == [] ✓
        return Refl([])
    else:
        x, xt = xs[0], xs[1:]
        ih = reverse_involutive(xt)  # reverse(reverse(xt)) == xt

        # reverse(reverse([x] + xt))
        #   = reverse(reverse(xt) + [x])         (def of reverse)
        #   = reverse([x]) + reverse(reverse(xt)) (reverse_append)
        #   = [x] + reverse(reverse(xt))          (reverse singleton)
        #   = [x] + xt                            (by ih)
        p1 = reverse_append(reverse(xt), [x])
        p2 = Cong(lambda r: [x] + r, ih)
        return Trans(p1, p2)

Example: Append Preserves Length

Prove that len(xs + ys) == len(xs) + len(ys):

@lemma
@induction("xs", over="list")
def append_length(xs: list, ys: list) -> Proof[
    len(xs + ys) == len(xs) + len(ys)
]:
    if not xs:
        # len([] + ys) == len(ys) == 0 + len(ys) ✓
        return Refl(len(ys))
    else:
        x, xt = xs[0], xs[1:]
        ih = append_length(xt, ys)  # len(xt + ys) == len(xt) + len(ys)

        # len([x] + xt + ys) = 1 + len(xt + ys)        (def of len)
        #                   = 1 + len(xt) + len(ys)     (by ih)
        #                   = len([x] + xt) + len(ys)   (def of len)
        return Cong(lambda k: 1 + k, ih)

Structural Induction on Trees

Trees have two constructors: Leaf and Node(left, value, right). The decorator @induction("t", over="tree") gives you:

Example: Mirror Is Involutive

def mirror(t):
    if t is None:      # Leaf
        return None
    return Node(mirror(t.right), t.value, mirror(t.left))

@lemma
@induction("t", over="tree")
def mirror_involutive(t) -> Proof[mirror(mirror(t)) == t]:
    if t is None:
        # mirror(mirror(Leaf)) == mirror(Leaf) == Leaf ✓
        return Refl(None)
    else:
        # Induction hypotheses for subtrees
        ih_left = mirror_involutive(t.left)
        ih_right = mirror_involutive(t.right)

        # mirror(mirror(Node(L, v, R)))
        #   = mirror(Node(mirror(R), v, mirror(L)))    (def of mirror)
        #   = Node(mirror(mirror(L)), v, mirror(mirror(R)))  (def of mirror)
        #   = Node(L, v, R)                             (by ih_left, ih_right)
        p_left = Cong(lambda l: Node(l, t.value, t.right), ih_left)
        p_right = Cong(lambda r: Node(t.left, t.value, r), ih_right)
        return Trans(p_left, p_right)

Proof by Contradiction

Sometimes the most natural proof assumes the negation and derives False. The @proof_by_contradiction decorator handles this pattern:

@lemma
@proof_by_contradiction
def no_largest_int(n: int) -> Proof[Exists(int, lambda m: m > n)]:
    # Assume for contradiction: ∀ m, m ≤ n
    # Then n + 1 ≤ n, which implies 1 ≤ 0 — contradiction.
    witness = n + 1
    # n + 1 > n is trivially true
    return Witness(witness, Refl(True))

Deppy checks that the contradiction is genuine — you can't just assert False. The proof must derive it from the negated hypothesis through valid logical steps.

Proof Combinators

Proof combinators are the building blocks for constructing proof terms. Each combinator has a precise type that Deppy verifies:

Combinator Type Description
Refl(x) Proof[x == x] Reflexivity — every value equals itself
Symm(p) Proof[a == b] → Proof[b == a] Symmetry — flip an equality
Trans(p, q) Proof[a == b] → Proof[b == c] → Proof[a == c] Transitivity — chain two equalities
Cong(f, p) (a → b) → Proof[x == y] → Proof[f(x) == f(y)] Congruence — apply a function to both sides
Subst(p, ctx) Proof[a == b] → (a → Prop) → Proof[ctx(a)] → Proof[ctx(b)] Substitution — rewrite in a context
Witness(w, p) a → Proof[P(a)] → Proof[Exists(T, P)] Existential introduction
Inductive(base, step) Proof[P(0)] → (Proof[P(k)] → Proof[P(k+1)]) → Proof[∀n. P(n)] Induction principle

You can compose these freely. For instance, to prove f(a) == f(c) given a == b and b == c:

p_ab: Proof[a == b] = ...
p_bc: Proof[b == c] = ...
p_ac = Trans(p_ab, p_bc)       # Proof[a == c]
result = Cong(f, p_ac)          # Proof[f(a) == f(c)]

Complete Example: Sum of Odd Numbers

Let's prove a classic identity: the sum of the first n odd numbers equals n². That is, 1 + 3 + 5 + ⋯ + (2n − 1) == n².

def sum_odds(n: int) -> int:
    """Sum of first n odd numbers: 1 + 3 + 5 + ... + (2n-1)."""
    total = 0
    for i in range(n):
        total += 2 * i + 1
    return total

@lemma
@induction("n")
def sum_odds_square(n: int) -> Proof[sum_odds(n) == n ** 2]:
    if n == 0:
        # Base case: sum_odds(0) = 0 = 0² ✓
        return Refl(0)
    else:
        # Inductive step:
        # Assume sum_odds(n-1) == (n-1)²
        ih = sum_odds_square(n - 1)

        # sum_odds(n) = sum_odds(n-1) + (2n - 1)      (loop body)
        #            = (n-1)² + (2n - 1)               (by ih)
        #            = n² - 2n + 1 + 2n - 1
        #            = n²                              ✓
        step = Cong(lambda s: s + (2 * n - 1), ih)
        # Now step : sum_odds(n) == (n-1)² + (2n - 1)
        # Z3 can close the arithmetic gap: (n-1)² + (2n-1) == n²
        return Trans(step, Refl(n ** 2))

Notice how the final arithmetic simplification is left to Z3. You only need to write proof steps for the parts Z3 can't figure out on its own (the inductive structure). The algebraic manipulation is automatic.

How Lemmas Compose with @guarantee

Lemmas on their own prove facts. To use a lemma inside a function's verification, add @uses(lemma_name). This tells Deppy to apply the lemma when discharging the function's @guarantee.

@guarantee("result has same total length as inputs combined")
@uses(append_length)
def interleave(xs: list, ys: list) -> list:
    """Interleave two lists: [1,2],[a,b] -> [1,a,2,b]."""
    if not xs:
        return ys
    if not ys:
        return xs
    return [xs[0], ys[0]] + interleave(xs[1:], ys[1:])

The verification pipeline works like this:

  1. Deppy sees @guarantee and generates a proof obligation about len(result).
  2. Z3 tries to discharge it automatically — but the recursive structure stumps it.
  3. Deppy sees @uses(append_length) and instantiates the lemma at each recursive call.
  4. With the lemma's fact available, Z3 can now close the proof. ✓
Start with Z3. Always try verification without any lemmas first. Z3 handles a surprising amount — arithmetic, simple list operations, boolean logic. Only write a lemma when you see a verification failure that requires inductive reasoning or a non-obvious rewrite.
Lemma termination. Lemma proofs must terminate, just like any other function under Deppy's termination checker. If your lemma has an infinite loop, the proof is unsound — Deppy will reject it. Always ensure recursive lemma calls decrease on a well-founded measure (the @induction decorator handles this automatically for structural recursion).
Exercise. Prove that map distributes over concatenation. That is, for any function f and lists xs, ys:
map(f, xs + ys) == map(f, xs) + map(f, ys)

Write a complete lemma using @induction("xs", over="list"). You'll need two cases:

  • Base: map(f, [] + ys) == map(f, ys) == [] + map(f, ys) == map(f, []) + map(f, ys)
  • Inductive: Unfold one step of map on the cons case and apply the induction hypothesis.
Show solution
@lemma
@induction("xs", over="list")
def map_append(f, xs: list, ys: list) -> Proof[
    map(f, xs + ys) == map(f, xs) + map(f, ys)
]:
    if not xs:
        # map(f, [] + ys) = map(f, ys)
        # map(f, []) + map(f, ys) = [] + map(f, ys) = map(f, ys)
        return Refl(map(f, ys))
    else:
        x, xt = xs[0], xs[1:]
        ih = map_append(f, xt, ys)

        # map(f, [x]+xt + ys) = [f(x)] + map(f, xt + ys)
        #                    = [f(x)] + map(f, xt) + map(f, ys)  (by ih)
        #                    = map(f, [x]+xt) + map(f, ys)
        return Cong(lambda r: [f(x)] + r, ih)