Lean Export & CIC Interpretation

Machine-checked certificates for Python programs

Deppy can construct proofs about your Python code — but how do you know Deppy itself isn't wrong? The answer is independent verification: export the proof to Lean 4, whose tiny trusted kernel re-checks everything from scratch. If Lean accepts the proof, correctness depends only on Lean's kernel — not on Deppy, not on Z3, not on any LLM.

This chapter explains how Deppy proofs map to Lean 4's Calculus of Inductive Constructions (CIC), what the generated files look like, and what remains out of reach.

Why Lean?

Deppy's trust hierarchy has three levels, in ascending order of assurance:

  🟠 LLM_CHECKED   — oracle says correct (confidence ≥ threshold)
  🟡 Z3_VERIFIED    — proved by Z3 SMT solver
  🟢 LEAN_KERNEL_VERIFIED — machine-checked by Lean's kernel

Why is Lean at the top? Because Lean 4's kernel is a small, carefully audited type checker implementing the Calculus of Inductive Constructions. It is roughly 5,000 lines of C++ — small enough to be meaningfully reviewed. Every proof accepted by the kernel has been independently re-derived from axioms; the kernel trusts nothing outside itself.

By contrast, Deppy's proof engine is tens of thousands of lines of Python, wrapping Z3 bindings, homotopy search, and LLM calls. Any bug in that code could cause a false positive. Exporting to Lean gives an independent second opinion: if Lean accepts the proof, correctness doesn't depend on Deppy being bug-free.

This is the same strategy used by proof-carrying code, CompCert, and other high-assurance systems: generate the proof however you like, then let a small trusted checker verify it.

Deppy in CIC

Deppy's proof terms are built from homotopy-flavored constructors — Refl, Trans, Cong, and so on. Each constructor has a natural interpretation in Lean 4's type theory. The following table shows the mapping:

Deppy	CIC / Lean 4	Notes
`@guarantee("P")`	`theorem` `fn_P` : `P`	Postcondition becomes a theorem statement
`@requires("Q")`	`(h : Q) →`	Precondition becomes a hypothesis parameter
`Refl`	`rfl` / `Eq.refl`	Reflexivity of equality
`Trans`(`p`, `q`)	`Eq.trans` `p` `q`	Transitivity: `a = b → b = c → a = c`
`Sym`(`p`)	`Eq.symm` `p`	Symmetry: `a = b → b = a`
`Cong`(`f`, `p`)	`congrArg` `f` `p`	Congruence: `a = b → f a = f b`
`NatInduction`	`Nat.rec` / `by induction`	Structural induction on natural numbers
`ListInduction`	`List.rec` / `by induction`	Structural induction on lists
`Z3Proof`	`by omega` / `by decide`	Arithmetic decided by Lean's `omega` tactic
`FunExt`	`funext`	Function extensionality
`Cases`	`by cases` / `match`	Case analysis on constructors
`DuckPath`	(axiom)	Structural subtyping — modeled as a postulated axiom
`CechGlue`	`sorry` -- Čech gluing	No CIC analog — exported with explanation comment
`PathType`	`Eq`	Propositional equality
`Transport`	`Eq.subst` / `▸`	Substitution along an equality proof

Mapping Examples

A Deppy reflexivity proof:

# Deppy proof term
proof = Refl(term=square(x))

becomes in Lean:

-- Lean 4
theorem square_eq : square x = square x := rfl

A transitive chain:

# Deppy
proof = Trans(
    Cong(f, p),   # f(a) = f(b)  from  a = b
    Sym(q)         # f(b) = c     from  c = f(b)
)

-- Lean 4
theorem chain (p : a = b) (q : c = f b) :
    f a = c :=
  Eq.trans (congrArg f p) (Eq.symm q)

A Z3 leaf proof is translated to Lean's omega tactic, which handles linear arithmetic natively:

# Deppy
proof = Z3Proof("x + 1 > x")

-- Lean 4
theorem succ_gt (x : Int) : x + 1 > x := by omega

Type Translation

Python types are mapped to Lean types following a simple, conservative scheme. The goal is not to model every quirk of CPython semantics but to provide a faithful mathematical model of what the Python code intends.

Python Type	Lean 4 Type	Notes
`int`	`Int`	Arbitrary-precision integers (matches Python semantics)
`bool`	`Bool`	`True` → `true`, `False` → `false`
`str`	`String`	Unicode strings
`list[int]`	`List Int`	Immutable in proofs (functional model)
`Optional[T]`	`Option T`	`None` → `Option.none`
`dict[K, V]`	`K → V`	Partial function model (total with default)
`set[T]`	`Finset T`	Requires `import` Mathlib
`tuple[A, B]`	`A × B`	Product type (Prod)

Example translation of a function signature:

def lookup(d: dict[str, int], key: str) -> Optional[int]:
    ...

def lookup (d : String → Int) (key : String) : Option Int :=
  sorry

What a Certificate Looks Like

Let's walk through a complete example. Here is a Python function with two guarantees:

from deppy import guarantee

@guarantee("result >= 0")
@guarantee("result * result <= x")
def isqrt(x: int) -> int:
    if x <= 0:
        return 0
    r = 0
    while (r + 1) * (r + 1) <= x:
        r += 1
    return r

Deppy generates the following Lean 4 file:

/-
  Generated by Deppy v0.1.0
  Source: math_utils.py
  Guarantees: 2 exported, 1 fully proved, 1 sorry
-/
import Mathlib.Tactic.Omega
import Mathlib.Data.Int.Basic

/-- Integer square root -/
def isqrt (x : Int) : Int :=
  if x ≤ 0 then 0
  else
    -- body translated from Python
    sorry  -- loop translation not yet supported

theorem isqrt_nonneg (x : Int) : 0 ≤ isqrt x := by
  unfold isqrt
  split
  · omega
  · sorry  -- requires loop invariant

theorem isqrt_squared (x : Int) (h : 0 ≤ x) :
    isqrt x * isqrt x ≤ x := by
  sorry  -- requires loop invariant reasoning

Anatomy of the Certificate

Header comment — records the Deppy version, source file, and a summary of how many guarantees were exported and how many were fully proved. This is metadata for human readers; Lean ignores it.
Imports — Mathlib.Tactic.Omega provides the omega tactic for linear arithmetic. Mathlib.Data.Int.Basic gives basic integer lemmas.
Function definition — the Python function is translated to a Lean def. The if branch translates directly. The while loop body is replaced with sorry because Lean requires termination proofs for recursive definitions, and arbitrary while loops don't map cleanly to structural recursion.
Theorem statements — each @guarantee becomes a theorem. The name is derived from the function name and guarantee text. The statement is a direct formalization of the guarantee's meaning.
Proof bodies — where Deppy has a complete proof, the tactic script is emitted. Where the proof depends on parts Deppy can't translate (like loop invariants), sorry is used with a comment explaining what's missing.

A Fully Verified Example

Not every export contains sorry. For functions with simple structure, Deppy can produce a complete Lean proof. Here is the simplest case:

from deppy import guarantee

@guarantee("result >= 0")
def square(x: int) -> int:
    return x * x

The generated Lean file — with zero sorry:

def square (x : Int) : Int := x * x

theorem square_nonneg (x : Int) : 0 ≤ square x := by
  unfold square
  exact mul_self_nonneg x

When Lean processes this file with lake build, it type-checks every definition and theorem. The mul_self_nonneg lemma is provided by Mathlib and states exactly what we need: for any integer x, x * x ≥ 0. No trust in Deppy is required — Lean has independently verified the claim.

The key insight: Deppy doesn't need to be correct — it only needs to produce output that Lean can check. This is why proof generation (hard, complex, potentially buggy) is separated from proof verification (small, trusted, audited).

A slightly more involved example with a precondition:

@guarantee("result > 0")
def abs_plus_one(x: int) -> int:
    return abs(x) + 1

def abs_plus_one (x : Int) : Int := Int.natAbs x + 1

theorem abs_plus_one_pos (x : Int) : 0 < abs_plus_one x := by
  unfold abs_plus_one
  omega

Trust Levels in Detail

Every Lean export carries a trust level that summarizes how much of the proof Lean can independently verify:

Trust Level	Meaning	sorry?	Custom Axioms?	Lean ran?
LEAN_KERNEL_VERIFIED	Lean kernel fully checked every step (set by `cert.verify_with_lean()`)	0	0	✓
LEAN_SYNTAX_COMPLETE	No `sorry`, no raw axioms — but Lean has NOT been invoked. Default after a successful `compile_to_lean`.	0	0	✗
LEAN_EXPORTED	Generated but contains `sorry` gaps; Lean not invoked or sorry remains.	≥ 1	0	✗
ASSUMPTION_DEPENDENT	Uses raw `axiom` declarations (e.g., for library calls or duck-typing). Lean accepts axioms by definition, so a passing `verify_with_lean()` still inherits their assumptions.	Any	≥ 1	—
LEAN_REJECTED	The `lean` binary was invoked and rejected the file. See `cert.lean_check_stderr` for the diagnostic.	Any	Any	✓ (failed)
LEAN_UNAVAILABLE	The `lean` binary was not found on PATH. No kernel check was attempted; install Lean to enable verification.	Any	Any	—

LEAN_KERNEL_VERIFIED is the gold standard. Zero sorry, zero custom axioms. The proof is fully machine-checked. If Lean's kernel has no bugs, the theorem holds.
LEAN_EXPORTED means the proof structure is documented in Lean, but some steps are marked sorry. This is still useful: the theorem statements are precise, and you can fill in the sorrys manually in Lean if desired.
ASSUMPTION_DEPENDENT means the proof relies on axioms that model Python-specific features (like duck typing paths) that have no native CIC equivalent. The proof is correct if the axioms hold.

You can inspect the trust level programmatically. Note: compile_to_lean does NOT invoke Lean — it only emits Lean source. To actually run Lean (and obtain the LEAN_KERNEL_VERIFIED label) call cert.verify_with_lean():

from deppy.lean import compile_to_lean

# Step 1: emit Lean source.  Trust starts at LEAN_EXPORTED or
# LEAN_SYNTAX_COMPLETE depending on whether any `sorry` remains.
cert = compile_to_lean(my_function)
print(cert.trust_level)
# → 'LEAN_SYNTAX_COMPLETE'  (no sorry, but Lean hasn't run)

# Step 2: invoke Lean.  Requires `lean` on PATH.
cert = cert.verify_with_lean()
print(cert.trust_level)
# → 'LEAN_KERNEL_VERIFIED'  (Lean accepted)
# or 'LEAN_REJECTED'       (Lean rejected; see cert.lean_check_stderr)
# or 'LEAN_UNAVAILABLE'    (lean binary not on PATH)

print(cert.summary())
# LeanCertificate 'my_function'
#   declarations: 1
#   theorems:     3 (2 proved, 1 sorry)
#   trust level:  LEAN_EXPORTED

Retired label: older docs referred to a single LEAN_VERIFIED trust level that was issued whenever the emitted file had no sorry — even if Lean never ran. That label is retired. The current labels are LEAN_EXPORTED, LEAN_SYNTAX_COMPLETE, LEAN_KERNEL_VERIFIED, LEAN_REJECTED, LEAN_UNAVAILABLE, and ASSUMPTION_DEPENDENT — see Part 4 · Trust Levels for the full taxonomy.

Using the Export

CLI Workflow

The simplest way to export and check proofs:

$ python -m deppy export my_module.py --output my_module.lean
Exported 5 theorems (3 fully proved, 2 sorry)

# Or: render the Lean text directly via the sidecar runner.
# It also runs ``lean`` against the output and writes the JSON sidecar:
$ python -m deppy.pipeline.run_verify my_module.deppy --out my_module.lean
  Lean export round-trip: OK
  ✓ CERTIFIED — top-level certify_or_die: PASS

Setting Up a Lean 4 Project

To check the exported proofs, you need a Lean 4 project with Mathlib. Here is the minimal setup:

-- lakefile.lean
import Lake
open Lake DSL

require mathlib from git
  "https://github.com/leanprover-community/mathlib4"

package myProject

lean_lib MyModule

Then:

$ lake update   # fetch Mathlib (takes a few minutes the first time)
$ lake build    # type-check everything, including your exported file

Interpreting Results

When lake build succeeds, every theorem without sorry has been fully verified by Lean's kernel. Lean will emit warnings for any remaining sorry:

warning: declaration uses 'sorry' [linter.unreachableTactic]

This is expected for LEAN_EXPORTED certificates — it means Lean acknowledged the proof structure but couldn't independently verify the sorry steps.

When Lean Rejects a Proof

If lake build fails with a type error, it means Deppy generated an invalid proof term. This can happen when:

A Z3 proof term doesn't translate cleanly to omega (non-linear arithmetic, for example).
The type translation introduced a mismatch (e.g., Python's int division vs. Lean's Int division).
A homotopy tactic used a rule that doesn't hold in CIC.

What to do:

Read the error message — Lean's errors are precise and point to exactly which proof step failed.
Try adding hints — sometimes the proof is correct but Lean needs intermediate steps. Add check() annotations in the Python source to give Deppy more stepping stones.
Fall back to Z3 — if a particular guarantee can't be exported to Lean, the Z3_VERIFIED trust level is still meaningful. Not every proof needs the Lean gold standard.
Fill in manually — open the .lean file in VS Code with the Lean 4 extension and fill in the sorrys with manual tactic proofs. This is a valid workflow for critical code.

Never modify the exported .lean file and then claim Deppy verified it. If you manually edit the Lean file, the verification is human-assisted, not automated. Track this in your trust metadata.

Limitations and Honest Assessment

We believe in being forthright about what the Lean export can and cannot do.

What We CAN Export

Simple arithmetic proofs — equalities and inequalities over integers that fall within omega's scope.
Structural induction — proofs by induction on natural numbers or lists.
Type-based proofs — refinement type properties that follow from the function's structure.
Equality chains — proofs built from Refl, Trans, Sym, Cong, FunExt.
Case analysis — pattern matching on constructors, if/else branches.

What We CAN'T Export

Python's full dynamic semantics — duck typing, monkey-patching, eval(), metaclasses. These don't have types in CIC.
Duck typing paths — Deppy's DuckPath proofs establish structural subtyping equivalences. These are modeled as axioms in Lean, not proved from first principles.
Čech cohomology proofs — the sheaf-theoretic gluing used by Deppy has no analog in CIC. These are exported as sorry with an explanatory comment.
Complex loop invariants — while loops require termination proofs and loop invariants that Deppy doesn't always compute automatically.

The Semantic Gap

Lean's Int ≠ Python's int in every respect. While both are arbitrary-precision, the operations differ: Python's // rounds toward negative infinity; Lean's integer division rounds toward zero. Python's % follows the sign of the divisor; Lean follows the sign of the dividend. We model Python ints as mathematical integers and document any semantic gap in the export header.

Sorry Is Not Failure

A partially-proved certificate with sorry is not useless — it documents the proof structure. The theorem statements are precise. The proof steps that are filled in have been checked by Lean. Only the sorry steps remain unverified. You can:

Fill them in manually in Lean (for critical code).
Accept Z3_VERIFIED trust level for those steps.
Use the sorry count as a metric: fewer sorry = higher assurance.

From Homotopy to CIC

This section is for readers who want to understand the deep theory behind the translation. If you're a practitioner who just wants to use the export, the earlier sections suffice.

Two Type Theories

Deppy is built on dependent type theory (HoTT). Its proof terms manipulate paths (equalities as first-class objects), transport (moving data along equalities), and Čech cohomology (decomposing complex properties into locally-provable pieces that are glued together).

Lean 4 is built on the Calculus of Inductive Constructions (CIC). Its equality is propositional (Eq in Prop), proofs are proof-irrelevant (any two proofs of the same proposition are considered equal), and there are no interval variables or cubical primitives.

These are fundamentally different type theories. The translation is therefore lossy — some Deppy proofs cannot be faithfully represented in CIC. The translation preserves as much structure as possible and uses sorry for the genuinely cubical parts.

The Bridge

The key correspondences that make the translation possible:

HoTT Concept	CIC / Lean Equivalent	Faithful?
`PathType` `a` `b`	`Eq` `a` `b` (propositional equality)	✓ for ground types
`Transport` `P` `p` `x`	`Eq.subst` `p` `x` / `▸`	✓
`Refl` / `Sym` / `Trans`	`rfl` / `Eq.symm` / `Eq.trans`	✓
`FunExt`	`funext`	✓ (Lean has funext)
Interval variable `𝕀`	no analog	✗
`hcomp` (homogeneous composition)	no analog	✗ — `sorry`
`Glue` types	no analog	✗ — `sorry`
`CechGlue`	no analog	✗ — `sorry`
Univalence axiom	inconsistent with CIC's UIP	✗ — not exported

What Translates Faithfully

The good news: most proofs Deppy generates for practical Python code don't use the deeply cubical features. They use:

Reflexivity, symmetry, transitivity, congruence — all map directly.
Z3 leaf proofs — map to omega / decide.
Induction on natural numbers or lists — maps to Nat.rec / List.rec.
Function extensionality — Lean has funext.
Transport along equalities — maps to Eq.subst.

These cover the vast majority of refinement-type proofs.

What Doesn't Translate

The cubical features — hcomp, Glue, interval variables — are used in Deppy's more advanced proof strategies, particularly the Čech decomposition. When these appear in a proof:

# Deppy proof term containing cubical operations
proof = CechGlue(
    patches=[p1, p2, p3],
    overlaps=[o12, o23],
    cocycle=c
)

The Lean export produces:

/-
  CechGlue: This proof step uses Čech cohomology gluing,
  which has no analog in CIC. The Deppy proof decomposes
  the property into 3 local patches verified independently,
  then glues them via the sheaf condition.
  Patches proved: p1 ✓, p2 ✓, p3 ✓
  Overlaps checked: o12 ✓, o23 ✓
  Cocycle condition: satisfied
-/
sorry  -- Čech gluing (no CIC analog)

This is honest: the comment documents exactly what was proved using the homotopy methods, even though Lean can't check it. A reader of the Lean file can see the proof structure and decide whether to trust the homotopy reasoning or attempt a CIC-native proof of the same property.

Why not use Cubical Agda? Cubical Agda natively supports HoTT and could check the cubical parts. We chose Lean because of Mathlib's extensive library of mathematical lemmas, which covers far more of the arithmetic and data-structure properties that arise in practice. A future version of Deppy may support a Cubical Agda backend for the homotopy-heavy proofs.

← Previous How Deppy Uses Z3 Next → Under the Hood: Overview