Lean Export & CIC Interpretation

Machine-checked certificates for Python programs

Deppy can construct proofs about your Python code — but how do you know Deppy itself isn't wrong? The answer is independent verification: export the proof to Lean 4, whose tiny trusted kernel re-checks everything from scratch. If Lean accepts the proof, correctness depends only on Lean's kernel — not on Deppy, not on Z3, not on any LLM.

This chapter explains how Deppy proofs map to Lean 4's Calculus of Inductive Constructions (CIC), what the generated files look like, and what remains out of reach.


Why Lean?

Deppy's trust hierarchy has three levels, in ascending order of assurance:

  🟠 LLM_CHECKED   — oracle says correct (confidence ≥ threshold)
  🟡 Z3_VERIFIED    — proved by Z3 SMT solver
  🟢 LEAN_KERNEL_VERIFIED — machine-checked by Lean's kernel

Why is Lean at the top? Because Lean 4's kernel is a small, carefully audited type checker implementing the Calculus of Inductive Constructions. It is roughly 5,000 lines of C++ — small enough to be meaningfully reviewed. Every proof accepted by the kernel has been independently re-derived from axioms; the kernel trusts nothing outside itself.

By contrast, Deppy's proof engine is tens of thousands of lines of Python, wrapping Z3 bindings, homotopy search, and LLM calls. Any bug in that code could cause a false positive. Exporting to Lean gives an independent second opinion: if Lean accepts the proof, correctness doesn't depend on Deppy being bug-free.

This is the same strategy used by proof-carrying code, CompCert, and other high-assurance systems: generate the proof however you like, then let a small trusted checker verify it.

Deppy in CIC

Deppy's proof terms are built from homotopy-flavored constructors — Refl, Trans, Cong, and so on. Each constructor has a natural interpretation in Lean 4's type theory. The following table shows the mapping:

DeppyCIC / Lean 4Notes
@guarantee("P") theorem fn_P : P Postcondition becomes a theorem statement
@requires("Q") (h : Q) → Precondition becomes a hypothesis parameter
Refl rfl / Eq.refl Reflexivity of equality
Trans(p, q) Eq.trans p q Transitivity: a = b → b = c → a = c
Sym(p) Eq.symm p Symmetry: a = b → b = a
Cong(f, p) congrArg f p Congruence: a = b → f a = f b
NatInduction Nat.rec / by induction Structural induction on natural numbers
ListInduction List.rec / by induction Structural induction on lists
Z3Proof by omega / by decide Arithmetic decided by Lean's omega tactic
FunExt funext Function extensionality
Cases by cases / match Case analysis on constructors
DuckPath (axiom) Structural subtyping — modeled as a postulated axiom
CechGlue sorry -- Čech gluing No CIC analog — exported with explanation comment
PathType Eq Propositional equality
Transport Eq.subst / Substitution along an equality proof

Mapping Examples

A Deppy reflexivity proof:

# Deppy proof term
proof = Refl(term=square(x))

becomes in Lean:

-- Lean 4
theorem square_eq : square x = square x := rfl

A transitive chain:

# Deppy
proof = Trans(
    Cong(f, p),   # f(a) = f(b)  from  a = b
    Sym(q)         # f(b) = c     from  c = f(b)
)
-- Lean 4
theorem chain (p : a = b) (q : c = f b) :
    f a = c :=
  Eq.trans (congrArg f p) (Eq.symm q)

A Z3 leaf proof is translated to Lean's omega tactic, which handles linear arithmetic natively:

# Deppy
proof = Z3Proof("x + 1 > x")
-- Lean 4
theorem succ_gt (x : Int) : x + 1 > x := by omega

Type Translation

Python types are mapped to Lean types following a simple, conservative scheme. The goal is not to model every quirk of CPython semantics but to provide a faithful mathematical model of what the Python code intends.

Python TypeLean 4 TypeNotes
int Int Arbitrary-precision integers (matches Python semantics)
bool Bool Truetrue, Falsefalse
str String Unicode strings
list[int] List Int Immutable in proofs (functional model)
Optional[T] Option T NoneOption.none
dict[K, V] K → V Partial function model (total with default)
set[T] Finset T Requires import Mathlib
tuple[A, B] A × B Product type (Prod)

Example translation of a function signature:

def lookup(d: dict[str, int], key: str) -> Optional[int]:
    ...
def lookup (d : StringInt) (key : String) : Option Int :=
  sorry

What a Certificate Looks Like

Let's walk through a complete example. Here is a Python function with two guarantees:

from deppy import guarantee

@guarantee("result >= 0")
@guarantee("result * result <= x")
def isqrt(x: int) -> int:
    if x <= 0:
        return 0
    r = 0
    while (r + 1) * (r + 1) <= x:
        r += 1
    return r

Deppy generates the following Lean 4 file:

/-
  Generated by Deppy v0.1.0
  Source: math_utils.py
  Guarantees: 2 exported, 1 fully proved, 1 sorry
-/
import Mathlib.Tactic.Omega
import Mathlib.Data.Int.Basic

/-- Integer square root -/
def isqrt (x : Int) : Int :=
  if x ≤ 0 then 0
  else
    -- body translated from Python
    sorry  -- loop translation not yet supported

theorem isqrt_nonneg (x : Int) : 0 ≤ isqrt x := by
  unfold isqrt
  split
  · omega
  · sorry  -- requires loop invariant

theorem isqrt_squared (x : Int) (h : 0 ≤ x) :
    isqrt x * isqrt xx := by
  sorry  -- requires loop invariant reasoning

Anatomy of the Certificate

  1. Header comment — records the Deppy version, source file, and a summary of how many guarantees were exported and how many were fully proved. This is metadata for human readers; Lean ignores it.
  2. ImportsMathlib.Tactic.Omega provides the omega tactic for linear arithmetic. Mathlib.Data.Int.Basic gives basic integer lemmas.
  3. Function definition — the Python function is translated to a Lean def. The if branch translates directly. The while loop body is replaced with sorry because Lean requires termination proofs for recursive definitions, and arbitrary while loops don't map cleanly to structural recursion.
  4. Theorem statements — each @guarantee becomes a theorem. The name is derived from the function name and guarantee text. The statement is a direct formalization of the guarantee's meaning.
  5. Proof bodies — where Deppy has a complete proof, the tactic script is emitted. Where the proof depends on parts Deppy can't translate (like loop invariants), sorry is used with a comment explaining what's missing.

A Fully Verified Example

Not every export contains sorry. For functions with simple structure, Deppy can produce a complete Lean proof. Here is the simplest case:

from deppy import guarantee

@guarantee("result >= 0")
def square(x: int) -> int:
    return x * x

The generated Lean file — with zero sorry:

def square (x : Int) : Int := x * x

theorem square_nonneg (x : Int) : 0 ≤ square x := by
  unfold square
  exact mul_self_nonneg x

When Lean processes this file with lake build, it type-checks every definition and theorem. The mul_self_nonneg lemma is provided by Mathlib and states exactly what we need: for any integer x, x * x ≥ 0. No trust in Deppy is required — Lean has independently verified the claim.

The key insight: Deppy doesn't need to be correct — it only needs to produce output that Lean can check. This is why proof generation (hard, complex, potentially buggy) is separated from proof verification (small, trusted, audited).

A slightly more involved example with a precondition:

@guarantee("result > 0")
def abs_plus_one(x: int) -> int:
    return abs(x) + 1
def abs_plus_one (x : Int) : Int := Int.natAbs x + 1

theorem abs_plus_one_pos (x : Int) : 0 < abs_plus_one x := by
  unfold abs_plus_one
  omega

Trust Levels in Detail

Every Lean export carries a trust level that summarizes how much of the proof Lean can independently verify:

Trust LevelMeaningsorry?Custom Axioms?Lean ran?
LEAN_KERNEL_VERIFIED Lean kernel fully checked every step (set by cert.verify_with_lean()) 0 0
LEAN_SYNTAX_COMPLETE No sorry, no raw axioms — but Lean has NOT been invoked. Default after a successful compile_to_lean. 0 0
LEAN_EXPORTED Generated but contains sorry gaps; Lean not invoked or sorry remains. ≥ 1 0
ASSUMPTION_DEPENDENT Uses raw axiom declarations (e.g., for library calls or duck-typing). Lean accepts axioms by definition, so a passing verify_with_lean() still inherits their assumptions. Any ≥ 1
LEAN_REJECTED The lean binary was invoked and rejected the file. See cert.lean_check_stderr for the diagnostic. Any Any ✓ (failed)
LEAN_UNAVAILABLE The lean binary was not found on PATH. No kernel check was attempted; install Lean to enable verification. Any Any

You can inspect the trust level programmatically. Note: compile_to_lean does NOT invoke Lean — it only emits Lean source. To actually run Lean (and obtain the LEAN_KERNEL_VERIFIED label) call cert.verify_with_lean():

from deppy.lean import compile_to_lean

# Step 1: emit Lean source.  Trust starts at LEAN_EXPORTED or
# LEAN_SYNTAX_COMPLETE depending on whether any `sorry` remains.
cert = compile_to_lean(my_function)
print(cert.trust_level)
# → 'LEAN_SYNTAX_COMPLETE'  (no sorry, but Lean hasn't run)

# Step 2: invoke Lean.  Requires `lean` on PATH.
cert = cert.verify_with_lean()
print(cert.trust_level)
# → 'LEAN_KERNEL_VERIFIED'  (Lean accepted)
# or 'LEAN_REJECTED'       (Lean rejected; see cert.lean_check_stderr)
# or 'LEAN_UNAVAILABLE'    (lean binary not on PATH)

print(cert.summary())
# LeanCertificate 'my_function'
#   declarations: 1
#   theorems:     3 (2 proved, 1 sorry)
#   trust level:  LEAN_EXPORTED
Retired label: older docs referred to a single LEAN_VERIFIED trust level that was issued whenever the emitted file had no sorry — even if Lean never ran. That label is retired. The current labels are LEAN_EXPORTED, LEAN_SYNTAX_COMPLETE, LEAN_KERNEL_VERIFIED, LEAN_REJECTED, LEAN_UNAVAILABLE, and ASSUMPTION_DEPENDENT — see Part 4 · Trust Levels for the full taxonomy.

Using the Export

CLI Workflow

The simplest way to export and check proofs:

$ python -m deppy export my_module.py --output my_module.lean
Exported 5 theorems (3 fully proved, 2 sorry)

# Or: render the Lean text directly via the sidecar runner.
# It also runs ``lean`` against the output and writes the JSON sidecar:
$ python -m deppy.pipeline.run_verify my_module.deppy --out my_module.lean
  Lean export round-trip: OK
  ✓ CERTIFIED — top-level certify_or_die: PASS

Setting Up a Lean 4 Project

To check the exported proofs, you need a Lean 4 project with Mathlib. Here is the minimal setup:

-- lakefile.lean
import Lake
open Lake DSL

require mathlib from git
  "https://github.com/leanprover-community/mathlib4"

package myProject

lean_lib MyModule

Then:

$ lake update   # fetch Mathlib (takes a few minutes the first time)
$ lake build    # type-check everything, including your exported file

Interpreting Results

When lake build succeeds, every theorem without sorry has been fully verified by Lean's kernel. Lean will emit warnings for any remaining sorry:

warning: declaration uses 'sorry' [linter.unreachableTactic]

This is expected for LEAN_EXPORTED certificates — it means Lean acknowledged the proof structure but couldn't independently verify the sorry steps.

When Lean Rejects a Proof

If lake build fails with a type error, it means Deppy generated an invalid proof term. This can happen when:

What to do:

  1. Read the error message — Lean's errors are precise and point to exactly which proof step failed.
  2. Try adding hints — sometimes the proof is correct but Lean needs intermediate steps. Add check() annotations in the Python source to give Deppy more stepping stones.
  3. Fall back to Z3 — if a particular guarantee can't be exported to Lean, the Z3_VERIFIED trust level is still meaningful. Not every proof needs the Lean gold standard.
  4. Fill in manually — open the .lean file in VS Code with the Lean 4 extension and fill in the sorrys with manual tactic proofs. This is a valid workflow for critical code.
Never modify the exported .lean file and then claim Deppy verified it. If you manually edit the Lean file, the verification is human-assisted, not automated. Track this in your trust metadata.

Limitations and Honest Assessment

We believe in being forthright about what the Lean export can and cannot do.

What We CAN Export

What We CAN'T Export

The Semantic Gap

Lean's Int ≠ Python's int in every respect. While both are arbitrary-precision, the operations differ: Python's // rounds toward negative infinity; Lean's integer division rounds toward zero. Python's % follows the sign of the divisor; Lean follows the sign of the dividend. We model Python ints as mathematical integers and document any semantic gap in the export header.

Sorry Is Not Failure

A partially-proved certificate with sorry is not useless — it documents the proof structure. The theorem statements are precise. The proof steps that are filled in have been checked by Lean. Only the sorry steps remain unverified. You can:


From Homotopy to CIC

This section is for readers who want to understand the deep theory behind the translation. If you're a practitioner who just wants to use the export, the earlier sections suffice.

Two Type Theories

Deppy is built on dependent type theory (HoTT). Its proof terms manipulate paths (equalities as first-class objects), transport (moving data along equalities), and Čech cohomology (decomposing complex properties into locally-provable pieces that are glued together).

Lean 4 is built on the Calculus of Inductive Constructions (CIC). Its equality is propositional (Eq in Prop), proofs are proof-irrelevant (any two proofs of the same proposition are considered equal), and there are no interval variables or cubical primitives.

These are fundamentally different type theories. The translation is therefore lossy — some Deppy proofs cannot be faithfully represented in CIC. The translation preserves as much structure as possible and uses sorry for the genuinely cubical parts.

The Bridge

The key correspondences that make the translation possible:

HoTT ConceptCIC / Lean EquivalentFaithful?
PathType a b Eq a b (propositional equality) ✓ for ground types
Transport P p x Eq.subst p x /
Refl / Sym / Trans rfl / Eq.symm / Eq.trans
FunExt funext ✓ (Lean has funext)
Interval variable 𝕀 no analog
hcomp (homogeneous composition) no analog ✗ — sorry
Glue types no analog ✗ — sorry
CechGlue no analog ✗ — sorry
Univalence axiom inconsistent with CIC's UIP ✗ — not exported

What Translates Faithfully

The good news: most proofs Deppy generates for practical Python code don't use the deeply cubical features. They use:

These cover the vast majority of refinement-type proofs.

What Doesn't Translate

The cubical features — hcomp, Glue, interval variables — are used in Deppy's more advanced proof strategies, particularly the Čech decomposition. When these appear in a proof:

# Deppy proof term containing cubical operations
proof = CechGlue(
    patches=[p1, p2, p3],
    overlaps=[o12, o23],
    cocycle=c
)

The Lean export produces:

/-
  CechGlue: This proof step uses Čech cohomology gluing,
  which has no analog in CIC. The Deppy proof decomposes
  the property into 3 local patches verified independently,
  then glues them via the sheaf condition.
  Patches proved: p1 ✓, p2 ✓, p3 ✓
  Overlaps checked: o12 ✓, o23 ✓
  Cocycle condition: satisfied
-/
sorry  -- Čech gluing (no CIC analog)

This is honest: the comment documents exactly what was proved using the homotopy methods, even though Lean can't check it. A reader of the Lean file can see the proof structure and decide whether to trust the homotopy reasoning or attempt a CIC-native proof of the same property.

Why not use Cubical Agda? Cubical Agda natively supports HoTT and could check the cubical parts. We chose Lean because of Mathlib's extensive library of mathematical lemmas, which covers far more of the arithmetic and data-structure properties that arise in practice. A future version of Deppy may support a Cubical Agda backend for the homotopy-heavy proofs.