Under the Hood: Overview

How Deppy verifies your Python code — from source text to machine-checked proof term.

Throughout this book we have used @guarantee, assume(), check(), and the homotopy tactics as if they were magic incantations. In this section we pull back the curtain and explain exactly what happens when Deppy processes a decorated Python module. If you have read the F* tutorial's "Under the Hood" chapters, the structure will feel familiar — but the engine is fundamentally different: where F* elaborates to a core calculus and hands everything to Z3, Deppy uses cubical homotopy type theory as its primary proof search strategy and only drops down to Z3 for decidable leaf goals.


Architecture at a Glance

Deppy is structured as a pipeline of four main stages. Each stage is implemented as a pure function that consumes the output of the previous stage, making the whole system easy to test and reason about:

┌─────────────┐     ┌──────────────────┐     ┌─────────────────────┐     ┌──────────────┐
│  Python Src │────▶│  AST + Annots    │────▶│  Proof Obligations  │────▶│  Proof Terms │
│  (.py file) │     │  (typed AST)     │     │  (goals to prove)   │     │  (verified)  │
└─────────────┘     └──────────────────┘     └─────────────────────┘     └──────────────┘
       │                     │                        │                       │
    parse()          elaborate()           discharge()             check_proof()
       │                     │                        │                       │
   Python AST       Bidirectional         Homotopy + Z3           Kernel
   module           type checker          proof search            verification

Let's walk through each stage in detail.

Stage 1: Parsing & Annotation Extraction

Deppy uses Python's built-in ast module to parse source files. This means every valid Python 3.10+ file is accepted — we never require a custom parser. During this stage the system:

# What the user writes
@guarantee("returns a sorted list with no duplicates")
def unique_sorted(lst: list[int]) -> list[int]:
    return sorted(set(lst))

# What Deppy extracts as metadata
# AnnotatedFunction(
#   name       = "unique_sorted",
#   params     = [Param("lst", ListType(IntType))],
#   ret_type   = ListType(IntType),
#   guarantees = [NLSpec("returns a sorted list with no duplicates")],
#   assumes    = [],
#   checks     = [],
# )

Stage 2: Elaboration & Type Checking

The elaborator transforms the annotated AST into a fully typed core term. Deppy uses bidirectional type checking: every sub-expression is processed in either check mode (we know the expected type and push it down) or synth mode (we infer the type from the expression and push it up).

# Bidirectional rules (simplified)
#
#   Γ ⊢ e ⇐ A      "check e against type A"
#   Γ ⊢ e ⇒ A      "synthesize type A from e"
#
# Key rules:
#   CHECK-LAM:   Γ, x:A ⊢ body ⇐ B
#                ──────────────────────
#                Γ ⊢ λx.body ⇐ (A → B)
#
#   SYNTH-APP:   Γ ⊢ f ⇒ (A → B)    Γ ⊢ arg ⇐ A
#                ────────────────────────────────
#                Γ ⊢ f(arg) ⇒ B
#
#   CHECK-PATH:  Γ, i:𝕀 ⊢ body ⇐ A
#                body[0/i] ≡ a    body[1/i] ≡ b
#                ────────────────────────────────
#                Γ ⊢ <i>body ⇐ Path A a b

The cubical extension is where Deppy diverges from conventional type checkers. The interval variable i : 𝕀 ranges over the abstract interval [0, 1] and is used to construct paths — proofs of equality. During elaboration, interval substitution (body[0/i] and body[1/i]) is performed to verify that a path's endpoints match the claimed equality.

Stage 3: Proof Obligation Discharge

After elaboration, Deppy has a list of proof obligations — goals that must be proved for the program to be considered correct. Each obligation is a ProofGoal:

@dataclass
class ProofGoal:
    context: list[Hypothesis]   # known facts (Γ)
    goal_type: SynType          # what we need to prove
    source: SourceLocation      # where in the .py file
    strategy: Strategy          # HomotopyFirst | Z3Only | Manual

The discharge engine tries strategies in order:

  1. HomotopyStrategy — the primary strategy. Attempts to build a proof term using path algebra, transport, Čech decomposition, or fibration reasoning. If a sub-goal is a decidable arithmetic/boolean formula, it delegates to Z3Strategy for that leaf.
  2. Z3Strategy — encodes the goal as an SMT query and calls Z3. Used directly for goals marked Z3Only or as a leaf solver within the homotopy strategy.
  3. ManualStrategy — the user supplies a proof term in a sidecar file. Deppy type-checks the term against the goal.
This layered approach is the key difference from F*. In F*, all proof obligations go to Z3 (with optional tactic pre-processing). In Deppy, Z3 is a leaf solver — the top-level proof structure is always a homotopy proof term.

Stage 4: Kernel Verification

Every proof term produced by Stage 3 is independently verified by the proof kernel. The kernel is the trusted computing base (TCB) of the entire system — if you trust the kernel, you can trust every verification result. The kernel is described in detail below.


The Trusted Computing Base

The proof kernel is deliberately small: approximately 2,500 lines of Python in deppy/core/kernel.py, plus deppy/core/formal_ops.py and deppy/core/types.py (the syntactic term / type representation). Everything outside those three files — the elaborator, the proof search strategies, the Z3 bridge, the Lean translator, the caching layer, the whole deppy/pipeline/ subsystem — is untrusted. If a bug outside the TCB produces a bogus proof term, the kernel rejects it; if a bug inside the TCB accepts a bogus proof, that's a soundness bug.

The kernel accepts a ProofTerm and a Judgment (the goal) and returns a VerificationResult. The result carries:

The kernel operates by structural recursion on the proof term, checking each constructor against its corresponding typing rule. The full proof-term constructor roster (defined in deppy/core/kernel.py):

Each constructor has a corresponding _verify_* method on ProofKernel. Several were tightened in the recent honesty pass and now fail rather than silently downgrade:

Two kernels, not one. Deppy's own kernel (trust level KERNEL) is independent of Lean's kernel (trust label LEAN_KERNEL_VERIFIED). A proof can be KERNEL-verified by deppy — meaning every step was checked by ProofKernel.verify — without Lean ever running. To get Lean's assurance, call cert.verify_with_lean(). See Lean export.

Proof Term Constructors

Deppy's proof terms form a small but expressive language. Each constructor corresponds to a rule of cubical homotopy type theory, extended with domain-specific constructs for Python's type system:

Proof TermTyping RuleIntuition
Refl(a) Γ ⊢ Refl(a) : Path A a a Every value is equal to itself
Sym(p) p : Path A a b ⊢ Sym(p) : Path A b a Equality is symmetric
Trans(p, q) p : Path A a b, q : Path A b c ⊢ Trans(p,q) : Path A a c Equality is transitive
Cong(f, p) f : A→B, p : Path A a b ⊢ Cong(f,p) : Path B (f a) (f b) Functions respect equality
Ap(f, a) f : (x:A)→B(x) ⊢ Ap(f,a) : B(a) Function application
PathComp(p, q) p : Path A a b, q : Path A b c ⊢ PathComp(p,q) : Path A a c Path composition (cubical)
FunExt(h) h : (x:A) → Path B (f x) (g x) ⊢ FunExt(h) : Path (A→B) f g Functions equal pointwise are equal
CechGlue(patches, gluing) patches : [Patch_i], gluing : Agreement ⊢ CechGlue : Path A a b Glue local proofs into a global proof
TransportProof(p, x) p : Path Type A B, x : A ⊢ Transport(p,x) : B Move a value along a type equality
Fiber(f, b, a, p) f:A→B, p:Path B (f a) b ⊢ Fiber(f,b,a,p) : Fib(f,b) Witness that a is in the fiber of f over b
Univalence(equiv) equiv : A ≃ B ⊢ Univalence(equiv) : Path Type A B Equivalent types are equal
DuckPath(proto, cls, wit) wit : Implements(cls, proto) ⊢ DuckPath : Path Type cls proto A class implementing a Protocol is "equal" to it

How the Kernel Verifies Each Proof Term

The kernel's core function, verify_proof, is a recursive match on the proof term constructor. Here is a simplified sketch (the real implementation handles more edge cases):

def verify_proof(
    ctx: Context,
    proof: ProofTerm,
    goal: SynType
) -> VerificationResult:
    match proof:
        case Refl(a):
            # goal must be Path(A, a, a) for some A
            if not is_path_type(goal):
                return Rejected("Refl requires Path type")
            A, lhs, rhs = decompose_path(goal)
            if not definitionally_equal(ctx, lhs, rhs):
                return Rejected(f"endpoints not equal: {lhs} ≠ {rhs}")
            if not definitionally_equal(ctx, a, lhs):
                return Rejected(f"Refl witness {a} ≠ endpoint {lhs}")
            return Verified()

        case Trans(p, q):
            # goal must be Path(A, a, c)
            # p must prove Path(A, a, b) for some b
            # q must prove Path(A, b, c)
            A, a, c = decompose_path(goal)
            b = fresh_meta()  # infer the midpoint
            r1 = verify_proof(ctx, p, PathType(A, a, b))
            r2 = verify_proof(ctx, q, PathType(A, b, c))
            return combine_results(r1, r2)

        case CechGlue(patches, gluing):
            # Each patch proves the goal on a subdomain
            # Gluing data proves patches agree on overlaps
            for patch in patches:
                r = verify_proof(ctx, patch.proof, patch.local_goal)
                if r.is_rejected:
                    return r
            for overlap in gluing.overlaps:
                r = verify_agreement(ctx, overlap)
                if r.is_rejected:
                    return r
            return Verified()

        case DuckPath(proto, cls, witness):
            # Verify that cls implements every method of proto
            for method in proto.required_methods:
                if not has_compatible_method(cls, method):
                    return Rejected(
                        f"{cls.name} lacks method {method.name}"
                    )
            return Verified()

        # ... remaining constructors ...
The kernel never calls Z3. It only does structural/definitional equality checks. If a proof term contains a Z3Leaf node, the kernel trusts that node only if it was produced by Deppy's own Z3 integration and carries a valid Z3Certificate (a hash of the query + result). This is the single trust boundary between the kernel and the SMT solver.

The Type Checker in Depth

Bidirectional Checking

Bidirectional type checking has two mutually recursive judgments:

The key advantage is that type annotations on function parameters flow downward into the body, eliminating the need for many intermediate annotations that a purely inference-based system would require.

def check(ctx: Context, expr: Expr, expected: SynType) -> CheckResult:
    """Check mode: verify expr has type `expected`."""
    match expr:
        case Lambda(param, body):
            # CHECK-LAM rule
            dom, cod = decompose_arrow(expected)
            new_ctx = ctx.extend(param, dom)
            return check(new_ctx, body, cod)

        case PathAbstraction(ivar, body):
            # CHECK-PATH rule
            A, a, b = decompose_path(expected)
            new_ctx = ctx.extend_interval(ivar)
            result = check(new_ctx, body, A)
            # Verify endpoints
            left  = subst_interval(body, ivar, 0)
            right = subst_interval(body, ivar, 1)
            if not definitionally_equal(ctx, left, a):
                return CheckFailed(f"left endpoint: {left} ≠ {a}")
            if not definitionally_equal(ctx, right, b):
                return CheckFailed(f"right endpoint: {right} ≠ {b}")
            return result

        case _:
            # Fall through to synth + subsumption
            inferred = synth(ctx, expr)
            return subtype_check(ctx, inferred, expected)

Cubical Typing Rules

The cubical layer adds three key constructs to the standard type theory:

# Interval substitution example
#
# Given:  path p = <i> (2 * i + 3)
#
#   p @ 0  =  2 * 0 + 3  =  3
#   p @ 1  =  2 * 1 + 3  =  5
#
# So p : Path Int 3 5  (a proof that 3 can be connected to 5
#                       via the continuous family 2*i+3)

Interval Substitution

Interval substitution is the core operation of the cubical type checker. When we encounter p @ r, we substitute r for the path's interval variable throughout the body:

def subst_interval(
    body: Expr,
    ivar: IntervalVar,
    value: IntervalExpr
) -> Expr:
    """Replace every occurrence of `ivar` in `body` with `value`."""
    match body:
        case IVar(name) if name == ivar:
            return interval_to_expr(value)
        case BinOp(left, op, right):
            return BinOp(
                subst_interval(left, ivar, value),
                op,
                subst_interval(right, ivar, value)
            )
        case PathApp(path, arg):
            return PathApp(
                subst_interval(path, ivar, value),
                subst_interval_expr(arg, ivar, value)
            )
        # ... structural recursion on all other cases

The Verification Pipeline

Strategy Dispatch

When a proof goal is generated, Deppy's StrategyDispatcher classifies it and selects the appropriate proof search strategy:

class StrategyDispatcher:
    def dispatch(self, goal: ProofGoal) -> Strategy:
        # 1. If the goal is purely arithmetic/boolean → Z3
        if is_decidable_fragment(goal):
            return Z3Strategy(timeout=5000)

        # 2. If the goal involves equality/paths → Homotopy
        if involves_paths(goal) or involves_transport(goal):
            return HomotopyStrategy(
                leaf_solver=Z3Strategy(timeout=2000),
                max_depth=20
            )

        # 3. If the goal has a sidecar proof → Manual
        if has_sidecar(goal):
            return ManualStrategy(sidecar=goal.sidecar_path)

        # 4. Default: Homotopy with Z3 leaves
        return HomotopyStrategy(
            leaf_solver=Z3Strategy(timeout=5000),
            max_depth=30
        )

The HomotopyStrategy searches for a proof term by exploring the space of possible proof constructions. At each step, it tries the following tactics (in order):

  1. Reflexivity — if the two sides are definitionally equal, emit Refl.
  2. Congruence — if both sides are applications f(a) and f(b), reduce to proving a = b.
  3. Transport — if there's a known type equality Path Type A B in context, try transporting the left side along it.
  4. Čech decomposition — split the domain into patches, prove on each patch, then glue with CechGlue.
  5. Function extensionality — if both sides are functions, try FunExt and prove pointwise equality.
  6. Z3 leaf — if the goal is arithmetic/boolean, delegate to Z3 and wrap the result in a Z3Leaf proof term.
class HomotopyStrategy:
    def search(self, ctx: Context, goal: ProofGoal, depth: int) -> Optional[ProofTerm]:
        if depth > self.max_depth:
            return None

        # Try tactics in order of likelihood
        for tactic in [
            self.try_refl,
            self.try_congruence,
            self.try_transport,
            self.try_cech,
            self.try_funext,
            self.try_z3_leaf,
        ]:
            result = tactic(ctx, goal, depth)
            if result is not None:
                return result

        return None  # exhausted all tactics

ProofTransportCache

Many programs generate similar or identical proof obligations. Deppy memoizes proof search results in a ProofTransportCache keyed by the canonical form of the proof goal (context + goal type, modulo alpha equivalence and definitional equality):

class ProofTransportCache:
    def __init__(self):
        self._cache: dict[int, ProofTerm] = {}

    def lookup(self, goal: ProofGoal) -> Optional[ProofTerm]:
        key = canonical_hash(goal)
        return self._cache.get(key)

    def store(self, goal: ProofGoal, proof: ProofTerm):
        key = canonical_hash(goal)
        self._cache[key] = proof

    def transport(
        self, old_goal: ProofGoal, new_goal: ProofGoal
    ) -> Optional[ProofTerm]:
        """Try to adapt a cached proof to a new goal via transport."""
        cached = self.lookup(old_goal)
        if cached is None:
            return None
        # Build a transport path between the goals
        path = find_goal_path(old_goal, new_goal)
        if path is None:
            return None
        return TransportProof(path, cached)
The transport method is what makes the cache especially powerful. If you've proved a goal for list[int] and now need the same proof for list[float], Deppy can often build a transport path between the two and reuse the original proof — no re-searching required.

Čech Decomposition Engine

The Čech decomposition engine splits a complex proof goal into simpler "patches" that can be proved independently, then glues the results together. This directly mirrors the mathematical concept of a Čech cover in algebraic topology:

class CechEngine:
    def decompose(self, goal: ProofGoal) -> Optional[CechCover]:
        """Split a goal into patches + overlap conditions."""
        match goal.goal_type:
            case ForAll(var, domain, body):
                # Partition the domain into subdomains
                patches = self.partition_domain(domain)
                overlaps = self.compute_overlaps(patches)
                return CechCover(
                    patches=[
                        Patch(
                            subdomain=p,
                            local_goal=restrict_goal(goal, var, p)
                        )
                        for p in patches
                    ],
                    overlaps=overlaps
                )
            case Conjunction(left, right):
                # Two patches, one for each conjunct
                return CechCover(
                    patches=[
                        Patch(subdomain="left", local_goal=left),
                        Patch(subdomain="right", local_goal=right),
                    ],
                    overlaps=[]
                )
            case _:
                return None  # cannot decompose

Performance Considerations

Verification Time vs. Proof Complexity

The time Deppy takes to verify a function depends on several factors:

FactorImpactTypical Range
Number of proof obligations Linear 1–50 per function
Homotopy search depth Exponential (but pruned) 1–30 steps
Z3 query complexity Variable (decidability) 1ms – 5s per query
Čech patch count Quadratic (overlap checks) 2–8 patches
Cache hit rate Dramatic speedup 40–90% in practice

Caching Strategies

Deppy uses three levels of caching:

  1. ProofTransportCache — memoizes proof terms keyed by canonical goal form. Supports transport-based reuse across structurally similar goals.
  2. Z3ResultCache — caches Z3 solver results keyed by the SMT-LIB query string. Avoids re-running the solver for identical queries.
  3. DiskCache — persists verification results to .deppy_cache/. On subsequent runs, only re-verifies functions whose source code (or dependencies) has changed.

Parallel Verification

Deppy can verify independent functions in parallel using Python's concurrent.futures. Since each function's verification is essentially a pure computation (given the cache state), this parallelism is safe:

from concurrent.futures import ProcessPoolExecutor

def verify_module_parallel(
    module: AnnotatedModule,
    workers: int = 4
) -> list[VerificationResult]:
    goals = generate_obligations(module)
    with ProcessPoolExecutor(max_workers=workers) as pool:
        futures = {
            pool.submit(discharge_goal, g): g
            for g in goals
        }
        results = []
        for future in as_completed(futures):
            goal = futures[future]
            try:
                result = future.result(timeout=60)
                results.append(result)
            except TimeoutError:
                results.append(Timeout(goal))
    return results

Comparison with F*'s Architecture

It is instructive to compare Deppy's pipeline with F*'s:

F*

Source (.fst)
    
    
Desugaring
    
    
ElaborationCore F*
    
    
VC Generation
    
    
Z3 (all goals)
    
    
Result

Deppy

Source (.py)
    
    
AST + Annotation
    
    
Bidir. Elaboration
    
    
Homotopy Search
    │      │
        Z3 (leaves)
    │      │
    ▼      ▼
Proof TermKernel
    
    
Result

Key differences:

AspectF*Deppy
Primary solver Z3 (SMT) Cubical homotopy search
Z3's role Sole decision procedure Leaf solver for decidable fragments
Proof terms Optional (tactics generate them) Always produced, always kernel-checked
Equality theory Propositional (with ==) Cubical paths (with transport, ua)
TCB size ~30k LOC (OCaml) + Z3 ~1.7k LOC (Python) + Z3
Source language F* (custom syntax) Python 3.10+ (standard syntax)
Duck typing Not applicable First-class via DuckPath
Topology None Čech covers, fibrations, univalence
F*'s approach — sending everything to Z3 — works extremely well for the kinds of software verification F* targets (cryptographic protocols, low-level systems code). Deppy's approach is designed for a different niche: Python codebases where duck typing, dynamic dispatch, and structural equivalences are common, and where a topological proof strategy can exploit the inherent structure.

Elaboration: F* vs. Deppy

In F*, elaboration transforms surface syntax into a core calculus with explicit universe levels, effect annotations, and implicit argument resolution. The elaborated term is then sent through VC generation, which produces first-order logic formulas for Z3.

In Deppy, elaboration is simpler because we piggyback on Python's existing type system (PEP 484 annotations, typing module). The elaborator's main job is to:

  1. Resolve decorators (@guarantee, @spec) to their specification semantics.
  2. Infer refinement types from assume() / check() calls.
  3. Construct the interval context for cubical operations.
  4. Generate proof obligations (the "VCs" of Deppy).

The key difference is that Deppy's proof obligations are not first-order formulas — they are typing judgments in cubical type theory, which are then solved by homotopy search.

Tactics: F* Meta-F* vs. Deppy Homotopy Tactics

F*'s tactic system (Meta-F*) lets users write custom proof automation in F* itself. These tactics manipulate proof state imperatively and can call Z3 as a sub-step.

Deppy's "tactics" are the proof search strategies described above. They are not user-programmable in the same way (though users can supply manual proofs via sidecar files). The key difference is that Deppy's tactics always produce a proof term that the kernel can independently verify, whereas F*'s tactics produce proof scripts that are replayed.


Putting It All Together

Let's trace through a complete verification of a simple function:

@guarantee("result >= 0")
def abs_val(x: int) -> int:
    if x >= 0:
        return x
    else:
        return -x

Verification Trace

1
Parse — Extract @guarantee("result >= 0") and attach it as a postcondition annotation to abs_val.
2
Elaborate — The NL spec parser converts "result >= 0" to the refinement type {v: int | v >= 0}. The return type is refined: abs_val : (x: int) → {v: int | v >= 0}.
3
Generate obligations — Two proof goals:
Goal A: x >= 0 ⊢ x >= 0 (then-branch)
Goal B: x < 0 ⊢ -x >= 0 (else-branch)
4
Dispatch — Both goals are decidable arithmetic. StrategyDispatcher selects Z3Strategy directly.
5
Z3 discharge — Goal A: Z3 proves (x ≥ 0) ⇒ (x ≥ 0) trivially. Goal B: Z3 proves (x < 0) ⇒ (-x ≥ 0). Both return Z3Leaf proof terms with certificates.
6
Kernel check — The kernel verifies each Z3Leaf by checking the certificate hash. Result: Verified

For a more complex example involving homotopy reasoning, see the next chapter on Z3 integration, where we trace through a function that requires both Z3 leaves and homotopy gluing.