Under the Hood: Overview

How Deppy verifies your Python code — from source text to machine-checked proof term.

Throughout this book we have used @guarantee, assume(), check(), and the homotopy tactics as if they were magic incantations. In this section we pull back the curtain and explain exactly what happens when Deppy processes a decorated Python module. If you have read the F* tutorial's "Under the Hood" chapters, the structure will feel familiar — but the engine is fundamentally different: where F* elaborates to a core calculus and hands everything to Z3, Deppy uses cubical homotopy type theory as its primary proof search strategy and only drops down to Z3 for decidable leaf goals.

Architecture at a Glance

Deppy is structured as a pipeline of four main stages. Each stage is implemented as a pure function that consumes the output of the previous stage, making the whole system easy to test and reason about:

┌─────────────┐     ┌──────────────────┐     ┌─────────────────────┐     ┌──────────────┐
│  Python Src │────▶│  AST + Annots    │────▶│  Proof Obligations  │────▶│  Proof Terms │
│  (.py file) │     │  (typed AST)     │     │  (goals to prove)   │     │  (verified)  │
└─────────────┘     └──────────────────┘     └─────────────────────┘     └──────────────┘
       │                     │                        │                       │
    parse()          elaborate()           discharge()             check_proof()
       │                     │                        │                       │
   Python AST       Bidirectional         Homotopy + Z3           Kernel
   module           type checker          proof search            verification

Let's walk through each stage in detail.

Stage 1: Parsing & Annotation Extraction

Deppy uses Python's built-in ast module to parse source files. This means every valid Python 3.10+ file is accepted — we never require a custom parser. During this stage the system:

Parses the module into a standard ast.Module tree.
Walks the tree looking for @guarantee, @spec decorators and assume() / check() / given() calls.
Extracts natural-language specifications and attaches them to their AST nodes as DeppyAnnotation metadata.
Resolves type annotations (PEP 604 unions, Optional, generics) into the internal SynType representation.

# What the user writes
@guarantee("returns a sorted list with no duplicates")
def unique_sorted(lst: list[int]) -> list[int]:
    return sorted(set(lst))

# What Deppy extracts as metadata
# AnnotatedFunction(
#   name       = "unique_sorted",
#   params     = [Param("lst", ListType(IntType))],
#   ret_type   = ListType(IntType),
#   guarantees = [NLSpec("returns a sorted list with no duplicates")],
#   assumes    = [],
#   checks     = [],
# )

Stage 2: Elaboration & Type Checking

The elaborator transforms the annotated AST into a fully typed core term. Deppy uses bidirectional type checking: every sub-expression is processed in either check mode (we know the expected type and push it down) or synth mode (we infer the type from the expression and push it up).

# Bidirectional rules (simplified)
#
#   Γ ⊢ e ⇐ A      "check e against type A"
#   Γ ⊢ e ⇒ A      "synthesize type A from e"
#
# Key rules:
#   CHECK-LAM:   Γ, x:A ⊢ body ⇐ B
#                ──────────────────────
#                Γ ⊢ λx.body ⇐ (A → B)
#
#   SYNTH-APP:   Γ ⊢ f ⇒ (A → B)    Γ ⊢ arg ⇐ A
#                ────────────────────────────────
#                Γ ⊢ f(arg) ⇒ B
#
#   CHECK-PATH:  Γ, i:𝕀 ⊢ body ⇐ A
#                body[0/i] ≡ a    body[1/i] ≡ b
#                ────────────────────────────────
#                Γ ⊢ <i>body ⇐ Path A a b

The cubical extension is where Deppy diverges from conventional type checkers. The interval variable i : 𝕀 ranges over the abstract interval [0, 1] and is used to construct paths — proofs of equality. During elaboration, interval substitution (body[0/i] and body[1/i]) is performed to verify that a path's endpoints match the claimed equality.

Stage 3: Proof Obligation Discharge

After elaboration, Deppy has a list of proof obligations — goals that must be proved for the program to be considered correct. Each obligation is a ProofGoal:

@dataclass
class ProofGoal:
    context: list[Hypothesis]   # known facts (Γ)
    goal_type: SynType          # what we need to prove
    source: SourceLocation      # where in the .py file
    strategy: Strategy          # HomotopyFirst | Z3Only | Manual

The discharge engine tries strategies in order:

HomotopyStrategy — the primary strategy. Attempts to build a proof term using path algebra, transport, Čech decomposition, or fibration reasoning. If a sub-goal is a decidable arithmetic/boolean formula, it delegates to Z3Strategy for that leaf.
Z3Strategy — encodes the goal as an SMT query and calls Z3. Used directly for goals marked Z3Only or as a leaf solver within the homotopy strategy.
ManualStrategy — the user supplies a proof term in a sidecar file. Deppy type-checks the term against the goal.

This layered approach is the key difference from F*. In F*, all proof obligations go to Z3 (with optional tactic pre-processing). In Deppy, Z3 is a leaf solver — the top-level proof structure is always a homotopy proof term.

Stage 4: Kernel Verification

Every proof term produced by Stage 3 is independently verified by the proof kernel. The kernel is the trusted computing base (TCB) of the entire system — if you trust the kernel, you can trust every verification result. The kernel is described in detail below.

The Trusted Computing Base

The proof kernel is deliberately small: approximately 2,500 lines of Python in deppy/core/kernel.py, plus deppy/core/formal_ops.py and deppy/core/types.py (the syntactic term / type representation). Everything outside those three files — the elaborator, the proof search strategies, the Z3 bridge, the Lean translator, the caching layer, the whole deppy/pipeline/ subsystem — is untrusted. If a bug outside the TCB produces a bogus proof term, the kernel rejects it; if a bug inside the TCB accepts a bogus proof, that's a soundness bug.

The kernel accepts a ProofTerm and a Judgment (the goal) and returns a VerificationResult. The result carries:

success — True/False;
trust_level — one of the seven kernel trust levels (KERNEL / Z3_VERIFIED / STRUCTURAL_CHAIN / LLM_CHECKED / AXIOM_TRUSTED / EFFECT_ASSUMED / UNTRUSTED — see Trust Levels);
message — diagnostic;
sub_results — the same info recursively for sub-proofs.

The kernel operates by structural recursion on the proof term, checking each constructor against its corresponding typing rule. The full proof-term constructor roster (defined in deppy/core/kernel.py):

Equality / paths — Refl, Sym, Trans, Cong, Ap, FunExt, TransportProof, PathComp, DuckPath, CechGlue, Univalence
Induction / case-analysis — ListInduction, NatInduction, Cases, Fiber
Logical glue — Cut, Assume, Ext, Rewrite, Unfold
External / opaque — AxiomInvocation, Z3Proof, Structural
Effect / safety — EffectWitness, ConditionalEffectWitness, SourceDischarge, SemanticSafetyWitness, ModuleSafetyComposition, TerminationObligation, SafetyObligation
Patches / fibers — Patch

Each constructor has a corresponding _verify_* method on ProofKernel. Several were tightened in the recent honesty pass and now fail rather than silently downgrade:

_verify_unfold without a sub-proof returns UNTRUSTED (was: STRUCTURAL_CHAIN).
_verify_fiber over a UnionType with uncovered alternatives fails with code E003e (was: downgrade only).
_verify_duck_path with missing protocol methods fails with code E003f (was: downgrade only).
_verify_transport rejects motives whose AST shape can't be a type family (literal bodies, pair projections, arity-incompatible lambdas) with code E003g.
_verify_structural still returns STRUCTURAL_CHAIN trust — it's a deliberate escape hatch. Use ProofCertificate.kernel_verified downstream to refuse proofs containing structural leaves.

Two kernels, not one. Deppy's own kernel (trust level KERNEL) is independent of Lean's kernel (trust label LEAN_KERNEL_VERIFIED). A proof can be KERNEL-verified by deppy — meaning every step was checked by ProofKernel.verify — without Lean ever running. To get Lean's assurance, call cert.verify_with_lean(). See Lean export.

Proof Term Constructors

Deppy's proof terms form a small but expressive language. Each constructor corresponds to a rule of cubical homotopy type theory, extended with domain-specific constructs for Python's type system:

Proof Term	Typing Rule	Intuition
`Refl(a)`	`Γ ⊢ Refl(a) : Path A a a`	Every value is equal to itself
`Sym(p)`	`p : Path A a b ⊢ Sym(p) : Path A b a`	Equality is symmetric
`Trans(p, q)`	`p : Path A a b, q : Path A b c ⊢ Trans(p,q) : Path A a c`	Equality is transitive
`Cong(f, p)`	`f : A→B, p : Path A a b ⊢ Cong(f,p) : Path B (f a) (f b)`	Functions respect equality
`Ap(f, a)`	`f : (x:A)→B(x) ⊢ Ap(f,a) : B(a)`	Function application
`PathComp(p, q)`	`p : Path A a b, q : Path A b c ⊢ PathComp(p,q) : Path A a c`	Path composition (cubical)
`FunExt(h)`	`h : (x:A) → Path B (f x) (g x) ⊢ FunExt(h) : Path (A→B) f g`	Functions equal pointwise are equal
`CechGlue(patches, gluing)`	`patches : [Patch_i], gluing : Agreement ⊢ CechGlue : Path A a b`	Glue local proofs into a global proof
`TransportProof(p, x)`	`p : Path Type A B, x : A ⊢ Transport(p,x) : B`	Move a value along a type equality
`Fiber(f, b, a, p)`	`f:A→B, p:Path B (f a) b ⊢ Fiber(f,b,a,p) : Fib(f,b)`	Witness that a is in the fiber of f over b
`Univalence(equiv)`	`equiv : A ≃ B ⊢ Univalence(equiv) : Path Type A B`	Equivalent types are equal
`DuckPath(proto, cls, wit)`	`wit : Implements(cls, proto) ⊢ DuckPath : Path Type cls proto`	A class implementing a Protocol is "equal" to it

How the Kernel Verifies Each Proof Term

The kernel's core function, verify_proof, is a recursive match on the proof term constructor. Here is a simplified sketch (the real implementation handles more edge cases):

def verify_proof(
    ctx: Context,
    proof: ProofTerm,
    goal: SynType
) -> VerificationResult:
    match proof:
        case Refl(a):
            # goal must be Path(A, a, a) for some A
            if not is_path_type(goal):
                return Rejected("Refl requires Path type")
            A, lhs, rhs = decompose_path(goal)
            if not definitionally_equal(ctx, lhs, rhs):
                return Rejected(f"endpoints not equal: {lhs} ≠ {rhs}")
            if not definitionally_equal(ctx, a, lhs):
                return Rejected(f"Refl witness {a} ≠ endpoint {lhs}")
            return Verified()

        case Trans(p, q):
            # goal must be Path(A, a, c)
            # p must prove Path(A, a, b) for some b
            # q must prove Path(A, b, c)
            A, a, c = decompose_path(goal)
            b = fresh_meta()  # infer the midpoint
            r1 = verify_proof(ctx, p, PathType(A, a, b))
            r2 = verify_proof(ctx, q, PathType(A, b, c))
            return combine_results(r1, r2)

        case CechGlue(patches, gluing):
            # Each patch proves the goal on a subdomain
            # Gluing data proves patches agree on overlaps
            for patch in patches:
                r = verify_proof(ctx, patch.proof, patch.local_goal)
                if r.is_rejected:
                    return r
            for overlap in gluing.overlaps:
                r = verify_agreement(ctx, overlap)
                if r.is_rejected:
                    return r
            return Verified()

        case DuckPath(proto, cls, witness):
            # Verify that cls implements every method of proto
            for method in proto.required_methods:
                if not has_compatible_method(cls, method):
                    return Rejected(
                        f"{cls.name} lacks method {method.name}"
                    )
            return Verified()

        # ... remaining constructors ...

The kernel never calls Z3. It only does structural/definitional equality checks. If a proof term contains a Z3Leaf node, the kernel trusts that node only if it was produced by Deppy's own Z3 integration and carries a valid Z3Certificate (a hash of the query + result). This is the single trust boundary between the kernel and the SMT solver.

The Type Checker in Depth

Bidirectional Checking

Bidirectional type checking has two mutually recursive judgments:

Check mode (⇐): we know the expected type and verify the expression against it. Used for function bodies, return expressions, and anything under a type annotation.
Synth mode (⇒): we infer the type from the expression itself. Used for function calls, variable references, and literals.

The key advantage is that type annotations on function parameters flow downward into the body, eliminating the need for many intermediate annotations that a purely inference-based system would require.

def check(ctx: Context, expr: Expr, expected: SynType) -> CheckResult:
    """Check mode: verify expr has type `expected`."""
    match expr:
        case Lambda(param, body):
            # CHECK-LAM rule
            dom, cod = decompose_arrow(expected)
            new_ctx = ctx.extend(param, dom)
            return check(new_ctx, body, cod)

        case PathAbstraction(ivar, body):
            # CHECK-PATH rule
            A, a, b = decompose_path(expected)
            new_ctx = ctx.extend_interval(ivar)
            result = check(new_ctx, body, A)
            # Verify endpoints
            left  = subst_interval(body, ivar, 0)
            right = subst_interval(body, ivar, 1)
            if not definitionally_equal(ctx, left, a):
                return CheckFailed(f"left endpoint: {left} ≠ {a}")
            if not definitionally_equal(ctx, right, b):
                return CheckFailed(f"right endpoint: {right} ≠ {b}")
            return result

        case _:
            # Fall through to synth + subsumption
            inferred = synth(ctx, expr)
            return subtype_check(ctx, inferred, expected)

Cubical Typing Rules

The cubical layer adds three key constructs to the standard type theory:

Interval variable i : 𝕀 — an abstract parameter that ranges over a formal interval. It is not a real number; it participates only in substitution and endpoint evaluation.
Path abstraction <i> body — a term parametric in an interval variable. The type is Path A (body[0/i]) (body[1/i]).
Path application p @ r — apply a path to an interval expression r (which can be 0, 1, i, 1-i, i ∧ j, i ∨ j).

# Interval substitution example
#
# Given:  path p = <i> (2 * i + 3)
#
#   p @ 0  =  2 * 0 + 3  =  3
#   p @ 1  =  2 * 1 + 3  =  5
#
# So p : Path Int 3 5  (a proof that 3 can be connected to 5
#                       via the continuous family 2*i+3)

Interval Substitution

Interval substitution is the core operation of the cubical type checker. When we encounter p @ r, we substitute r for the path's interval variable throughout the body:

def subst_interval(
    body: Expr,
    ivar: IntervalVar,
    value: IntervalExpr
) -> Expr:
    """Replace every occurrence of `ivar` in `body` with `value`."""
    match body:
        case IVar(name) if name == ivar:
            return interval_to_expr(value)
        case BinOp(left, op, right):
            return BinOp(
                subst_interval(left, ivar, value),
                op,
                subst_interval(right, ivar, value)
            )
        case PathApp(path, arg):
            return PathApp(
                subst_interval(path, ivar, value),
                subst_interval_expr(arg, ivar, value)
            )
        # ... structural recursion on all other cases

The Verification Pipeline

Strategy Dispatch

When a proof goal is generated, Deppy's StrategyDispatcher classifies it and selects the appropriate proof search strategy:

class StrategyDispatcher:
    def dispatch(self, goal: ProofGoal) -> Strategy:
        # 1. If the goal is purely arithmetic/boolean → Z3
        if is_decidable_fragment(goal):
            return Z3Strategy(timeout=5000)

        # 2. If the goal involves equality/paths → Homotopy
        if involves_paths(goal) or involves_transport(goal):
            return HomotopyStrategy(
                leaf_solver=Z3Strategy(timeout=2000),
                max_depth=20
            )

        # 3. If the goal has a sidecar proof → Manual
        if has_sidecar(goal):
            return ManualStrategy(sidecar=goal.sidecar_path)

        # 4. Default: Homotopy with Z3 leaves
        return HomotopyStrategy(
            leaf_solver=Z3Strategy(timeout=5000),
            max_depth=30
        )

Homotopy Proof Search

The HomotopyStrategy searches for a proof term by exploring the space of possible proof constructions. At each step, it tries the following tactics (in order):

Reflexivity — if the two sides are definitionally equal, emit Refl.
Congruence — if both sides are applications f(a) and f(b), reduce to proving a = b.
Transport — if there's a known type equality Path Type A B in context, try transporting the left side along it.
Čech decomposition — split the domain into patches, prove on each patch, then glue with CechGlue.
Function extensionality — if both sides are functions, try FunExt and prove pointwise equality.
Z3 leaf — if the goal is arithmetic/boolean, delegate to Z3 and wrap the result in a Z3Leaf proof term.

class HomotopyStrategy:
    def search(self, ctx: Context, goal: ProofGoal, depth: int) -> Optional[ProofTerm]:
        if depth > self.max_depth:
            return None

        # Try tactics in order of likelihood
        for tactic in [
            self.try_refl,
            self.try_congruence,
            self.try_transport,
            self.try_cech,
            self.try_funext,
            self.try_z3_leaf,
        ]:
            result = tactic(ctx, goal, depth)
            if result is not None:
                return result

        return None  # exhausted all tactics

ProofTransportCache

Many programs generate similar or identical proof obligations. Deppy memoizes proof search results in a ProofTransportCache keyed by the canonical form of the proof goal (context + goal type, modulo alpha equivalence and definitional equality):

class ProofTransportCache:
    def __init__(self):
        self._cache: dict[int, ProofTerm] = {}

    def lookup(self, goal: ProofGoal) -> Optional[ProofTerm]:
        key = canonical_hash(goal)
        return self._cache.get(key)

    def store(self, goal: ProofGoal, proof: ProofTerm):
        key = canonical_hash(goal)
        self._cache[key] = proof

    def transport(
        self, old_goal: ProofGoal, new_goal: ProofGoal
    ) -> Optional[ProofTerm]:
        """Try to adapt a cached proof to a new goal via transport."""
        cached = self.lookup(old_goal)
        if cached is None:
            return None
        # Build a transport path between the goals
        path = find_goal_path(old_goal, new_goal)
        if path is None:
            return None
        return TransportProof(path, cached)

The transport method is what makes the cache especially powerful. If you've proved a goal for list[int] and now need the same proof for list[float], Deppy can often build a transport path between the two and reuse the original proof — no re-searching required.

Čech Decomposition Engine

The Čech decomposition engine splits a complex proof goal into simpler "patches" that can be proved independently, then glues the results together. This directly mirrors the mathematical concept of a Čech cover in algebraic topology:

class CechEngine:
    def decompose(self, goal: ProofGoal) -> Optional[CechCover]:
        """Split a goal into patches + overlap conditions."""
        match goal.goal_type:
            case ForAll(var, domain, body):
                # Partition the domain into subdomains
                patches = self.partition_domain(domain)
                overlaps = self.compute_overlaps(patches)
                return CechCover(
                    patches=[
                        Patch(
                            subdomain=p,
                            local_goal=restrict_goal(goal, var, p)
                        )
                        for p in patches
                    ],
                    overlaps=overlaps
                )
            case Conjunction(left, right):
                # Two patches, one for each conjunct
                return CechCover(
                    patches=[
                        Patch(subdomain="left", local_goal=left),
                        Patch(subdomain="right", local_goal=right),
                    ],
                    overlaps=[]
                )
            case _:
                return None  # cannot decompose

Performance Considerations

Verification Time vs. Proof Complexity

The time Deppy takes to verify a function depends on several factors:

Factor	Impact	Typical Range
Number of proof obligations	Linear	1–50 per function
Homotopy search depth	Exponential (but pruned)	1–30 steps
Z3 query complexity	Variable (decidability)	1ms – 5s per query
Čech patch count	Quadratic (overlap checks)	2–8 patches
Cache hit rate	Dramatic speedup	40–90% in practice

Caching Strategies

Deppy uses three levels of caching:

ProofTransportCache — memoizes proof terms keyed by canonical goal form. Supports transport-based reuse across structurally similar goals.
Z3ResultCache — caches Z3 solver results keyed by the SMT-LIB query string. Avoids re-running the solver for identical queries.
DiskCache — persists verification results to .deppy_cache/. On subsequent runs, only re-verifies functions whose source code (or dependencies) has changed.

Parallel Verification

Deppy can verify independent functions in parallel using Python's concurrent.futures. Since each function's verification is essentially a pure computation (given the cache state), this parallelism is safe:

from concurrent.futures import ProcessPoolExecutor

def verify_module_parallel(
    module: AnnotatedModule,
    workers: int = 4
) -> list[VerificationResult]:
    goals = generate_obligations(module)
    with ProcessPoolExecutor(max_workers=workers) as pool:
        futures = {
            pool.submit(discharge_goal, g): g
            for g in goals
        }
        results = []
        for future in as_completed(futures):
            goal = futures[future]
            try:
                result = future.result(timeout=60)
                results.append(result)
            except TimeoutError:
                results.append(Timeout(goal))
    return results

Comparison with F*'s Architecture

It is instructive to compare Deppy's pipeline with F*'s:

F*

Source (.fst)
    │
    ▼
Desugaring
    │
    ▼
Elaboration → Core F*
    │
    ▼
VC Generation
    │
    ▼
Z3 (all goals)
    │
    ▼
Result

Deppy

Source (.py)
    │
    ▼
AST + Annotation
    │
    ▼
Bidir. Elaboration
    │
    ▼
Homotopy Search
    │      │
    │    Z3 (leaves)
    │      │
    ▼      ▼
Proof Term → Kernel
    │
    ▼
Result

Key differences:

Aspect	F*	Deppy
Primary solver	Z3 (SMT)	Cubical homotopy search
Z3's role	Sole decision procedure	Leaf solver for decidable fragments
Proof terms	Optional (tactics generate them)	Always produced, always kernel-checked
Equality theory	Propositional (with `==`)	Cubical paths (with transport, `ua`)
TCB size	~30k LOC (OCaml) + Z3	~1.7k LOC (Python) + Z3
Source language	F* (custom syntax)	Python 3.10+ (standard syntax)
Duck typing	Not applicable	First-class via `DuckPath`
Topology	None	Čech covers, fibrations, univalence

F*'s approach — sending everything to Z3 — works extremely well for the kinds of software verification F* targets (cryptographic protocols, low-level systems code). Deppy's approach is designed for a different niche: Python codebases where duck typing, dynamic dispatch, and structural equivalences are common, and where a topological proof strategy can exploit the inherent structure.

Elaboration: F* vs. Deppy

In F*, elaboration transforms surface syntax into a core calculus with explicit universe levels, effect annotations, and implicit argument resolution. The elaborated term is then sent through VC generation, which produces first-order logic formulas for Z3.

In Deppy, elaboration is simpler because we piggyback on Python's existing type system (PEP 484 annotations, typing module). The elaborator's main job is to:

Resolve decorators (@guarantee, @spec) to their specification semantics.
Infer refinement types from assume() / check() calls.
Construct the interval context for cubical operations.
Generate proof obligations (the "VCs" of Deppy).

The key difference is that Deppy's proof obligations are not first-order formulas — they are typing judgments in cubical type theory, which are then solved by homotopy search.

Tactics: F* Meta-F* vs. Deppy Homotopy Tactics

F*'s tactic system (Meta-F*) lets users write custom proof automation in F* itself. These tactics manipulate proof state imperatively and can call Z3 as a sub-step.

Deppy's "tactics" are the proof search strategies described above. They are not user-programmable in the same way (though users can supply manual proofs via sidecar files). The key difference is that Deppy's tactics always produce a proof term that the kernel can independently verify, whereas F*'s tactics produce proof scripts that are replayed.

Putting It All Together

Let's trace through a complete verification of a simple function:

@guarantee("result >= 0")
def abs_val(x: int) -> int:
    if x >= 0:
        return x
    else:
        return -x

Verification Trace

Parse — Extract @guarantee("result >= 0") and attach it as a postcondition annotation to abs_val.

Elaborate — The NL spec parser converts "result >= 0" to the refinement type {v: int | v >= 0}. The return type is refined: abs_val : (x: int) → {v: int | v >= 0}.

Generate obligations — Two proof goals:
Goal A: x >= 0 ⊢ x >= 0 (then-branch)
Goal B: x < 0 ⊢ -x >= 0 (else-branch)

Dispatch — Both goals are decidable arithmetic. StrategyDispatcher selects Z3Strategy directly.

Z3 discharge — Goal A: Z3 proves (x ≥ 0) ⇒ (x ≥ 0) trivially. Goal B: Z3 proves (x < 0) ⇒ (-x ≥ 0). Both return Z3Leaf proof terms with certificates.

Kernel check — The kernel verifies each Z3Leaf by checking the certificate hash. Result: Verified

For a more complex example involving homotopy reasoning, see the next chapter on Z3 integration, where we trace through a function that requires both Z3 leaves and homotopy gluing.

← Previous Async / Await Next → How Deppy Uses Z3