Introduction

Python is the world's most popular programming language. It powers scientific computing, machine learning, web backends, financial systems, and critical infrastructure. Yet Python has essentially no formal verification story. Type checkers like mypy catch some bugs, but they cannot prove that your sorting function actually sorts, that your financial calculation never loses a cent, or that your concurrent server is free of data races.

Deppy changes this. It is a proof-oriented verification system for Python that lets you state and prove rich properties about existing code — without modifying that code.

What Is Python with Dependent Types?

Proof-oriented programming (POP), as pioneered by F*, is a paradigm where programs and proofs coexist in the same language. The compiler checks both your code and your proofs, rejecting programs that don't satisfy their specifications.

But F* requires you to write in F*. Dafny requires you to write in Dafny. For the vast majority of working programmers, rewriting production code in a new language is not feasible.

Deppy's answer is proof-oriented verification: you keep your Python code exactly as it is, and write specifications and proofs beside it in .deppy sidecar files. The Deppy checker reads both your Python source and the sidecar proofs, and verifies that the code satisfies the specification.

Traditional POP (F*)

(* You must write in F* *)
val factorial: nat -> Tot nat
let rec factorial n =
  if n = 0 then 1
  else n * factorial (n - 1)

Deppy (Verify Existing Python)

# Your existing Python — untouched
def factorial(n: int) -> int:
    if n <= 0:
        return 1
    return n * factorial(n - 1)

A Capsule Summary of Deppy

Deppy combines six key ingredients:

Refinement types — types refined by logical predicates, like refined(PyIntType(), "x >= 0", "x") (or just Nat()) for natural numbers. These are the bread-and-butter of specification.
Z3 SMT solver — automatically discharges arithmetic, boolean, and data-structure obligations. Most simple properties verify with zero manual proof effort.
Cubical homotopy type theory — path types model type equivalences, transport moves proofs between equivalent types, and Čech decomposition breaks complex proofs into composable local patches.
The cubical heap — instead of separation logic's ★, Deppy models mutation as paths in a world category. Mutations bump an epoch counter, aliases are DuckPaths at the cell level, and reads after mutation require explicit TransportProof. See Part 6.1.
PSDL — Python-syntax tactics — the proof body is ordinary Python. Walrus binding, postfix .symm/.trans, operator overloading p @ q / ~p / p ** k, comprehensions as induction, and assert P, "qed" as proof closer. See Part 5.5.
Sidecar metadata in two equivalent forms — .deppy files (best for verifying third-party libraries) or proof decorators (@foundation, @axiom, @verify, @psdl_proof) inline with Python (best when you're writing the module yourself). Both feed the same kernel pipeline. See Part 4.2.

You don't need to understand homotopy type theory to be productive with Deppy. Parts 1–4 of this book use only refinement types and Z3. The homotopy theory in Part 5 is for advanced users who need compositional reasoning or want to verify dynamic-typing patterns.

How Deppy Differs from Other Systems

vs. F*

F* is a full programming language with dependent types that extracts to C, OCaml, or Rust. You write programs in F*. Deppy does not replace Python — it verifies Python. F*'s effect system (Tot, Div, ST, etc.) maps to Deppy's verification levels. F*'s SMT-backed automation is mirrored by Deppy's Z3 integration. But Deppy adds homotopy methods that F* does not have, because Python's dynamic typing demands them.

vs. Dafny

Dafny is closest in spirit: it uses Z3 and has a familiar syntax. But Dafny is its own language (compiling to C#, Go, Java, etc.), and it has no concept of duck typing, dynamic types, or decorators. Deppy handles all of these. Dafny also has no sidecar proof mechanism — you must write in Dafny.

vs. Liquid Haskell

Liquid Haskell adds refinement types to Haskell via annotations in comments. This is the closest precedent to Deppy's non-invasive approach. But Liquid Haskell is limited to Haskell (a statically typed, purely functional language), and its annotations must still go inside the source file. Deppy's sidecar proofs are fully external, and its homotopy-theoretic foundations handle Python's dynamic features that have no Haskell equivalent.

vs. mypy / pyright

Type checkers like mypy verify type correctness — that you pass an int where an int is expected. Deppy verifies functional correctness — that your function returns the right value, maintains invariants, terminates, and is safe for concurrent use. Type checking is a subset of what Deppy can express.

The Key Innovation: Homotopy Type Theory for Dynamic Languages

Why homotopy type theory? Consider Python's duck typing. A function that calls x.read() works with any object that has a read method — files, StringIO, BytesIO, network sockets, mock objects. In traditional type theory, these are unrelated types. In homotopy type theory, they are connected by paths:

# These types are "path-equivalent" in Deppy:
# They all support .read() → str

path : TextIOWrapper ≃ StringIO     # via {read, write, close}
path : StringIO       ≃ MockFile    # via {read, write}

# Transport: a proof about TextIOWrapper.read()
# can be *transported* to StringIO.read()
transport(path, proof_about_textio) : Proof[StringIO]

This is not just an analogy. Deppy implements cubical path types where:

Two types are path-equivalent if they share the same structural interface (duck typing)
Transport carries proofs from one type to an equivalent type
Čech decomposition breaks a complex module into overlapping "patches" that can be verified independently and glued together
Fibrations model dependent structures like decorated functions, where the output type depends on the decorator applied

The cubical structure means Deppy can reason about multiple paths between the same types (e.g., different subsets of a protocol) and compose them coherently. This goes beyond simple structural subtyping.

Worked Example: Factorial with Full Verification

Let's walk through a complete verification from start to finish. We'll verify a simple factorial function, proving it correct, terminating, and non-negative.

Step 1: The Python Code

This is existing code that we will not modify:

# math_utils.py — existing production code
from __future__ import annotations

def factorial(n: int) -> int:
    """Compute n! for non-negative integers."""
    if n <= 0:
        return 1
    return n * factorial(n - 1)

Step 2: The Sidecar Specification

We create a sidecar file with three proof obligations:

# math_utils.deppy — sidecar proof file (YAML-shaped)
module: math_utils

foundation Nat_zero_succ: "factorial(0) == 1; factorial(n) == n * factorial(n-1)"

spec math_utils.factorial:
  guarantee: "factorial(n) >= 1 when n >= 0"
  pure: true

verify factorial_nonneg:
  function: math_utils.factorial
  property: "factorial(n) >= 1"
  via: "foundation Nat_zero_succ"
  binders: { n: Nat }
  psdl_proof: |
    if n <= 0:
        assert 1 >= 1, "z3"
    else:
        induction(n)
        assert n * factorial(n-1) >= 1, "z3"

Equivalently, in decorator form if you wrote math_utils.py yourself:

from deppy.proofs.sidecar_decorators import (
    foundation, verify, psdl_proof,
)

@foundation
def Nat_zero_succ():
    """factorial(0) == 1; factorial(n) == n * factorial(n-1)"""


@verify(
    property="factorial(n) >= 1",
    via="foundation Nat_zero_succ",
    binders={"n": "Nat"},
)
@psdl_proof("""
if n <= 0:
    assert 1 >= 1, "z3"
else:
    induction(n)
    assert n * factorial(n-1) >= 1, "z3"
""")
def factorial(n: int) -> int:
    if n <= 0:
        return 1
    return n * factorial(n - 1)

Step 3: Run Verification

$ python -m deppy.pipeline.run_verify math_utils.deppy --out math_utils.lean

  …
  ProofTerm subclasses (C1–C7 audit batch):
    constructed / kernel-accepted:         N / N
  ``verify`` blocks (.deppy → kernel.verify):
    accepted by ProofKernel.verify:        N
  Lean export round-trip:                  OK
  ────────────────────────────────────────────────
  ✓ CERTIFIED — top-level certify_or_die: PASS

The pipeline emits a Lean certificate at the --out path and a JSON sidecar next to it. Both are reproducible — running again produces byte-identical output assuming the same inputs.

Step 4: Understanding the Proof

How Deppy Verified factorial

Parse & bind. Deppy reads math_utils.py, extracts the AST for factorial, and binds it to the spec in math_utils.deppy.

Termination. The decreases(n) annotation tells Deppy that n strictly decreases on each recursive call. Since Nat is bounded below by 0, this is a well-founded measure. Z3 checks: n > 0 ⟹ n - 1 < n ∧ n - 1 ≥ 0. ✓

Postcondition (base case). When n ≤ 0, the function returns 1. Z3 checks: 1 ≥ 1. ✓

Postcondition (inductive case). When n > 0, the function returns n * factorial(n - 1). By the inductive hypothesis, factorial(n - 1) ≥ 1. Z3 checks: n ≥ 1 ∧ factorial(n-1) ≥ 1 ⟹ n * factorial(n-1) ≥ 1. ✓

Refinement. The result type is Pos, i.e., int with r ≥ 1. This follows directly from the postcondition. ✓

Verification Levels

Not all proofs are created equal. Deppy assigns every verified obligation a trust level indicating how it was checked:

Level	Badge	Meaning	Trust
`LEAN_KERNEL_VERIFIED`	🟢	Proof exported to Lean 4 and checked by its kernel	Highest — machine-checked by an independent proof assistant
`Z3_VERIFIED`	🟡	Discharged by Z3 SMT solver	High — Z3 is trusted and widely used, but its kernel is not independently verified
`LLM_CHECKED`	🟠	An LLM oracle judged the property correct with confidence ≥ threshold	Medium — useful for rapid prototyping, but should be upgraded to Z3 or Lean
`UNTRUSTED`	🔴	No verification performed yet	None — specification exists but hasn't been checked

The verification levels form a lattice: LEAN_KERNEL_VERIFIED ⊒ Z3_VERIFIED ⊒ LLM_CHECKED ⊒ UNTRUSTED. You can always upgrade a proof to a higher trust level without changing the specification.

# For a .deppy sidecar, the run_verify pipeline always exports
# Lean and runs ``lean`` against it on every invocation.
# When the round-trip succeeds, every verify block reaches
# LEAN_KERNEL_VERIFIED automatically:
$ python -m deppy.pipeline.run_verify math_utils.deppy --out math_utils.lean
  Lean export round-trip:                  OK
  ✓ CERTIFIED — top-level certify_or_die: PASS

# For a Python module that uses inline @guarantee/@ensures decorators,
# the ``deppy upgrade`` subcommand re-runs the safety pipeline aiming
# at a stricter level:
$ python -m deppy upgrade math_utils.py --to Z3_VERIFIED --json
  {"target_level": "Z3_VERIFIED", "upgraded": 5, "failed": 0, ...}

LLM_CHECKED proofs are not formal proofs. They are high-confidence heuristic checks useful during development. Production code should target Z3_VERIFIED or LEAN_KERNEL_VERIFIED.

The Sidecar Approach

The sidecar approach is Deppy's defining architectural choice. Instead of embedding specifications in your Python source (like type hints or docstrings), proofs live in separate .deppy files:

# Project layout with sidecar proofs
myproject/
├── myproject/
│   ├── __init__.py
│   ├── math_utils.py          # your code — untouched
│   ├── math_utils.deppy     # sidecar proofs
│   ├── data_pipeline.py       # your code — untouched
│   └── data_pipeline.deppy  # sidecar proofs
├── tests/
│   └── test_math.py
└── pyproject.toml

This design has several advantages:

Zero modification. Your Python code is never changed. This means no risk of introducing bugs through annotation, no merge conflicts with verification artifacts, and no dependency on Deppy at runtime.
Library verification. You can write sidecar proofs for code you don't own — pip-installed packages, standard library modules, even C extensions (via their Python interface).
Incremental adoption. Add sidecar proofs one function at a time. Unverified code coexists with verified code. No big-bang rewrite needed.
Separation of concerns. Domain experts write Python; verification engineers write proofs. The two can work in parallel on different files.

How Sidecar Binding Works

A sidecar file uses @verify("module.function") to bind a specification to a target function. Deppy resolves the target by:

Parsing the Python source to extract the function's AST
Matching the spec's parameter names and types against the target
Generating verification conditions from the spec's postconditions, refinements, and proof terms
Discharging the conditions using Z3 (or Lean, or the LLM oracle)

# Sidecar file can verify multiple functions from the same module
from deppy import verify, refined

@verify("data_pipeline.clean_data")
def clean_data_spec(
    df: DataFrame
) -> refined(DataFrame, lambda r: not r.isnull().any().any()):
    """No nulls in the output."""
    z3.auto()

@verify("data_pipeline.normalize")
def normalize_spec(
    values: list[float]
) -> refined(list[float], lambda r: all(0 <= v <= 1 for v in r)):
    """All values in [0, 1]."""
    assume(len(values) > 0)
    z3.auto()

What Comes Next

With the big picture in mind, you're ready to start verifying Python code. The next chapter, Getting Started, walks you through installing Deppy, setting up your first project, and verifying your first function.

Here's a roadmap of what you'll learn:

Part 1 teaches refinement types and Z3 — enough to verify pure functions, data structures, and algorithms like quicksort.
Part 2 adds advanced type-level reasoning: generics, dependent types, and metatheory proofs.
Part 3 handles Python's unique features: duck typing as path equivalence, decorators as fibrations, and dynamic typing as transport.
Part 4 develops the sidecar proof methodology and demonstrates it on a real library (SymPy).
Part 5 dives deep into homotopy type theory: path types, transport, Čech decomposition, and fibrations.
Part 6 tackles mutable state: separation logic, references, arrays, GIL-aware threading, and async/await.

Before moving on, try to answer these questions (you'll find the answers throughout the book):

Why can't mypy verify that a sorting function actually sorts?
What is a refinement type, and how does it differ from a type hint?
Why does duck typing require homotopy type theory rather than simple subtyping?
What is the advantage of sidecar proofs over inline annotations?