Part 4 — Sidecar Proofs

Verify existing Python libraries without modifying a single line of their source code.

Sidecar proofs are unique to Deppy. No other Python verification tool lets you write formal specifications for libraries you don't control — NumPy, SymPy, torch, requests — and verify that your usage of them is correct.

Two interchangeable forms. Deppy offers two equivalent shapes for sidecar metadata: separate .deppy files (this chapter, ideal for verifying libraries you don't control) and decorator-based annotations (next chapter, ideal when you write the module yourself). Both produce the same SidecarModule and feed the identical kernel pipeline.

The Problem

Most real Python code depends on third-party libraries. When you import numpy, sympy, or torch, you're importing hundreds of thousands of lines of code that you cannot modify. Traditional verification systems face a hard choice:

Ignore libraries — treat every external call as a black box, losing the ability to reason about most of your program.
Rewrite everything — port the library into the verification language (impractical for anything beyond toy examples).
Require source annotations — like F*, which needs val specifications inside the library source. You can't do this for pip install numpy.

Deppy takes a fourth path: sidecar proofs.

The Sidecar Philosophy

A sidecar is a separate file that rides alongside the library it describes — just like a motorcycle sidecar. The library runs unmodified; the sidecar provides the formal specifications.

How It Works

For any module foo, you create a companion file foo.deppy. This file contains:

Type specifications — refined types for function parameters and return values.
Preconditions — what must be true before calling a function.
Postconditions — what the function guarantees.
Algebraic laws — properties like commutativity, associativity, or idempotence that the library's operations satisfy.
Trust level — how much you trust each specification (from LEAN_KERNEL_VERIFIED to UNTRUSTED).

# math.deppy — sidecar specs for Python's math module
module: math

foundation Real_sqrt_nonneg: "sqrt(x) >= 0 when x >= 0"
foundation Real_sqrt_sq: "sqrt(x)**2 == x when x >= 0"

spec math.sqrt:
  guarantee: "math.sqrt(x) is defined and non-negative when x >= 0"
  pure: true
  axiom sqrt_nonneg: "math.sqrt(x) >= 0 when x >= 0"
  axiom sqrt_inverse: "math.sqrt(x)**2 == x when x >= 0"

Equivalently, when you own the calling code, you can use the decorator form (see next chapter):

from deppy.sidecar import about, ExternalSpec, law

@about("math.sqrt")
class SqrtSpec(ExternalSpec):
    @law
    def nonneg(self, x: float) -> bool:
        """math.sqrt(x) >= 0 when x >= 0."""
        if x < 0: return True
        return self.call(x) >= 0

The `@about` Decorator

The @about decorator is the key mechanism. It binds a specification class to an external function or class by its fully qualified name — the library code is never touched.

@about("collections.OrderedDict.popitem")
class PopItemSpec(ExternalSpec):
    """popitem(last=True) returns the last-inserted key-value pair."""

    @law
    def returns_pair(self, d: OrderedDict) -> bool:
        # Each ``@law`` is a runtime-checkable claim about the live
        # target.  It also becomes a sidecar axiom in the kernel.
        if len(d) == 0: return True
        result = self.call(d)
        return isinstance(result, tuple) and len(result) == 2

Deppy resolves the dotted name at verification time, matches it to the installed library, and checks every call site in your code against the spec.

Trust Levels (Preview)

Not all specifications carry the same weight. Deppy tracks trust along two parallel lattices: a 7-element deppy-kernel lattice (TrustLevel) and a 6-element Lean-export lattice (LeanTrustLevel). This preview shows the most commonly-cited rungs of each — see Trust & Verification Levels for the full list.

Level	Badge	Lattice	Meaning
🟢 `LEAN_KERNEL_VERIFIED`	LEAN_KERNEL_VERIFIED	Lean export	Machine-checked by the Lean 4 kernel after export.
🟢 `KERNEL`	KERNEL	deppy kernel	The deppy kernel accepted a closed proof term.
🟡 `Z3_VERIFIED`	Z3_VERIFIED	deppy kernel	Proved by the Z3 SMT solver with no admits.
🟠 `LLM_CHECKED`	🟠 LLM_CHECKED	deppy kernel	Oracle says correct with confidence ≥ threshold.
🔴 `UNTRUSTED`	UNTRUSTED	deppy kernel	No verification. Specification is assumed on faith.

Trust propagates through dependencies: if function A calls function B, then A's trust level can never exceed B's. We'll explore this in detail in Trust & Verification Levels.

Gradual Verification

Sidecar proofs enable gradual verification — you don't have to specify everything at once. Start with the functions that matter most:

Gradual Verification Workflow

Identify critical call sites — which library functions does your safety-critical code depend on?

Write sidecar specs — create .deppy files with preconditions, postconditions, and algebraic laws.

Run verification — python -m deppy.pipeline.run_verify myproject.deppy (or --from-module myproject for the decorator form) checks every foundation, axiom, and verify block against the live module.

Add more specs over time — coverage grows incrementally. CI tracks "verification coverage" like code coverage.

Comparison with Other Systems

F* / Dafny

// F*: must add specs INSIDE the module
val sqrt: x:float{x >= 0.0}
         -> y:float{y >= 0.0
                /\ y *. y ≈ x}
let sqrt x = // implementation here
  // ← Can't do this for pip packages!

Deppy Sidecar

# math.deppy — lives BESIDE the library
module: math

foundation Real_sqrt_nonneg: "sqrt(x) >= 0 when x >= 0"

spec math.sqrt:
  guarantee: "non-negative output when input >= 0"
  pure: true
  axiom sqrt_nonneg: "math.sqrt(x) >= 0 when x >= 0"
  # ← Library stays untouched!

The sidecar approach means you can start verifying your code today, even if you depend on dozens of unverified libraries. You specify the contract you rely on, and Deppy checks that your code respects it.

Anatomy of a `.deppy` File

Every sidecar file follows a consistent structure. Here is the complete anatomy:

# ─── Header: identify the target module ───
# filename: numpy_linalg.deppy
module: numpy
version: ">=1.24,<3"

# ─── Foundations: generic mathematical facts ───
foundation Linear_solve_unique: "A·x == b has at most one solution when det(A) != 0"

# ─── Spec block: one per target function/method ───
spec numpy.linalg.solve:
  guarantee: "A @ solve(A, b) ≈ b when A is square and non-singular"
  pure: true
  axiom solve_inverts: "numpy.allclose(A @ numpy.linalg.solve(A, b), b)"
  axiom solve_unique: "numpy.linalg.solve(A, b) is independent of run"

verify hypotenuse_solve_safe:
  function: my_module.hypotenuse_solve
  property: "my_module.hypotenuse_solve(A, b) is non-empty"
  via: "foundation Linear_solve_unique"

The key design principles:

Foundations are generic mathematical facts that you trust as axioms (the system can also discharge many of them via Z3).
Axioms inside a spec are formal claims about the live target — Deppy parses each statement as a Python expression and resolves its referenced attributes against the actual class or module.
Verify blocks are theorems that have to be discharged from foundations + axioms via the proof kernel (PSDL bodies, citation, chain, rewrite-then-cite).
Version pinning — sidecars declare version: … so API changes don't silently break your specs.

How Sidecar Verification Works

When you run python -m deppy.pipeline.run_verify, the pipeline (in deppy.lean.sidecar_runner) performs these steps:

Verification Pipeline

Load sidecars — parse all .deppy and .deppy-sig files from the configured search paths. Build a registry of specs keyed by fully qualified function name.

Scan source — analyze your Python code via AST to find every call to an external function. Match each call to a sidecar spec (if one exists).

Generate proof obligations — for each call site, extract the precondition from the sidecar and the context from the surrounding code (types, assertions, control flow guards).

Discharge proofs — send each obligation to the appropriate solver. Z3 handles arithmetic and logical properties. The LLM oracle handles semantic properties. Lean handles constructive proofs.

Propagate postconditions — once a precondition is verified, the postcondition becomes available as an assumption for downstream code. This lets proofs compose across function boundaries.

Compute trust levels — traverse the dependency graph and compute each function's effective trust level via the propagation rule.

Step 5 is crucial: postconditions compose. If math.sqrt guarantees result ≥ 0, then code that computes x = math.sqrt(n); y = math.sqrt(x) can automatically prove that x ≥ 0 — the precondition for the second call — using the postcondition of the first.

Why Not Just Modify the Source?

Some verification systems require annotations inside the source code. This doesn't work for the Python ecosystem:

Approach	Pro	Con
Modify source (F*-style)	Specs live with implementation	Can't do this for `pip install numpy`
Stub files (`.pyi`)	Python has native support	Only types, no pre/post/laws
Wrapper libraries	Full control	Runtime overhead, breaks introspection
Sidecar files	Zero modification, full specs, composable	Must keep in sync with library updates

The sidecar approach extends Python's existing .pyi stub file convention — but adds preconditions, postconditions, algebraic laws, trust levels, and verification support.

What You'll Learn in Part 4

Sidecar Basics — Creating .deppy files, the @about decorator, ExternalSpec, and verifying specs for math and collections.
Case Study: SymPy — A full sidecar verification of SymPy's algebraic, matrix, and calculus modules. Using homotopy to transport properties between symbolic and numerical computation.
Trust & Verification Levels — The four trust levels, trust propagation, gradual verification, CI integration, and the trust dashboard.

Quick Example: Verifying Your Code Against a Sidecar

Suppose your project uses math.sqrt. Here's the full workflow:

# my_geometry.py — YOUR code
import math

def circle_area(radius: float) -> float:
    """Compute area of a circle. Radius must be non-negative."""
    assert radius >= 0
    return math.pi * radius * radius

def hypotenuse(a: float, b: float) -> float:
    """Compute hypotenuse. a, b must be non-negative."""
    assert a >= 0 and b >= 0
    return math.sqrt(a * a + b * b)
    # ↑ Deppy checks: a*a + b*b >= 0 (sqrt precondition)
    #   This is automatically discharged by Z3!

Running verification:

$ python -m deppy.pipeline.run_verify math.deppy --out math.lean

════════════════════════════════════════════════════════════════════════
  Deppy sidecar verification report
  Sidecar:  math.deppy
  Library:  math (built-in) — Python ≥3.10
  Output:   math.lean
────────────────────────────────────────────────────────────────────────
  foundations:                       (axioms about real arithmetic)
  spec(s) bound:        math.sqrt
  axiom(s):             sqrt_nonneg, sqrt_inverse_when_nonneg
  verify block(s):      hypotenuse_nonneg
    PSDL → kernel ProofTerm:         OK
    Lean export round-trip:          OK
────────────────────────────────────────────────────────────────────────
  ✓ CERTIFIED — top-level certify_or_die: PASS

Counts and trust badges in the report come from the section_counters dict written to math.json alongside the Lean output — see Lean Export for the full schema.

Before You Begin: Install Deppy and create a simple Python file that uses math.gcd. Think about what preconditions and postconditions you would want to specify. What properties does gcd satisfy? (Hint: gcd(a, b) divides both a and b, and is the greatest such divisor.)

Part 4 — Sidecar Proofs

The Problem

The Sidecar Philosophy

How It Works

The @about Decorator

Trust Levels (Preview)

Gradual Verification

Gradual Verification Workflow

Comparison with Other Systems

F* / Dafny

Deppy Sidecar

Anatomy of a .deppy File

How Sidecar Verification Works

Verification Pipeline

Why Not Just Modify the Source?

What You'll Learn in Part 4

Quick Example: Verifying Your Code Against a Sidecar

The `@about` Decorator

Anatomy of a `.deppy` File