Sidecar Proof Basics
This chapter teaches you how to create sidecar specifications — formal contracts
for libraries you don't own. By the end, you'll be able to write .deppy
files for any Python module and verify your code against them.
Creating a .deppy File
A sidecar file is just a free-standing
<something>.deppy text file. The pipeline is
invoked by name:
python -m deppy.pipeline.run_verify path/to/file.deppy.
Conventionally the file lives next to the Python source it
describes (my_app.py ↔ my_app.deppy) or
under a project-level sidecars/ directory.
# Suggested project layout
myproject/
├── src/
│ └── my_app.py # your code — imports math, collections
├── sidecars/
│ ├── math.deppy # sidecar for Python's math module
│ └── collections.deppy # sidecar for Python's collections
├── pyproject.toml # normal Python packaging
└── tests/
--out /
--from-module to run_verify, or wire a
shell loop / Makefile / GitHub Actions step to enumerate the
.deppy files in your tree. Per-run knobs
(Z3 timeout, parallelism) live on
deppy.config.DeppyConfig.
The @about Decorator
Every specification in a sidecar file is bound to an external function or class
via the @about decorator. The argument is the fully qualified
name of the target.
from deppy.sidecar import about, ExternalSpec
@about("math.sqrt")
class SqrtSpec(ExternalSpec):
"""Specification for math.sqrt."""
...
@about("math.factorial")
class FactorialSpec(ExternalSpec):
"""Specification for math.factorial."""
...
The name resolution rules are:
"math.sqrt"— top-level function in a module"collections.OrderedDict"— a class"collections.OrderedDict.popitem"— a method on a class"numpy.linalg.solve"— nested module function
math.sqrt doesn't resolve, you get a clear error — not a
silent failure.
The ExternalSpec Class
ExternalSpec is the base class for all sidecar specifications. It
provides a structured way to declare preconditions, postconditions, algebraic
laws, and exception behavior.
from deppy.sidecar import about, ExternalSpec, law
@about("math.sqrt")
class SqrtSpec(ExternalSpec):
"""math.sqrt(x) → float"""
@law
def nonneg(self, x: float) -> bool:
"""math.sqrt(x) is non-negative when x >= 0."""
if x < 0: return True
return self.call(x) >= 0
@law
def sqrt_of_square(self, x: float) -> bool:
"""sqrt(x²) == |x| for all x."""
return abs(self.call(x * x) - abs(x)) < 1e-10
An ExternalSpec subclass is a runtime-checkable bundle:
each @law method takes the same arguments the target
takes, calls self.call(*args) to invoke the live
target, and returns a boolean. In a sidecar pipeline, every
@law also becomes a recorded sidecar axiom that the
kernel can cite.
In the .deppy sidecar form, the same content is
written declaratively as foundations, axioms, and verify blocks
— see Verifying
sympy.geometry for a worked example.
ExternalSpec Surface
| Member | Purpose | Required? |
|---|---|---|
@about(target) |
Bind the spec class to the dotted-path target | Required |
@law methods |
Runtime-checkable claims about the target; each becomes a sidecar axiom in the kernel | Recommended |
self.call(*args) |
Resolves and invokes the live target inside an
@law body |
— |
guarantees /
preconditions /
postconditions lists |
String-form pre/post/guarantees consumed by the
SidecarModule registry |
Optional |
trust_level |
TrustLevel attached to the bundle (default
UNTRUSTED until promoted by a successful
proof) |
Optional |
Assume vs. Prove
In the .deppy file syntax, sidecar metadata
distinguishes between things you assume about
the library (its observable axioms) and things you
prove about your code's usage of it (verify
blocks). The shape on disk is:
foundation / axiom
# You TRUST the library / arithmetic does this
foundation Real_sqrt_nonneg:
"sqrt(x) >= 0 when x >= 0"
spec math.sqrt:
guarantee: "non-negative output"
axiom sqrt_nonneg:
"math.sqrt(x) >= 0 when x >= 0"
# ↑ Z3 may *try* to discharge it; if it can't, it
# becomes an admitted axiom in the Lean output.
verify block
# Deppy CHECKS this against the foundations + axioms
verify hypotenuse_nonneg:
function: my_module.hypotenuse
property: "hypotenuse(a, b) >= 0"
via: "foundation Real_sqrt_nonneg"
# ↑ The kernel constructs a real ProofTerm.
The rule is simple: foundations and axioms are assumed (taken on faith, optionally Z3-checked), verify blocks are proved (the kernel must construct a closed proof term that is also kernel-verified by Lean on round-trip).
certified == false
in the JSON sidecar). This is how the system guards against
false foundations slipping in.
Example: Sidecar for Python's math Module
Let's build a complete sidecar for the most commonly used functions in
math.
# sidecars/math.deppy
module: math
# ── Foundations: arithmetic facts the kernel cites ───────────
foundation Real_sqrt_nonneg: "sqrt(x) >= 0 when x >= 0"
foundation Real_sqrt_sq: "sqrt(x)**2 == x when x >= 0"
foundation Int_factorial_pos: "n! >= 1 when n >= 0"
foundation Int_gcd_divides: "gcd(a,b) divides both a and b"
# ── sqrt ──────────────────────────────────────────────
spec math.sqrt:
guarantee: "non-negative output when input is non-negative"
pure: true
axiom sqrt_nonneg: "math.sqrt(x) >= 0 when x >= 0"
axiom sqrt_inverse: "math.sqrt(x)**2 == x when x >= 0"
# ── factorial ─────────────────────────────────────────
spec math.factorial:
guarantee: "factorial is positive on non-negative integers"
pure: true
axiom factorial_pos: "math.factorial(n) >= 1 when n >= 0"
axiom factorial_step: "math.factorial(n) == n * math.factorial(n-1) when n >= 1"
# ── gcd ───────────────────────────────────────────────
spec math.gcd:
guarantee: "GCD divides both inputs and is non-negative"
pure: true
axiom gcd_nonneg: "math.gcd(a, b) >= 0"
axiom gcd_commutative: "math.gcd(a, b) == math.gcd(b, a)"
axiom gcd_identity: "math.gcd(a, 0) == abs(a)"
Example: Sidecar for collections.OrderedDict
# sidecars/collections.deppy
module: collections
foundation OrderedDict_setitem_appends:
"setitem on a fresh key appends to insertion order"
foundation OrderedDict_popitem_returns_last:
"popitem(last=True) returns the most-recently-inserted pair"
spec collections.OrderedDict.popitem:
guarantee: "raises KeyError exactly when the dict is empty"
axiom popitem_last_when_nonempty:
"len(d) > 0 implies popitem(d, last=True)[0] == list(d.keys())[-1]"
axiom popitem_raises_on_empty:
"len(d) == 0 implies popitem(d) raises KeyError"
spec collections.OrderedDict.move_to_end:
guarantee: "key ends up at end (or front when last=False)"
axiom move_to_end_last:
"move_to_end(d, k, last=True); list(d.keys())[-1] == k"
Loading Sidecars: SidecarModule
Deppy provides SidecarModule to programmatically load and query
sidecar specifications.
from deppy.proofs.sidecar import SidecarModule
# Parse a .deppy file into a SidecarModule object
math_sidecar = SidecarModule.from_file("sidecars/math.deppy")
# Query specs
sqrt_spec = math_sidecar.get_spec("math.sqrt")
print(sqrt_spec.precondition) # first precondition string (or "")
print(sqrt_spec.postcondition) # first postcondition string (or "")
print(sqrt_spec.laws) # list of axiom names declared in this spec
print(sqrt_spec.trust_level) # TrustLevel enum value
# List all specs in this sidecar
for name, spec in math_sidecar.specs.items():
print(f" {name}: {len(spec.laws)} axioms, trust={spec.trust_level.name}")
Verification output:
$ python -m deppy.pipeline.run_verify math.deppy --out math.lean
════════════════════════════════════════════════════════════════════════
Deppy sidecar verification report
Sidecar: math.deppy
Library: math (built-in) — Python 3.12
Output: math.lean
────────────────────────────────────────────────────────────────────────
Foundations: 3
discharged by Z3: 3 / 3 attempted
Spec blocks: 3 (math.sqrt, math.factorial, math.gcd)
Sidecar axioms: 7
Verify blocks: 4 / 4 kernel-verified
Counterexample search: 3 attempted, 0 found
────────────────────────────────────────────────────────────────
Lean export round-trip: OK
────────────────────────────────────────────────────────────────
✓ CERTIFIED — top-level certify_or_die: PASS
Detailed call-site narration is not part of the
run_verify output today; the pipeline reports
foundation/axiom/verify counts and the kernel verdict, then writes
the Lean text plus a JSON sidecar with the full
section_counters. For per-call reasoning use
python -m deppy check <module.py> --sidecar math.deppy
instead, which runs the safety pipeline.
Handling Exceptions in Specs
The raises method lets you specify which exceptions a function throws
for given inputs. Deppy uses this in two ways:
- Negative preconditions: If you call
math.sqrt(-1), Deppy knows aValueErrorwill be raised and flags it. - Exception-safe code: If your code wraps calls in
try/except, Deppy verifies you catch the right exception type.
# Your code with try/except
def safe_sqrt(x: float) -> float | None:
try:
return math.sqrt(x)
except ValueError:
return None
# Deppy verifies:
# 1. math.sqrt CAN raise ValueError (from the sidecar spec)
# 2. The except clause catches the right type
# 3. Return type is float | None (union of both branches)
Module Signatures: .deppy-sig Files
For large libraries, writing full ExternalSpec classes for every
function is tedious. Deppy supports a lightweight signature format
for quickly declaring types without full pre/post conditions:
# math.deppy-sig — lightweight type signatures
# Format: function_name : (param_types) -> return_type [trust_level]
sqrt : (x: float {x >= 0}) -> float {result >= 0} [Z3_VERIFIED]
factorial : (n: int {n >= 0}) -> int {result > 0} [Z3_VERIFIED]
gcd : (a: int, b: int) -> int {result >= 0} [Z3_VERIFIED]
ceil : (x: float) -> int {result >= x} [Z3_VERIFIED]
floor : (x: float) -> int {result <= x} [Z3_VERIFIED]
log : (x: float {x > 0}) -> float [Z3_VERIFIED]
log2 : (x: float {x > 0}) -> float [Z3_VERIFIED]
exp : (x: float) -> float {result > 0} [Z3_VERIFIED]
sin : (x: float) -> float {-1 <= result <= 1} [Z3_VERIFIED]
cos : (x: float) -> float {-1 <= result <= 1} [Z3_VERIFIED]
Signature files are automatically combined with full .deppy specs.
If both exist, the .deppy file takes precedence.
JSON Sidecar Output for CI
Every successful run of
python -m deppy.pipeline.run_verify foo.deppy --out foo.lean
writes a machine-readable JSON sidecar at foo.json
next to the Lean file. This is what CI consumes:
$ python -m deppy.pipeline.run_verify sympy_geometry.deppy \
--out sympy_geometry.lean
$ cat sympy_geometry.json | head -30
{
"sidecar": ".../sympy_geometry.deppy",
"lean_path": ".../sympy_geometry.lean",
"library_summary": "sympy 1.14.0 — sympy",
"elapsed_ms": 604.21,
"certified": true,
"failures": [],
"lean_runner": { "invoked": true, "ok": true, "stderr": "" },
"section_counters": {
"foundation_count": 14,
"axiom_count": 38,
"verify_total": 11,
"verify_kernel_verified": 11,
"counterexamples_attempted": 3,
"counterexamples_found": 0,
"soundness_gate_pass": 3,
"soundness_gate_fail": 0,
...
},
"classification_counts": {
"CITES_FOUNDATION": 2,
"REWRITE_THEN_CITE": 6,
"CHAIN": 5,
"CITES_AXIOM": 25,
"COMPOSITION": 7
}
}
Use this JSON in CI pipelines to track specification drift across
versions — certified, section_counters,
and classification_counts are stable, scriptable
fields:
# .github/workflows/verify.yml (excerpt)
- name: Run deppy sidecar pipeline
run: |
python -m deppy.pipeline.run_verify sympy_geometry.deppy \
--out sympy_geometry.lean
- name: Fail if regressed
run: |
python - <<'PY'
import json
j = json.load(open("sympy_geometry.json"))
assert j["certified"], "sympy_geometry.deppy did not certify"
assert j["section_counters"]["verify_kernel_verified"] == \\
j["section_counters"]["verify_total"], \\
"some verify blocks were not kernel-verified"
PY
Best Practices
sqrt(x) >= 0, don't also specify floating-point precision —
keep specs minimal and maintainable.
Exercises
json.loads.
What precondition should you specify? (Hint: the input must be valid JSON.)
What postcondition can you assume about the return type?
os.path.join.
Specify the law: os.path.join(a, os.path.join(b, c)) ==
os.path.join(os.path.join(a, b), c) — is this always true?
When does it fail? (Hint: what if b starts with /?)
heapq module. Specify: heappush maintains the heap
invariant, heappop returns the smallest element, and
heapify is idempotent.
.deppy-sig file for
Python's statistics module covering mean,
median, stdev, and variance.
What preconditions do they share?