Theorem Discovery

JuGeo's discovery engine proactively surfaces verifiable properties of a codebase that developers have not yet specified. It combines novelty search, theorem economics, proof suggestion, and research assistance to bootstrap verification coverage from scratch.

🕑 18 min read Advanced

📋

On this page: Discovery engine · Novelty search · Theorem economics · Theorem ecology · Proof suggestion · Regime bootstrap · Python examples

Discovery Engine

The public ideation surface today is the jugeo ideate CLI command and the jugeo.easy.ideate() helper. They scan a topic, generate candidate theorem statements, and return ranked suggestions that can be fed back into verification or research workflows. The underlying engine still operates in three modes:

INDUCTIVE — generalise from already-verified local facts to conjectured global properties. E.g., if every function in a module is pure, conjecture that the module is pure.
ANALOGICAL — transport known theorems from structurally similar coordinates. E.g., if List.map is proved pure, conjecture the same for Dict.map.
EXPLORATORY — enumerate candidate formulas from the theorem schema library and filter by plausibility using fast solver checks.

Discovery Utility Function

\[ U(T) \;=\; \underbrace{\mathrm{novelty}(T)}_{\text{information gain}} \;+\; \lambda \cdot \underbrace{\mathrm{coverage}(T)}_{\text{frontier size reduction}} \;-\; \mu \cdot \underbrace{\mathrm{effort}(T)}_{\text{estimated proof cost}} \]

The engine ranks candidate theorems by utility and presents the top-\(k\) candidates to the orchestrator as new frontier objectives. Each candidate carries a confidence estimate (plausibility score from fast solver checks) and a proof sketch from the research copilot.

Novelty Search

Novelty search measures how much new information a theorem would add to the theorem ecology relative to what is already known. A theorem that is a trivial corollary of an existing result has low novelty; one that introduces a structurally new property has high novelty.

JuGeo computes novelty using the semantic fingerprint of a theorem — a hash of its logical content after normalisation — and compares it against the fingerprints of all theorems in the ecology. The novelty score is the cosine distance in a learned embedding space of theorem fingerprints.

python

from jugeo.easy import ideate

ideas = ideate("reliable distributed schedulers", n=3)

print(ideas.domain)
print(len(ideas.theorems))
for theorem in ideas.theorems[:3]:
    print(theorem.statement)
    print(theorem.novelty_score)
    print(theorem.confidence)

Theorem Economics

Theorem economics models the value of a theorem to the verification effort of a project. The economic model assigns a “theorem budget” to each project based on available verification resources and allocates that budget to the theorems with the highest expected return on investment.

Three key economic concepts:

Leverage — how many other theorems would become provable (or easier to prove) if this theorem were proved. Corresponds to the backpressure reduction achievable by adding this theorem to the ecology.
Depreciation — theorems that are very specific to the current code structure may become invalid as the code evolves. High-leverage, low-depreciation theorems are the most valuable.
Marginal cost — the additional solver / human effort required to prove the theorem given what is already in the ecology.

python

from jugeo.easy import ideate

ideas = ideate("effect-safe Flask request handlers", n=5)

print(len(ideas.theorems))
print(len(ideas.conjectures))
print(len(ideas.connections))
for connection in ideas.connections[:3]:
    print(connection)

Theorem Ecology

The theorem ecology is JuGeo's persistent store of all verified theorems for a project. It acts as the “institutional memory” of the verification effort: new theorems are added as they are proved, and existing theorems can be retrieved for use in analogy transport, cut introduction, and proof sketching.

The ecology has three layers:

LOCAL

Local Ecology

Theorems specific to the current project. Stored in .jugeo/ecology.db in the project root.

SHARED

Shared Ecology

Theorems shared across multiple projects in an organisation, stored in a central server and queried via the registry API.

GLOBAL

Global Ecology

Standard library and third-party package theorems, maintained by the JuGeo team and accessible to all users.

💡

Ecology-first verification. Before attempting to prove a judgment from scratch, JuGeo automatically searches the ecology for an existing theorem that implies the goal. Ecology hits account for approximately 35% of successfully closed judgments in benchmarks, making the ecology one of the most impactful optimisations in the system.

Proof Suggestion

The proof suggestion system combines the ecology, the research copilot, and the move engine to generate a proof sketch for an open judgment. A proof sketch is not a complete proof; it is a high-level outline of the key moves and lemmas that the full proof would use, intended to guide the orchestrator's search.

python

import subprocess

completed = subprocess.run(
    ["jugeo", "ideate", "--topic", "effect-safe Flask request handlers", "--count", "3"],
    check=True,
    capture_output=True,
    text=True,
)

print(completed.stdout)

Regime Bootstrap

When starting verification on a new codebase that has zero annotations, JuGeo provides a regime bootstrap procedure. The bootstrap runs the discovery engine, selects the most valuable theorems to prove first (using theorem economics), and generates an initial batch of annotation candidates for developer review.

The bootstrap is designed to be run once at the start of a verification engagement and produces a bootstrap plan: an ordered list of recommended annotations, in decreasing order of leverage, that will give the highest verification coverage for the least effort.

python

from jugeo.easy import ideate

bootstrap_notes = ideate("bootstrap verified interfaces for a new Python codebase", n=4)

print(bootstrap_notes.domain)
for theorem in bootstrap_notes.theorems:
    print(theorem.statement)

Python Examples

Full discovery engine workflow:

python

from jugeo.easy import ideate

ideas = ideate("judgment geometry for full-stack Flask applications", n=5)

print(ideas.domain)
for theorem in ideas.theorems[:5]:
    print(theorem.statement)
    print(theorem.proof_sketch)