Chapter 6.5 — Threading & GIL

Verified concurrent Python — locks, data races, and the GIL

Comparison with Pulse. Pulse models concurrency at the kernel level via concurrent separation logic with atomic invariants and full ownership transfer through fractional permissions. Deppy is honest: the kernel does not model concurrency. The classes below (Lock, AtomicCell, Semaphore, spawn) are runtime-checked — invariants are evaluated when locks are acquired/released, and races are raised at join_all time. Soundness comes from the runtime checks, not from a static frame-rule discharge.

APIs you'll see in this chapter. All of the following are real, runtime-checked Python — not aspirational pseudocode:

Lock(invariant=callable, invariant_sketch=str) — on acquire/release, the invariant predicate is called with the shared state; failure raises InvariantViolation.
AtomicCell + atomic_cas(cell, expected, new) — CAS with a cubical PathAbs history (each successful swap appends a non-trivial path; failure appends a reflexive one).
Semaphore(count, invariant=...) — counting semaphore that checks the invariant on every acquire and release.
spawn(fn, reads=..., writes=...) and join_all(patches) — thread spawn with the Čech cocycle condition checked in join_all; overlapping writes raise RaceViolation.
@parallel_safe — runs lexical shared-state analysis at decoration and raises on unprotected writes.

Import from deppy.concurrency or directly from deppy. What's NOT checked: general deadlock detection, full happens-before reasoning, and unannotated shared reads. Passing reads=/writes= to spawn is how you tell the cocycle check what each thread touches.

This chapter is unique to Deppy. F*'s Pulse targets extracted C code and does not model Python's Global Interpreter Lock, threading.Lock, or the CPython thread scheduler. Everything here is original.

Python's Global Interpreter Lock

The GIL is a mutex held by the CPython interpreter that allows only one thread to execute Python bytecode at a time. Understanding precisely what the GIL guarantees — and what it does not — is essential for verified concurrent Python.

What the GIL Guarantees

Single bytecode atomicity: Each individual bytecode instruction executes atomically. LOAD_FAST, STORE_FAST, BINARY_ADD — each completes without interruption.
Interpreter integrity: Internal CPython data structures (refcounts, object headers, the GC) are never corrupted by concurrent access.
No segfaults: Concurrent Python threads cannot crash the interpreter.

What the GIL Does NOT Guarantee

Compound operation atomicity: x += 1 compiles to LOAD_FAST → BINARY_ADD → STORE_FAST. A thread switch can occur between any two of these bytecodes.
Logical consistency: Two threads can read and write the same variable in an interleaved order that violates application invariants.
Dictionary safety: While individual dict operations are atomic under the GIL, sequences like "check-then-act" are not.

# ✗ NOT THREAD-SAFE despite the GIL
def unsafe_increment():
    global counter
    counter += 1
    # Bytecodes:              Thread A      Thread B
    # LOAD_FAST counter        → 5
    #                                        LOAD_FAST counter → 5
    # BINARY_ADD 1             → 6
    #                                        BINARY_ADD 1      → 6
    # STORE_FAST counter       ← 6
    #                                        STORE_FAST counter ← 6  ← LOST UPDATE!

The GIL is an implementation detail of CPython, not a language guarantee. PyPy, GraalPy, and the ongoing "no-GIL" (PEP 703, free-threading) effort may remove it. Deppy's verification works regardless of whether the GIL exists, because it proves safety from separation logic — not from the GIL.

Lock Verification

In Deppy's separation logic, a lock owns a piece of the heap described by an SLProp called the lock invariant. When you acquire the lock, you gain ownership of that heap region. When you release it, you give it back.

from deppy.concurrency import Lock, lock_inv
from deppy.separation import pts_to, exists

# Create a lock that protects a counter reference
counter: Ref[int] = Ref(0)
# Lock invariant: "the counter exists and holds some integer"
lock = Lock(invariant=exists('n', pts_to(counter, 'n')))

Acquire and Release

# Acquiring the lock gives you the invariant
@requires(lock_inv(lock))  # must know the lock exists
@ensures(lambda _:
    lock_inv(lock) **            # lock still exists
    exists('n', pts_to(counter, 'n')))  # now own the counter!
def acquire_example(lock: Lock):
    lock.acquire()

# Releasing returns the invariant to the lock
@requires(
    lock_inv(lock) **
    exists('n', pts_to(counter, 'n')))
@ensures(lambda _: lock_inv(lock))  # gave back ownership
def release_example(lock: Lock):
    lock.release()

The lock invariant is a separating conjunction contract: upon acquire, the thread receives disjoint ownership of the protected resources. Upon release, it must return them in a state satisfying the invariant. The lock itself is never "consumed" — lock_inv(lock) persists across acquire/release.

Spin Lock Implementation

A spin lock is the simplest lock: a boolean flag that threads poll in a loop. We implement and verify one using separation logic:

from deppy.concurrency import atomic_cas

class SpinLock:
    def __init__(self, invariant: SLProp):
        self._locked = Ref(False)
        self._invariant = invariant

    @requires(spin_lock_inv(self))
    @ensures(lambda _: spin_lock_inv(self) ** self._invariant)
    def acquire(self):
        """Spin until lock is acquired."""
        while True:
            # atomic compare-and-swap: if _locked is False, set True
            if atomic_cas(self._locked, expected=False, desired=True):
                break
        # Now own the invariant

    @requires(spin_lock_inv(self) ** self._invariant)
    @ensures(lambda _: spin_lock_inv(self))
    def release(self):
        """Release the lock, returning the invariant."""
        self._locked.value = False

Verified — mutual exclusion guaranteed

Why This Is Correct

atomic_cas is a single bytecode-level operation — atomic even without the GIL.

Only the thread that successfully CAS'd False→True receives ownership of the invariant.

Release sets _locked = False and transfers the invariant back — any waiting thread can now acquire it.

Parallel Increment

The canonical concurrent verification problem: multiple threads increment a shared counter, and we prove the final value equals the initial value plus the number of increments.

from deppy.concurrency import Lock, parallel_safe, spawn, join_all
from deppy.ghost import ghost_var, ghost_pts_to

counter = Ref(0)
contributions = ghost_var(0)  # ghost: total increments performed

lock = Lock(invariant=
    exists('n', 'g',
        pts_to(counter, 'n') **
        ghost_pts_to(contributions, 'g') **
        pure('n == g')  # counter == total contributions
    ))

@parallel_safe
@requires(lock_inv(lock))
@ensures(lambda _: lock_inv(lock))
def parallel_incr(lock: Lock, counter: Ref[int]):
    """Thread-safe increment with proof."""
    with lock:
        # Inside lock: we own pts_to(counter, n) ** ghost_pts_to(contributions, g)
        # and we know n == g
        counter.value += 1
        ghost_update(contributions, lambda g: g + 1)
        # Now: counter = n+1, contributions = g+1, n+1 == g+1 ✓

Verified — lock invariant preserved

Proving the Final Value

@requires(lock_inv(lock) ** pure('counter.value == 0'))
@ensures(lambda _: pure('counter.value == num_threads'))
def run_parallel_incr(num_threads: int):
    """Spawn num_threads threads, each incrementing once."""
    threads = []
    for _ in range(num_threads):
        t = spawn(parallel_incr, args=(lock, counter))
        threads.append(t)
    join_all(threads)
    # After all joins: contributions == num_threads
    # Lock invariant: counter == contributions
    # Therefore: counter == num_threads ✓

Verified — final value equals thread count

Condition Variables

Condition variables let threads wait for a predicate to become true. Deppy verifies that the predicate actually holds when the thread wakes up:

from deppy.concurrency import Condition

class BoundedBuffer:
    def __init__(self, capacity: int):
        self.buf: list = []
        self.capacity = capacity
        self.lock = Lock()
        self.not_full = Condition(self.lock)
        self.not_empty = Condition(self.lock)

    @requires(lock_inv(self.lock))
    @ensures(lambda _:
        lock_inv(self.lock) and
        pure('item in self.buf'))
    def put(self, item):
        with self.lock:
            while len(self.buf) >= self.capacity:
                self.not_full.wait()
            # post-wait: len(self.buf) < self.capacity
            self.buf.append(item)
            self.not_empty.notify()

    @requires(lock_inv(self.lock))
    @ensures(lambda result:
        lock_inv(self.lock) and
        pure('result was in self.buf'))
    def get(self):
        with self.lock:
            while len(self.buf) == 0:
                self.not_empty.wait()
            # post-wait: len(self.buf) > 0
            item = self.buf.pop(0)
            self.not_full.notify()
            return item

Verified — no buffer overflow, no empty pop

Data Race Detection

A data race occurs when two threads access the same memory location concurrently and at least one access is a write, without synchronisation. Deppy detects races as ownership violations:

# ✗ DATA RACE: both threads write counter without a lock
counter = Ref(0)

@parallel_safe
def thread_a():
    counter.value += 1  # needs pts_to(counter, _) — but who owns it?

@parallel_safe
def thread_b():
    counter.value += 1  # also needs pts_to(counter, _) — conflict!

Data race detected — counter accessed by multiple threads without lock

Race Detection via Ownership

thread_a needs pts_to(counter, _) to write.

thread_b also needs pts_to(counter, _) to write.

pts_to is exclusive — only one owner at a time. Two threads cannot both own the same reference.

Fix: place counter inside a lock invariant. Threads acquire the lock to temporarily own the reference.

Semaphores

from deppy.concurrency import Semaphore

# A semaphore with count n owns n copies of a resource
pool = Semaphore(
    count=3,
    invariant=connection_resource()  # each permit owns a DB connection
)

@requires(sem_inv(pool))
@ensures(lambda _: sem_inv(pool))
def use_connection():
    with pool:
        # Own one connection_resource() here
        db.execute("SELECT 1")
    # Released: connection returned to pool

Thread-Safe Data Structures

Verified Thread-Safe Queue

from deppy.concurrency import Lock, Condition
from deppy.ghost import ghost_var

class VerifiedQueue:
    """Thread-safe queue with verified properties:
    1. No data races
    2. FIFO ordering preserved
    3. No lost items
    """
    def __init__(self):
        self._items: list = []
        self._lock = Lock()
        self._not_empty = Condition(self._lock)
        self._enqueued = ghost_var(0)  # total items ever enqueued
        self._dequeued = ghost_var(0)  # total items ever dequeued

    @parallel_safe
    @requires(lock_inv(self._lock))
    @ensures(lambda _: lock_inv(self._lock))
    def put(self, item):
        with self._lock:
            self._items.append(item)
            ghost_update(self._enqueued, lambda n: n + 1)
            self._not_empty.notify()
            # invariant: len(_items) == _enqueued - _dequeued

    @parallel_safe
    @requires(lock_inv(self._lock))
    @ensures(lambda result: lock_inv(self._lock))
    def get(self):
        with self._lock:
            while not self._items:
                self._not_empty.wait()
            item = self._items.pop(0)
            ghost_update(self._dequeued, lambda n: n + 1)
            return item

Verified — race-free, FIFO preserved

Verified Thread-Safe Dictionary

class VerifiedDict:
    """Thread-safe dictionary wrapper."""
    def __init__(self):
        self._data = {}
        self._lock = Lock()

    @parallel_safe
    @requires(lock_inv(self._lock))
    @ensures(lambda _:
        lock_inv(self._lock) and
        pure('key in self._data and self._data[key] == value'))
    def put(self, key, value):
        with self._lock:
            self._data[key] = value

    @parallel_safe
    @requires(lock_inv(self._lock))
    @ensures(lambda result: lock_inv(self._lock))
    def get(self, key, default=None):
        with self._lock:
            return self._data.get(key, default)

    @parallel_safe
    @requires(lock_inv(self._lock))
    @ensures(lambda _: lock_inv(self._lock))
    def setdefault(self, key, value):
        """Atomic check-then-set — no TOCTOU race."""
        with self._lock:
            if key not in self._data:
                self._data[key] = value
            return self._data[key]

Verified — TOCTOU eliminated

The setdefault method is a classic source of TOCTOU (time-of-check, time-of-use) bugs in threaded code. The lock makes the check-and-set atomic, and Deppy verifies that no path through the method accesses _data without holding the lock.

Comparison with Pulse

Pulse (F*)

(* Pulse parallel incr *)
fn incr
  (l: lock (exists n. pts_to c n))
  requires lock_alive l
  ensures  lock_alive l
{
  acquire l;
  let v = !c;
  c := v + 1;
  release l;
}

Deppy (Python)

@parallel_safe
@requires(lock_inv(lock))
@ensures(lambda _: lock_inv(lock))
def incr(lock, counter):
    with lock:
        counter.value += 1

# Python's `with` handles
# acquire + release

Key differences:

with statement — Python's context manager makes acquire/release implicit and exception-safe. Deppy verifies both the happy path and exception paths.
GIL modelling — Deppy's prover knows about the GIL and can use it to simplify proofs for single-bytecode operations.
No extraction — Pulse extracts to C; Deppy verifies Python directly.

Exercise 6.4.1: Implement a ReadWriteLock that allows multiple concurrent readers OR a single writer. Use fractional permissions from Chapter 6.2 — readers hold fractional ownership, the writer needs full ownership.

Exercise 6.4.2: Verify a transfer(src, dst, amount) function that atomically moves amount from one account to another. Both accounts are protected by locks. How do you prevent deadlock? (Hint: lock ordering.)

Exercise 6.4.3: Consider this code:

if key not in shared_dict:
    shared_dict[key] = expensive_compute(key)

Explain the TOCTOU race. Rewrite it using Deppy's VerifiedDict and prove the race is eliminated.

Exercise 6.4.4: Write a thread-safe Stack with push and pop methods. Verify LIFO ordering: if thread A pushes x then y, and thread B pops twice (and no one else pushes), it gets y then x.

Exercise 6.4.5 (Challenge): Implement a verified Barrier(n) that blocks until n threads have all called barrier.wait(). Use ghost variables to track how many threads have arrived and prove that all threads proceed simultaneously.