Contrax: Lightweight and extensible Design by Contract

The world is producing code — the realization of intelligence — at an exponentially increasing rate. Most of it is full of bugs. Contrax is a living example of that. It is also an attempt at a solution.

Most of the code being produced is hard to audit for bugs. There's a lot of it. Much of it is convoluted. Often, that's unintentional.

The amazing thing we're seeing with the rise of Artificial Intelligence is that many of the behaviors we assumed were because of sinful humanity — cutting corners, being lazy, not thinking problems through fully — may actually be inherent characteristics of intelligence.

I had a conversation with an AI last year that gave the wrong answer for calculating 20% of a quantity of milligrams. I had conniptions for a week that the ONE THING computers should always do better than humans had been corrupted by shitty models. Anyone who has actually worked with AI agents knows they can produce a mountain of code in a way that would make the Infinite Monkey Theorem proud.

I decided to test the limits of what I could accomplish with that AI by diving into a space I'd have written off as too much of a time sink a year ago: implementing Design by Contract as a compiler plugin. To understand why that was worth my weekend, we have to go back forty years.

The man and the book

Bertrand Meyer formalized the concept of Design by Contract as a way to produce higher-quality software. The best treatment of the topic is Chapter 11 of the second edition of his Object-Oriented Software Construction — a good read, if a bit dense at times. But the best part about it is that Professor Meyer offers it for free himself because HE'S A FUCKING MENSCH, check it out here.

Cover of Object-Oriented Software Construction, Second Edition, by Bertrand Meyer
Bertrand Meyer, Object-Oriented Software Construction, 2nd ed. (1997) — Chapter 11 is the definitive treatment of Design by Contract.

Design by Contract uses types and predicates to assert conditions about software. Unfortunately, his contributions were mainly locked away in Eiffel: a programming language no one knows and no one uses. As Richard Gabriel taught us, Worse is Better — hence the rise of JavaScript/TypeScript, Node, and React. Despite Eiffel having genuinely interesting and advanced ideas about how to think about and construct software, it has been "too ivory tower." So we sit down at our typewriters and hope we eventually bang something like it out.

When I started writing this, I thought Meyer's book was one of those interesting-but-forgotten bits of 1980s history. I have it on my shelf, so I pulled it down to cite my sources — and the first thing that struck me was the date: 1997. That's after Jurassic Park. Korn had already released their first two albums. I almost left it at "huh, later than I thought." Something seemed off, I remember hearing that the concept was first derived in the 1980s. So I polished off my old literature review skills.

The literature review

Design by Contract is the engineering of a much older logic. Robert Floyd attached inductive assertions to program points in 1967; Tony Hoare turned them into a calculus — the triple {P} S {Q}, precondition, statement, postcondition — in his 1969 Axiomatic Basis for Computer Programming; Dijkstra pushed it further with the weakest-precondition calculus in 1976. Meyer didn't invent that logic. What he did was productize it:

  • He coined "Design by Contract" and first explicated it in 1986, built into Eiffel — a language almost no one would ever run.
  • He shared it with the broad world in 1992, in the paper "Applying 'Design by Contract'" — published in Computer, the IEEE Computer Society's flagship magazine (vol. 25, no. 10, pp. 40–51) — the version most working engineers actually encountered.
  • He gave it its most thorough treatment in 1997, the second edition of Object-Oriented Software Construction, Chapter 11 — the brick on my shelf. (The first edition, with a lighter treatment, was 1988.)

One more paper belongs here, because contrax leans on it directly: Barbara Liskov and Jeannette Wing formalized how contracts behave under inheritance in 1994 — "A Behavioral Notion of Subtyping." A precondition may only weaken in a subclass; a postcondition may only strengthen. That is the "L" in SOLID. It is exactly the rule contrax follows when it inherits a contract down an override hierarchy.

Floyd, Hoare, Dijkstra, Meyer, Liskov — some of the most decorated names in the field built up the foundations for Design by Contract across thirty years. But it never made its way as a built-in, supported concept in the languages most people actually write: Java and Python still have nothing of the sort, and C++ is only getting it now — in C++26, nearly four decades after Meyer. The places it was fully realized — Eiffel, where Meyer built it; Ada, which standardized Pre/Post aspects in 2012; Clojure, with its :pre/:post conditions — are all languages that don't even crack the top five of Stack Overflow's annual developer survey (Eiffel isn't even on the list). The mainstream industry ignored it. That's the actual indictment.

Why did so many ignore such a valuable concept? Richard Gabriel is indispensable. In his tour de force The Structure of a Programming Language Revolution — the title a deliberate echo of Kuhn's Structure of Scientific Revolutions — Gabriel makes a hard claim. Our understanding of object orientation did not evolve. It suffered an incommensurable paradigm shift, in Kuhn's strict, unforgiving sense of the word. The older conception was programming as living systems. The Smalltalk image. The Lisp machine. Objects as message-receiving agents. The formalist languages paradigm supplanted it: objects-as-abstract-data-types, crystallized around 1990 by William Cook and Gilad Bracha. Incommensurable means exactly what it says. The new generation cannot read the old papers as their authors meant them. The old academic understanding of OOP wasn't refined. It was buried.

That divide rears its head again here. Meyer's contracts got lost in the shuffle. The field that inherited the narrower one had nowhere to put them.

Object orientation "came of age" in the 1990s. While the whole country watched Bill Clinton play word games about what the definition of "is" is, engineers and researchers were playing word games of their own about what it meant to be "object-oriented." Hence, most of the things we learned in school are fucking wrong — not because monied interests buried the truth (though ask the Lisp people; Sun and NEC and Microsoft and Apple certainly weren't building Lisp Machines), but because the discipline was so young that very few had figured out how to do it right. AIs in 2026 still don't do it right, which is one reason so much of the code they generate is shit.

Why not before now?

Implementing Design by Contract is a compiler problem. I haven't worked that low-level in over 20 years. I wrote a compiler in Debra Calliss's CSE340 class 18 years ago, for her simplified Pascal-minus-minus, like most computer-science students did when I was an undergrad — so I know about Abstract Syntax Trees and parsing. But I hadn't "played with them in anger" for almost half my life now.

In a recent consulting engagement my client was insistent that I use Claude Code — so insistent he offered to pay for it. I'll forever be grateful. Like most stodgy old software engineers, I was skeptical. I have been programming seriously for 25 years; I don't just do it as a job, I do it for fun, and I didn't want to think an AI could do it better.

It can't. But it can make me faster, and it can empower me to do things I'd written off as "too difficult" and "too time-consuming" — like implementing a skeleton of Design by Contract as a plugin to most of the languages I work in (or may be forced to work in by market conditions).

What is the contract?

The fundamental concept of Design by Contract is, as one would expect, The Contract. But this isn't just an interface or an API specification. The core ideas are Preconditions and Postconditions, with the advanced topic being Class Invariants.

When I learned about preconditions and postconditions in Renee Turban's CSE210 Data Structures and Algorithms class in 2004 (AAAAAGGGGHHH), we implemented them by stating them in the comments of method headers and testing for them by hand:

/**
 * Precondition:  x != null && y != null
 * Postcondition: returns x + y, never null
 */
Integer add(Integer x, Integer y) {
    Objects.requireNonNull(x);
    Objects.requireNonNull(y);
    return x + y;
}

The idea is that this helps consumers know what to expect of the behavior without reading the code, and helps implementers specify precisely the behaviors they will support.

This understanding is completely wrong — and to see why, we have to go back to what Meyer actually wrote.

The Non-Redundancy Principle

Here is Meyer on preconditions and postconditions, with the Non-Redundancy Principle, and a special emphasis (mine) on his critique of Defensive Programming:

Non-Redundancy Principle: Under no circumstances shall the body of a routine ever test for the routine's precondition.

This rule is the reverse of what many software engineering or programming methodology textbooks advocate, often under the name of defensive programming – the idea that to obtain reliable software you should design every component of a system so that it protects itself as much as possible. Better check too much, this approach holds, than not enough; one is never too careful when dealing with strangers. A redundant check might not help, but at least it will not hurt.

Design by Contract follows from the opposite observation: redundant checks can and indeed will hurt. Of course this will at first seem strange; the natural reaction is to think that an extra check – for example routine sqrt containing the above conditional instruction testing for x < 0 even though callers have been instructed to ensure x >= 0 – may at worst be useless, but cannot possibly cause any damage. Such a comment, however, comes from a microscopic understanding of reliability, focused on the individual software elements such as the sqrt routine. If we restrict our view to the narrow world of sqrt, then the routine seems more robust with the extra test than without it. But the world of a system is not related to a routine; it contains a multitude of routines in a multitude of classes. To obtain reliable systems we must go from the microscopic view to a macroscopic view encompassing the entire architecture.

If we take this global view, simplicity becomes a crucial criterion. As we noted at the beginning of the chapter, complexity is the major enemy of quality. When we bring in this concern, possibly redundant checks do not appear so harmless any more! Extrapolated to thousands of routines of a medium-size system (or tens of thousands of routines of a larger one), the if x < 0 then … of sqrt, innocuous at first sight, begins to look like a monster of useless complexity. By adding possibly redundant checks, you add more software; more software means more complexity, and in particular more sources of conditions that could go wrong; hence the need for more checks, meaning more software; and so on ad infinitum. If we start on this road only one thing is certain; we will never obtain reliability. The more we write, the more we will have to write.

To avoid this infinite chase, we should never start it.

What a powerful fucking section. This is what Bob Martin was talking about in Clean Code. This is why Antoine de Saint-Exupéry said "Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away." My favorite formulation — the one I had above my desk at Amazon — is Buckminster Fuller's: "When I am working on a problem, I never think about beauty… but when I have finished, if the solution is not beautiful, I know it is wrong." This principle drove how I built contrax.

Cheap Enforcement

How does Design by Contract enforce a contract without defensive checks in every routine body? The same way the type Integer in add already saved you from checking whether the input was the string "fuck you": the system helps you. A compiler and types are two ways we try to achieve higher-quality software.

The fundamental insight of Design by Contract is to push the compiler (and, in Eiffel, the runtime) to do more for you — like enforce the contracts.

I'll readily admit I'm not treading on new ground, nor am I doing full Design by Contract. Two projects sit nearby, and it's worth comparing contrax with them.

Cofoja — Contracts for Java — is the real, explicit thing. @Requires, @Ensures, @Invariant, enforced at runtime through annotation processing and bytecode instrumentation. Nhat Minh Le did careful, genuinely good work on it. But it's the issues that give me pause. The last release, v1.3, was cut in 2016. It was built for the Java 6 and 8 world. The tracker shows modern users running aground as the language moved on — records, the module system, newer JDKs. Solid work that simply hasn't kept pace with a language that refused to sit still.

The Checker Framework is the other reference point, and it is genuinely powerful — but it isn't really a Design by Contract framework. It is a pluggable type-checking framework. Contract-style guarantees fall out of writing custom type qualifiers. Apart from the obvious Nullness Checker, preconditions and postconditions aren't what it puts forward. It is also large — a 40-plus-chapter manual — and academic, and complex. That complexity is partly intrinsic: full Design by Contract, class invariants and all, drifts toward data-flow analysis and AspectJ-style enforcement.

  Cofoja Checker Framework Contrax
Markets itself as DbC Yes — explicitly No — a type-checker Yes
Pre/postconditions @Requires / @Ensures via type qualifiers @PreCondition / @PostCondition
When checked Runtime (bytecode woven) Compile-time (type-checking) Compile-time (literals)
Class invariants Yes (@Invariant) via the type system No (on purpose)
Last release v1.3, 2016 active 2026, modern Java
Surface area moderate 40+ chapter manual 733 lines, 5 modules

So this isn't "contrax beats either of them." It's a question of the Pareto Principle. Cofoja and the Checker Framework are each years of work and many thousands of lines — in one language. I built contrax's 80% — null, blank, positive, range, and whatever literal checks you write yourself — in a weekend. It runs today in modern Java, Kotlin, and TypeScript, covering the simple cases that buy most of the value, in 733 lines of Java and four lines of build configuration. Nothing to learn. Nothing abandoned underneath you. The cases it doesn't reach are real. I'm open to extending contrax to cover them, but not at revision 1.

Contrax: Pareto Principle DbC in a weekend

Contrax focuses on Preconditions and Postconditions, which we can evaluate at compile time. At method entry and method exit, I ask one question: "do the inputs and outputs conform to the contract?" I do this in Java with annotations and an annotation processor — javac's compiler-plugin model. I provide three default contracts that almost everybody needs — null checking, blank-string checking, and positive-number checking. I built contrax with a plugin mechanism so users can supply their own enforcements. The whole point of what follows is that none of it costs anything at runtime, and the failure lands where it's cheapest to fix.

The contract is an annotation; the violation is a compile error

You declare the obligation on the parameter, where it becomes part of the signature. When a caller hands a literal that breaks it, the build stops — and the diagnostic lands on the caller's line, not the callee's parameter:

class Subject {
    void score(@PreCondition(IsNotNull.class) Object value) {}
}

subject.score(null);   // the billion-dollar mistake, written as a literal
error: @PreCondition violated at parameter #1 ('value' of type java.lang.Object) of Subject.score
    subject.score(null);
                  ^

A diagnostic, at compile time, pointing at the exact expression that broke the promise. Postconditions work the same way, landing on the offending return.

Where contrax stops

Contrax enforces the subset a compiler can prove soundly: literals. A null literal, an out-of-range int literal, a blank String literal — those are knowable at compile time, and contrax fails the build on them.

Here's the sharpest illustration of where that line falls — the case that looks like it should be caught and isn't:

add(null, 0);                       // NULL_LITERAL → compile error

final Integer n = null;
add(n, 0);                          // IDENTIFIER   → compiles clean

Both pass null to a @PreCondition(IsNotNull.class) parameter; contrax fails the build on the first and compiles the second clean. It evaluates the literal at the call site, not the variable behind it: the null literal is knowable now, so the build breaks; n is a variable, and tracing it back to its initializer is data-flow analysis contrax doesn't do. The variable still throws at runtime — that's the guard's job, not the compiler's.

The literal cases already pay for themselves: they delete the null-check boilerplate I was tired of writing, and they give me a seam to hang every other contract on. Soundness over completeness — contrax never raises a violation it cannot prove, and it adds zero runtime checks. Move the cheap, provable part of the contract to compile time, and get out of the way.

Bring your own enforcement

Each built-in is a few lines that inspect the literal and call violate(). Writing your own is a @FunctionalInterface and a ServiceLoader registration — no fork, no processor changes. Here's one of the built-in enforcements:

public final class IsNotNull implements Enforcement {
    @Override
    public void enforce(final CallSite site) {
        if (site.expression().getKind() == Tree.Kind.NULL_LITERAL) {
            site.violate();
        }
    }
}

Because @PreCondition can be a meta-annotation, a third party can ship a parameterized contract that carries its own data and resolves to an enforcement. range-check is exactly that as an example:

@PreCondition(InRange.class)            // @RangeBounds IS-A precondition, via meta-annotation
@Target({ElementType.PARAMETER, ElementType.METHOD})
public @interface RangeBounds { long lower(); long upper(); }

void setOpacity(@RangeBounds(lower = 0, upper = 100) int percent) {}

setOpacity(150);   // error: @RangeBounds violated at parameter #1 … : 150 not in [0, 100)

Five modules, each its own repository

Monorepos tempt bad dependency hygiene — they force modules that aren't changing to re-version alongside ones that are.

So contrax is atomized: every module is an independently versioned repo, consuming its upstreams as published Maven coordinates.

  • annotations — the SPI everyone reads: @PreCondition, @PostCondition, @Contract, the Enforcement interface, the sealed CallSite, the Details substrate. The contract annotations are @Documented, so a consumer's obligations surface in their generated Javadoc.
  • enforcements — the built-ins: IsNotNull, IsNotBlank, IsPositive.
  • processor — the AST driver: an AbstractProcessor running a TreePathScanner at ANALYZE time — and scanning then, rather than during the processing round, is itself the workaround for a long-open javac bug I detail below.
  • range-check — the "third-party plugin" above, in 60 lines, demonstrating the extension story end to end.
  • apt — general annotation-processing utilities with no knowledge of contrax. It earns its own section.

Getting started

Java 17+, Gradle 8+ or Maven 3.9+. Four lines:

// build.gradle.kts
dependencies {
    api(libs.contrax.enforcements)          // @PreCondition + the built-ins (transitively pulls annotations)
    annotationProcessor(libs.contrax.processor) // runs the checks at compile time
}

The reference implementation is Java. It is also ported to pure Kotlin (K2/FIR) and TypeScript, with Dart and Rust planned — and maybe soon your favorite language.

apt: The library that fell out

Walking method-override hierarchies and computing the transitive closure of meta-annotations are not contract-checking problems. They are annotation-processing problems that any APT author hits. Leaving them tangled inside the processor would have been a private hack. So they became software.visionary:apt — three singleton enums that are functions over the javax.lang.model mirror API:

// The override hierarchy of a method — itself plus every super-method it
// transitively overrides — in encounter order. A BiFunction you can pass anywhere.
public enum OverrideHierarchy
        implements BiFunction<ExecutableElement, ProcessingEnvironment, SequencedSet<ExecutableElement>> {
    INSTANCE;
    // ...
}

AllMetaAnnotations returns the cycle-protected transitive closure of meta-annotations reachable from a mirror; MethodInvocation reifies the (executable, [argument]) zip at a call site so a processor never hand-rolls that index-matching again. None of it knows what a contract is. All of it is yours, for any annotation processor you ever write. This is the same instinct that produced record-jar and cookie-jar out of the Language Encounters Center: doing the job right means leaving behind reusable things, not private hacks.

Fucked by a javac bug

To check a contract, the processor has to resolve the argument you wrote at a call site to the Element it refers to — which means calling Trees.getElement on a TreePath. The obvious place to do that is the annotation-processing round, inside process(). Unfortunately, the obvious place is wrong. Finding out why cost me a day and a bug report.

During a processing round, the trees inside method bodies are not yet fully attributed. Calling Trees.getElement on such a path forces javac's Attr phase to attribute the enclosing method on the fly. For a record's canonical constructor, that on-the-fly re-attribution trips a bug filed as JDK-8006042, open since 2013. Attr re-attributes against an internally-copied tree. The enclosing Env still points at the original. The canonical-constructor validator's identity check for the desugarer's implicit super() fails against the copy. Javac then concludes the body contains an explicit constructor invocation. It then rejects a record that contains no such thing:

public record Subject(String s) {
    public Subject {                       // a compact canonical constructor…
        if (s != null) s = s.trim();       // …with one method call in its body
    }
}
error: invalid canonical constructor in record Subject
    public Subject {
           ^
  (canonical constructor must not contain explicit constructor invocation)

There is no this(...) and no super(...) anywhere. The proof that it's the processor and not the source is bit-identical: vanilla javac compiles that record clean; add a scan-only processor that does nothing but call Trees.getElement on the paths it visits, and the same bytes fail to compile. The processor mutates nothing.

The severity has been quietly escalating — before records this surfaced as a misleading warning. Now, javac's record canonical-constructor validator (JDK 16+) promotes it to a hard compile error, which is why Lombok hit the same Attr-on-copied-record-constructor path.

The workaround is to stop resolving during the processing round entirely. Contrax registers a TaskListener and runs its scan in the finished callback for TaskEvent.Kind.ANALYZEafter javac has fully attributed every tree in the compilation unit. By then Trees.getElement just returns the cached Element; the re-attribution path never runs; the bug never fires. That listener — ContractCheckTaskListener — is the entire reason the shipping processor is 142 lines instead of its minimal 111.

The full minimal reproducer — two files, no build tool, with the vanilla-versus-processor diff — is written up and ready to file, in the repo as JDK-BUG-REPORT.md. Unfortunately, filing JDK bugs is itself a fucking gauntlet, so I didn't pursue it after a cursory web search.

Doing the work right meant not just dodging the bug but documenting it for whoever hits it next.

Master and novice

As I said at the beginning, using Claude gave me the confidence to take on something like this. I hadn't dived deep into the Trees API, it gave me some running code so I could say "ah, that's how that works." But the difference between a software engineer using it as a force multiplier and vibe coder just praying for answer is in what comes next.

I let it build the processor, it produced a confident, plausible, six-class architecture. Then I made it dissolve that architecture, because of everything Meyer said above. The Non-Redundancy Principle is fractal: it is about runtime checks multiplying across a system, but the same disease — more code breeding more code breeding more sources of error — applies to the framework's own guts. The novice accepts the mountain. The master knows the mountain is the bug.

I have the whole git history, measured at every milestone — leanness via scc, smells and coverage via SonarQube. Watch the processor climb to the peak, then collapse:

processor — code lines per milestone. Bar length is code LOC; the climb to the peak, then the collapse.
  • first functional137 LOC · cx5
  • + @PostCondition197 LOC · cx9
  • generalized (tangled)229 LOC · cx25
  • PEAK: "massage"261 LOC · cx23
  • collapse MethodElement224 LOC · cx18
  • walkers → apt175 LOC · cx14
  • final shape111 LOC · cx6
  • release142 LOC · cx10

cx = cyclomatic complexity. Peak in red, dissolved final shape in green.

There are two worst points, and they're different worsts. Complexity density peaked at the generalized-but-tangled commit — cyclomatic complexity 25, the AI's confident first architecture, more branching crammed per line than any shape before or since. The size peaked one commit later: a commit I named, "massage Claude's code into less shit." Like most refactoring, it had to look worse before it got better. I started by spreading that tangle across seven files and 261 lines. Then I made it dissolve. Peak to final shape: 261 → 111 code-LOC (−58%), cyclomatic complexity 23 → 6 (−74%). That is Meyer, Saint-Exupéry, and Fuller, made arithmetic.

The shipping release is 142 lines — the extra 31 over the minimal shape are the TaskListener I had to add to dodge a long-open javac bug, but the core architecture could be simpler without the workaround — I already took it there.

The project as a whole did not shrink. Measured against the retired monorepo's last commit, built and scanned:

Module Before (monorepo) After (release) What happened
processor 228 LOC · cx23 142 LOC · cx10 the tangle dissolved
annotations 32 LOC · 4 annotation types 214 LOC · full SPI grew — absorbed the dispatch primitive
enforcements 33 LOC · 31% coverage 74 LOC · 100% coverage grew + got real tests
apt did not exist 241 LOC · net-new fell out as a reusable library

One 228-line tangle of analyzers and walkers became four single-responsibility modules — none over complexity 20, every one at 100% line and branch coverage, zero bugs, zero vulnerabilities, zero real code smells, A/A/A on SonarQube. The capability grew while it happened: meta-annotations, override-hierarchy inheritance, parameterized contracts, and a reusable APT toolkit that didn't exist before. The real headline isn't "less code." It's "the core collapsed, the capability expanded, and the reusable parts left as their own libraries."

That's the difference between a master and a novice driving the same machine. The machine will hand you a mountain. The novice will give it to you to climb. The master will turn it into a molehill.

Roadmap

Already shipped: Java (the reference), Kotlin (K2/FIR — and Kotlin's null soundness makes IsNotNull redundant there, which is a compliment to Kotlin and a smaller box for contrax), and TypeScript. Planned: Dart, Rust, and whatever the community asks for next.

More provable cases. Anything a compiler can know from a literal or constant expression is fair game — string formats, regex literals, enum exhaustiveness at a seam. The SPI is open; the built-ins grow as real consumers ask for them, not before.

Why it matters

Design by Contract has been correct and available since before most working programmers were born — Floyd in 1967, Hoare in 1969, Meyer from 1986. Yet the industry still checks its contracts at runtime, in front of users, with stack traces that blame the wrong line. We call that normal. It is not normal; it is habitual. Modern compilers and annotation processing can dissolve the provable part of that failure class entirely, at zero runtime cost, with no false alarms — and be clear about the part they can't reach.

Contrax is small, fast, free, and finished. I built it because I needed to stop writing the same null guard for the thousandth time; I open-sourced it because the bug it catches is the most expensive, most embarrassing, most preventable one in our field. I showed you the git history because the most valuable thing I do is not generating code — the machines are good at that now. It's knowing when the confident first draft is a tangle, and having the taste and the discipline to make it disappear.

If you want software that is simple, correct, and proven from someone who builds it that way every day, this is what that looks like. You can contact me at nico@visionary.software with all your cheers, boos, and beers. I'd be happy to build the next one with you, or discuss licensing these for your commercial work.

The contract was always part of the type. We just kept finding out too late. I will keep building things that find out now.