ordeal — Chaos testing for Python¶

Your tests pass. Your code still breaks in production. Ordeal finds what you missed.

Why ordeal

Your code works until it doesn't. Ordeal finds the failures you didn't think to test — the crash when two things happen at once, the silent wrong answer on a weird input, the timeout that only happens under load. It tries thousands of combinations no human would write by hand, and when it finds a problem, it hands you the exact steps to reproduce it. One command, real bugs, no test code required.

What ordeal does¶

You give it your Python code. It gives you back:

What your functions actually do — not what you think they do, what they provably do across hundreds of random inputs
What your tests miss — gaps in coverage, mutations your tests don't catch, edge cases you haven't considered
Exactly what to fix — line numbers, specific inputs, concrete suggestions

No test code to write. No configuration. Just point and run.

Try it right now¶

Open a terminal and paste this (uvx runs Python tools without installing them):

uvx ordeal mine ordeal.demo

This analyzes ordeal's built-in demo module. You'll see output like:

mine(score): 500 examples
  ALWAYS  output in [0, 1] (500/500)         ← score() always returns a value between 0 and 1
  ALWAYS  monotonically non-decreasing        ← bigger input = bigger output, always

mine(normalize): 500 examples
  ALWAYS  len(output) == len(xs) (500/500)    ← output is always the same length as input
     97%  idempotent (29/30)                  ← normalizing twice SHOULD give the same result
                                                 ...but ordeal found 1 case where it doesn't

Ordeal called each function hundreds of times with random inputs and told you what's always true — and what isn't. That 97% idempotent is a real finding: there's an edge case where normalize(normalize(x)) gives a different result than normalize(x).

Point it at your code¶

If your project has a file like myapp/scoring.py, the module path is myapp.scoring — the file path with slashes replaced by dots, without the .py:

uvx ordeal scan myapp.scoring --save-artifacts  # find a bug, save report + regressions
uvx ordeal verify fnd_123456789abc              # re-run one saved finding later
uvx ordeal init myapp                           # bootstrap starter tests for an existing package
uvx ordeal mine myapp.scoring       # what do my functions actually do?
uvx ordeal audit myapp.scoring      # what are my tests missing?

audit goes further — it generates tests for you, measures coverage, and mutation-tests the result:

myapp.scoring
  migrated:  12 tests |   130 lines | 96% coverage [verified]
  mutation: 14/18 (78%)                   ← ordeal flipped operators in your code;
                                             4 changes went undetected by your tests
  suggest:
    - L42 in compute(): test when x < 0
    - L67 in normalize(): test that ValueError is raised

Those suggest lines are real. Line 42 of compute() behaves differently with negative inputs, and your tests never check that.

Let your AI assistant do it¶

You don't need to learn ordeal's API. Open Claude Code, Cursor, Copilot, or any AI coding assistant and paste:

"Run uv tool install ordeal to install ordeal. Then run ordeal mine on each module in my project and ordeal audit on the ones with existing tests. Read the output, explain what it found, and fix the issues it suggests."

Or without installing anything:

"Run uvx ordeal mine on my main modules. Show me the output and explain what the findings mean."

ordeal ships with an AGENTS.md — your AI assistant reads it automatically and knows every command, every option, and how to interpret every result.

Install¶

When you're ready to make ordeal part of your workflow:

pip install ordeal           # or: uv tool install ordeal

Then ordeal mine, ordeal audit, and ordeal explore are available directly from your terminal.

Find what you need¶

Every goal maps to a starting point — a command to run, a module to import, and a page to read. Nothing is hidden.

I want to...	Start here	Learn more
Capture a bug and lock it in	`ordeal scan mymodule --save-artifacts`	Bug Bundle
Understand why `scan` promoted or demoted a crash	Read the scan finding rules	Scan Finding Rules
Re-run one saved finding	`ordeal verify fnd_123456789abc`	Bug Bundle
Bootstrap tests for an existing package	`ordeal init mymodule`	CLI
Find bugs without writing tests	`ordeal mine mymodule`	Auto Testing
Check if my tests are good enough	`ordeal audit mymodule`	Mutations
Write a chaos test	`from ordeal import ChaosTest`	Getting Started
Inject specific failures (timeout, NaN, ...)	`from ordeal.faults import timing`	Fault Injection
Explore all failure combinations	`ordeal explore`	Explorer
Reproduce and shrink a failure	`ordeal replay trace.json`	Shrinking
Add fail-safe gates to production code	`from ordeal.buggify import buggify`	Fault Injection
Make assertions across all runs	`from ordeal import always, sometimes`	Assertions
Control time / filesystem in tests	`from ordeal.simulate import Clock`	Simulation
Compare two implementations	`ordeal mine-pair mod.fn1 mod.fn2`	Auto Testing
Test API endpoints for faults	`from ordeal.integrations.openapi import chaos_api_test`	Integrations
Extend ordeal with a new fault	Follow the pattern in `ordeal/faults/*.py`	Fault Injection
Configure reproducible runs	Create `ordeal.toml`	Configuration
See the next functionality-coverage priorities	Read the roadmap	Roadmap
Inspect every capability before choosing a tool	`ordeal catalog --detail`	API Reference
Discover all available faults, assertions, strategies	`from ordeal import catalog; catalog()`	API Reference

Pick your starting point

Every path leads somewhere useful — pick whichever matches what you need right now.

"I just want to see what ordeal does" → Run uvx ordeal mine ordeal.demo in your terminal, then read Getting Started
"I have code and want to find bugs" → Run ordeal mine mymodule — see Auto Testing
"I want to write chaos tests for my service" → Start with Getting Started, then Writing Tests
"I want to understand the ideas behind ordeal" → Read Philosophy, then the Concepts
"I need to check if my tests are any good" → Run ordeal audit — see Mutations
"I want to run ordeal in CI" → See the Explorer guide and Configuration
"I want to explore the source code" → See the Architecture section in the README for a full code map

Start here¶

Philosophy

Why ordeal exists. What problem it solves. Why it matters for the future of code quality.
Getting Started

Write your first chaos test in 5 minutes. From install to your first failure.

Understand¶

Chaos Testing

What is chaos testing? Faults, nemesis, swarm mode — explained from the ground up.
Coverage Guidance

How the explorer finds bugs: edge hashing, checkpoints, energy scheduling.
Property Assertions

always, sometimes, reachable, unreachable — the Antithesis assertion model.
Fault Injection

External faults, inline buggify, the FoundationDB model — and when to use each.
Shrinking

How ordeal minimizes failures: delta debugging, step elimination, fault simplification.

Use¶

Explorer — Run and configure coverage-guided exploration
Writing Tests — Patterns for effective chaos tests
Auto Testing — Zero-boilerplate: scan_module, fuzz, mine, diff, chaos_for
Simulation — Deterministic Clock and FileSystem
Mutations — Validate that your tests catch real bugs
Integrations — Atheris fuzzing, built-in API chaos testing

Reference¶

CLI — ordeal explore, ordeal replay, pytest --chaos
CLI — ordeal scan, verify, init, explore, replay
Configuration — ordeal.toml schema and tuning
API Reference — Every function, every parameter, every type
Troubleshooting — Common issues and how to fix them

What ordeal brings together¶

Capability	Idea	Origin
Stateful chaos testing	Nemesis toggles faults while Hypothesis explores interleavings	Jepsen + Hypothesis
Coverage-guided exploration	Checkpoint interesting states, branch from productive ones	Antithesis
Property assertions	`always`, `sometimes`, `reachable`, `unreachable`	Antithesis
Inline fault injection	`buggify()` — no-op in production, fault in testing	FoundationDB
Boundary-biased generation	Test at 0, -1, empty, max — where bugs cluster	Jane Street
Mutation testing	Verify tests catch real code changes	Meta ACH
Differential testing	Compare two implementations on random inputs	Regression testing
Property mining	Discover invariants from execution traces	Specification mining
Metamorphic testing	Check output relationships across transformed inputs	Metamorphic relations
Network faults	HTTP errors, rate limiting, DNS failure, connection reset	Real-world API failures
Concurrency faults	Lock contention, thread boundaries, stale state	Thread-safety testing

Install¶

pip install ordeal           # core
pip install ordeal[all]      # everything
uv tool install ordeal       # CLI tool