Troubleshooting¶
Find your symptom, follow the fix
Something not working? You're probably not alone. Scan the headings below to find your symptom -- each section explains why it happens and exactly how to fix it. Issues are ordered from most common (explorer configuration) to most specific (simulation edge cases). If nothing here matches, check the Getting Help section at the bottom.
Common issues and how to fix them.
The explorer runs but finds 0 failures¶
Think of it this way
Zero failures can mean your code is solid -- or that the explorer isn't looking in the right places. The edge count in your report is the clue: if edges plateau early, the explorer is stuck in shallow territory and needs more faults, more rules, or broader target_modules to see deeper.
This can mean your code is correct under fault conditions — or that the explorer isn't reaching interesting states.
Check edge count. If edges plateau early (say 10-20), the explorer is stuck in shallow territory.
- Add more
target_modulesto expand coverage tracking - Increase
steps_per_runto let runs go deeper - Add more faults — the explorer needs faults to create interesting state combinations
- Check that your
target_modulespaths are correct (they match by path segment, not substring)
Check your faults. If all faults target the same function, the explorer can't create many combinations. Spread faults across different dependencies.
Check your rules. If you only have one rule, there aren't many interleavings to explore. Add rules for different operations your system supports.
The explorer runs but finds too many failures¶
If every run fails, the explorer can't explore — it's drowning in noise.
Handle expected exceptions in rules. When a timeout fault is active, TimeoutError is expected. Catch it in your rule:
@rule()
def call_service(self):
try:
result = self.service.call()
except TimeoutError:
return # expected when timeout fault is active
always(result is not None, "result exists")
Enable swarm mode. With swarm = True, each run uses a random subset of faults instead of all of them. This reduces noise and lets the explorer find specific fault combinations that cause real bugs.
Reduce faults. Start with 2-3 faults and add more once your base test is stable.
"Cannot import" error with ordeal explore¶
The class path in ordeal.toml must be importable from the working directory. Check:
- Is the module path correct? Format:
"module.path:ClassName" - Is your working directory the project root?
- Is the package installed or on
PYTHONPATH?
Hypothesis shrinking takes too long¶
What's happening
When ordeal finds a failure, it tries to simplify the failing sequence to the shortest possible reproduction. With many faults and long rule chains, this can take minutes. You don't have to wait — skip shrinking during exploration and do it later on just the traces that matter.
Shrinking is Hypothesis finding the minimal reproducing example. It can be slow with many faults and long rule sequences.
- Use
--no-shrinkduring exploration:ordeal explore --no-shrink - Shrink post-hoc:
ordeal replay --shrink trace.json - Set a time limit: the Explorer's
max_shrink_timeparameter (default 30s) - Reduce
steps_per_runto produce shorter traces
buggify() always returns False¶
Why this matters
buggify() is designed to be safe in production -- it does nothing unless you explicitly turn it on. This is a feature, not a bug. You need either --chaos, auto_configure(), or activate() to make it active. See ordeal/buggify.py for the activation logic.
buggify() is a no-op unless explicitly activated. Check:
- Are you running with
--chaos? (pytest --chaos) - Or did you call
auto_configure()? - Or did you call
activate()directly?
Property assertions not tracked in the report¶
always() and unreachable() always raise on violation — they are never silent, with or without --chaos. The property report (hit counts, pass/fail summary) appears whenever there are tracked results — even without --chaos.
sometimes() and reachable() only track when the tracker is active. Without --chaos, they are no-ops unless you use sometimes(..., warn=True) which prints status to stdout (captured by pytest).
To enable the tracker and the property report:
- Run with
--chaosflag (enables full tracking) - Or call
auto_configure()at test start - Use
sometimes(..., warn=True)for visibility without--chaos - The property report prints at the end of pytest output whenever there are tracked results
"sometimes" or "reachable" fails at session end¶
These are deferred assertions — they must be satisfied at least once across the entire session.
sometimesfails: the condition was never True. Either the code path isn't being exercised, or the condition is too strict. Check your rules — are they actually reaching the state where this condition holds?reachablefails: the code path was never executed. Your fault injection might not be creating the conditions that trigger this path. Add more faults or rules.
Coverage collector shows 0 edges¶
The CoverageCollector uses sys.settrace to track execution. If it shows 0 edges:
- Check
target_modules— the collector only tracks files whose path contains a matching segment.["myapp"]matchesmyapp/foo.pybut NOTtests/test_myapp.py. - Make sure your code actually runs during the test. If all rules raise immediately, no application code is traced.
- Some C extensions bypass
sys.settrace— coverage only tracks Python code.
Mutation testing: all mutants survive¶
What this means
Mutation testing changes small things in your code (like swapping + to -) and checks if your tests notice. If all mutants survive, your tests aren't checking the function's actual behavior -- they might only check that it runs without crashing. You need assertions that verify specific output values.
If mutate_function_and_test returns a 0% kill score, your tests aren't checking the function's behavior:
- Are you testing the right function? The
targetis a dotted path:"myapp.scoring.compute". - Does your test actually call the function? Mutants are only killed if the test raises an exception.
- Are your assertions specific enough? If you only check
result is not None, swapping+to-won't be caught. Add value checks.
PatchFault doesn't seem to work¶
PatchFault resolves the dotted path lazily (on first activation). If the fault seems inactive:
- Check the target path is correct and the module is importable
- Make sure the fault is activated (check
fault.active) - If the target is imported as
from module import func, the local binding won't be patched — PatchFault patches the module attribute. Import asimport module; module.func()for PatchFault to work.
FileSystem.read returns bytes, not str¶
FileSystem.read() returns bytes. Use fs.read_text(path) for a decoded string:
Clock.advance doesn't fire timers¶
Clock.sleep() advances time but does NOT fire timers. Use Clock.advance() instead:
clock = Clock()
clock.set_timer(10.0, callback)
clock.sleep(15.0) # advances time but callback NOT fired
clock.advance(15.0) # advances time AND fires callback
ordeal audit shows FAILED instead of coverage¶
The audit never silently returns 0% — if a measurement fails, it says FAILED: reason. Common reasons:
- "no test files found": test files must be named
test_<module_short_name>.pyortest_<module_short_name>_*.py. Check the--test-dirflag. - "pytest not found": install
pytest. - "coverage report not generated": the subprocess crashed before coverage could be written. Check the stderr hint in the failure.
- "timed out after 120s": tests are too slow under coverage. Try with fewer tests or a faster machine.
- "module not found in coverage report": the module path doesn't match what ordeal traced. Check the dotted path matches the file location.
- "coverage data inconsistent": the measured percentage doesn't match the executed/missing line counts. This can happen with dynamic imports or generated code.
Check the warnings field for details: result.warnings lists every problem encountered during the audit.
ordeal audit shows 0% migrated coverage (FAILED)¶
The migrated test is generated to .ordeal/test_<module>_migrated.py. Check:
- Can the module be imported? (
python -c "import myapp.scoring") - Does
scan_module("myapp.scoring")find any functions? Functions without type hints and no fixtures are skipped. - Use
--show-generatedto see what the generated test looks like. - Check
result.warnings— mining failures are logged there.
Property mining finds no properties¶
ordeal.mine needs the function to be callable with random inputs. If all calls crash, no properties can be observed.
- Provide fixtures for parameters that can't be inferred:
mine(fn, model=mock_model) - Check that the function has type hints — mining uses the same strategy inference as
fuzz(). - A function that always raises won't have observable output properties.
Tests pass locally but fail in CI¶
Environment differences
Chaos testing is sensitive to environment: different CPUs, memory pressure, and timing can change which paths the explorer visits. The fix is almost always pinning the seed in ordeal.toml so that local and CI runs explore the same space.
- Seed mismatch: set a fixed seed in
ordeal.tomlfor reproducibility - Missing dependencies: make sure
ordeal[all]or the specific extras are installed - Timeout: CI may be slower — increase
max_time - PYTHONPATH: ensure the project root is on the path
Getting help¶
Still stuck?
If nothing on this page matched your problem, you've likely found something new. That's valuable -- open an issue with what you tried and what happened. Include your ordeal.toml, the error output, and your Python version. The fastest path to a fix is a minimal reproducing example, but even a description helps.
- Check the full documentation
- Open an issue at github.com/teilomillet/ordeal