CLI Reference¤

Calibrax provides a command-line interface for common benchmarking operations. All commands operate on a store directory specified via --data.

General Usage¤

calibrax <command> [options]

Commands¤

`profile`¤

Profile a JAX function with timing, resource, and optional energy/FLOP measurement.

calibrax profile --module <PYTHON.PATH> --function <NAME> \
    [--warmup <N>] [--iterations <N>] [--energy] [--flops] [--data <PATH>]

Option	Required	Default	Description
`--module`	Yes	—	Python module path (e.g. `my_pkg.benchmark`)
`--function`	Yes	—	Function name within the module
`--warmup`	No	`1`	Number of warmup iterations to exclude
`--iterations`	No	`10`	Number of timed iterations
`--energy`	No	off	Enable energy monitoring
`--flops`	No	off	Enable FLOP counting
`--data`	No	None	Store directory to persist profiling results

calibrax profile --module my_pkg.benchmark --function train_step \
    --warmup 2 --iterations 50

Profiling my_pkg.benchmark.train_step
  Warmup: 2, Iterations: 50

Timing Results:
  Wall clock: 5.2340s
  Batches: 52 (warmup excluded: 2)
  Mean batch time: 0.1047s

Profile complete.

`ingest`¤

Import benchmark results from an external JSON file into the store.

calibrax ingest --data <PATH> --input <FILE>

Option	Required	Description
`--data`	Yes	Path to the store directory
`--input`	Yes	Path to the JSON file to import

calibrax ingest --data ./benchmark-data --input results.json

`export`¤

Export a run to Weights & Biases.

calibrax export --data <PATH> [--run <ID>] [--project <NAME>] [--entity <NAME>]

Option	Required	Default	Description
`--data`	Yes	—	Path to the store directory
`--run`	No	latest	Run ID to export
`--project`	Yes	—	W&B project name
`--entity`	No	None	W&B entity (team or user)

calibrax export --data ./benchmark-data --project my-benchmarks

Requires wandb

Install with uv pip install "calibrax[wandb]" and run wandb login first.

`check`¤

Run a regression check against the stored baseline. Exits with code 1 if any regressions exceed the threshold — suitable for CI pipeline gating.

calibrax check --data <PATH> [--threshold <FLOAT>]

Option	Required	Default	Description
`--data`	Yes	—	Path to the store directory
`--threshold`	No	`0.05`	Regression threshold (fraction, e.g. 0.05 = 5%)

calibrax check --data ./benchmark-data --threshold 0.05
echo $?  # 0 = pass, 1 = regression detected

PASSED: No regressions detected (threshold=0.05)

`baseline`¤

Set a run as the active baseline for regression detection.

calibrax baseline --data <PATH> [--run <ID>]

Option	Required	Default	Description
`--data`	Yes	—	Path to the store directory
`--run`	No	latest	Run ID to set as baseline

calibrax baseline --data ./benchmark-data --run a1b2c3d4e5f6

`trend`¤

Show metric values over time for a specific point and framework.

calibrax trend --data <PATH> --metric <NAME> --point <NAME> --framework <NAME> [--n-runs <N>]

Option	Required	Default	Description
`--data`	Yes	—	Path to the store directory
`--metric`	Yes	—	Metric name to track
`--point`	Yes	—	Point name to filter by
`--framework`	Yes	—	Framework tag value
`--n-runs`	No	all	Limit to the last N runs

calibrax trend --data ./benchmark-data --metric throughput \
    --point forward_pass --framework flax --n-runs 10

Trend: throughput for forward_pass (flax)
Timestamp                           Value Commit
--------------------------------------------------------
2026-02-20 11:34:24.346877      1200.0000 -

`summary`¤

Display a summary of a run's metrics and metadata.

calibrax summary --data <PATH> [--run <ID>]

Option	Required	Default	Description
`--data`	Yes	—	Path to the store directory
`--run`	No	latest	Run ID to summarize

calibrax summary --data ./benchmark-data

Run: 84cb49fb06a9
  Timestamp: 2026-02-20 11:34:24.346877
  Points: 1

Scenario: training
  flax: latency=0.8000, throughput=1200.0000

Exit Codes¤

Code	Meaning
0	Success
1	Regression detected (from `check`) or runtime error

End-to-End Workflow¤

A typical workflow using the CLI:

# 1. Run benchmarks and save results to a JSON file
python run_benchmarks.py --output results.json

# 2. Ingest into the store
calibrax ingest --data ./benchmark-data --input results.json

# 3. Set baseline (first time only)
calibrax baseline --data ./benchmark-data

# 4. Run regression check (in CI)
calibrax check --data ./benchmark-data --threshold 0.05

# 5. View trends
calibrax trend --data ./benchmark-data --metric throughput \
    --point forward_pass --framework flax

# 6. Export to W&B
calibrax export --data ./benchmark-data --project my-benchmarks

CLI Reference¤

General Usage¤

Commands¤

profile¤

ingest¤

export¤

check¤

baseline¤

trend¤

summary¤