Skip to content

calibrax.exporters¤

Export benchmark results to external systems and publication formats.

Base Exporter¤

The Exporter ABC defines the interface for all exporters.

calibrax.exporters.base ¤

Abstract base class for benchmark result exporters.

Exporter ¤

Bases: ABC

Base class for exporting benchmark results to external systems.

Subclasses implement export_run for raw data and export_analysis for computed analytics (regressions, rankings, etc.).

export_run(run) abstractmethod ¤

Export a benchmark run to an external system.

Parameters:

Name Type Description Default
run Run

The benchmark run to export.

required

Returns:

Type Description
str

URL or identifier of the exported artifact.

export_analysis(run, baseline=None) abstractmethod ¤

Export analysis results (rankings, regressions, etc.).

Parameters:

Name Type Description Default
run Run

Current benchmark run.

required
baseline Run | None

Optional baseline run for comparison.

None

W&B Exporter¤

Import Path

WandBExporter is not re-exported from calibrax.exporters to avoid loading wandb at import time. Import directly:

from calibrax.exporters.wandb import WandBExporter

Optional Dependency

Requires wandb: uv pip install "calibrax[wandb]"

calibrax.exporters.wandb ¤

Weights & Biases exporter for benchmark results and analysis.

Exports benchmark runs, comparisons, regressions, rankings, and trends to W&B dashboards. Requires optional wandb dependency.

WandBExporter(project, entity=None, tags=None) ¤

Bases: Exporter

Export benchmark results and analysis to Weights & Biases.

Parameters:

Name Type Description Default
project str

W&B project name.

required
entity str | None

W&B entity (team or user). Uses default if None.

None
tags list[str] | None

Optional tags applied to all W&B runs.

None

Raises:

Type Description
ImportError

If wandb is not installed.

Initialize the W&B exporter.

check_auth() ¤

Check if W&B authentication is available.

Returns:

Type Description
bool

True if authenticated (API key, offline mode, or stored creds).

export_run(run, *, finish=True) ¤

Export a benchmark run to W&B.

Logs all metrics with slash-grouped panel names, a comparison summary table, and an HTML comparison table.

Parameters:

Name Type Description Default
run Run

Benchmark run to export.

required
finish bool

Whether to finish the W&B run after export.

True

Returns:

Type Description
str

URL of the W&B run.

export_analysis(run, baseline=None) ¤

Export analysis artifacts: rankings, regressions, aggregate scores, Pareto.

Parameters:

Name Type Description Default
run Run

Current benchmark run.

required
baseline Run | None

Optional baseline for regression detection.

None

Export metric trends over time to W&B.

Parameters:

Name Type Description Default
store Any

Store instance with extract_trend method.

required
metric str

Metric name to track.

required
point_name str

Point name to match.

required
tags dict[str, str]

Tags to filter by.

required
n_runs int | None

Optional limit on number of trend points.

None

log_figures(figures) ¤

Log matplotlib figures to W&B.

Parameters:

Name Type Description Default
figures dict[str, Any]

{name: matplotlib_figure} mapping.

required

log_html_artifacts(html) ¤

Log HTML strings as W&B artifacts.

Parameters:

Name Type Description Default
html dict[str, str]

{name: html_string} mapping.

required

log_extra_tables(tables) ¤

Log additional W&B tables.

Parameters:

Name Type Description Default
tables dict[str, tuple[list[str], list[list[Any]]]]

{name: (columns, rows)} mapping.

required

MLflow Exporter¤

Import Path

MLflowExporter is not re-exported from calibrax.exporters to avoid loading mlflow at import time. Import directly:

from calibrax.exporters.mlflow import MLflowExporter

Optional Dependency

Requires mlflow: uv pip install "calibrax[mlflow]"

calibrax.exporters.mlflow ¤

MLflow exporter for benchmark results and analysis.

Exports benchmark runs, comparisons, and regressions to MLflow tracking. Requires the optional mlflow dependency (uv pip install "calibrax[mlflow]").

Note: NOT re-exported from calibrax.exporters.__init__ to avoid import-time MLflow loading. Import it from calibrax.exporters.mlflow.

MLflowExporter(experiment_name, tracking_uri=None) ¤

Bases: Exporter

Export benchmark results and analysis to MLflow.

Logs metrics, parameters, and artifacts to an MLflow tracking server. Each benchmark run becomes an MLflow run within the specified experiment.

Parameters:

Name Type Description Default
experiment_name str

MLflow experiment name.

required
tracking_uri str | None

MLflow tracking server URI. Uses default if None.

None

Raises:

Type Description
ImportError

If mlflow is not installed.

Initialize the MLflow exporter.

Parameters:

Name Type Description Default
experiment_name str

MLflow experiment name.

required
tracking_uri str | None

MLflow tracking server URI.

None

Raises:

Type Description
ImportError

If mlflow is not installed.

export_run(run) ¤

Export a benchmark run to MLflow.

Logs each metric from each point as an MLflow metric, and logs environment/metadata as MLflow parameters.

Parameters:

Name Type Description Default
run Run

Benchmark run to export.

required

Returns:

Type Description
str

MLflow run ID.

export_analysis(run, baseline=None) ¤

Export analysis artifacts to MLflow.

Logs regressions as metrics and comparison data as a JSON artifact.

Parameters:

Name Type Description Default
run Run

Current benchmark run.

required
baseline Run | None

Optional baseline run for regression detection.

None

Publication Generator¤

Optional Dependency

Plot generation requires matplotlib: uv pip install "calibrax[publication]"

Table generation (LaTeX, HTML, CSV) works without matplotlib.

calibrax.exporters.publication ¤

Publication-ready plot and table generation for benchmark results.

Generates comparison bar charts, scaling plots, convergence plots, and formatted tables (LaTeX, HTML, CSV). Requires optional matplotlib dependency for plots; table generation works without it.

PublicationGenerator(output_dir) ¤

Generate publication-ready plots and tables from benchmark data.

Parameters:

Name Type Description Default
output_dir Path | str

Directory where generated files are saved.

required

Initialize the publication generator.

generate_comparison_plot(run, metrics=None, *, output_format='png') ¤

Generate a bar chart comparing frameworks across metrics.

Parameters:

Name Type Description Default
run Run

Benchmark run with points tagged by framework.

required
metrics Sequence[str] | None

Subset of metric names to plot. Defaults to all.

None
output_format str

File format (png, pdf, svg).

'png'

Returns:

Type Description
Path | None

Path to generated file, or None if matplotlib unavailable.

generate_scaling_plot(sizes, values, *, metric_name='throughput', output_format='png') ¤

Generate a scaling plot (size vs metric value).

Parameters:

Name Type Description Default
sizes Sequence[float]

Input sizes (x-axis).

required
values Sequence[float]

Metric values (y-axis).

required
metric_name str

Name of the metric being plotted.

'throughput'
output_format str

File format (png, pdf, svg).

'png'

Returns:

Type Description
Path | None

Path to generated file, or None if matplotlib unavailable.

generate_convergence_plot(series, *, output_format='png') ¤

Generate a convergence plot from a trend series.

Parameters:

Name Type Description Default
series TrendSeries

Time-series trend data.

required
output_format str

File format (png, pdf, svg).

'png'

Returns:

Type Description
Path | None

Path to generated file, or None if matplotlib unavailable.

plot_metric_values(values, *, title, filename, output_format='png') ¤

Plot one or more scalar metric values.

Parameters:

Name Type Description Default
values Mapping[str, float]

Mapping from metric name to scalar value.

required
title str

Plot title.

required
filename str

Output filename stem.

required
output_format str

File format (png, pdf, svg).

'png'

Returns:

Type Description
Path | None

Path to generated file, or None if matplotlib unavailable.

Raises:

Type Description
ValueError

If no metric values are provided.

generate_table(run, metrics=None, *, output_format='latex', group_by_tag='framework') ¤

Generate a formatted comparison table.

Parameters:

Name Type Description Default
run Run

Benchmark run with points and metrics.

required
metrics Sequence[str] | None

Subset of metrics to include. Defaults to all.

None
output_format str

One of "latex", "html", "csv".

'latex'
group_by_tag str

Tag key used for row labels.

'framework'

Returns:

Type Description
Path

Path to the generated table file.

Raises:

Type Description
ValueError

If output_format is not recognized.