Skip to content

calibrax.profiling.resources¤

Resource monitoring via a background sampling thread. ResourceMonitor tracks CPU usage, RSS memory, and optional GPU utilization/memory/clock/power over time.

Background resource monitoring with 10Hz sampling.

Provides ResourceMonitor context manager for tracking CPU, memory, and optional GPU utilization during benchmark execution.

GPUProfilerProtocol ¤

Bases: Protocol

Protocol for GPU profilers providing utilization and memory data.

get_utilization() ¤

Get current GPU utilization percentage.

Returns:

Type Description
float

GPU utilization as a percentage (0-100).

get_memory_usage() ¤

Get current GPU memory usage statistics.

Returns:

Type Description
dict[str, float]

Dictionary with at least 'gpu_memory_used_mb' key.

get_clock_info() ¤

Get current GPU clock frequencies.

Returns:

Type Description
dict[str, float]

Dictionary with 'gpu_clock_mhz' and 'mem_clock_mhz' keys.

get_power_info() ¤

Get current GPU power draw and limits.

Returns:

Type Description
dict[str, float]

Dictionary with 'power_draw_w' and 'power_limit_w' keys.

ResourceSample(*, timestamp, cpu_percent, rss_mb, gpu_util, gpu_mem_mb, gpu_clock_mhz=None, gpu_power_w=None) dataclass ¤

Single resource measurement at a point in time.

Attributes:

Name Type Description
timestamp float

Time of measurement (perf_counter).

cpu_percent float

CPU utilization percentage.

rss_mb float

Resident set size in MB.

gpu_util float | None

GPU utilization percentage (None if no GPU).

gpu_mem_mb float | None

GPU memory used in MB (None if no GPU).

ResourceSummary(*, peak_rss_mb, mean_rss_mb, peak_gpu_mem_mb, mean_gpu_util, memory_growth_mb, num_samples, duration_sec, mean_gpu_clock_mhz=None, mean_gpu_power_w=None) dataclass ¤

Aggregated resource usage over a monitoring period.

Attributes:

Name Type Description
peak_rss_mb float

Maximum RSS observed.

mean_rss_mb float

Average RSS across all samples.

peak_gpu_mem_mb float | None

Maximum GPU memory (None if no GPU).

mean_gpu_util float | None

Average GPU utilization (None if no GPU).

memory_growth_mb float

Last RSS minus first RSS (positive = growth).

num_samples int

Total samples collected.

duration_sec float

Time span of monitoring.

to_dict() ¤

Serialize to a JSON-compatible dictionary.

Optional GPU fields are included only when not None. Numeric values are converted to Python primitives for JAX scalar safety.

Returns:

Type Description
dict[str, Any]

Dictionary representation with all resource summary fields.

from_dict(data) classmethod ¤

Deserialize from a dictionary.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary with resource summary fields.

required

Returns:

Type Description
ResourceSummary

Reconstructed ResourceSummary instance.

ResourceMonitor(sample_interval_sec=0.1, gpu_profiler=None) ¤

Background 10Hz resource sampling via context manager.

Usage:

with ResourceMonitor() as mon:
    # ... run benchmark ...
summary = mon.summary

Parameters:

Name Type Description Default
sample_interval_sec float

Seconds between samples (default 0.1 = 10Hz).

0.1
gpu_profiler GPUProfilerProtocol | None

Optional profiler satisfying GPUProfilerProtocol.

None

Initialize ResourceMonitor.

Parameters:

Name Type Description Default
sample_interval_sec float

Seconds between resource samples.

0.1
gpu_profiler GPUProfilerProtocol | None

Optional GPU profiler for GPU metrics.

None

samples property ¤

Return a copy of all collected samples.

summary property ¤

Compute aggregated summary from collected samples.

Returns:

Type Description
ResourceSummary

ResourceSummary with aggregated metrics, or zeroed summary

ResourceSummary

if no samples were collected.

__enter__() ¤

Start background sampling thread.

__exit__(*args) ¤

Stop background sampling thread.