calibrax.profiling.resources¤
Resource monitoring via a background sampling thread. ResourceMonitor tracks
CPU usage, RSS memory, and optional GPU utilization/memory/clock/power over time.
Background resource monitoring with 10Hz sampling.
Provides ResourceMonitor context manager for tracking CPU, memory, and optional GPU utilization during benchmark execution.
GPUProfilerProtocol
¤
Bases: Protocol
Protocol for GPU profilers providing utilization and memory data.
get_utilization()
¤
Get current GPU utilization percentage.
Returns:
| Type | Description |
|---|---|
float
|
GPU utilization as a percentage (0-100). |
get_memory_usage()
¤
Get current GPU memory usage statistics.
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with at least 'gpu_memory_used_mb' key. |
get_clock_info()
¤
Get current GPU clock frequencies.
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with 'gpu_clock_mhz' and 'mem_clock_mhz' keys. |
get_power_info()
¤
Get current GPU power draw and limits.
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with 'power_draw_w' and 'power_limit_w' keys. |
ResourceSample(*, timestamp, cpu_percent, rss_mb, gpu_util, gpu_mem_mb, gpu_clock_mhz=None, gpu_power_w=None)
dataclass
¤
Single resource measurement at a point in time.
Attributes:
| Name | Type | Description |
|---|---|---|
timestamp |
float
|
Time of measurement (perf_counter). |
cpu_percent |
float
|
CPU utilization percentage. |
rss_mb |
float
|
Resident set size in MB. |
gpu_util |
float | None
|
GPU utilization percentage (None if no GPU). |
gpu_mem_mb |
float | None
|
GPU memory used in MB (None if no GPU). |
ResourceSummary(*, peak_rss_mb, mean_rss_mb, peak_gpu_mem_mb, mean_gpu_util, memory_growth_mb, num_samples, duration_sec, mean_gpu_clock_mhz=None, mean_gpu_power_w=None)
dataclass
¤
Aggregated resource usage over a monitoring period.
Attributes:
| Name | Type | Description |
|---|---|---|
peak_rss_mb |
float
|
Maximum RSS observed. |
mean_rss_mb |
float
|
Average RSS across all samples. |
peak_gpu_mem_mb |
float | None
|
Maximum GPU memory (None if no GPU). |
mean_gpu_util |
float | None
|
Average GPU utilization (None if no GPU). |
memory_growth_mb |
float
|
Last RSS minus first RSS (positive = growth). |
num_samples |
int
|
Total samples collected. |
duration_sec |
float
|
Time span of monitoring. |
to_dict()
¤
Serialize to a JSON-compatible dictionary.
Optional GPU fields are included only when not None. Numeric values are converted to Python primitives for JAX scalar safety.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dictionary representation with all resource summary fields. |
from_dict(data)
classmethod
¤
Deserialize from a dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Dictionary with resource summary fields. |
required |
Returns:
| Type | Description |
|---|---|
ResourceSummary
|
Reconstructed ResourceSummary instance. |
ResourceMonitor(sample_interval_sec=0.1, gpu_profiler=None)
¤
Background 10Hz resource sampling via context manager.
Usage:
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sample_interval_sec
|
float
|
Seconds between samples (default 0.1 = 10Hz). |
0.1
|
gpu_profiler
|
GPUProfilerProtocol | None
|
Optional profiler satisfying GPUProfilerProtocol. |
None
|
Initialize ResourceMonitor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sample_interval_sec
|
float
|
Seconds between resource samples. |
0.1
|
gpu_profiler
|
GPUProfilerProtocol | None
|
Optional GPU profiler for GPU metrics. |
None
|
samples
property
¤
Return a copy of all collected samples.
summary
property
¤
Compute aggregated summary from collected samples.
Returns:
| Type | Description |
|---|---|
ResourceSummary
|
ResourceSummary with aggregated metrics, or zeroed summary |
ResourceSummary
|
if no samples were collected. |
__enter__()
¤
Start background sampling thread.
__exit__(*args)
¤
Stop background sampling thread.