Skip to content

calibrax.profiling.gpu¤

GPU memory profiling and optimization. Includes GPUMemoryProfiler for memory snapshots, MemoryOptimizer for pipeline memory analysis, and AdaptiveOperation for hardware-aware shape optimization.

GPU memory profiling and hardware-adaptive operations.

Provides hardware detection, shape optimization, GPU memory profiling (satisfying GPUProfilerProtocol), and memory usage analysis. Includes NVML-based GPU clock and power monitoring when pynvml is available.

HardwareConfig(*, platform, precision, tile_size, critical_batch_size, memory_layout, use_vmem_optimization) dataclass ¤

Hardware-specific optimization configuration.

Attributes:

Name Type Description
platform str

Detected platform ("cpu", "tpu", "gpu_modern", "gpu_legacy").

precision str

Recommended floating-point precision string.

tile_size int

Tile size for matrix operation alignment.

critical_batch_size int

Optimal batch size for the platform.

memory_layout str

Memory layout preference.

use_vmem_optimization bool

Whether VMEM optimization is available.

MemoryAnalysis(*, baseline_memory_mb, peak_memory_mb, peak_usage_mb, retained_memory_mb, memory_efficiency, suggestions=()) dataclass ¤

Result of pipeline memory analysis.

Attributes:

Name Type Description
baseline_memory_mb float

Memory usage before pipeline execution.

peak_memory_mb float

Memory usage at peak during execution.

peak_usage_mb float

Peak usage above baseline.

retained_memory_mb float

Memory retained after GC.

memory_efficiency float

Ratio of freed memory to peak usage.

suggestions tuple[str, ...]

Optimization suggestions.

AdaptiveOperation() ¤

Hardware-adaptive operations with auto-detection.

Detects the current JAX backend (CPU/GPU/TPU) and provides optimized configuration and shape padding.

Initialize with auto-detected hardware configuration.

optimize_shapes(*shapes) ¤

Pad tensor shapes to align with hardware tile size.

Parameters:

Name Type Description Default
*shapes tuple[int, ...]

Variable number of tensor shapes to optimize.

()

Returns:

Type Description
list[tuple[int, ...]]

List of optimized shapes padded to tile_size multiples.

GPUMemoryProfiler() ¤

GPU memory profiling satisfying GPUProfilerProtocol.

Uses multi-fallback strategy: memory_stats -> xla_bridge -> zeros.

Initialize GPU memory profiler with GPU detection.

get_memory_usage() ¤

Get current GPU memory usage statistics.

Returns:

Type Description
dict[str, float]

Dictionary with gpu_memory_used_mb, gpu_memory_total_mb,

dict[str, float]

and optionally gpu_memory_utilization.

get_utilization() ¤

Get GPU utilization percentage for ResourceMonitor.

Returns:

Type Description
float

GPU memory utilization as percentage (0-100), or 0.0.

get_clock_info() ¤

Get current GPU clock frequencies via NVML.

Returns:

Type Description
dict[str, float]

Dictionary with 'gpu_clock_mhz' and 'mem_clock_mhz' keys.

dict[str, float]

Returns zeros if NVML is unavailable or query fails.

get_power_info() ¤

Get current GPU power draw and limit via NVML.

Returns:

Type Description
dict[str, float]

Dictionary with 'power_draw_w' and 'power_limit_w' keys.

dict[str, float]

Returns zeros if NVML is unavailable or query fails.

analyze_memory_pattern(measurements) ¤

Analyze memory usage patterns and suggest optimizations.

Parameters:

Name Type Description Default
measurements list[dict[str, float]]

List of memory usage dictionaries.

required

Returns:

Type Description
list[str]

List of optimization suggestion strings.

MemoryOptimizer ¤

Memory optimization analysis for pipeline functions.

analyze_pipeline_memory(pipeline_fn, sample_data) ¤

Analyze memory usage of a pipeline function.

Parameters:

Name Type Description Default
pipeline_fn Callable[[Any], Any]

Function to analyze.

required
sample_data Any

Sample input data.

required

Returns:

Type Description
MemoryAnalysis | None

MemoryAnalysis with measurements and suggestions,

MemoryAnalysis | None

or None if the pipeline raises an exception.