calibrax.profiling.gpu¤
GPU memory profiling and optimization. Includes GPUMemoryProfiler for memory
snapshots, MemoryOptimizer for pipeline memory analysis, and AdaptiveOperation
for hardware-aware shape optimization.
GPU memory profiling and hardware-adaptive operations.
Provides hardware detection, shape optimization, GPU memory profiling (satisfying GPUProfilerProtocol), and memory usage analysis. Includes NVML-based GPU clock and power monitoring when pynvml is available.
HardwareConfig(*, platform, precision, tile_size, critical_batch_size, memory_layout, use_vmem_optimization)
dataclass
¤
Hardware-specific optimization configuration.
Attributes:
| Name | Type | Description |
|---|---|---|
platform |
str
|
Detected platform ("cpu", "tpu", "gpu_modern", "gpu_legacy"). |
precision |
str
|
Recommended floating-point precision string. |
tile_size |
int
|
Tile size for matrix operation alignment. |
critical_batch_size |
int
|
Optimal batch size for the platform. |
memory_layout |
str
|
Memory layout preference. |
use_vmem_optimization |
bool
|
Whether VMEM optimization is available. |
MemoryAnalysis(*, baseline_memory_mb, peak_memory_mb, peak_usage_mb, retained_memory_mb, memory_efficiency, suggestions=())
dataclass
¤
Result of pipeline memory analysis.
Attributes:
| Name | Type | Description |
|---|---|---|
baseline_memory_mb |
float
|
Memory usage before pipeline execution. |
peak_memory_mb |
float
|
Memory usage at peak during execution. |
peak_usage_mb |
float
|
Peak usage above baseline. |
retained_memory_mb |
float
|
Memory retained after GC. |
memory_efficiency |
float
|
Ratio of freed memory to peak usage. |
suggestions |
tuple[str, ...]
|
Optimization suggestions. |
AdaptiveOperation()
¤
Hardware-adaptive operations with auto-detection.
Detects the current JAX backend (CPU/GPU/TPU) and provides optimized configuration and shape padding.
Initialize with auto-detected hardware configuration.
optimize_shapes(*shapes)
¤
Pad tensor shapes to align with hardware tile size.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*shapes
|
tuple[int, ...]
|
Variable number of tensor shapes to optimize. |
()
|
Returns:
| Type | Description |
|---|---|
list[tuple[int, ...]]
|
List of optimized shapes padded to tile_size multiples. |
GPUMemoryProfiler()
¤
GPU memory profiling satisfying GPUProfilerProtocol.
Uses multi-fallback strategy: memory_stats -> xla_bridge -> zeros.
Initialize GPU memory profiler with GPU detection.
get_memory_usage()
¤
Get current GPU memory usage statistics.
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with gpu_memory_used_mb, gpu_memory_total_mb, |
dict[str, float]
|
and optionally gpu_memory_utilization. |
get_utilization()
¤
Get GPU utilization percentage for ResourceMonitor.
Returns:
| Type | Description |
|---|---|
float
|
GPU memory utilization as percentage (0-100), or 0.0. |
get_clock_info()
¤
Get current GPU clock frequencies via NVML.
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with 'gpu_clock_mhz' and 'mem_clock_mhz' keys. |
dict[str, float]
|
Returns zeros if NVML is unavailable or query fails. |
get_power_info()
¤
Get current GPU power draw and limit via NVML.
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with 'power_draw_w' and 'power_limit_w' keys. |
dict[str, float]
|
Returns zeros if NVML is unavailable or query fails. |
analyze_memory_pattern(measurements)
¤
Analyze memory usage patterns and suggest optimizations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
measurements
|
list[dict[str, float]]
|
List of memory usage dictionaries. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of optimization suggestion strings. |
MemoryOptimizer
¤
Memory optimization analysis for pipeline functions.
analyze_pipeline_memory(pipeline_fn, sample_data)
¤
Analyze memory usage of a pipeline function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pipeline_fn
|
Callable[[Any], Any]
|
Function to analyze. |
required |
sample_data
|
Any
|
Sample input data. |
required |
Returns:
| Type | Description |
|---|---|
MemoryAnalysis | None
|
MemoryAnalysis with measurements and suggestions, |
MemoryAnalysis | None
|
or None if the pipeline raises an exception. |