calibrax.metrics.functional.classification¤
Classification metrics for evaluating discrete predictions. Covers precision, recall, F1, ROC-AUC, average precision, Matthews correlation, Cohen's kappa, balanced accuracy, and the confusion matrix.
Classification metrics for binary and multiclass evaluation.
Pure functions for computing standard classification metrics. All functions accept JAX arrays and return scalar values (except confusion_matrix which returns a JAX array).
Includes 14 functions: accuracy, precision, recall, f1_score, fbeta_score, confusion_matrix, roc_auc, average_precision, log_loss, matthews_corrcoef, cohen_kappa, balanced_accuracy, specificity, sensitivity.
accuracy(predictions: Any, targets: Any) -> Any
¤
Fraction of correct predictions.
Note
Direction: HIGHER (1.0 = perfect). Range: [0, 1]. Not a proper scoring rule.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Predicted class indices or probability array. |
required |
targets
|
Any
|
Ground truth class indices. |
required |
Returns:
| Type | Description |
|---|---|
Any
|
Accuracy as a scalar value. |
confusion_matrix(predictions: Any, targets: Any, *, num_classes: int | None = None) -> jax.Array
¤
Compute confusion matrix.
Note
Helper function, not registered in MetricRegistry (returns array). Rows are true classes, columns are predicted classes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Predicted class indices or probability array. |
required |
targets
|
Any
|
Ground truth class indices. |
required |
num_classes
|
int | None
|
Number of classes. Inferred from data if None. |
None
|
Returns:
| Type | Description |
|---|---|
Array
|
Confusion matrix of shape (num_classes, num_classes) as JAX array. |
precision(predictions: Any, targets: Any, *, average: str = 'binary') -> Any
¤
Precision: TP / (TP + FP).
Note
Direction: HIGHER (1.0 = no false positives). Range: [0, 1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Predicted class indices or probability array. |
required |
targets
|
Any
|
Ground truth class indices. |
required |
average
|
str
|
Averaging method. "binary" for binary classification, "micro" sums globally, "macro" averages per-class, "weighted" weights by class frequency. |
'binary'
|
Returns:
| Type | Description |
|---|---|
Any
|
Precision as a scalar value. |
recall(predictions: Any, targets: Any, *, average: str = 'binary') -> Any
¤
Recall (sensitivity): TP / (TP + FN).
Note
Direction: HIGHER (1.0 = no false negatives). Range: [0, 1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Predicted class indices or probability array. |
required |
targets
|
Any
|
Ground truth class indices. |
required |
average
|
str
|
Averaging method. "binary", "micro", "macro", "weighted". |
'binary'
|
Returns:
| Type | Description |
|---|---|
Any
|
Recall as a scalar value. |
fbeta_score(predictions: Any, targets: Any, *, beta: float = 1.0, average: str = 'binary') -> Any
¤
Generalized F-measure with configurable beta.
F_beta = (1 + beta^2) * (precision * recall) / (beta^2 * precision + recall)
Note
Direction: HIGHER (1.0 = perfect). Range: [0, 1]. beta < 1 weights precision more; beta > 1 weights recall more.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Predicted class indices or probability array. |
required |
targets
|
Any
|
Ground truth class indices. |
required |
beta
|
float
|
Weight of recall vs precision. 1.0 = F1, 2.0 = F2. |
1.0
|
average
|
str
|
Averaging method. "binary", "micro", "macro", "weighted". |
'binary'
|
Returns:
| Type | Description |
|---|---|
Any
|
F-beta score as a scalar value. |
f1_score(predictions: Any, targets: Any, *, average: str = 'binary') -> Any
¤
F1 score: harmonic mean of precision and recall.
Equivalent to fbeta_score(predictions, targets, beta=1.0).
Note
Direction: HIGHER (1.0 = perfect). Range: [0, 1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Predicted class indices or probability array. |
required |
targets
|
Any
|
Ground truth class indices. |
required |
average
|
str
|
Averaging method. "binary", "micro", "macro", "weighted". |
'binary'
|
Returns:
| Type | Description |
|---|---|
Any
|
F1 score as a scalar value. |
roc_auc(predictions: Any, targets: Any) -> Any
¤
Area under the ROC curve (binary classification only).
Note
Direction: HIGHER (1.0 = perfect separation, 0.5 = random). Range: [0, 1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Probability scores for the positive class. |
required |
targets
|
Any
|
Binary labels (0 or 1). |
required |
Returns:
| Type | Description |
|---|---|
Any
|
AUC-ROC as a scalar value. |
average_precision(predictions: Any, targets: Any) -> Any
¤
Area under precision-recall curve.
Note
Direction: HIGHER (1.0 = perfect). Range: [0, 1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Probability scores for the positive class. |
required |
targets
|
Any
|
Binary labels (0 or 1). |
required |
Returns:
| Type | Description |
|---|---|
Any
|
Average precision as a scalar value. |
log_loss(predictions: Any, targets: Any, *, eps: float = 1e-07) -> Any
¤
Logarithmic loss (cross-entropy).
Note
Direction: LOWER (0.0 = perfect). Range: [0, inf). Is a proper scoring rule — minimized by the true distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Predicted probabilities. For binary: 1D array of P(class=1). For multiclass: 2D array of shape (n_samples, n_classes). |
required |
targets
|
Any
|
Ground truth class indices. |
required |
eps
|
float
|
Clipping value to avoid log(0). |
1e-07
|
Returns:
| Type | Description |
|---|---|
Any
|
Log loss as a scalar value. |
matthews_corrcoef(predictions: Any, targets: Any) -> Any
¤
Matthews correlation coefficient.
Note
Direction: HIGHER (1.0 = perfect, 0.0 = random, -1.0 = inverse). Range: [-1, 1]. Considered the most informative single score for binary classification.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Predicted class indices or probability array. |
required |
targets
|
Any
|
Ground truth class indices. |
required |
Returns:
| Type | Description |
|---|---|
Any
|
MCC as a scalar value. |
cohen_kappa(predictions: Any, targets: Any) -> Any
¤
Cohen's kappa coefficient for inter-rater agreement.
kappa = (accuracy - expected_accuracy) / (1 - expected_accuracy)
Note
Direction: HIGHER (1.0 = perfect agreement, 0.0 = chance agreement). Range: [-1, 1] (negative = worse than chance).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Predicted class indices or probability array. |
required |
targets
|
Any
|
Ground truth class indices. |
required |
Returns:
| Type | Description |
|---|---|
Any
|
Cohen's kappa as a scalar value. |
balanced_accuracy(predictions: Any, targets: Any) -> Any
¤
Balanced accuracy: average recall per class.
Note
Direction: HIGHER (1.0 = perfect). Range: [0, 1]. Equal to standard accuracy on balanced datasets. More informative than accuracy on imbalanced datasets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Predicted class indices or probability array. |
required |
targets
|
Any
|
Ground truth class indices. |
required |
Returns:
| Type | Description |
|---|---|
Any
|
Balanced accuracy as a scalar value. |
specificity(predictions: Any, targets: Any) -> Any
¤
Specificity (true negative rate): TN / (TN + FP).
Note
Direction: HIGHER (1.0 = no false positives on negatives). Range: [0, 1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Binary predictions (0 or 1). |
required |
targets
|
Any
|
Binary ground truth (0 or 1). |
required |
Returns:
| Type | Description |
|---|---|
Any
|
Specificity as a scalar value. |
sensitivity(predictions: Any, targets: Any) -> Any
¤
Sensitivity (true positive rate): TP / (TP + FN).
Equivalent to recall for binary classification.
Note
Direction: HIGHER (1.0 = no false negatives). Range: [0, 1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Any
|
Binary predictions (0 or 1). |
required |
targets
|
Any
|
Binary ground truth (0 or 1). |
required |
Returns:
| Type | Description |
|---|---|
Any
|
Sensitivity as a scalar value. |