Skip to content

calibrax.metrics.functional.classification¤

Classification metrics for evaluating discrete predictions. Covers precision, recall, F1, ROC-AUC, average precision, Matthews correlation, Cohen's kappa, balanced accuracy, and the confusion matrix.

Classification metrics for binary and multiclass evaluation.

Pure functions for computing standard classification metrics. All functions accept JAX arrays and return scalar values (except confusion_matrix which returns a JAX array).

Includes 14 functions: accuracy, precision, recall, f1_score, fbeta_score, confusion_matrix, roc_auc, average_precision, log_loss, matthews_corrcoef, cohen_kappa, balanced_accuracy, specificity, sensitivity.

accuracy(predictions: Any, targets: Any) -> Any ¤

Fraction of correct predictions.

Note

Direction: HIGHER (1.0 = perfect). Range: [0, 1]. Not a proper scoring rule.

Parameters:

Name Type Description Default
predictions Any

Predicted class indices or probability array.

required
targets Any

Ground truth class indices.

required

Returns:

Type Description
Any

Accuracy as a scalar value.

confusion_matrix(predictions: Any, targets: Any, *, num_classes: int | None = None) -> jax.Array ¤

Compute confusion matrix.

Note

Helper function, not registered in MetricRegistry (returns array). Rows are true classes, columns are predicted classes.

Parameters:

Name Type Description Default
predictions Any

Predicted class indices or probability array.

required
targets Any

Ground truth class indices.

required
num_classes int | None

Number of classes. Inferred from data if None.

None

Returns:

Type Description
Array

Confusion matrix of shape (num_classes, num_classes) as JAX array.

precision(predictions: Any, targets: Any, *, average: str = 'binary') -> Any ¤

Precision: TP / (TP + FP).

Note

Direction: HIGHER (1.0 = no false positives). Range: [0, 1].

Parameters:

Name Type Description Default
predictions Any

Predicted class indices or probability array.

required
targets Any

Ground truth class indices.

required
average str

Averaging method. "binary" for binary classification, "micro" sums globally, "macro" averages per-class, "weighted" weights by class frequency.

'binary'

Returns:

Type Description
Any

Precision as a scalar value.

recall(predictions: Any, targets: Any, *, average: str = 'binary') -> Any ¤

Recall (sensitivity): TP / (TP + FN).

Note

Direction: HIGHER (1.0 = no false negatives). Range: [0, 1].

Parameters:

Name Type Description Default
predictions Any

Predicted class indices or probability array.

required
targets Any

Ground truth class indices.

required
average str

Averaging method. "binary", "micro", "macro", "weighted".

'binary'

Returns:

Type Description
Any

Recall as a scalar value.

fbeta_score(predictions: Any, targets: Any, *, beta: float = 1.0, average: str = 'binary') -> Any ¤

Generalized F-measure with configurable beta.

F_beta = (1 + beta^2) * (precision * recall) / (beta^2 * precision + recall)

Note

Direction: HIGHER (1.0 = perfect). Range: [0, 1]. beta < 1 weights precision more; beta > 1 weights recall more.

Parameters:

Name Type Description Default
predictions Any

Predicted class indices or probability array.

required
targets Any

Ground truth class indices.

required
beta float

Weight of recall vs precision. 1.0 = F1, 2.0 = F2.

1.0
average str

Averaging method. "binary", "micro", "macro", "weighted".

'binary'

Returns:

Type Description
Any

F-beta score as a scalar value.

f1_score(predictions: Any, targets: Any, *, average: str = 'binary') -> Any ¤

F1 score: harmonic mean of precision and recall.

Equivalent to fbeta_score(predictions, targets, beta=1.0).

Note

Direction: HIGHER (1.0 = perfect). Range: [0, 1].

Parameters:

Name Type Description Default
predictions Any

Predicted class indices or probability array.

required
targets Any

Ground truth class indices.

required
average str

Averaging method. "binary", "micro", "macro", "weighted".

'binary'

Returns:

Type Description
Any

F1 score as a scalar value.

roc_auc(predictions: Any, targets: Any) -> Any ¤

Area under the ROC curve (binary classification only).

Note

Direction: HIGHER (1.0 = perfect separation, 0.5 = random). Range: [0, 1].

Parameters:

Name Type Description Default
predictions Any

Probability scores for the positive class.

required
targets Any

Binary labels (0 or 1).

required

Returns:

Type Description
Any

AUC-ROC as a scalar value.

average_precision(predictions: Any, targets: Any) -> Any ¤

Area under precision-recall curve.

Note

Direction: HIGHER (1.0 = perfect). Range: [0, 1].

Parameters:

Name Type Description Default
predictions Any

Probability scores for the positive class.

required
targets Any

Binary labels (0 or 1).

required

Returns:

Type Description
Any

Average precision as a scalar value.

log_loss(predictions: Any, targets: Any, *, eps: float = 1e-07) -> Any ¤

Logarithmic loss (cross-entropy).

Note

Direction: LOWER (0.0 = perfect). Range: [0, inf). Is a proper scoring rule — minimized by the true distribution.

Parameters:

Name Type Description Default
predictions Any

Predicted probabilities. For binary: 1D array of P(class=1). For multiclass: 2D array of shape (n_samples, n_classes).

required
targets Any

Ground truth class indices.

required
eps float

Clipping value to avoid log(0).

1e-07

Returns:

Type Description
Any

Log loss as a scalar value.

matthews_corrcoef(predictions: Any, targets: Any) -> Any ¤

Matthews correlation coefficient.

Note

Direction: HIGHER (1.0 = perfect, 0.0 = random, -1.0 = inverse). Range: [-1, 1]. Considered the most informative single score for binary classification.

Parameters:

Name Type Description Default
predictions Any

Predicted class indices or probability array.

required
targets Any

Ground truth class indices.

required

Returns:

Type Description
Any

MCC as a scalar value.

cohen_kappa(predictions: Any, targets: Any) -> Any ¤

Cohen's kappa coefficient for inter-rater agreement.

kappa = (accuracy - expected_accuracy) / (1 - expected_accuracy)

Note

Direction: HIGHER (1.0 = perfect agreement, 0.0 = chance agreement). Range: [-1, 1] (negative = worse than chance).

Parameters:

Name Type Description Default
predictions Any

Predicted class indices or probability array.

required
targets Any

Ground truth class indices.

required

Returns:

Type Description
Any

Cohen's kappa as a scalar value.

balanced_accuracy(predictions: Any, targets: Any) -> Any ¤

Balanced accuracy: average recall per class.

Note

Direction: HIGHER (1.0 = perfect). Range: [0, 1]. Equal to standard accuracy on balanced datasets. More informative than accuracy on imbalanced datasets.

Parameters:

Name Type Description Default
predictions Any

Predicted class indices or probability array.

required
targets Any

Ground truth class indices.

required

Returns:

Type Description
Any

Balanced accuracy as a scalar value.

specificity(predictions: Any, targets: Any) -> Any ¤

Specificity (true negative rate): TN / (TN + FP).

Note

Direction: HIGHER (1.0 = no false positives on negatives). Range: [0, 1].

Parameters:

Name Type Description Default
predictions Any

Binary predictions (0 or 1).

required
targets Any

Binary ground truth (0 or 1).

required

Returns:

Type Description
Any

Specificity as a scalar value.

sensitivity(predictions: Any, targets: Any) -> Any ¤

Sensitivity (true positive rate): TP / (TP + FN).

Equivalent to recall for binary classification.

Note

Direction: HIGHER (1.0 = no false negatives). Range: [0, 1].

Parameters:

Name Type Description Default
predictions Any

Binary predictions (0 or 1).

required
targets Any

Binary ground truth (0 or 1).

required

Returns:

Type Description
Any

Sensitivity as a scalar value.