Skip to content

calibrax.metrics.functional.ranking¤

Ranking and retrieval metrics for evaluating ordered predictions. Provides NDCG, mean average precision, precision/recall at k, mean reciprocal rank, hit rate, and catalog coverage.

Ranking and retrieval metrics.

Pure functions for evaluating the quality of ranked lists and information retrieval systems. Functions accept predicted scores and ground truth relevance labels.

Includes 8 functions: ndcg, ndcg_at_k, mean_average_precision, precision_at_k, recall_at_k, mean_reciprocal_rank, hit_rate, coverage.

ndcg(scores: Any, relevance: Any) -> Any ¤

Normalized Discounted Cumulative Gain (full list).

DCG / IDCG where DCG = sum((2^rel_i - 1) / log2(i+2)).

Note

Direction: HIGHER (1.0 = perfect ranking). Range: [0, 1].

Parameters:

Name Type Description Default
scores Any

Predicted relevance scores.

required
relevance Any

Ground truth relevance labels (non-negative).

required

Returns:

Type Description
Any

NDCG as a scalar value.

Examples:

>>> import jax.numpy as jnp
>>> ndcg(jnp.array([3.0, 2.0, 1.0]), jnp.array([3, 2, 1]))
1.0

ndcg_at_k(scores: Any, relevance: Any, *, k: int) -> Any ¤

NDCG truncated to top-k results.

Note

Direction: HIGHER (1.0 = perfect ranking in top-k). Range: [0, 1].

Parameters:

Name Type Description Default
scores Any

Predicted relevance scores.

required
relevance Any

Ground truth relevance labels.

required
k int

Number of top results to consider.

required

Returns:

Type Description
Any

NDCG@k as a scalar value.

mean_average_precision(scores: Any, relevance: Any) -> Any ¤

Mean Average Precision for a single query.

Average of precision at each relevant position.

Note

Direction: HIGHER (1.0 = all relevant items ranked first). Range: [0, 1]. Relevance must be binary (0/1).

Parameters:

Name Type Description Default
scores Any

Predicted relevance scores.

required
relevance Any

Binary ground truth (0 or 1).

required

Returns:

Type Description
Any

Average precision as a scalar value.

Examples:

>>> import jax.numpy as jnp
>>> mean_average_precision(jnp.array([3.0, 1.0, 2.0]), jnp.array([1, 0, 1]))
1.0

precision_at_k(scores: Any, relevance: Any, *, k: int) -> Any ¤

Fraction of relevant items in top-k.

Note

Direction: HIGHER (1.0 = all top-k are relevant). Range: [0, 1].

Parameters:

Name Type Description Default
scores Any

Predicted relevance scores.

required
relevance Any

Binary ground truth (0 or 1).

required
k int

Number of top results to consider.

required

Returns:

Type Description
Any

Precision@k as a scalar value.

recall_at_k(scores: Any, relevance: Any, *, k: int) -> Any ¤

Fraction of relevant items found in top-k.

Note

Direction: HIGHER (1.0 = all relevant items in top-k). Range: [0, 1].

Parameters:

Name Type Description Default
scores Any

Predicted relevance scores.

required
relevance Any

Binary ground truth (0 or 1).

required
k int

Number of top results to consider.

required

Returns:

Type Description
Any

Recall@k as a scalar value.

mean_reciprocal_rank(scores: Any, relevance: Any) -> Any ¤

Reciprocal of the rank of the first relevant item.

Note

Direction: HIGHER (1.0 = first item is relevant). Range: [0, 1]. Returns 0.0 if no relevant items.

Parameters:

Name Type Description Default
scores Any

Predicted relevance scores.

required
relevance Any

Binary ground truth (0 or 1).

required

Returns:

Type Description
Any

MRR as a scalar value.

Examples:

>>> import jax.numpy as jnp
>>> mean_reciprocal_rank(jnp.array([1.0, 3.0, 2.0]), jnp.array([0, 1, 0]))
1.0

hit_rate(scores: Any, relevance: Any, *, k: int) -> Any ¤

Whether any relevant item appears in top-k.

Note

Direction: HIGHER. Range: {0.0, 1.0} (binary).

Parameters:

Name Type Description Default
scores Any

Predicted relevance scores.

required
relevance Any

Binary ground truth (0 or 1).

required
k int

Number of top results to consider.

required

Returns:

Type Description
Any

1.0 if any relevant item in top-k, 0.0 otherwise.

coverage(scores: Any, relevance: Any, *, catalog_size: int) -> Any ¤

Fraction of catalog covered by recommendations.

Note

Direction: HIGHER (1.0 = full catalog coverage). Range: [0, 1].

Parameters:

Name Type Description Default
scores Any

Recommended item IDs (1D integer array).

required
relevance Any

Unused (present for API consistency). Pass any array.

required
catalog_size int

Total number of unique items in catalog.

required

Returns:

Type Description
Any

Coverage as a scalar value.