Distances and Spaces¤


Level	Intermediate
Time	~15 minutes
Prerequisites	Quickstart, Regression Metrics
Format	Python + Jupyter

Overview¤

Calibrax organises distance and divergence functions along a geometric hierarchy: flat (Euclidean, Manhattan, cosine), hyperbolic (Poincare, Lorentz), distributional (KL, JS, Wasserstein, Sinkhorn), and information-theoretic (entropy, mutual information). This example walks through each family with concrete computations, explaining the mathematical properties that determine when each is appropriate.

These metrics form the foundation for embedding evaluation, distribution comparison, and metric learning. Understanding the distinction between true distances (satisfying the triangle inequality) and divergences (which may be asymmetric) is key to selecting the right tool for a given task.

What You'll Learn¤

Compute Euclidean, cosine, and Manhattan distances on flat vector spaces
Measure hyperbolic distances in both Poincare ball and Lorentz hyperboloid models
Compare probability distributions with KL divergence, JS divergence, and Wasserstein distance
Evaluate point-cloud similarity with Sinkhorn divergence
Quantify uncertainty and dependence with entropy and mutual information

Files¤

Python Script: examples/metrics/04_distances.py
Jupyter Notebook: examples/metrics/04_distances.ipynb

Quick Start¤

source activate.sh && uv run python examples/metrics/04_distances.py

Key Concepts¤

Flat Vector Distances¤

The simplest distance family operates on vectors in Euclidean space.

from calibrax.metrics.functional.distance import (
    euclidean_distance, cosine_distance, manhattan_distance,
)

a = jnp.array([1.0, 0.0, 0.0])
b = jnp.array([0.0, 1.0, 0.0])

euclidean_distance(a, b)  # L2 norm of (a - b)
cosine_distance(a, b)     # 1 - cosine_similarity; 0 = identical, 1 = orthogonal
manhattan_distance(a, b)  # L1 norm of (a - b)

Euclidean: rotation-invariant, sensitive to magnitude.
Cosine: scale-invariant, measures angular separation only.
Manhattan: axis-aligned, more robust to high-dimensional noise.

Hyperbolic Distances¤

Hyperbolic space has negative curvature, making it naturally suited for embedding hierarchical structures (trees, taxonomies). Calibrax supports two equivalent models.

Poincare ball: points lie inside the unit ball (||x|| < 1). Distance grows exponentially as points approach the boundary.

from calibrax.metrics.functional.distance import poincare_distance

origin = jnp.array([0.0, 0.0])
near = jnp.array([0.3, 0.0])
far = jnp.array([0.8, 0.0])

poincare_distance(origin, near)  # moderate
poincare_distance(origin, far)   # much larger -- exponential growth near boundary

Lorentz hyperboloid: the first component is timelike (x_0 = sqrt(1 + ||x_spatial||^2)). This model is numerically more stable near the boundary.

from calibrax.metrics.functional.distance import lorentz_distance

p1 = jnp.array([1.0, 0.0, 0.0])  # origin on hyperboloid
spatial = jnp.array([0.5, 0.3])
p2 = jnp.concatenate([jnp.sqrt(1.0 + jnp.sum(spatial**2))[None], spatial])

lorentz_distance(p1, p2)

Distribution Divergences¤

Divergences measure how different two probability distributions are. Unlike true distances, they may be asymmetric.

from calibrax.metrics.functional.divergence import (
    kl_divergence, js_divergence, wasserstein_1d, sinkhorn_divergence,
)

p = jnp.array([0.4, 0.3, 0.2, 0.1])
q = jnp.array([0.25, 0.25, 0.25, 0.25])

kl_divergence(p, q)  # asymmetric: KL(p||q) != KL(q||p)
js_divergence(p, q)  # symmetric: JS(p,q) == JS(q,p)

KL divergence: measures information lost when q is used to approximate p. Asymmetric and unbounded.
JS divergence: symmetrised KL, bounded in [0, log(2)]. Often preferred for comparing distributions.

Sample-Based Distances¤

When you have samples rather than explicit distributions, use Wasserstein or Sinkhorn:

samples_a = jnp.array([1.0, 2.0, 3.0, 4.0, 5.0])
samples_b = jnp.array([2.0, 3.0, 4.0, 5.0, 6.0])

wasserstein_1d(samples_a, samples_b)  # 1D optimal transport

# For multidimensional point clouds, use Sinkhorn divergence
points_x = jnp.array([[0.0, 0.0], [1.0, 0.0], [0.0, 1.0], [1.0, 1.0]])
points_y = jnp.array([[0.5, 0.5], [1.5, 0.5], [0.5, 1.5], [1.5, 1.5]])
sinkhorn_divergence(points_x, points_y, regularization=0.1)

Sinkhorn divergence is a debiased version of the entropic optimal transport cost. It satisfies S(X, X) = 0 (unlike raw Sinkhorn distance).

Information-Theoretic Metrics¤

Entropy and mutual information quantify uncertainty and statistical dependence.

from calibrax.metrics.functional.information import entropy, mutual_information

uniform = jnp.array([0.25, 0.25, 0.25, 0.25])
peaked = jnp.array([0.9, 0.05, 0.03, 0.02])

entropy(uniform)  # maximum for 4 outcomes: log(4)
entropy(peaked)   # low -- most mass on one outcome

# Mutual information from a joint probability table
joint = jnp.array([[0.45, 0.05], [0.05, 0.45]])
mutual_information(joint)  # high -- strong dependence

Example Code¤

The script demonstrates the asymmetry of KL divergence concretely:

p = jnp.array([0.4, 0.3, 0.2, 0.1])
q = jnp.array([0.25, 0.25, 0.25, 0.25])  # uniform

kl_divergence(p, q)  # KL(p || q)
kl_divergence(q, p)  # KL(q || p) -- different value

js_divergence(p, q)  # symmetric
js_divergence(q, p)  # same value

Next Steps¤

Model Evaluation with Composition -- combine metrics into collections, suites, and quality gates
Advanced Manifold and Graph Metrics -- SPD, Grassmann, and graph-theoretic distances
API Reference: calibrax.metrics.functional.distance -- full signatures