viva_math/entropy

Entropy and information theory functions.

Based on Shannon (1948) and Kullback-Leibler (1951). Used for memory consolidation scoring and uncertainty quantification.

References:

Shannon (1948) “A Mathematical Theory of Communication”
Cover & Thomas (2006) “Elements of Information Theory”

Values

binary_cross_entropy

</>

pub fn binary_cross_entropy(
  p: Float,
  q: Float,
) -> Result(Float, Nil)

Binary cross-entropy for single probability.

H(p, q) = -[p log(q) + (1-p) log(1-q)]

conditional_entropy

</>

pub fn conditional_entropy(
  px: List(Float),
  pxy: List(List(Float)),
) -> Float

Conditional entropy: H(X|Y) = H(X, Y) - H(Y)

Uncertainty in X given knowledge of Y.

cross_entropy

</>

pub fn cross_entropy(
  p: List(Float),
  q: List(Float),
) -> Result(Float, Nil)

Cross-entropy: H(P, Q) = -Σ p(x) log q(x)

Used in machine learning loss functions. H(P, Q) = H(P) + D_KL(P || Q)

jensen_shannon

</>

pub fn jensen_shannon(
  p: List(Float),
  q: List(Float),
) -> Result(Float, Nil)

Jensen-Shannon Divergence: JS(P, Q) = (D_KL(P || M) + D_KL(Q || M)) / 2 where M = (P + Q) / 2

This is symmetric and bounded [0, 1] when using log₂.

kl_divergence

</>

pub fn kl_divergence(
  p: List(Float),
  q: List(Float),
) -> Result(Float, Nil)

KL Divergence: D_KL(P || Q) = Σ p(x) log(p(x) / q(x))

Measures how much P diverges from Q (not symmetric!). P is the “true” distribution, Q is the approximation.

Returns Error if distributions have different lengths or Q has zeros where P is non-zero.

mutual_information

</>

pub fn mutual_information(
  px: List(Float),
  py: List(Float),
  pxy: List(List(Float)),
) -> Float

Mutual Information: I(X; Y) = H(X) + H(Y) - H(X, Y)

Measures shared information between two variables. Takes marginal distributions and joint distribution as input.

relative_entropy_rate

</>

pub fn relative_entropy_rate(
  observed: List(Float),
  expected: List(Float),
) -> Result(Float, Nil)

Relative entropy rate for sequences.

Used for measuring “surprise” in temporal data.

shannon

</>

pub fn shannon(probabilities: List(Float)) -> Float

Shannon entropy: H(X) = -Σ p(x) log₂ p(x)

Measures uncertainty/information content of a distribution. Higher entropy = more uncertainty.

Examples

shannon([0.5, 0.5])     // -> 1.0 (maximum for 2 outcomes)
shannon([1.0, 0.0])     // -> 0.0 (no uncertainty)
shannon([0.25, 0.25, 0.25, 0.25]) // -> 2.0

shannon_normalized

</>

pub fn shannon_normalized(probabilities: List(Float)) -> Float

Normalized Shannon entropy (0 to 1 range).

Divides by log₂(n) where n is number of outcomes.

symmetric_kl

</>

pub fn symmetric_kl(
  p: List(Float),
  q: List(Float),
) -> Result(Float, Nil)

Symmetric KL Divergence (Jensen-Shannon divergence without the 1/2).

D_sym(P, Q) = D_KL(P || Q) + D_KL(Q || P)