viva_math/entropy
Entropy and information theory functions.
Based on Shannon (1948) and Kullback-Leibler (1951). Used for memory consolidation scoring and uncertainty quantification.
References:
- Shannon (1948) “A Mathematical Theory of Communication”
- Cover & Thomas (2006) “Elements of Information Theory”
Values
pub fn binary_cross_entropy(
p: Float,
q: Float,
) -> Result(Float, Nil)
Binary cross-entropy for single probability.
H(p, q) = -[p log(q) + (1-p) log(1-q)]
pub fn conditional_entropy(
px: List(Float),
pxy: List(List(Float)),
) -> Float
Conditional entropy: H(X|Y) = H(X, Y) - H(Y)
Uncertainty in X given knowledge of Y.
pub fn cross_entropy(
p: List(Float),
q: List(Float),
) -> Result(Float, Nil)
Cross-entropy: H(P, Q) = -Σ p(x) log q(x)
Used in machine learning loss functions. H(P, Q) = H(P) + D_KL(P || Q)
pub fn jensen_shannon(
p: List(Float),
q: List(Float),
) -> Result(Float, Nil)
Jensen-Shannon Divergence: JS(P, Q) = (D_KL(P || M) + D_KL(Q || M)) / 2 where M = (P + Q) / 2
This is symmetric and bounded [0, 1] when using log₂.
pub fn kl_divergence(
p: List(Float),
q: List(Float),
) -> Result(Float, Nil)
KL Divergence: D_KL(P || Q) = Σ p(x) log(p(x) / q(x))
Measures how much P diverges from Q (not symmetric!). P is the “true” distribution, Q is the approximation.
Returns Error if distributions have different lengths or Q has zeros where P is non-zero.
pub fn mutual_information(
px: List(Float),
py: List(Float),
pxy: List(List(Float)),
) -> Float
Mutual Information: I(X; Y) = H(X) + H(Y) - H(X, Y)
Measures shared information between two variables. Takes marginal distributions and joint distribution as input.
pub fn relative_entropy_rate(
observed: List(Float),
expected: List(Float),
) -> Result(Float, Nil)
Relative entropy rate for sequences.
Used for measuring “surprise” in temporal data.
pub fn shannon(probabilities: List(Float)) -> Float
Shannon entropy: H(X) = -Σ p(x) log₂ p(x)
Measures uncertainty/information content of a distribution. Higher entropy = more uncertainty.
Examples
shannon([0.5, 0.5]) // -> 1.0 (maximum for 2 outcomes)
shannon([1.0, 0.0]) // -> 0.0 (no uncertainty)
shannon([0.25, 0.25, 0.25, 0.25]) // -> 2.0
pub fn shannon_normalized(probabilities: List(Float)) -> Float
Normalized Shannon entropy (0 to 1 range).
Divides by log₂(n) where n is number of outcomes.
pub fn symmetric_kl(
p: List(Float),
q: List(Float),
) -> Result(Float, Nil)
Symmetric KL Divergence (Jensen-Shannon divergence without the 1/2).
D_sym(P, Q) = D_KL(P || Q) + D_KL(Q || P)