View source on GitHub

Calculates the partial expected mutual information (EMI) of two variables.

EMI reflects the MI expected by chance, and is used to compute adjusted mutual information. See

The EMI for two variables x and y, is the sum of the expected mutual info for each value of x with each value of y. This function computes the EMI for a single value of each variable (x_i, y_j) and is thus considered a partial EMI calculation.

Specifically: EMI(x, y) = sum_{n_ij = max(0, x_i + y_j - n) to min(x_i, y_j)} ( n_ij / n * log2((n * n_ij / (x_i * y_j))

* ((x_i! * y_j! * (n - x_i)! * (n - y_j)!) /
(n! * n_ij! * (x_i - n_ij)! * (y_j - n_ij)! * (n - x_i - y_j + n_ij)!)))

where n_ij is the joint count of x taking on value i and y taking on value j, x_i is the count for x taking on value i, y_j is the count for y taking on value j, and n represents total count.

n The sum of weights for all values.
x_i The sum of weights for the first variable taking on value i
y_j The sum of weights for the second variable taking on value j

Calculated expected mutual information for x_i, y_j.