# tft.CovarianceCombiner

## Class `CovarianceCombiner`

Combines the PCollection to compute the biased covariance matrix.

## `__init__`

``````__init__(
numpy_dtype=np.float64,
output_shape=None
)
``````

Store the dtype for np arrays/matrices for precision.

## Methods

### `add_input`

``````add_input(
accumulator,
batch_values
)
``````

Compute sum of input cross-terms, sum of inputs, and count.

The cross terms for a numeric 1d array x are given by the set: {z_ij = x_i * x_j for all indices i and j}. This is stored as a 2d array. Since next_input is an array of 1d numeric arrays (i.e. a 2d array), matmul(transpose(next_input), next_input) will automatically sum up the cross terms of each 1d array in next_input.

#### Args:

• `accumulator`: running sum of cross terms, input vectors, and count
• `batch_values`: entries from the pipeline, which must be single element list containing a 2d array representing multiple 1d arrays

#### Returns:

An accumulator with next_input considered in its running list of sum_product, sum_vectors, and count of input rows.

### `create_accumulator`

``````create_accumulator()
``````

Create an accumulator with all zero entries.

### `extract_output`

``````extract_output(accumulator)
``````

Run covariance logic on sum_product, sum of input vectors, and count.

The formula used to compute the covariance is cov(x) = E(xx^T) - uu^T, where x is the original input to the combiner, and u = mean(x). E(xx^T) is computed by dividing sum of cross terms (index 0) by count (index 2). u is computed by taking the sum of rows (index 1) and dividing by the count (index 2).

#### Args:

• `accumulator`: final accumulator as a list of the sum of cross-terms matrix, sum of input vectors, and count.

#### Returns:

A list containing a single 2d ndarray, the covariance matrix.

### `merge_accumulators`

``````merge_accumulators(accumulators)
``````

Sums values in each accumulator entry.

### `output_tensor_infos`

``````output_tensor_infos()
``````