tfio.genome.phred_sequences_to_probability

Converts raw phred quality scores into base-calling error probabilities.

Used in the notebooks

Used in the tutorials

For each ASCII encoded phred quality score (X), the probability that there was an error calling that base is computed by:

P = 10 ^ (-(X - 33) / 10)

This is assuming an "ASCII base" of 33.

The input is a tf.string tensor of ASCII encoded phred qualities, one string per DNA sequence, with each character representing the quality of a nucelotide.

For example:

phred_qualities = [["BB<"], ["BBBB"]]

phred_qualities A tf.string tensor where each string represents the phred quality of a DNA sequence. Each character in the string is the ASCII representation of the phred quality number.

tf.RaggedTensor The quality scores for each base in each sequence provided.