Missed TensorFlow World? Check out the recap. Learn more

tfio.genome.sequences_to_onehot

View source on GitHub

Convert DNA sequences into a one hot nucleotide encoding.

tfio.genome.sequences_to_onehot(sequences)

Each nucleotide in each sequence is mapped as follows: A -> [1, 0, 0, 0] C -> [0, 1, 0, 0] G -> [0 ,0 ,1, 0] T -> [0, 0, 0, 1]

If for some reason a non (A, T, C, G) character exists in the string, it is currently mapped to a error one hot encoding [1, 1, 1, 1].

Args:

  • sequences: A tf.string tensor where each string represents a DNA sequence

Returns: