tfio.genome.sequences_to_onehot

Convert DNA sequences into a one hot nucleotide encoding.

Used in the notebooks

Used in the tutorials

Each nucleotide in each sequence is mapped as follows: A -> [1, 0, 0, 0] C -> [0, 1, 0, 0] G -> [0 ,0 ,1, 0] T -> [0, 0, 0, 1]

If for some reason a non (A, T, C, G) character exists in the string, it is currently mapped to a error one hot encoding [1, 1, 1, 1].

sequences A tf.string tensor where each string represents a DNA sequence

tf.RaggedTensor The output sequences with nucleotides one hot encoded.