Help protect the Great Barrier Reef with TensorFlow on Kaggle

tft.ngrams

Create a SparseTensor of n-grams.

Given a SparseTensor of tokens, returns a SparseTensor containing the ngrams that can be constructed from each row.

separator is inserted between each pair of tokens, so " " would be an appropriate choice if the tokens are words, while "" would be an appropriate choice if they are characters.

Example:

tokens = tf.SparseTensor(
indices=[[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [1, 3]],
values=['One', 'was', 'Johnny', 'Two', 'was', 'a', 'rat'],
dense_shape=[2, 4])
print(tft.ngrams(tokens, ngram_range=(1, 3), separator=' '))
SparseTensor(indices=tf.Tensor(
[[0 0] [0 1] [0 2] [0 3] [0 4] [0 5]
[1 0] [1 1] [1 2] [1 3] [1 4] [1 5] [1 6] [1 7] [1 8]],
shape=(15, 2), dtype=int64),
values=tf.Tensor(
[b'One' b'One was' b'One was Johnny' b'was' b'was Johnny' b'Johnny' b'Two'
b'Two was' b'Two was a' b'was' b'was a' b'was a rat' b'a' b'a rat'
b'rat'], shape=(15,), dtype=string),
dense_shape=tf.Tensor([2 9], shape=(2,), dtype=int64))

tokens a two-dimensionalSparseTensor of dtype tf.string containing tokens that will be used to construct ngrams.
ngram_range A pair with the range (inclusive) of ngram sizes to return.
separator a string that will be inserted between tokens when ngrams are constructed.
name (Optional) A name for this operation.

A SparseTensor containing all ngrams from each row of the input. Note: if an ngram appears multiple times in the input row, it will be present the same number of times in the output. For unique ngrams, see tft.bag_of_words.

ValueError if tokens is not 2D.
ValueError if ngram_range < 1 or ngram_range < ngram_range

[{ "type": "thumb-down", "id": "missingTheInformationINeed", "label":"Missing the information I need" },{ "type": "thumb-down", "id": "tooComplicatedTooManySteps", "label":"Too complicated / too many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out of date" },{ "type": "thumb-down", "id": "samplesCodeIssue", "label":"Samples / code issue" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }]
[{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }]