|View source on GitHub|
Computes a bag of "words" based on the specified ngram configuration.
tft.bag_of_words( tokens, ngram_range, separator, name=None )
A light wrapper around tft.ngrams. First computes ngrams, then transforms the ngram representation (list semantics) into a Bag of Words (set semantics) per row. Each row reflects the set of unique ngrams present in an input record.
See tft.ngrams for more information.
||A pair with the range (inclusive) of ngram sizes to compute.|
||a string that will be inserted between tokens when ngrams are constructed.|
||(Optional) A name for this operation.|