tft.bag_of_words
Stay organized with collections
Save and categorize content based on your preferences.
Computes a bag of "words" based on the specified ngram configuration.
tft.bag_of_words(
tokens: tf.SparseTensor,
ngram_range: Tuple[int, int],
separator: str,
name: Optional[str] = None
) -> tf.SparseTensor
A light wrapper around tft.ngrams. First computes ngrams, then transforms the
ngram representation (list semantics) into a Bag of Words (set semantics) per
row. Each row reflects the set of unique ngrams present in an input record.
See tft.ngrams for more information.
Args |
tokens
|
a two-dimensional SparseTensor of dtype tf.string containing
tokens that will be used to construct a bag of words.
|
ngram_range
|
A pair with the range (inclusive) of ngram sizes to compute.
|
separator
|
a string that will be inserted between tokens when ngrams are
constructed.
|
name
|
(Optional) A name for this operation.
|
Returns |
A SparseTensor containing the unique set of ngrams from each row of the
input. Note: the original order of the ngrams may not be preserved.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-26 UTC.
[{
"type": "thumb-down",
"id": "missingTheInformationINeed",
"label":"Missing the information I need"
},{
"type": "thumb-down",
"id": "tooComplicatedTooManySteps",
"label":"Too complicated / too many steps"
},{
"type": "thumb-down",
"id": "outOfDate",
"label":"Out of date"
},{
"type": "thumb-down",
"id": "samplesCodeIssue",
"label":"Samples / code issue"
},{
"type": "thumb-down",
"id": "otherDown",
"label":"Other"
}]
[{
"type": "thumb-up",
"id": "easyToUnderstand",
"label":"Easy to understand"
},{
"type": "thumb-up",
"id": "solvedMyProblem",
"label":"Solved my problem"
},{
"type": "thumb-up",
"id": "otherUp",
"label":"Other"
}]
{"lastModified": "Last updated 2024-04-26 UTC."}
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-04-26 UTC."],[],[]]