tf.contrib.lookup.IdTableWithHashBuckets

class tf.contrib.lookup.IdTableWithHashBuckets

String to Id table wrapper that assigns out-of-vocabulary keys to buckets.

For example, if an instance of IdTableWithHashBuckets is initialized with a string-to-id table that maps: - emerson -> 0 - lake -> 1 - palmer -> 2

The IdTableWithHashBuckets object will performs the following mapping: - emerson -> 0 - lake -> 1 - palmer -> 2 - -> bucket id between 3 and 3 + num_oov_buckets, calculated by: hash() % num_oov_buckets + vocab_size

If input_tensor is ["emerson", "lake", "palmer", "king", "crimson"], the lookup result is [0, 1, 2, 4, 7]

If table is None, only out-of-vocabulary buckets are used.

Example usage:

num_oov_buckets = 3
input_tensor = tf.constant(["emerson", "lake", "palmer", "king", "crimnson"])
table = tf.IdTableWithHashBuckets(
    tf.HashTable(tf.TextFileIdTableInitializer(filename), default_value),
    num_oov_buckets)
out = table.lookup(input_tensor).
table.init.run()
print out.eval()

The hash function used for generating out-of-vocabulary buckets ID is handled by hasher_spec.

Properties

init

The table initialization op.

key_dtype

The table key dtype.

name

The name of the table.

value_dtype

The table value dtype.

Methods

__init__(table, num_oov_buckets, hasher_spec=tf.contrib.lookup.FastHashSpec, name=None)

Construct a IdTableWithHashBuckets object.

Args:

  • table: Table that maps string to ids.
  • num_oov_buckets: Number of buckets to use for out-of-vocabulary keys.
  • hasher_spec: A HasherSpec to specify the hash function to use for assignation of out-of-vocabulary buckets (optional).
  • name: A name for the operation (optional).

Raises:

  • ValueError: when table in None and num_oov_buckets is not positive.
  • TypeError: when hasher_spec is invalid.

check_table_dtypes(key_dtype, value_dtype)

Check that the given key_dtype and value_dtype matches the table dtypes.

Args:

  • key_dtype: The key data type to check.
  • value_dtype: The value data type to check.

Raises:

  • TypeError: when 'key_dtype' or 'value_dtype' doesn't match the table data types.

lookup(keys, name=None)

Looks up keys in the table, outputs the corresponding values.

It assigns out-of-vocabulary keys to buckets based in their hashes.

Args:

  • keys: Keys to look up. May be either a SparseTensor or dense Tensor.
  • name: Optional name for the op.

Returns:

A SparseTensor if keys are sparse, otherwise a dense Tensor.

Raises:

  • TypeError: when keys doesn't match the table key data type.

size(name=None)

Compute the number of elements in this table.

Defined in tensorflow/contrib/lookup/lookup_ops.py.