tf.random.learned_unigram_candidate_sampler

TensorFlow 2 version View source on GitHub

Samples a set of classes from a distribution learned during training.

This operation randomly samples a tensor of sampled classes (sampled_candidates) from the range of integers [0, range_max).

The elements of sampled_candidates are drawn without replacement (if unique=True) or with replacement (if unique=False) from the base distribution.

The base distribution for this operation is constructed on the fly during training. It is a unigram distribution over the target classes seen so far during training. Every integer in [0, range_max) begins with a weight of 1, and is incremented by 1 each time it is seen as a target class. The base distribution is not saved to checkpoints, so it is reset when the model is reloaded.

In addition, this operation returns tensors true_expected_count and sampled_expected_count representing the number of times each of the target classes (true_classes) and the sampled classes (sampled_candidates) is expected to occur in an average tensor of sampled classes. These values correspond to Q(y|x) defined in this document. If unique=True, then these are post-rejection probabilities and we compute them approximately.

true_classes A Tensor of type int64 and shape [batch_size, num_true]. The target classes.
num_true An int. The number of target classes per training example.
num_sampled An int. The number of classes to randomly sample.
unique A bool. Determines whether all sampled classes in a batch are unique.
range_max An int. The number of possible classes.
seed An int. An operation-specific seed. Default is 0.
name A name for the operation (optional).

sampled_candidates A tensor of type int64 and shape [num_sampled]. The sampled classes.
true_expected_count A tensor of type float. Same shape as true_classes. The expected counts under the sampling distribution of each of true_classes.
sampled_expected_count A tensor of type float. Same shape as sampled_candidates. The expected counts under the sampling distribution of each of sampled_candidates.