tfm.vision.layers.ROISampler

Samples ROIs and assigns targets to the sampled ROIs.

mix_gt_boxes A bool of whether to mix the groundtruth boxes with proposed ROIs.
num_sampled_rois An int of the number of sampled ROIs per image.
foreground_fraction A float in [0, 1], what percentage of proposed ROIs should be sampled from the foreground boxes.
foreground_iou_threshold A float that represents the IoU threshold for a box to be considered as positive (if >= foreground_iou_threshold).
background_iou_high_threshold A float that represents the IoU threshold for a box to be considered as negative (if overlap in [background_iou_low_threshold, background_iou_high_threshold]).
background_iou_low_threshold A float that represents the IoU threshold for a box to be considered as negative (if overlap in [background_iou_low_threshold, background_iou_high_threshold])
skip_subsampling a bool that determines if we want to skip the sampling procedure than balances the fg/bg classes. Used for upper frcnn layers in cascade RCNN.
**kwargs Additional keyword arguments passed to Layer.

Methods

call

View source

Assigns the proposals with groundtruth classes and performs subsmpling.

Given proposed_boxes, gt_boxes, and gt_classes, the function uses the following algorithm to generate the final num_samples_per_image RoIs.

  1. Calculates the IoU between each proposal box and each gt_boxes.
  2. Assigns each proposed box with a groundtruth class and box by choosing the largest IoU overlap.
  3. Samples num_samples_per_image boxes from all proposed boxes, and returns box_targets, class_targets, and RoIs.

Args
boxes A tf.Tensor of shape of [batch_size, N, 4]. N is the number of proposals before groundtruth assignment. The last dimension is the box coordinates w.r.t. the scaled images in [ymin, xmin, ymax, xmax] format.
gt_boxes A tf.Tensor of shape of [batch_size, MAX_NUM_INSTANCES, 4]. The coordinates of gt_boxes are in the pixel coordinates of the scaled image. This tensor might have padding of values -1 indicating the invalid box coordinates.
gt_classes A tf.Tensor with a shape of [batch_size, MAX_NUM_INSTANCES]. This tensor might have paddings with values of -1 indicating the invalid classes.
gt_outer_boxes A tf.Tensor of shape of [batch_size, MAX_NUM_INSTANCES, 4]. The corrdinates of gt_outer_boxes are in the pixel coordinates of the scaled image. This tensor might have padding of values -1 indicating the invalid box coordinates. Ignored if not provided.

Returns
sampled_rois A tf.Tensor of shape of [batch_size, K, 4], representing the coordinates of the sampled RoIs, where K is the number of the sampled RoIs, i.e. K = num_samples_per_image.
sampled_gt_boxes A tf.Tensor of shape of [batch_size, K, 4], storing the box coordinates of the matched groundtruth boxes of the samples RoIs.
sampled_gt_outer_boxes A tf.Tensor of shape of [batch_size, K, 4], storing the box coordinates of the matched groundtruth outer boxes of the samples RoIs. This field is missing if gt_outer_boxes is None.
sampled_gt_classes A tf.Tensor of shape of [batch_size, K], storing the classes of the matched groundtruth boxes of the sampled RoIs.
sampled_gt_indices A tf.Tensor of shape of [batch_size, K], storing the indices of the sampled groudntruth boxes in the original gt_boxes tensor, i.e., gt_boxes[sampled_gt_indices[:, i]] = sampled_gt_boxes[:, i].