tfm.vision.models.RetinaNetModel

The RetinaNet model class.

backbone tf.keras.Model a backbone network.
decoder tf.keras.Model a decoder network.
head RetinaNetHead, the RetinaNet head.
detection_generator the detection generator.
min_level Minimum level in output feature maps.
max_level Maximum level in output feature maps.
num_scales A number representing intermediate scales added on each level. For instances, num_scales=2 adds one additional intermediate anchor scales [2^0, 2^0.5] on each level.
aspect_ratios A list representing the aspect raito anchors added on each level. The number indicates the ratio of width to height. For instances, aspect_ratios=[1.0, 2.0, 0.5] adds three anchors on each scale level.
anchor_size A number representing the scale of size of the base anchor to the feature stride 2^level.
**kwargs keyword arguments to be passed.

backbone

checkpoint_items Returns a dictionary of items to be additionally checkpointed.
decoder

detection_generator

head

Methods

call

View source

Forward pass of the RetinaNet model.

Args
images Tensor or a sequence of Tensor, the input batched images to the backbone network, whose shape(s) is [batch, height, width, 3]. If it is a sequence of Tensor, we will assume the anchors are generated based on the shape of the first image(s).
image_shape Tensor, the actual shape of the input images, whose shape is [batch, 2] where the last dimension is [height, width]. Note that this is the actual image shape excluding paddings. For example, images in the batch may be resized into different shapes before padding to the fixed size.
anchor_boxes a dict of tensors which includes multilevel anchors.

  • key: str, the level of the multilevel predictions.
  • values: Tensor, the anchor coordinates of a particular feature level, whose shape is [height_l, width_l, num_anchors_per_location].
output_intermediate_features bool indicating whether to return the intermediate feature maps generated by backbone and decoder.
training bool, indicating whether it is in training mode.

Returns
scores a dict of tensors which includes scores of the predictions.

  • key: str, the level of the multilevel predictions.
  • values: Tensor, the box scores predicted from a particular feature level, whose shape is [batch, height_l, width_l, num_classes * num_anchors_per_location].
boxes a dict of tensors which includes coordinates of the predictions.
  • key: str, the level of the multilevel predictions.
  • values: Tensor, the box coordinates predicted from a particular feature level, whose shape is [batch, height_l, width_l, 4 * num_anchors_per_location].
  • attributes a dict of (attribute_name, attribute_predictions). Each attribute prediction is a dict that includes:
  • key: str, the level of the multilevel predictions.
  • values: Tensor, the attribute predictions from a particular feature level, whose shape is [batch, height_l, width_l, att_size * num_anchors_per_location].