Module google/delf/1

Attentive local feature descriptor trained on photographs of landmarks.

Module URL:

The DELF module takes an image as input and will describe noteworthy points with vectors. The points and vectors can be used for large-scale image retrieval, or for matching two images of the same landmark to obtain local correspondences.

For more information about DELF, e.g. its architecture and applications, please see the paper [1] and the DELF project on GitHub .

Example use

# Prepare an image tensor.
image = tf.image.decode_jpeg('my_image.jpg', channels=3)
image = tf.image.convert_image_dtype(image, tf.float32)

# Instantiate the DELF module.
delf_module = hub.Module("")

delf_inputs = {
  # An image tensor with dtype float32 and shape [height, width, 3], where
  # height and width are positive integers:
  'image': image,
  # Scaling factors for building the image pyramid as described in the paper:
  'image_scales': [0.25, 0.3536, 0.5, 0.7071, 1.0, 1.4142, 2.0],
  # Image features whose attention score exceeds this threshold will be
  # returned:
  'score_threshold': 100.0,
  # The maximum number of features that should be returned:
  'max_feature_num': 1000,

# Apply the DELF module to the inputs to get the outputs.
delf_outputs = delf_module(delf_inputs, as_dict=True)

# delf_outputs is a dictionary of named tensors:
# * delf_outputs['locations']: a Tensor with dtype float32 and shape [None, 2],
#   where each entry is a coordinate (vertical-offset, horizontal-offset) in
#   pixels from the top-left corner of the image.
# * delf_outputs['descriptors']: a Tensor with dtype float32 and shape
#   [None, 40], where delf_outputs['descriptors'][i] is a 40-dimensional
#   descriptor for the image at location delf_outputs['locations'][i].


The feature extraction and attention weights were trained on the "full" and "clean" subsets of the data as introduced in the paper [2].


