Module google/‌imagenet/‌mobilenet_v2_050_96/‌classification/1

Imagenet (ILSVRC-2012-CLS) classification with MobileNet V2 (depth multiplier 0.50).

Module URL: https://tfhub.dev/google/imagenet/mobilenet_v2_050_96/classification/1

Overview

MobileNet V2 is a family of neural network architectures for efficient on-device image classification and related tasks, originally published by

Mobilenets come in various sizes controlled by a multiplier for the depth (number of features) in the convolutional layers. They can also be trained for various sizes of input images to control inference speed.

This TF-Hub module uses the TF-Slim implementation of mobilenet_v2 with a depth multiplier of 0.5 and an input size of 96x96 pixels. This implementation of Mobilenet V2 rounds feature depths to multiples of 8 (an optimization not described in the paper). Depth multipliers less than 1.0 are not applied to the last convolutional layer (from which the module takes the image feature vector).

The module contains a trained instance of the network, packaged to do the image classification that the network was trained on. If you merely want to transform images into feature vectors, use module google/imagenet/mobilenet_v2_050_96/feature_vector/1 instead, and save the space occupied by the classification layer.

Training

The checkpoint exported into this module was mobilenet_v2_0.5_96/mobilenet_v2_0.5_96.ckpt downloaded from MobileNet V2 pre-trained models. Its weights were originally obtained by training on the ILSVRC-2012-CLS dataset for image classification ("Imagenet").

Usage

This module implements the common signature for image classification. It can be used like

module = hub.Module("https://tfhub.dev/google/imagenet/mobilenet_v2_050_96/classification/1")
height, width = hub.get_expected_image_size(module)
images = ...  # A batch of images with shape [batch_size, height, width, 3].
logits = module(images)  # Logits with shape [batch_size, num_classes].

...or using the signature name image_classification. The indices into logits are the num_classes = 1001 classes of the classification from the original training (see above).

This module can also be used to compute image feature vectors, using the signature name image_feature_vector.

For this module, the size of the input image is fixed to height x width = 96 x 96 pixels. The input images are expected to have color values in the range [0,1], following the common image input conventions.

Fine-tuning

In principle, consumers of this module can fine-tune it. However, fine-tuning through a large classification might be prone to overfit.

Fine-tuning requires importing the graph version with tag set {"train"} in order to operate batch normalization in training mode.