Options to configure the image processing pipeline, which operates before inference.
The Task Library Vision API performs image preprocessing on the input image over the region of interest, so that it fits model requirements (e.g. upright 224x224 RGB) and populate the corresponding input tensor. This is performed by (in this order):
- cropping the frame buffer to the region of interest (which, in most cases, just covers the entire input image),
- resizing it (with bilinear interpolation, aspect-ratio *not* preserved) to the dimensions of the model input tensor,
- converting it to the colorspace of the input tensor (i.e. RGB, which is the only supported colorspace for now),
- rotating it according to its
ImageProcessingOptions.Orientation
so that inference is performed on an "upright" image.
IMPORTANT: as a consequence of cropping occurring first, the provided region of interest is
expressed in the unrotated frame of reference coordinates system, i.e. in [0,
TensorImage.getWidth()) x [0, TensorImage.getHeight())
, which are the dimensions of the
underlying image data before any orientation gets applied. If the region is out of these bounds,
the inference method, such as ImageClassifier.classify(MlImage)
, will return error.
Nested Classes
class | ImageProcessingOptions.Builder | Builder for ImageProcessingOptions . |
|
enum | ImageProcessingOptions.Orientation | Orientation type that follows EXIF specification. |
Public Constructors
Public Methods
static ImageProcessingOptions.Builder |
builder()
|
abstract ImageProcessingOptions.Orientation | |
abstract Rect |
getRoi()
|