tensorflow::serving::Loader

This is an abstract class.

#include <loader.h>

A standardized abstraction for an object that manages the lifecycle of a servable, including loading and unloading it.

Summary

Servables are arbitrary objects that serve algorithms or data that often, though not necessarily, use a machine-learned model.

A Loader for a servable object represents one instance of a stream of servable versions, all sharing a common name (e.g. "my_servable") and increasing version numbers, typically representing updated model parameters learned from fresh training data.

A Loader should start in an unloaded state, meaning that no work has been done to prepare to perform operations. A typical instance that has not yet been loaded contains merely a pointer to a location from which its data can be loaded (e.g. a file-system path or network location). Construction and destruction of instances should be fairly cheap. Expensive initialization operations should be done in Load().

Subclasses may optionally store a pointer to the Source that originated it, for accessing state shared across multiple servable objects in a given servable stream.

Implementations need to ensure that the methods they expose are thread-safe, or carefully document and/or coordinate their thread-safety properties with their clients to ensure correctness. Servables do not need to worry about concurrent execution of Load()/Unload() as the caller will ensure that does not happen.

Inheritance

Direct Known Subclasses:tensorflow::serving::ResourceUnsafeLoader

Constructors and Destructors

~Loader()
The destructor will never be called on a Loader whose servable is currently loaded, i.e.

Public functions

EstimateResources(ResourceAllocation *estimate) const =0
virtual Status
Estimates the resources a servable will use.
Load()
virtual Status
Fetches any data that needs to be loaded before using the servable returned by servable().
LoadWithMetadata(const Metadata & metadata)
virtual Status
Similar to the above method, but takes Metadata as a param, which may be used by the loader implementation appropriately.
Unload()=0
virtual void
Frees any resources allocated during Load() (except perhaps for resources shared across servables that are still needed for other active ones).
servable()=0
virtual AnyPtr
Returns an opaque interface to the underlying servable object.

Structs

tensorflow::serving::Loader::Metadata

The metadata consists of the ServableId.

Public functions

EstimateResources

virtual Status EstimateResources(
  ResourceAllocation *estimate
) const =0

Estimates the resources a servable will use.

IMPORTANT: This method's implementation must obey following requirements, which enable the serving system to reason correctly about which servables can be loaded safely:

  1. The estimate must represent an upper bound on the actual value.
  2. Prior to load, the estimate may include resources that are not bound to any specific device instance, e.g. RAM on one of the two GPUs.
  3. While loaded, for any devices with multiple instances (e.g. two GPUs), the estimate must specify the instance to which each resource is bound.
  4. The estimate must be monotonically non-increasing, i.e. it cannot increase over time. Reasons to have it potentially decrease over time Returns
    an estimate of the resources the servable will consume once loaded. If the servable has already been loaded, returns an estimate of the actual resource usage.

Load

virtual Status Load()

Fetches any data that needs to be loaded before using the servable returned by servable().

May use no more resources than the estimate reported by EstimateResources().

If implementing Load(), you don't have to override LoadWithMetadata().

LoadWithMetadata

virtual Status LoadWithMetadata(
  const Metadata & metadata
)

Similar to the above method, but takes Metadata as a param, which may be used by the loader implementation appropriately.

If you're overriding LoadWithMetadata(), because you can use the metadata appropriately, you can skip overriding Load().

Unload

virtual void Unload()=0

Frees any resources allocated during Load() (except perhaps for resources shared across servables that are still needed for other active ones).

The loader does not need to return to the "new" state (i.e. Load() cannot be called after Unload()).

servable

virtual AnyPtr servable()=0

Returns an opaque interface to the underlying servable object.

The caller should know the precise type of the interface in order to make actual use of it. For example:

CustomLoader implementation:

class CustomLoader : public Loader {
 public:
  ...
  Status Load() override {
    servable_ = ...;
  }

  AnyPtr servable() override { return servable_; }

 private:
  CustomServable* servable_ = nullptr;
};

Serving user request:

ServableHandle<CustomServable> handle = ...
CustomServable* servable = handle.get();
servable->...

If servable() is called after successful Load() and before Unload(), it returns a valid, non-null AnyPtr object. If called before a successful Load() call or after Unload(), it returns null AnyPtr.

~Loader

virtual  ~Loader()=default

The destructor will never be called on a Loader whose servable is currently loaded, i.e.

between (successful) calls to Load() and Unload().