Layers

Just as Tensor is our fundamental building block for accelerated parallel computation, most machine learning models and operations will be expressed in terms of the Layer protocol. Layer defines an interface for types that take a differentiable input, process it, and produce a differentiable output. A Layer can contain state, such as trainable weights.

Layer is a refinement of the Module protocol, with Module defining the more general case where the input to the type is not necessarily differentiable. Most components in a model will deal with differentiable inputs, but there are cases where types may need to conform to Module instead.

If you create an operation has no trainable parameters within it, you'll want to define it in terms of ParameterlessLayer instead of Layer.

Models themselves are often defined as Layers, and are regularly composed of other Layers. A model or subunit that has been defined as a Layer can be treated just like any other Layer, allowing for the construction of arbitarily complex models from other models or subunits.

To define a custom Layer for a model or operation of your own, you generally will follow a template similar to this:

public struct MyModel: Layer {
  // Define your layers or other properties here.

  // A custom initializer may be desired to configure the model.
  public init() {}

  @differentiable
  public func callAsFunction(_ input: Tensor<Float>) -> Tensor<Float> {
    // Define the sequence of operations performed on model input to arrive at the output.
    return ...
  }
}

Trainable components of Layers, such as weights and biases, as well as other Layers, can be declared as properties. A custom initializer is a good place to expose customizable parameters for a model, such as a variable numbers of layers or the output size of a classification model. Finally, the core of the Layer is callAsFunction(), where you will define the types for the input and output as well as the transformation that takes in one and returns the other.

Built-in layers

Many common machine learning operations have been encapsulated as Layers for you to use when defining models or subunits. The following is a list of the layers provided by Swift for TensorFlow, grouped by functional areas:

Augmentation

Convolution

Embedding

Morphological

Normalization

Pooling

Recurrent neural networks

Reshaping

Upsampling

Optimizers

Optimizers are a key component of the training of a machine learning model, updating the model based on a calculated gradient. These updates ideally will adjust the parameters of a model in such a way as to train the model.

To use an optimizer, first initialize it for a target model with appropriate training parameters:

let optimizer = RMSProp(for: model, learningRate: 0.0001, decay: 1e-6)

Train a model by obtaining a gradient with respect to input and a loss function, and then update the model along that gradient using your optimizer:

optimizer.update(&model, along: gradients)

Built-in optimizers

Several common optimizers are provided by Swift for TensorFlow. These include the following: