View source on GitHub
|
Cross Layer in Deep & Cross Network to learn explicit feature interactions.
tfrs.layers.feature_interaction.MultiLayerDCN(
projection_dim: Optional[int] = 1,
num_layers: Optional[int] = 3,
use_bias: bool = True,
kernel_initializer: Union[Text, tf.keras.initializers.Initializer] = 'truncated_normal',
bias_initializer: Union[Text, tf.keras.initializers.Initializer] = 'zeros',
kernel_regularizer: Union[Text, None, tf.keras.regularizers.Regularizer] = None,
bias_regularizer: Union[Text, None, tf.keras.regularizers.Regularizer] = None,
**kwargs
)
A layer that creates explicit and bounded-degree feature interactions
efficiently. The call method accepts inputs as a tuple of size 2
tensors. The first input x0 is the base layer that contains the original
features (usually the embedding layer); the second input xi is the output
of the previous Cross layer in the stack, i.e., the i-th Cross
layer. For the first Cross layer in the stack, x0 = xi.
The output is x_{i+1} = x0 .* (W * xi + bias + diag_scale * xi) + xi,
where .* designates elementwise multiplication, W could be a full-rank
matrix, or a low-rank matrix U*V to reduce the computational cost, and
diag_scale increases the diagonal of W to improve training stability (
especially for the low-rank case).
References:
1. [R. Wang et al.](https://arxiv.org/pdf/2008.13535.pdf)
See Eq. (1) for full-rank and Eq. (2) for low-rank version.
2. [R. Wang et al.](https://arxiv.org/pdf/1708.05123.pdf)
Example:
```python
# after embedding layer in a functional model:
input = tf.keras.Input(shape=(None,), name='index', dtype=tf.int64)
x0 = tf.keras.layers.Embedding(input_dim=32, output_dim=6)
x1 = MultiLayerDCN()(x0)
x2 = MultiLayerDCN()(x0)
logits = tf.keras.layers.Dense(units=10)(x2)
model = tf.keras.Model(input, logits)
```
Attributes:
projection_dim: project dimension to reduce the computational cost. a
low-rank matrix W = U*V will be used, where U is of size input_dim by
projection_dim and V is of size projection_dim by input_dim.
projection_dim need to be smaller than input_dim/2 to improve the
model efficiency. In practice, we've observed that projection_dim =
input_dim/4 consistently preserved the accuracy of a full-rank version.
num_layers: the number of stacked DCN layers
use_bias: whether to add a bias term for this layer. If set to False, no
bias term will be used.
kernel_initializer: Initializer to use on the kernel matrix.
bias_initializer: Initializer to use on the bias vector.
kernel_regularizer: Regularizer to use on the kernel matrix.
bias_regularizer: Regularizer to use on bias vector.
Input shape: A tuple of 2 (batch_size, input_dim) dimensional inputs.
Output shape: A single (batch_size, input_dim) dimensional output.
Methods
call
call(
x0: tf.Tensor
) -> tf.Tensor
Computes the multi layer DCN feature cross.
| Args | |
|---|---|
x0
|
The input tensor |
| Returns | |
|---|---|
| Tensor of crosses. |
View source on GitHub