![]() |
Cross Layer in Deep & Cross Network to learn explicit feature interactions.
tfrs.layers.dcn.Cross(
projection_dim: Optional[int] = None,
diag_scale: Optional[float] = 0.0,
use_bias: bool = True,
kernel_initializer: Union[Text, tf.keras.initializers.Initializer] = 'truncated_normal',
bias_initializer: Union[Text, tf.keras.initializers.Initializer] = 'zeros',
kernel_regularizer: Union[Text, None, tf.keras.regularizers.Regularizer] = None,
bias_regularizer: Union[Text, None, tf.keras.regularizers.Regularizer] = None,
**kwargs
)
Used in the notebooks
Used in the tutorials |
---|
A layer that creates explicit and bounded-degree feature interactions
efficiently. The call
method accepts inputs
as a tuple of size 2
tensors. The first input x0
is the base layer that contains the original
features (usually the embedding layer); the second input xi
is the output
of the previous Cross
layer in the stack, i.e., the i-th Cross
layer. For the first Cross
layer in the stack, x0 = xi.
The output is x_{i+1} = x0 .* (W * xi + bias + diag_scale * xi) + xi, where .* designates elementwise multiplication, W could be a full-rank matrix, or a low-rank matrix U*V to reduce the computational cost, and diag_scale increases the diagonal of W to improve training stability ( especially for the low-rank case).
References:
- R. Wang et al. See Eq. (1) for full-rank and Eq. (2) for low-rank version.
- R. Wang et al.
Example:
# after embedding layer in a functional model:
input = tf.keras.Input(shape=(None,), name='index', dtype=tf.int64)
x0 = tf.keras.layers.Embedding(input_dim=32, output_dim=6)
x1 = Cross()(x0, x0)
x2 = Cross()(x0, x1)
logits = tf.keras.layers.Dense(units=10)(x2)
model = tf.keras.Model(input, logits)
Args | |
---|---|
projection_dim
|
project dimension to reduce the computational cost.
Default is None such that a full (input_dim by input_dim ) matrix
W is used. If enabled, a low-rank matrix W = U*V will be used, where U
is of size input_dim by projection_dim and V is of size
projection_dim by input_dim . projection_dim need to be smaller
than input_dim /2 to improve the model efficiency. In practice, we've
observed that projection_dim = d/4 consistently preserved the
accuracy of a full-rank version.
|
diag_scale
|
a non-negative float used to increase the diagonal of the
kernel W by diag_scale , that is, W + diag_scale * I, where I is an
identity matrix.
|
use_bias
|
whether to add a bias term for this layer. If set to False, no bias term will be used. |
kernel_initializer
|
Initializer to use on the kernel matrix. |
bias_initializer
|
Initializer to use on the bias vector. |
kernel_regularizer
|
Regularizer to use on the kernel matrix. |
bias_regularizer
|
Regularizer to use on bias vector. |
Input shape: A tuple of 2 (batch_size, input_dim
) dimensional inputs.
Output shape: A single (batch_size, input_dim
) dimensional output.
Methods
call
call(
x0: tf.Tensor, x: Optional[tf.Tensor] = None
) -> tf.Tensor
Computes the feature cross.
Args | |
---|---|
x0
|
The input tensor |
x
|
Optional second input tensor. If provided, the layer will compute crosses between x0 and x; if not provided, the layer will compute crosses between x0 and itself. |
Returns | |
---|---|
Tensor of crosses. |