tfl.conditional_cdf.cdf_fn

Maps inputs through a CDF function specified by keypoint parameters.

cdf_fn is similar to tfl.layers.CDF, which is an additive / multiplicative average of a few shifted and scaled sigmoid or relu6 basis functions, with the difference that the functions are parametrized by the provided parameters instead of learnable weights belonging to a tfl.layers.CDF layer.

These parameters can be one of:

  • constants,
  • trainable variables,
  • outputs from other TF modules.

For inputs of shape (batch_size, input_dim), two sets of free-form parameters are used to configure the CDF function:

  • location_parameters for where to place the sigmoid / relu6 transformation basis,
  • scaling_parameters (optional) for the horizontal scaling before applying the transformation basis.

The transformation per dimension is x -> activation(scale * (x - location)), where:

  • scale (specified via scaling_parameter) is the input scaling for each dimension and needs to be strictly positive for the CDF function to become monotonic. If needed, you can set scaling_exp_transform_multiplier to get scale = exp(scaling_parameter * scaling_exp_transform_multiplier) and guarantees strict positivity.
  • location (specified via location_parameter) is the input shift. Notice for relu6 this is where the transformation starts to be nonzero, whereas for sigmoid this is where the transformation hits 0.5.
  • activation is either sigmoid or relu6 (for relu6 / 6).

An optional reduction operation will compute the additive / multiplicative average for the input dims after their individual CDF transformation. mean and geometric_mean are supported if sepcified.

sparsity_factor decides the level of sparsity during reduction. For instance, default of sparsity = 1 calculates the average of all input dims, whereas sparsity = 2 calculates the average of every other input dim, and so on.

We denote num_functions as the number of sigmoid or relu6 / 6 basis functions used for each CDF transformation.

inputs should be:

  • (batch_size, input_dim).

location_parameters should be:

  • (batch_size, input_dim, num_functions, units // sparsity_factor).

scaling_parameters when provided should be broadcast friendly with location_parameters, e.g. one of

  • (batch_size, input_dim, 1, 1),
  • (batch_size, input_dim, num_functions, 1),
  • (batch_size, input_dim, 1, units // sparsity_factor),
  • (batch_size, input_dim, num_functions, units // sparsity_factor).

inputs inputs to the CDF function.
location_parameters parameters for deciding the locations of the transformations.
scaling_parameters parameters for deciding the horizontal scaling of the transformations.
units output dimension.
activation either sigmoid or relu6 for selecting the transformation.
reduction either mean, geometric_mean, or none to specify whether to perform averaging and which average to perform.
sparsity_factor deciding the level of sparsity during reduction. input_dim and units should both be divisible by sparsity_factor.
scaling_exp_transform_multiplier if provided, will be used inside an exponential transformation for scaling_parameters. This can be useful if scaling_parameters is free-form.
return_derived_parameters Whether location_parameters and scaling_parameters should be output along with the model output (e.g. for loss function computation purpoeses).

If return_derived_parameters = False:

  • The CDF transformed outputs as a tensor with shape either (batch_size, units) if reduction = 'mean' / 'geometric_mean', or (batch_size, input_dim // sparsity_factor, units) if reduction = 'none'.

If return_derived_parameters = True:

  • A tuple of three elements:

    1. The CDF transformed outputs.
    2. location_parameters.
    3. scaling_parameters, with exp transformation applied if specified.