model_remediation.min_diff.losses.MinDiffLoss

MinDiffLoss abstract base class.

model_remediation.min_diff.losses.MinDiffLoss(
    membership_transform=None,
    predictions_transform=None,
    membership_kernel=None,
    predictions_kernel=None,
    name: Optional[str] = None,
    enable_summary_histogram: Optional[bool] = True
)

Inherits from: tf.keras.losses.Loss

Arguments
`membership_transform`	Transform function used on `membership`. If `None` is passed in then `membership` is left as is. The function must return a `tf.Tensor`.
`predictions_transform`	Transform function used on `predictions`. If `None` is passed in then `predictions` is left as is. The function must return a `tf.Tensor`.
`membership_kernel`	String (name of kernel) or `min_diff.losses.MinDiffKernel` to be applied on `membership`. If `None` is passed in, then `membership` is left untouched when applying kernels.
`predictions_kernel`	String (name of kernel) or `min_diff.losses.MinDiffKernel` to be applied on `predictions`. If `None` is passed in, then `predictions` is left untouched when applying kernels.
`name`	Name used for logging and tracking.
`enable_summary_histogram`	Boolean indicating if histogram summary should be written. Defaults to `True`.

To be implemented by subclasses:

call(): Contains the logic for loss calculation using membership, predictions and optionally sample_weight.

Example subclass implementation:

class MyMinDiffLoss(MinDiffLoss):

  def call(membership, predictions, sample_weight=None):
    loss = ...  # Internal logic to calculate loss.
    return loss

A MinDiffLoss instance measures the difference in prediction scores (typically score distributions) between two groups of examples identified by the value in the membership column.

If the predictions between the two groups are indistinguishable, the loss should be 0. The more different the two scores are, the higher the loss.

Raises
`ValueError`	If a `*_transform` parameter is passed in but is not callable.
`ValueError`	If a `*_kernel` parameter has an unrecognized type or value.

Methods

`call`

View source

__call__(
    membership: types.TensorType,
    predictions: types.TensorType,
    sample_weight: Optional[types.TensorType] = None
)

Invokes the MinDiffLoss instance.

Args
`membership`	Labels indicating whether examples are part of the sensitive group. Shape must be `[batch_size, d0, .. dN]`.
`predictions`	Predicted values. Must be the same shape as membership.
`sample_weight`	(Optional) acts as a coefficient for the loss. Must be of shape [batch_size] or [batch_size, 1]. If None then a tensor of ones with the appropriate shape is used.

Returns
Scalar `min_diff_loss`.

`_apply_kernels`

View source

_apply_kernels(
    membership: types.TensorType, predictions: types.TensorType
) -> Tuple[types.TensorType, types.TensorType]

Applies losses.MinDiffKernel attributes to corresponding inputs.

Arguments
`membership`	`membership` column as described in `MinDiffLoss.call`.
`predictions`	`predictions` tensor as described in `MinDiffLoss.call`.

In particular, self.membership_kernel, if not None, will be applied to membership and self.predictions_kernel, if not None, will be applied to predictions.

loss = ...  # MinDiffLoss subclass instance
loss.membership_kernel = min_diff.losses.GaussKernel()
loss.predictions_kernel = min_diff.losses.LaplaceKernel()

# Call with test inputs.
loss._apply_kernels([1, 2, 3], [4, 5, 6])  # (GaussKernel([1, 2, 3]),
                                           #  LaplaceKernel([4, 5, 6]))

If self.*_kernel is None, then the corresponding input is returned unchanged.

loss = ...  # MinDiffLoss subclass instance
loss.membership_kernel = None
loss.predictions_kernel = min_diff.losses.GaussKernel
# Call with test inputs.
loss._apply_kernels([1, 2, 3], [4, 5, 6])  # ([1, 2, 3], GaussKernel([4, 5, 6]))

# With both kernels set to None, _apply_kernels is the identity.
loss.predictions_kernel = None
# Call with test inputs.
loss._apply_kernels([1, 2, 3], [4, 5, 6])  # ([1, 2, 3], [4, 5, 6])

This function is meant for internal use by subclasses when the instance is invoked.

Returns
Tuple of (`membership_kernel_output`, `predictions_kernel_output`).

`_preprocess_inputs`

View source

_preprocess_inputs(
    membership: types.TensorType,
    predictions: types.TensorType,
    sample_weight: Optional[types.TensorType] = None
) -> Tuple[types.TensorType, types.TensorType, types.TensorType]

Preprocesses inputs by applying transforms and normalizing weights.

Arguments
`membership`	`membership` column as described in `MinDiffLoss.call`.
`predictions`	`predictions` tensor as described in `MinDiffLoss.call`.
`sample_weight`	`sample_weight` tensor as described in `MinDiffLoss.call`.

The three inputs are processed in the following way:

membership: has self.membership_transform applied to it, if it's present, and is cast to tf.float32.
predictions: has self.predictions_transform applied to it, if it's present, and is cast to tf.float32.
sample_weight: is validated, cast to tf.float32, and normalized. If it is None, it is set to a normalized tensor of equal weights.

This method is meant for internal use by subclasses when the instance is invoked.

Returns
Tuple of (`membership`, `predictions`, `normed_weights`).

Raises
`tf.errors.InvalidArgumentError`	if `sample_weight` has negative entries.

model_remediation.min_diff.losses.MinDiffLoss

Arguments

Raises

Methods

__call__

_apply_kernels

_preprocess_inputs

`call`

`_apply_kernels`

`_preprocess_inputs`