Privacy in Machine Learning

An important aspect of responsible AI usage is ensuring that ML models are prevented from exposing potentially sensitive information, such as demographic information or other attributes in the training dataset that could be used to identify people. One way to achieve this is by using differentially private stochastic gradient descent (DP-SGD), which is a modification to the standard stochastic gradient descent (SGD) algorithm in machine learning.

Models trained with DP-SGD have measurable differential privacy (DP) improvements, which helps mitigate the risk of exposing sensitive training data. Since the purpose of DP is to help prevent individual data points from being identified, a model trained with DP should not be affected by any single training example in its training data set. DP-SGD techniques can also be used in federated learning to provide user-level differential privacy. You can learn more about differentially private deep learning in the original paper.

import tensorflow as tf
from tensorflow_privacy.privacy.optimizers import dp_optimizer_keras

# Select your differentially private optimizer
optimizer = tensorflow_privacy.DPKerasSGDOptimizer(
    l2_norm_clip=l2_norm_clip,
    noise_multiplier=noise_multiplier,
    num_microbatches=num_microbatches,
    learning_rate=learning_rate)

# Select your loss function
loss = tf.keras.losses.CategoricalCrossentropy(
    from_logits=True, reduction=tf.losses.Reduction.NONE)

# Compile your model
model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy'])

# Fit your model
model.fit(train_data, train_labels,
  epochs=epochs,
  validation_data=(test_data, test_labels),
  batch_size=batch_size)
  

TensorFlow Privacy

Tensorflow Privacy (TF Privacy) is an open source library developed by teams in Google Research. The library includes implementations of commonly used TensorFlow Optimizers for training ML models with DP. The goal is to enable ML practitioners using standard Tensorflow APIs to train privacy-preserving models by changing only a few lines of code.

The differentially private optimizers can be used in conjunction with high-level APIs that use the Optimizer class, especially Keras. Additionally, you can find differentially private implementations of some Keras models. All of the Optimizers and models can be found in the API Documentation.