tf_privacy.TreeResidualSumQuery

Implements DPQuery for adding correlated noise through tree structure.

Inherits From: SumAggregationDPQuery, DPQuery

Clips and sums records in current sample xi = sum{j=0}^{n-1} x_{i,j}; returns the current sample adding the noise residual from tree aggregation. The returned value is conceptually equivalent to the following: calculates cumulative sum of samples over time si = sum{k=0}^i x_i (instead of only current sample) with added noise by tree aggregation protocol that is proportional to log(T), T being the number of times the query is called; r eturns the residual between the current noised cumsum noised(si) and the previous one noised(s{i-1}) when the query is called.

This can be used as a drop-in replacement for GaussianSumQuery, and can offer stronger utility/privacy tradeoffs when aplification-via-sampling is not possible, or when privacy epsilon is relativly large. This may result in more noise by a log(T) factor in each individual estimate of x_i, but if the x_i are used in the underlying code to compute cumulative sums, the noise in those sums can be less. That is, this allows us to adapt code that was written to use a regular SumQuery to benefit from the tree aggregation protocol.

Combining this query with a SGD optimizer can be used to implement the DP-FTRL algorithm in "Practical and Private (Deep) Learning without Sampling or Shuffling".

Example usage:

query = TreeResidualSumQuery(...) global_state = query.initial_global_state() params = query.derive_sample_params(global_state) for i, samples in enumerate(streaming_samples): sample_state = query.initial_sample_state(samples[0]) # Compute xi = sum{j=0}^{n-1} x_{i,j} for j,sample in enumerate(samples): sample_state = query.accumulate_record(params, sample_state, sample) # noised_sum is privatized estimate of x_i by conceptually postprocessing # noised cumulative sum s_i noised_sum, global_state, event = query.get_noised_result( sample_state, global_state)

record_specs A nested structure of tf.TensorSpecs specifying structure and shapes of records.
noise_generator tree_aggregation.ValueGenerator to generate the noise value for a tree node. Should be coupled with clipping norm to guarantee privacy.
clip_fn Callable that specifies clipping function. Input to clip is a flat list of vars in a record.
clip_value Float indicating the value at which to clip the record.
use_efficient Boolean indicating the usage of the efficient tree aggregation algorithm based on the paper "Efficient Use of Differentially Private Binary Trees".

clip_fn Callable that specifies clipping function. clip_fn receives two arguments: a flat list of vars in a record and a clip_value to clip the corresponding record, e.g. clip_fn(flat_record, clip_value).
clip_value float indicating the value at which to clip the record.
record_specs A nested structure of tf.TensorSpecs specifying structure and shapes of records.
tree_aggregator tree_aggregation.TreeAggregator initialized with user defined noise_generator. noise_generator is a tree_aggregation.ValueGenerator to generate the noise value for a tree node. Noise stdandard deviation is specified outside the dp_query by the user when defining noise_fn and should have order O(clip_norm*log(T)/eps) to guarantee eps-DP.

clip_fn Callable that specifies clipping function. clip_fn receives two arguments: a flat list of vars in a record and a clip_value to clip the corresponding record, e.g. clip_fn(flat_record, clip_value).
clip_value float indicating the value at which to clip the record.
record_specs A nested structure of tf.TensorSpecs specifying structure and shapes of records.
tree_aggregator tree_aggregation.TreeAggregator initialized with user defined noise_generator. noise_generator is a tree_aggregation.ValueGenerator to generate the noise value for a tree node. Noise stdandard deviation is specified outside the dp_query by the user when defining noise_fn and should have order O(clip_norm*log(T)/eps) to guarantee eps-DP.

Child Classes

class GlobalState

Methods

accumulate_preprocessed_record

View source

Implements tensorflow_privacy.DPQuery.accumulate_preprocessed_record.

accumulate_record

View source

Accumulates a single record into the sample state.

This is a helper method that simply delegates to preprocess_record and accumulate_preprocessed_record for the common case when both of those functions run on a single device. Typically this will be a simple sum.

Args
params The parameters for the sample. In standard DP-SGD training, the clipping norm for the sample's microbatch gradients (i.e., a maximum norm magnitude to which each gradient is clipped)
sample_state The current sample state. In standard DP-SGD training, the accumulated sum of previous clipped microbatch gradients.
record The record to accumulate. In standard DP-SGD training, the gradient computed for the examples in one microbatch, which may be the gradient for just one example (for size 1 microbatches).

Returns
The updated sample state. In standard DP-SGD training, the set of previous microbatch gradients with the addition of the record argument.

build_l2_gaussian_query

View source

Returns TreeResidualSumQuery with L2 norm clipping and Gaussian noise.

Args
clip_norm Each record will be clipped so that it has L2 norm at most clip_norm.
noise_multiplier The effective noise multiplier for the sum of records. Noise standard deviation is clip_norm*noise_multiplier.
record_specs A nested structure of tf.TensorSpecs specifying structure and shapes of records.
noise_seed Integer seed for the Gaussian noise generator. If None, a nondeterministic seed based on system time will be generated.
use_efficient Boolean indicating the usage of the efficient tree aggregation algorithm based on the paper "Efficient Use of Differentially Private Binary Trees".

derive_metrics

View source

Derives metric information from the current global state.

Any metrics returned should be derived only from privatized quantities.

Args
global_state The global state from which to derive metrics.

Returns
A collections.OrderedDict mapping string metric names to tensor values.

derive_sample_params

View source

Implements tensorflow_privacy.DPQuery.derive_sample_params.

get_noised_result

View source

Implements tensorflow_privacy.DPQuery.get_noised_result.

Updates tree state, and returns residual of noised cumulative sum.

Args
sample_state Sum of clipped records for this round.
global_state Global state with current samples cumulative sum and tree state.

Returns
A tuple of (noised_cumulative_sum, new_global_state).

initial_global_state

View source

Implements tensorflow_privacy.DPQuery.initial_global_state.

initial_sample_state

View source

Implements tensorflow_privacy.DPQuery.initial_sample_state.

merge_sample_states

View source

Implements tensorflow_privacy.DPQuery.merge_sample_states.

preprocess_record

View source

Implements tensorflow_privacy.DPQuery.preprocess_record.

Args
params clip_value for the record.
record The record to be processed.

Returns
Structure of clipped tensors.

reset_state

View source

Returns state after resetting the tree.

This function will be used in restart_query.RestartQuery after calling get_noised_result when the restarting condition is met.

Args
noised_results Noised cumulative sum returned by get_noised_result.
global_state Updated global state returned by get_noised_result, which records noise for the conceptual cumulative sum of the current leaf node, and tree state for the next conceptual cumulative sum.

Returns
New global state with zero noise and restarted tree state.