View source on GitHub 
Implements DPQuery for adding correlated noise through tree structure.
Inherits From: SumAggregationDPQuery
, DPQuery
tf_privacy.TreeResidualSumQuery(
record_specs, noise_generator, clip_fn, clip_value, use_efficient=True
)
Clips and sums records in current sample xi = sum{j=0}^{n1} x_{i,j}; returns the current sample adding the noise residual from tree aggregation. The returned value is conceptually equivalent to the following: calculates cumulative sum of samples over time si = sum{k=0}^i x_i (instead of only current sample) with added noise by tree aggregation protocol that is proportional to log(T), T being the number of times the query is called; r eturns the residual between the current noised cumsum noised(si) and the previous one noised(s{i1}) when the query is called.
This can be used as a dropin replacement for GaussianSumQuery
, and can
offer stronger utility/privacy tradeoffs when aplificationviasampling is not
possible, or when privacy epsilon is relativly large. This may result in
more noise by a log(T) factor in each individual estimate of x_i, but if the
x_i are used in the underlying code to compute cumulative sums, the noise in
those sums can be less. That is, this allows us to adapt code that was written
to use a regular SumQuery
to benefit from the tree aggregation protocol.
Combining this query with a SGD optimizer can be used to implement the DPFTRL algorithm in "Practical and Private (Deep) Learning without Sampling or Shuffling".
Example usage:
query = TreeResidualSumQuery(...) global_state = query.initial_global_state() params = query.derive_sample_params(global_state) for i, samples in enumerate(streaming_samples): sample_state = query.initial_sample_state(samples[0]) # Compute xi = sum{j=0}^{n1} x_{i,j} for j,sample in enumerate(samples): sample_state = query.accumulate_record(params, sample_state, sample) # noised_sum is privatized estimate of x_i by conceptually postprocessing # noised cumulative sum s_i noised_sum, global_state, event = query.get_noised_result( sample_state, global_state)
Args  

record_specs

A nested structure of tf.TensorSpec s specifying structure
and shapes of records.

noise_generator

tree_aggregation.ValueGenerator to generate the noise
value for a tree node. Should be coupled with clipping norm to guarantee
privacy.

clip_fn

Callable that specifies clipping function. Input to clip is a flat list of vars in a record. 
clip_value

Float indicating the value at which to clip the record. 
use_efficient

Boolean indicating the usage of the efficient tree aggregation algorithm based on the paper "Efficient Use of Differentially Private Binary Trees". 
Attributes  

clip_fn

Callable that specifies clipping function. clip_fn receives two
arguments: a flat list of vars in a record and a clip_value to clip the
corresponding record, e.g. clip_fn(flat_record, clip_value).

clip_value

float indicating the value at which to clip the record. 
record_specs

A nested structure of tf.TensorSpec s specifying structure
and shapes of records.

tree_aggregator

tree_aggregation.TreeAggregator initialized with user
defined noise_generator . noise_generator is a
tree_aggregation.ValueGenerator to generate the noise value for a tree
node. Noise stdandard deviation is specified outside the dp_query by the
user when defining noise_fn and should have order
O(clip_norm*log(T)/eps) to guarantee epsDP.

Child Classes
Methods
accumulate_preprocessed_record
accumulate_preprocessed_record(
sample_state, preprocessed_record
)
Implements tensorflow_privacy.DPQuery.accumulate_preprocessed_record
.
accumulate_record
accumulate_record(
params, sample_state, record
)
Accumulates a single record into the sample state.
This is a helper method that simply delegates to preprocess_record
and
accumulate_preprocessed_record
for the common case when both of those
functions run on a single device. Typically this will be a simple sum.
Args  

params

The parameters for the sample. In standard DPSGD training, the clipping norm for the sample's microbatch gradients (i.e., a maximum norm magnitude to which each gradient is clipped) 
sample_state

The current sample state. In standard DPSGD training, the accumulated sum of previous clipped microbatch gradients. 
record

The record to accumulate. In standard DPSGD training, the gradient computed for the examples in one microbatch, which may be the gradient for just one example (for size 1 microbatches). 
Returns  

The updated sample state. In standard DPSGD training, the set of previous microbatch gradients with the addition of the record argument. 
build_l2_gaussian_query
@classmethod
build_l2_gaussian_query( clip_norm, noise_multiplier, record_specs, noise_seed=None, use_efficient=True )
Returns TreeResidualSumQuery
with L2 norm clipping and Gaussian noise.
Args  

clip_norm

Each record will be clipped so that it has L2 norm at most
clip_norm .

noise_multiplier

The effective noise multiplier for the sum of records.
Noise standard deviation is clip_norm*noise_multiplier . The value can
be used as the input of the privacy accounting functions in
analysis.tree_aggregation_accountant .

record_specs

A nested structure of tf.TensorSpec s specifying structure
and shapes of records.

noise_seed

Integer seed for the Gaussian noise generator. If None , a
nondeterministic seed based on system time will be generated.

use_efficient

Boolean indicating the usage of the efficient tree aggregation algorithm based on the paper "Efficient Use of Differentially Private Binary Trees". 
derive_metrics
derive_metrics(
global_state
)
Derives metric information from the current global state.
Any metrics returned should be derived only from privatized quantities.
Args  

global_state

The global state from which to derive metrics. 
Returns  

A collections.OrderedDict mapping string metric names to tensor values.

derive_sample_params
derive_sample_params(
global_state
)
Implements tensorflow_privacy.DPQuery.derive_sample_params
.
get_noised_result
get_noised_result(
sample_state, global_state
)
Implements tensorflow_privacy.DPQuery.get_noised_result
.
Updates tree state, and returns residual of noised cumulative sum.
Args  

sample_state

Sum of clipped records for this round. 
global_state

Global state with current samples cumulative sum and tree state. 
Returns  

A tuple of (noised_cumulative_sum, new_global_state). 
initial_global_state
initial_global_state()
Implements tensorflow_privacy.DPQuery.initial_global_state
.
initial_sample_state
initial_sample_state(
template=None
)
Implements tensorflow_privacy.DPQuery.initial_sample_state
.
merge_sample_states
merge_sample_states(
sample_state_1, sample_state_2
)
Implements tensorflow_privacy.DPQuery.merge_sample_states
.
preprocess_record
preprocess_record(
params, record
)
Implements tensorflow_privacy.DPQuery.preprocess_record
.
Args  

params

clip_value for the record.

record

The record to be processed. 
Returns  

Structure of clipped tensors. 
preprocess_record_l2_impl
preprocess_record_l2_impl(
params, record
)
Clips the l2 norm, returning the clipped record and the l2 norm.
Args  

params

The parameters for the sample. 
record

The record to be processed. 
Returns  

A tuple (preprocessed_records, l2_norm) where preprocessed_records is
the structure of preprocessed tensors, and l2_norm is the total l2 norm
before clipping.

reset_l2_clip_gaussian_noise
reset_l2_clip_gaussian_noise(
global_state, clip_norm, stddev
)
reset_state
reset_state(
noised_results, global_state
)
Returns state after resetting the tree.
This function will be used in restart_query.RestartQuery
after calling
get_noised_result
when the restarting condition is met.
Args  

noised_results

Noised results returned by get_noised_result .

global_state

Updated global state returned by get_noised_result , which
records noise for the conceptual cumulative sum of the current leaf
node, and tree state for the next conceptual cumulative sum.

Returns  

New global state with zero noise and restarted tree state. 