View source on GitHub |
Time a TensorFlow function under a variety of strategies and hardware.
tfp.debugging.benchmarking.benchmark_tf_function(
user_fn,
iters=1,
config=default_benchmark_config(),
extra_columns=None,
use_autograph=False,
print_intermediates=False,
cpu_device='cpu:0',
gpu_device='gpu:0'
)
Runs the callable user_fn
iters
times under the strategies (any of Eager,
tfe.function + graph, and XLA) and hardware (CPU, GPU).
Example:
data_dicts = []
for inner_iters in [10, 100]:
for size in [100, 1000]:
def f():
total = tf.constant(0.0)
for _ in np.arange(inner_iters):
m = tf.random.uniform((size, size))
total += tf.reduce_sum(tf.matmul(m, m))
return total
data_dicts += benchmark_tf_function.benchmark_tf_function(
f,
iters=5,
extra_columns={'inner_iters': inner_iters,
'size': size})