tfl.kronecker_factored_lattice_lib.custom_reduce_prod

tf.reduce_prod(t, axis) with faster custom gradient.

Shows comparable speed on CPU, up to 2x speed up on GPU, and 7x on TPU.

t The tensor to reduce.
axis The dimension to reduce.

prod(t) and grad(prod(t))