Help protect the Great Barrier Reef with TensorFlow on Kaggle

Record operations for automatic differentiation.

### Used in the notebooks

Operations are recorded if they are executed within this context manager and at least one of their inputs is being "watched".

Trainable variables (created by `tf.Variable` or `tf.compat.v1.get_variable`, where `trainable=True` is default in both cases) are automatically watched. Tensors can be manually watched by invoking the `watch` method on this context manager.

For example, consider the function `y = x * x`. The gradient at `x = 3.0` can be computed as:

````x = tf.constant(3.0)`
`with tf.GradientTape() as g:`
`  g.watch(x)`
`  y = x * x`
`dy_dx = g.gradient(y, x)`
`print(dy_dx)`
`tf.Tensor(6.0, shape=(), dtype=float32)`
```

GradientTapes can be nested to compute higher-order derivatives. For example,

````x = tf.constant(5.0)`
`with tf.GradientTape() as g:`
`  g.watch(x)`
`  with tf.GradientTape() as gg:`
`    gg.watch(x)`
`    y = x * x`
`  dy_dx = gg.gradient(y, x)  # dy_dx = 2 * x`
`d2y_dx2 = g.gradient(dy_dx, x)  # d2y_dx2 = 2`
`print(dy_dx)`
`tf.Tensor(10.0, shape=(), dtype=float32)`
`print(d2y_dx2)`
`tf.Tensor(2.0, shape=(), dtype=float32)`
```

By default, the resources held by a GradientTape are released as soon as GradientTape.gradient() method is called. To compute multiple gradients over the same computation, create a persistent gradient tape. This allows multiple calls to the gradient() method as resources are released when the tape object is garbage collected. For example:

````x = tf.constant(3.0)`
`with tf.GradientTape(persistent=True) as g:`
`  g.watch(x)`
`  y = x * x`
`  z = y * y`
`dz_dx = g.gradient(z, x)  # (4*x^3 at x = 3)`
`print(dz_dx)`
`tf.Tensor(108.0, shape=(), dtype=float32)`
`dy_dx = g.gradient(y, x)`
`print(dy_dx)`
`tf.Tensor(6.0, shape=(), dtype=float32)`
```

By default GradientTape will automatically watch any trainable variables that are accessed inside the context. If you want fine grained control over which variables are watched you can disable automatic tracking by passing `watch_accessed_variables=False` to the tape constructor:

````x = tf.Variable(2.0)`
`w = tf.Variable(5````