Thanks for tuning in to Google I/O. View all sessions on demandWatch on demand

使用 CelebA 渐进式 GAN 模型生成人工面部

View 在 TensorFlow.org 上查看 在 Google Colab 中运行 在 GitHub 上查看源代码 下载笔记本 查看 TF Hub 模型

本 Colab 演示了如何使用基于生成对抗网络 (GAN) 的 TF-Hub 模块。该模块从 N 维向量(称为隐空间)映射到 RGB 图像。

本文提供了两个示例:

  • 从隐空间映射到图像,以及
  • 提供一个目标图像,利用梯度下降法找到生成与目标图像相似的图像的隐向量。

可选前提条件

更多模型

这里可以找到 tfhub.dev 上当前托管的所有模型,您可以使用这些模型生成图像。

设置

# Install imageio for creating animations.
pip -q install imageio
pip -q install scikit-image
pip install git+https://github.com/tensorflow/docs

Imports and function definitions

2022-12-14 21:13:48.531051: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2022-12-14 21:13:48.531155: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2022-12-14 21:13:48.531164: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

隐空间插值法

随机向量

两个随机初始化向量之间的隐空间插值。我们将使用包含预训练渐进式 GAN 的 TF-Hub 模块 progan-128

progan = hub.load("https://tfhub.dev/google/progan-128/1").signatures['default']
def interpolate_between_vectors():
  v1 = tf.random.normal([latent_dim])
  v2 = tf.random.normal([latent_dim])

  # Creates a tensor with 25 steps of interpolation between v1 and v2.
  vectors = interpolate_hypersphere(v1, v2, 50)

  # Uses module to generate images from the latent space.
  interpolated_images = progan(vectors)['default']

  return interpolated_images

interpolated_images = interpolate_between_vectors()
animate(interpolated_images)

gif

查找隐空间中的最近向量

确定目标图像。例如,使用从模块生成的图像或上传自己的图像。

image_from_module_space = True  # @param { isTemplate:true, type:"boolean" }

def get_module_space_image():
  vector = tf.random.normal([1, latent_dim])
  images = progan(vector)['default'][0]
  return images

def upload_image():
  uploaded = files.upload()
  image = imageio.imread(uploaded[list(uploaded.keys())[0]])
  return transform.resize(image, [128, 128])

if image_from_module_space:
  target_image = get_module_space_image()
else:
  target_image = upload_image()

display_image(target_image)

png

定义目标图像与隐空间变量生成的图像之后,我们可以利用梯度下降法找到最大限度减少损失的变量值。

tf.random.set_seed(42)
initial_vector = tf.random.normal([1, latent_dim])
display_image(progan(initial_vector)['default'][0])

png

def find_closest_latent_vector(initial_vector, num_optimization_steps,
                               steps_per_image):
  images = []
  losses = []

  vector = tf.Variable(initial_vector)  
  optimizer = tf.optimizers.Adam(learning_rate=0.01)
  loss_fn = tf.losses.MeanAbsoluteError(reduction="sum")

  for step in range(num_optimization_steps):
    if (step % 100)==0:
      print()
    print('.', end='')
    with tf.GradientTape() as tape:
      image = progan(vector.read_value())['default'][0]
      if (step % steps_per_image) == 0:
        images.append(image.numpy())
      target_image_difference = loss_fn(image, target_image[:,:,:3])
      # The latent vectors were sampled from a normal distribution. We can get
      # more realistic images if we regularize the length of the latent vector to 
      # the average length of vector from this distribution.
      regularizer = tf.abs(tf.norm(vector) - np.sqrt(latent_dim))

      loss = target_image_difference + regularizer
      losses.append(loss.numpy())
    grads = tape.gradient(loss, [vector])
    optimizer.apply_gradients(zip(grads, [vector]))

  return images, losses


num_optimization_steps=200
steps_per_image=5
images, loss = find_closest_latent_vector(initial_vector, num_optimization_steps, steps_per_image)
....................................................................................................
....................................................................................................
plt.plot(loss)
plt.ylim([0,max(plt.ylim())])
(0.0, 6696.250219726562)

png

animate(np.stack(images))

gif

将结果与目标进行对比:

display_image(np.concatenate([images[-1], target_image], axis=1))

png

试运行上述示例

如果图像来自模块空间,则下降很快且会收敛到合理的样本。如果尝试下降到不是来自模块空间的图像,则只有当图像相当接近训练图像的空间时,下降才会收敛。

如何使其更快速地下降并变成更真实的图像?您可以尝试:

  • 对图像差异使用不同的损失,例如二次方程,
  • 对隐向量使用不同的正则化器,
  • 在多次运行中从随机向量初始化,
  • 等等。