Wiki40B 语言模型

View on TensorFlow.org Run in Google Colab View on GitHub Download notebook 查看 TF Hub 模型

使用 TensorFlow Hub 中的 Wiki40B 语言模型生成类似 Wikipedia 的文本。

此笔记本介绍如何执行以下操作:

  • 在 TF-Hub 上加载属于 Wiki40b-LM 集合的 41 种单语言模型和 2 种多语言模型
  • 使用这些模型获取指定文本片段的复杂度、每层激活和单词嵌入向量
  • 按词例从种子文本片段生成文本

这些语言模型使用新发布并且经过清理的 Wiki40B 数据集(在 TensorFlow 数据集上提供)进行训练。训练设置基于论文 Wiki-40B: Multilingual Language Model Dataset

设置

Installing Dependencies

Imports

2021-08-13 21:02:56.325686: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0

选择语言

我们选择要从 TF-Hub 加载的语言模型以及要生成的文本长度

Using the https://tfhub.dev/google/wiki40b-lm-en/1 model to generate sequences of max length 20.

构建模型

现在,我们已经配置了要使用的预训练模型,接下来可以将其配置为生成最长 max_gen_len 的文本。首先,我们需要从 TF-Hub 加载语言模型,并输入一段初始文本,然后在生成文本时迭代提供词例。

Load the language model pieces

2021-08-13 21:03:06.683067: W tensorflow/core/common_runtime/graph_constructor.cc:1529] Importing a graph with a lower producer version 359 into an existing graph with producer version 716. Shape inference will have run different parts of the graph with different producer versions.

Construct the per-token generation graph

Build the statically unrolled graph for max_gen_len tokens

生成一些文本

我们来生成一些文本!首先,我们要设置一个文本 seed 来提示语言模型。

您可以使用一个预定义种子,也可以输入自己的种子(可选)。该文本将用作语言模型的种子,帮助提示语言下一步要生成什么文本。

您可以在生成的文章的特定部分之前使用以下专用词例。使用 _START_ARTICLE_ 表示文章开头,使用 _START_SECTION_ 表示一节的开头,而使用 _START_PARAGRAPH_ 可以生成文章中的文本。

Predefined Seeds

Enter your own seed (Optional).

Generating text from seed:

_START_ARTICLE_
1882 Prince Edward Island general election
_START_PARAGRAPH_
The 1882 Prince Edward Island election was held on May 8, 1882 to elect members of the House of Assembly of the province of Prince Edward Island, Canada.

Initialize session.

2021-08-13 21:04:07.149053: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-08-13 21:04:07.780896: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 21:04:07.781830: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-08-13 21:04:07.781868: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-13 21:04:07.787891: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-08-13 21:04:07.787999: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-08-13 21:04:07.789792: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-08-13 21:04:07.790166: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-08-13 21:04:07.792093: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-08-13 21:04:07.793898: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-08-13 21:04:07.794082: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-08-13 21:04:07.794188: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 21:04:07.795111: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 21:04:07.796026: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-08-13 21:04:07.796712: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-13 21:04:07.797257: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 21:04:07.798219: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-08-13 21:04:07.798326: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 21:04:07.799229: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 21:04:07.800092: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-08-13 21:04:07.800178: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-13 21:04:08.420911: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-13 21:04:08.420966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-08-13 21:04:08.420975: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-08-13 21:04:08.421227: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 21:04:08.422210: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 21:04:08.423110: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 21:04:08.423948: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
2021-08-13 21:04:15.172972: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2000179999 Hz

Generate text

2021-08-13 21:04:39.667995: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-08-13 21:04:40.084698: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
_START_ARTICLE_ Ernest Watson House _START_SECTION_ History _START_PARAGRAPH_ The house was built in 1975 by Eugene Pittrain

我们还可以查看模型的其他输出——复杂度、词例 ID、中间激活以及嵌入向量

ppl_result
array([23.507753], dtype=float32)
token_ids_result
array([[   8,    3, 6794, 1579, 1582,  721,  489,  448,    8,    5,   26,
        6794, 1579, 1582,  721,  448,   17,  245,   22,  166, 2928, 6794,
          16, 7690,  384,   11,    7,  402,   11, 1172,   11,    7, 2115,
          11, 1579, 1582,  721,    9,  646,   10]], dtype=int32)
activations_result.shape
(12, 1, 39, 768)
embeddings_result
array([[[ 0.12262525,  5.548009  ,  1.4743135 , ...,  2.4388404 ,
         -2.2788858 ,  2.172028  ],
        [-2.3905468 , -0.97108954, -1.5513545 , ...,  8.458472  ,
         -2.8723319 ,  0.6534524 ],
        [-0.83790785,  0.41630274, -0.8740793 , ...,  1.6446769 ,
         -0.9074106 ,  0.3339265 ],
        ...,
        [-0.8054745 , -1.2495526 ,  2.6232922 , ...,  2.893288  ,
         -0.91287214, -1.1259722 ],
        [ 0.64944506,  3.3696785 ,  0.09543293, ..., -0.7839227 ,
         -1.3573489 ,  1.862214  ],
        [-1.2970612 ,  0.5961366 ,  3.3531897 , ...,  3.2853985 ,
         -1.6212384 ,  0.30257902]]], dtype=float32)