Wiki40B Language Models

View on TensorFlow.org Run in Google Colab View on GitHub Download notebook See TF Hub models

Generate Wikipedia-like text using the Wiki40B language models from TensorFlow Hub!

This notebook illustrates how to:

  • Load the 41 monolingual and 2 multilingual language models that are part of the Wiki40b-LM collection on TF-Hub
  • Use the models to obtain perplexity, per layer activations, and word embeddings for a given piece of text
  • Generate text token-by-token from a piece of seed text

The language models are trained on the newly published, cleaned-up Wiki40B dataset available on TensorFlow Datasets. The training setup is based on the paper “Wiki-40B: Multilingual Language Model Dataset”.

Setup

Installing Dependencies

Imports

2021-07-29 11:12:44.079951: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0

Choose Language

Let's choose which language model to load from TF-Hub and the length of text to be generated.

Using the https://tfhub.dev/google/wiki40b-lm-en/1 model to generate sequences of max length 20.

Build the Model

Okay, now that we've configured which pre-trained model to use, let's configure it to generate text up to max_gen_len. We will need to load the language model from TF-Hub, feed in a piece of starter text, and then iteratively feed in tokens as they are generated.

Load the language model pieces

2021-07-29 11:12:56.943119: W tensorflow/core/common_runtime/graph_constructor.cc:1529] Importing a graph with a lower producer version 359 into an existing graph with producer version 716. Shape inference will have run different parts of the graph with different producer versions.

Construct the per-token generation graph

Build the statically unrolled graph for max_gen_len tokens

Generate some text

Let's generate some text! We'll set a text seed to prompt the language model.

You can use one of the predefined seeds or optionally enter your own. This text will be used as seed for the language model to help prompt the language model for what to generate next.

You can use the following special tokens precede special parts of the generated article. Use _START_ARTICLE_ to indicate the beginning of the article, _START_SECTION_ to indicate the beginning of a section, and _START_PARAGRAPH_ to generate text in the article

Predefined Seeds

Enter your own seed (Optional).

Generating text from seed:

_START_ARTICLE_
1882 Prince Edward Island general election
_START_PARAGRAPH_
The 1882 Prince Edward Island election was held on May 8, 1882 to elect members of the House of Assembly of the province of Prince Edward Island, Canada.

Initialize session.

2021-07-29 11:14:02.521710: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-07-29 11:14:03.234092: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:14:03.235135: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-29 11:14:03.235174: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-29 11:14:03.241070: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-07-29 11:14:03.241201: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-07-29 11:14:03.243019: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-07-29 11:14:03.243467: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-07-29 11:14:03.245436: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-07-29 11:14:03.247079: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-07-29 11:14:03.247303: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-07-29 11:14:03.247467: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:14:03.248577: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:14:03.249532: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-29 11:14:03.250442: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-07-29 11:14:03.251081: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:14:03.252183: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-29 11:14:03.252322: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:14:03.253331: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:14:03.254314: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-29 11:14:03.254391: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-29 11:14:03.967949: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-29 11:14:03.968006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-29 11:14:03.968016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-29 11:14:03.968287: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:14:03.969348: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:14:03.970343: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:14:03.971311: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
2021-07-29 11:14:11.453361: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2000179999 Hz

Generate text

2021-07-29 11:14:39.248200: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-07-29 11:14:39.719264: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
_NEWLINE_The election was a show of how 123 of the members of the 36-member house were considered

We can also look at the other outputs of the model - the perplexity, the token ids, the intermediate activations, and the embeddings

ppl_result
array([23.507753], dtype=float32)
token_ids_result
array([[   8,    3, 6794, 1579, 1582,  721,  489,  448,    8,    5,   26,
        6794, 1579, 1582,  721,  448,   17,  245,   22,  166, 2928, 6794,
          16, 7690,  384,   11,    7,  402,   11, 1172,   11,    7, 2115,
          11, 1579, 1582,  721,    9,  646,   10]], dtype=int32)
activations_result.shape
(12, 1, 39, 768)
embeddings_result
array([[[ 0.12262525,  5.548009  ,  1.4743135 , ...,  2.4388404 ,
         -2.2788858 ,  2.172028  ],
        [-2.3905468 , -0.97108954, -1.5513545 , ...,  8.458472  ,
         -2.8723319 ,  0.6534524 ],
        [-0.83790785,  0.41630274, -0.8740793 , ...,  1.6446769 ,
         -0.9074106 ,  0.3339265 ],
        ...,
        [-0.8054745 , -1.2495526 ,  2.6232922 , ...,  2.893288  ,
         -0.91287214, -1.1259722 ],
        [ 0.64944506,  3.3696785 ,  0.09543293, ..., -0.7839227 ,
         -1.3573489 ,  1.862214  ],
        [-1.2970612 ,  0.5961366 ,  3.3531897 , ...,  3.2853985 ,
         -1.6212384 ,  0.30257902]]], dtype=float32)