Multilingual Universal Sentence Encoder Q&A Retrieval

View on TensorFlow.org Run in Google Colab View on GitHub Download notebook See TF Hub models

This is a demo for using Univeral Encoder Multilingual Q&A model for question-answer retrieval of text, illustrating the use of question_encoder and response_encoder of the model. We use sentences from SQuAD paragraphs as the demo dataset, each sentence and its context (the text surrounding the sentence) is encoded into high dimension embeddings with the response_encoder. These embeddings are stored in an index built using the simpleneighbors library for question-answer retrieval.

On retrieval a random question is selected from the SQuAD dataset and encoded into high dimension embedding with the question_encoder and query the simpleneighbors index returning a list of approximate nearest neighbors in semantic space.

More models

You can find all currently hosted text embedding models here and all models that have been trained on SQuAD as well here.

Setup

Setup Environment

%%capture
# Install the latest Tensorflow version.
!pip install -q tensorflow_text
!pip install -q simpleneighbors[annoy]
!pip install -q nltk
!pip install -q tqdm

Setup common imports and functions

2021-07-29 11:39:05.044788: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
[nltk_data] Downloading package punkt to /home/kbuilder/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.

Run the following code block to download and extract the SQuAD dataset into:

  • sentences is a list of (text, context) tuples - each paragraph from the SQuAD dataset are splitted into sentences using nltk library and the sentence and paragraph text forms the (text, context) tuple.
  • questions is a list of (question, answer) tuples.

Download and extract SQuAD data

squad_url = 'https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json'

squad_json = download_squad(squad_url)
sentences = extract_sentences_from_squad_json(squad_json)
questions = extract_questions_from_squad_json(squad_json)
print("%s sentences, %s questions extracted from SQuAD %s" % (len(sentences), len(questions), squad_url))

print("\nExample sentence and context:\n")
sentence = random.choice(sentences)
print("sentence:\n")
pprint.pprint(sentence[0])
print("\ncontext:\n")
pprint.pprint(sentence[1])
print()
10455 sentences, 10552 questions extracted from SQuAD https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json

Example sentence and context:

sentence:

('As a result, with the exception of the largest markets, ABC was relegated to '
 'secondary status on one or both of the existing stations, usually via '
 'off-hours clearances (a notable exception during this time was WKST-TV in '
 'Youngstown, Ohio, now WYTV, despite the small size of the surrounding market '
 'and its close proximity to Cleveland and Pittsburgh even decades before the '
 "city's economic collapse).")

context:

('As a result, with the exception of the largest markets, ABC was relegated to '
 'secondary status on one or both of the existing stations, usually via '
 'off-hours clearances (a notable exception during this time was WKST-TV in '
 'Youngstown, Ohio, now WYTV, despite the small size of the surrounding market '
 'and its close proximity to Cleveland and Pittsburgh even decades before the '
 "city's economic collapse). According to Goldenson, this meant that an hour "
 'of ABC programming reported five times lower viewership than its '
 "competitors. However, the network's intake of money at the time would allow "
 "it to accelerate its content production. Still, ABC's limited reach would "
 'continue to hobble it for the next two decades; several smaller markets '
 'would not grow large enough to support a full-time ABC affiliate until the '
 '1960s, with some very small markets having to wait as late as the 1980s or '
 'even the advent of digital television in the 2000s, which allowed stations '
 'like WTRF-TV in Wheeling, West Virginia to begin airing ABC programming on a '
 "digital subchannel after airing the network's programs outside of "
 'recommended timeslots decades before.')

The following code block setup the tensorflow graph g and session with the Univeral Encoder Multilingual Q&A model's question_encoder and response_encoder signatures.

Load model from tensorflow hub

2021-07-29 11:39:17.800072: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-07-29 11:39:18.457510: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:39:18.458540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-29 11:39:18.458591: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-29 11:39:18.462207: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-07-29 11:39:18.462328: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-07-29 11:39:18.463419: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-07-29 11:39:18.463795: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-07-29 11:39:18.464903: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-07-29 11:39:18.465811: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-07-29 11:39:18.466000: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-07-29 11:39:18.466119: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:39:18.467066: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:39:18.467989: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-29 11:39:18.468511: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-07-29 11:39:18.469131: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:39:18.470030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-29 11:39:18.470123: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:39:18.471024: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:39:18.471927: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-29 11:39:18.471981: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-29 11:39:19.076400: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-29 11:39:19.076437: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-29 11:39:19.076445: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-29 11:39:19.076755: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:39:19.077888: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:39:19.078905: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-29 11:39:19.079878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
2021-07-29 11:39:26.438160: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-07-29 11:39:26.529811: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2000179999 Hz

The following code block compute the embeddings for all the text, context tuples and store them in a simpleneighbors index using the response_encoder.

Compute embeddings and build simpleneighbors index

2021-07-29 11:39:28.496160: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
Computing embeddings for 10455 sentences
2021-07-29 11:39:28.910219: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
0%|          | 0/104 [00:00<?, ?it/s]
simpleneighbors index for 10455 sentences built.

On retrieval, the question is encoded using the question_encoder and the question embedding is used to query the simpleneighbors index.

Retrieve nearest neighbors for a random question from SQuAD

num_results = 25

query = random.choice(questions)
display_nearest_neighbors(query[0], query[1])