Resposta da pergunta de BERT com o TensorFlow Lite Model Maker

Ver no TensorFlow.org Executar no Google Colab Ver fonte no GitHub Baixar caderno

A biblioteca Criador TensorFlow Lite Modelo simplifica o processo de adaptação e conversão de um modelo TensorFlow a determinados dados de entrada ao implantar este modelo para aplicações ML no dispositivo.

Este bloco de notas mostra um exemplo de ponta a ponta que utiliza a biblioteca Model Maker para ilustrar a adaptação e conversão de um modelo de resposta de pergunta comumente usado para a tarefa de resposta de pergunta.

Introdução à tarefa de resposta de perguntas de BERT

A tarefa suportada nesta biblioteca é uma tarefa de resposta de pergunta extrativa, o que significa que, dada uma passagem e uma pergunta, a resposta é o intervalo da passagem. A imagem abaixo mostra um exemplo de resposta à pergunta.

As respostas são vãos na passagem (crédito de imagem: Esquadrão do blog )

Quanto ao modelo de tarefa de resposta de pergunta, as entradas devem ser a passagem e o par de perguntas que já foram pré-processados, as saídas devem ser os logits de início e logits finais para cada token na passagem. O tamanho da entrada pode ser definido e ajustado de acordo com o comprimento da passagem e pergunta.

Visão geral de ponta a ponta

O trecho de código a seguir demonstra como obter o modelo em algumas linhas de código. O processo geral inclui 5 etapas: (1) escolher um modelo, (2) carregar dados, (3) treinar novamente o modelo, (4) avaliar e (5) exportá-lo para o formato TensorFlow Lite.

# Chooses a model specification that represents the model.
spec = model_spec.get('mobilebert_qa')

# Gets the training data and validation data.
train_data = DataLoader.from_squad(train_data_path, spec, is_training=True)
validation_data = DataLoader.from_squad(validation_data_path, spec, is_training=False)

# Fine-tunes the model.
model = question_answer.create(train_data, model_spec=spec)

# Gets the evaluation result.
metric = model.evaluate(validation_data)

# Exports the model to the TensorFlow Lite format with metadata in the export directory.
model.export(export_dir)

As seções a seguir explicam o código com mais detalhes.

Pré-requisitos

Para executar esse exemplo, instalar os pacotes necessários, incluindo o pacote Fabricante Modelo do repo GitHub .

pip install -q tflite-model-maker

Importe os pacotes necessários.

import numpy as np
import os

import tensorflow as tf
assert tf.__version__.startswith('2')

from tflite_model_maker import model_spec
from tflite_model_maker import question_answer
from tflite_model_maker.config import ExportFormat
from tflite_model_maker.question_answer import DataLoader
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_addons/utils/ensure_tf_install.py:67: UserWarning: Tensorflow Addons supports using Python ops for all Tensorflow versions above or equal to 2.3.0 and strictly below 2.6.0 (nightly versions are not supported). 
 The versions of TensorFlow you are currently using is 2.6.0 and is not supported. 
Some things might work, some things might not.
If you were to encounter a bug, do not file an issue.
If you want to make sure you're using a tested and supported configuration, either change the TensorFlow version or the TensorFlow Addons's version. 
You can find the compatibility matrix in TensorFlow Addon's readme:
https://github.com/tensorflow/addons
  UserWarning,
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/numba/core/errors.py:154: UserWarning: Insufficiently recent colorama version found. Numba requires colorama >= 0.3.9
  warnings.warn(msg)

A "Visão geral de ponta a ponta" demonstra um exemplo ponta a ponta simples. As seções a seguir percorrem o exemplo passo a passo para mostrar mais detalhes.

Escolha um model_spec que represente um modelo para a resposta da pergunta

Cada model_spec objeto representa um modelo específico para resposta da pergunta. O Model Maker atualmente suporta os modelos MobileBERT e BERT-Base.

Modelo Suportado Nome do model_spec Descrição do Modelo
MobileBERT 'mobilebert_qa' 4,3x menor e 5,5x mais rápido que o BERT-Base, ao mesmo tempo em que atinge resultados competitivos, adequados para o cenário no dispositivo.
MobileBERT-SQuAD 'mobilebert_qa_squad' Arquitetura modelo mesmo como modelo MobileBERT eo modelo inicial já está treinado novamente em SQuAD1.1 .
BERT-Base 'bert_qa' Modelo de BERT padrão amplamente utilizado em tarefas de PNL.

Neste tutorial, MobileBERT-SQuad é utilizado como um exemplo. Desde que o modelo já está treinado novamente em SQuAD1.1 , poderia cobertura mais rápida para a tarefa resposta da pergunta.

spec = model_spec.get('mobilebert_qa_squad')
2021-08-12 11:59:51.438945: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:59:51.447414: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:59:51.448405: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

Carregar dados de entrada específicos para um aplicativo de ML no dispositivo e pré-processar os dados

O TriviaQA é um conjunto de dados compreensão de leitura contendo mais de 650k triplos pergunta-resposta em evidências. Neste tutorial, você usará um subconjunto deste conjunto de dados para aprender como usar a biblioteca Model Maker.

Para carregar os dados, converter o conjunto de dados TriviaQA ao SQuAD1.1 formato executando o script Python conversor com --sample_size=8000 e um conjunto de web dados. Modifique um pouco o código de conversão:

  • Pular as amostras que não encontraram nenhuma resposta no documento de contexto;
  • Obtendo a resposta original no contexto, sem maiúsculas ou minúsculas.

Baixe a versão arquivada do conjunto de dados já convertido.

train_data_path = tf.keras.utils.get_file(
    fname='triviaqa-web-train-8000.json',
    origin='https://storage.googleapis.com/download.tensorflow.org/models/tflite/dataset/triviaqa-web-train-8000.json')
validation_data_path = tf.keras.utils.get_file(
    fname='triviaqa-verified-web-dev.json',
    origin='https://storage.googleapis.com/download.tensorflow.org/models/tflite/dataset/triviaqa-verified-web-dev.json')
Downloading data from https://storage.googleapis.com/download.tensorflow.org/models/tflite/dataset/triviaqa-web-train-8000.json
32571392/32570663 [==============================] - 0s 0us/step
32579584/32570663 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/download.tensorflow.org/models/tflite/dataset/triviaqa-verified-web-dev.json
1171456/1167744 [==============================] - 0s 0us/step
1179648/1167744 [==============================] - 0s 0us/step

Você também pode treinar o modelo MobileBERT com seu próprio conjunto de dados. Se você estiver executando este notebook no Colab, carregue seus dados usando a barra lateral esquerda.

Subir arquivo

Se você preferir não fazer o upload de seus dados para a nuvem, você também pode executar offline biblioteca, seguindo o guia .

Utilizar a DataLoader.from_squad método para pré-processar os carga e formato SQuad dados de acordo com uma específica model_spec . Você pode usar os formatos SQuAD2.0 ou SQuAD1.1. Ajustando o parâmetro version_2_with_negative como True meios dos formatos é SQuAD2.0. Caso contrário, o formato é SQuAD1.1. Por padrão, version_2_with_negative é False .

train_data = DataLoader.from_squad(train_data_path, spec, is_training=True)
validation_data = DataLoader.from_squad(validation_data_path, spec, is_training=False)
2021-08-12 12:02:04.752380: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-12 12:02:04.753227: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:04.754341: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:04.755202: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:05.293390: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:05.294462: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:05.295445: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:05.296323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14648 MB memory:  -> device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0

Personalize o modelo do TensorFlow

Crie um modelo de resposta de pergunta personalizado com base nos dados carregados. A create função compreende os passos seguintes:

  1. Cria o modelo de resposta da pergunta de acordo com model_spec .
  2. Treine o modelo de perguntas e respostas. Os épocas por defeito e do tamanho do lote padrão são definidas de acordo com duas variáveis default_training_epochs e default_batch_size na model_spec objecto.
model = question_answer.create(train_data, model_spec=spec)
INFO:tensorflow:Retraining the models...
INFO:tensorflow:Retraining the models...
2021-08-12 12:02:17.450548: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/2
1067/1067 [==============================] - 423s 350ms/step - loss: 1.1346 - start_positions_loss: 1.1321 - end_positions_loss: 1.1371
Epoch 2/2
1067/1067 [==============================] - 373s 350ms/step - loss: 0.7933 - start_positions_loss: 0.7927 - end_positions_loss: 0.7939

Dê uma olhada na estrutura detalhada do modelo.

model.summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_word_ids (InputLayer)     [(None, 384)]        0                                            
__________________________________________________________________________________________________
input_mask (InputLayer)         [(None, 384)]        0                                            
__________________________________________________________________________________________________
input_type_ids (InputLayer)     [(None, 384)]        0                                            
__________________________________________________________________________________________________
hub_keras_layer_v1v2 (HubKerasL {'start_logits': (No 24582914    input_word_ids[0][0]             
                                                                 input_mask[0][0]                 
                                                                 input_type_ids[0][0]             
__________________________________________________________________________________________________
start_positions (Lambda)        (None, None)         0           hub_keras_layer_v1v2[0][1]       
__________________________________________________________________________________________________
end_positions (Lambda)          (None, None)         0           hub_keras_layer_v1v2[0][0]       
==================================================================================================
Total params: 24,582,914
Trainable params: 24,582,914
Non-trainable params: 0
__________________________________________________________________________________________________

Avalie o modelo personalizado

Avaliar o modelo nos dados de validação e obter um dicionário de métricas incluindo f1 pontuação e exact match etc. Note-se que as métricas são diferentes para SQuAD1.1 e SQuAD2.0.

model.evaluate(validation_data)
INFO:tensorflow:Made predictions for 200 records.
INFO:tensorflow:Made predictions for 200 records.
INFO:tensorflow:Made predictions for 400 records.
INFO:tensorflow:Made predictions for 400 records.
INFO:tensorflow:Made predictions for 600 records.
INFO:tensorflow:Made predictions for 600 records.
INFO:tensorflow:Made predictions for 800 records.
INFO:tensorflow:Made predictions for 800 records.
INFO:tensorflow:Made predictions for 1000 records.
INFO:tensorflow:Made predictions for 1000 records.
INFO:tensorflow:Made predictions for 1200 records.
INFO:tensorflow:Made predictions for 1200 records.
{'exact_match': 0.5884353741496599, 'final_f1': 0.6621698029861295}

Exportar para o modelo TensorFlow Lite

Converter o modelo treinado para o formato modelo TensorFlow Lite com metadados de modo que você pode usar mais tarde em um aplicativo ML no dispositivo. O arquivo de vocabulário está embutido em metadados. O nome do arquivo TFLite padrão é model.tflite .

Em muitos aplicativos de ML no dispositivo, o tamanho do modelo é um fator importante. Portanto, é recomendado que você aplique quantizar o modelo para torná-lo menor e potencialmente rodar mais rápido. A técnica de quantização pós-treinamento padrão é a quantização de faixa dinâmica para os modelos BERT e MobileBERT.

model.export(export_dir='.')
2021-08-12 12:16:05.811327: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
INFO:tensorflow:Assets written to: /tmp/tmp7t_bxd9h/saved_model/assets
INFO:tensorflow:Assets written to: /tmp/tmp7t_bxd9h/saved_model/assets
2021-08-12 12:16:35.499794: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:351] Ignored output_format.
2021-08-12 12:16:35.499841: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:354] Ignored drop_control_dependency.
2021-08-12 12:16:35.499849: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:360] Ignored change_concat_input_ranges.
2021-08-12 12:16:35.501017: I tensorflow/cc/saved_model/reader.cc:38] Reading SavedModel from: /tmp/tmp7t_bxd9h/saved_model
2021-08-12 12:16:35.567920: I tensorflow/cc/saved_model/reader.cc:90] Reading meta graph with tags { serve }
2021-08-12 12:16:35.567966: I tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /tmp/tmp7t_bxd9h/saved_model
2021-08-12 12:16:35.922151: I tensorflow/cc/saved_model/loader.cc:211] Restoring SavedModel bundle.
2021-08-12 12:16:37.787828: I tensorflow/cc/saved_model/loader.cc:195] Running initialization op on SavedModel bundle at path: /tmp/tmp7t_bxd9h/saved_model
2021-08-12 12:16:38.783520: I tensorflow/cc/saved_model/loader.cc:283] SavedModel load for tags { serve }; Status: success: OK. Took 3282507 microseconds.
2021-08-12 12:16:40.489883: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:210] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2021-08-12 12:16:43.756590: I tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1899] Estimated count of arithmetic ops: 18.380 G  ops, equivalently 9.190 G  MACs
2021-08-12 12:16:43.920701: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.920748: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.920754: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.920759: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.920765: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.920770: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.920775: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.920780: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.920797: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.920801: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.920806: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.920811: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.920817: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.920822: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.920826: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.920833: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.920848: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.920853: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.920858: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.920863: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.920870: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.920874: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.920879: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.920883: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.920897: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.920902: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.920907: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.920911: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.920917: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.920922: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.920926: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.920931: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.920949: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.920954: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.920958: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.920964: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.920970: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.920975: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.920980: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.920985: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.920999: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921004: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921008: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921013: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921020: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921024: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921029: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921033: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921048: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921052: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921057: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921062: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921068: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921072: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921077: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921082: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921096: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921100: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921105: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921110: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921116: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921120: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921125: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921129: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921146: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921151: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921155: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921168: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921174: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921179: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921183: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921188: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921201: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921206: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921211: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921216: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921222: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921227: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921232: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921236: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921250: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921254: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921259: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921264: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921270: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921275: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921280: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921284: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921298: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921302: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921307: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921312: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921318: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921323: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921327: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921333: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921348: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921354: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921359: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921363: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921370: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921374: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921379: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921384: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921398: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921403: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921408: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921412: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921418: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921423: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921428: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921432: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921446: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921451: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921456: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921461: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921467: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921472: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921477: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921481: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921507: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921512: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921517: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921521: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921527: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921532: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921537: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921542: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921556: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921560: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921565: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921570: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921576: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921581: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921585: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921590: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921604: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921609: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921613: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921618: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921624: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921628: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921633: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921638: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921651: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921655: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921660: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921665: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921672: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921676: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921681: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921686: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921699: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921704: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921708: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921714: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921720: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921724: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921729: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921734: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921747: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921752: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921756: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921760: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921766: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921771: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921776: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921780: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921795: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921799: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921804: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921808: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921815: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921820: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921824: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921829: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921843: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921848: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921853: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921857: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921863: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921868: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921872: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921877: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921890: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul20 because it has no allocated buffer.
2021-08-12 12:16:43.921894: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul22 because it has no allocated buffer.
2021-08-12 12:16:43.921899: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul24 because it has no allocated buffer.
2021-08-12 12:16:43.921903: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul26 because it has no allocated buffer.
2021-08-12 12:16:43.921909: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul_125 because it has no allocated buffer.
2021-08-12 12:16:43.921914: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul_127 because it has no allocated buffer.
2021-08-12 12:16:43.921918: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul_129 because it has no allocated buffer.
2021-08-12 12:16:43.921923: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul_131 because it has no allocated buffer.
INFO:tensorflow:Vocab file is inside the TFLite model with metadata.
INFO:tensorflow:Vocab file is inside the TFLite model with metadata.
INFO:tensorflow:Saved vocabulary in /tmp/tmpjncdf_eu/vocab.txt.
INFO:tensorflow:Saved vocabulary in /tmp/tmpjncdf_eu/vocab.txt.
INFO:tensorflow:Finished populating metadata and associated file to the model:
INFO:tensorflow:Finished populating metadata and associated file to the model:
INFO:tensorflow:./model.tflite
INFO:tensorflow:./model.tflite
INFO:tensorflow:The associated file that has been been packed to the model is:
INFO:tensorflow:The associated file that has been been packed to the model is:
INFO:tensorflow:['vocab.txt']
INFO:tensorflow:['vocab.txt']
INFO:tensorflow:TensorFlow Lite model exported successfully: ./model.tflite
INFO:tensorflow:TensorFlow Lite model exported successfully: ./model.tflite

Você pode usar o arquivo de modelo TensorFlow Lite no bert_qa aplicativo de referência usando BertQuestionAnswerer API na Biblioteca de Tarefas Lite TensorFlow por download a partir da barra lateral esquerda em Colab.

Os formatos de exportação permitidos podem ser um ou uma lista dos seguintes:

Por padrão, ele apenas exporta o modelo TensorFlow Lite com metadados. Você também pode exportar arquivos diferentes seletivamente. Por exemplo, exportando apenas o arquivo de vocabulário da seguinte forma:

model.export(export_dir='.', export_format=ExportFormat.VOCAB)
INFO:tensorflow:Saved vocabulary in ./vocab.txt.
INFO:tensorflow:Saved vocabulary in ./vocab.txt.

Você também pode avaliar o modelo tflite com o evaluate_tflite método. Esta etapa deve levar muito tempo.

model.evaluate_tflite('model.tflite', validation_data)
INFO:tensorflow:Made predictions for 100 records.
INFO:tensorflow:Made predictions for 100 records.
INFO:tensorflow:Made predictions for 200 records.
INFO:tensorflow:Made predictions for 200 records.
INFO:tensorflow:Made predictions for 300 records.
INFO:tensorflow:Made predictions for 300 records.
INFO:tensorflow:Made predictions for 400 records.
INFO:tensorflow:Made predictions for 400 records.
INFO:tensorflow:Made predictions for 500 records.
INFO:tensorflow:Made predictions for 500 records.
INFO:tensorflow:Made predictions for 600 records.
INFO:tensorflow:Made predictions for 600 records.
INFO:tensorflow:Made predictions for 700 records.
INFO:tensorflow:Made predictions for 700 records.
INFO:tensorflow:Made predictions for 800 records.
INFO:tensorflow:Made predictions for 800 records.
INFO:tensorflow:Made predictions for 900 records.
INFO:tensorflow:Made predictions for 900 records.
INFO:tensorflow:Made predictions for 1000 records.
INFO:tensorflow:Made predictions for 1000 records.
INFO:tensorflow:Made predictions for 1100 records.
INFO:tensorflow:Made predictions for 1100 records.
INFO:tensorflow:Made predictions for 1200 records.
INFO:tensorflow:Made predictions for 1200 records.
{'exact_match': 0.5918367346938775, 'final_f1': 0.6682598580557765}

Uso Avançado

A create função é a parte crítica desta biblioteca em que o model_spec parâmetro define a especificação do modelo. O BertQASpec classe é suportado atualmente. Existem 2 modelos: modelo MobileBERT, modelo BERT-Base. A create função compreende os passos seguintes:

  1. Cria o modelo de resposta da pergunta de acordo com model_spec .
  2. Treine o modelo de perguntas e respostas.

Esta seção descreve vários tópicos avançados, incluindo ajuste do modelo, ajuste dos hiperparâmetros de treinamento etc.

Ajuste o modelo

Você pode ajustar a infra-estrutura de modelo como parâmetros seq_len e query_len na BertQASpec classe.

Parâmetros ajustáveis ​​para o modelo:

  • seq_len : Comprimento da passagem de alimentação para o modelo.
  • query_len : Comprimento de questão a alimentação para o modelo.
  • doc_stride : O passo ao fazer uma abordagem de janela deslizante para tirar pedaços de documentos.
  • initializer_range : O stdev do truncated_normal_initializer para inicializar todas as matrizes de peso.
  • trainable : booleano, se camada pré-treinado é treinável.

Parâmetros ajustáveis ​​para pipeline de treinamento:

  • model_dir : A localização dos arquivos de modelo de ponto de verificação. Se não for definido, o diretório temporário será usado.
  • dropout_rate : A taxa de abandono.
  • learning_rate : A taxa de aprendizado inicial para Adam.
  • predict_batch_size : Tamanho do lote para predição.
  • tpu : endereço de TPU para conectar-se. Usado apenas se estiver usando tpu.

Por exemplo, você pode treinar o modelo com um comprimento de sequência mais longo. Se você alterar o modelo, você deve primeiro construir uma nova model_spec .

new_spec = model_spec.get('mobilebert_qa')
new_spec.seq_len = 512

As etapas restantes são as mesmas. Note que você deve executar novamente tanto o dataloader e create peças como as diferentes especificações de modelo podem ter diferentes etapas de pré-processamento.

Ajustar hiperparâmetros de treinamento

Você também pode sintonizar os hiperparâmetros de treinamento como epochs e batch_size para impactar o desempenho do modelo. Por exemplo,

  • epochs : mais épocas poderia obter um melhor desempenho, mas pode levar a overfitting.
  • batch_size : número de amostras a utilizar em uma etapa de treinamento.

Por exemplo, você pode treinar com mais épocas e com um tamanho de lote maior, como:

model = question_answer.create(train_data, model_spec=spec, epochs=5, batch_size=64)

Mudar a arquitetura do modelo

Você pode alterar a base de modelar seus trens de dados on alterando o model_spec . Por exemplo, para mudar para o modelo BERT-Base, execute:

spec = model_spec.get('bert_qa')

As etapas restantes são as mesmas.

Personalize a quantização pós-treinamento no modelo TensorFlow Lite

Quantização pós-treino é uma técnica de conversão que pode reduzir o tamanho do modelo e latência inferência, ao mesmo tempo melhorar a velocidade da CPU e acelerador de hardware inferência, com um pouco de degradação na precisão do modelo. Portanto, é amplamente utilizado para otimizar o modelo.

A biblioteca do Model Maker aplica uma técnica de quantização pós-treinamento padrão ao exportar o modelo. Se você quiser personalizar quantização pós-treino, Model Maker suporta múltiplas opções de pós-formação de quantização usando QuantizationConfig também. Vamos tomar a quantização float16 como uma instância. Primeiro, defina a configuração de quantização.

config = QuantizationConfig.for_float16()

Em seguida, exportamos o modelo TensorFlow Lite com essa configuração.

model.export(export_dir='.', tflite_filename='model_fp16.tflite', quantization_config=config)

Consulte Mais informação

Você pode ler a nossa BERT Pergunta e resposta exemplo para aprender detalhes técnicos. Para obter mais informações, consulte: