Transfer Learning for the Audio Domain with TensorFlow Lite Model Maker

View on TensorFlow.org Run in Google Colab View source on GitHub Download notebook See TF Hub model

In this colab notebook, you'll learn how to use the TensorFlow Lite Model Maker to train a custom audio classification model.

The Model Maker library uses transfer learning to simplify the process of training a TensorFlow Lite model using a custom dataset. Retraining a TensorFlow Lite model with your own custom dataset reduces the amount of training data and time required.

It is part of the Codelab to Customize an Audio model and deploy on Android.

You'll use a custom birds dataset and export a TFLite model that can be used on a phone, a TensorFlow.JS model that can be used for inference in the browser and also a SavedModel version that you can use for serving.

Intalling dependencies

Model Maker for the Audio domain needs TensorFlow 2.5 to work.

 pip install tflite-model-maker tensorflow==2.5

Import TensorFlow, Model Maker and other libraries

Among the dependencies that are needed, you'll use TensorFlow and Model Maker. Aside those, the others are for audio manipulation, playing and visualizations.

import tensorflow as tf
import tflite_model_maker as mm
from tflite_model_maker import audio_classifier
import os

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import itertools
import glob
import random

from IPython.display import Audio, Image
from scipy.io import wavfile

print(f"TensorFlow Version: {tf.__version__}")
print(f"Model Maker Version: {mm.__version__}")
2021-08-13 11:35:53.667497: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/numba/core/errors.py:154: UserWarning: Insufficiently recent colorama version found. Numba requires colorama >= 0.3.9
  warnings.warn(msg)
TensorFlow Version: 2.5.0
Model Maker Version: 0.3.2

The Birds dataset

The Birds dataset is an education collection of 5 types of birds songs:

  • White-breasted Wood-Wren
  • House Sparrow
  • Red Crossbill
  • Chestnut-crowned Antpitta
  • Azara's Spinetail

The original audio came from Xeno-canto which is a website dedicated to sharing bird sounds from all over the world.

Let's start by downloading the data.

birds_dataset_folder = tf.keras.utils.get_file('birds_dataset.zip',
                                                'https://storage.googleapis.com/laurencemoroney-blog.appspot.com/birds_dataset.zip',
                                                cache_dir='./',
                                                cache_subdir='dataset',
                                                extract=True)
Downloading data from https://storage.googleapis.com/laurencemoroney-blog.appspot.com/birds_dataset.zip
343687168/343680986 [==============================] - 5s 0us/step

Explore the data

The audios are already split in train and test folders. Inside each split folder, there's one folder for each bird, using their bird_code as name.

The audios are all mono and with 16kHz sample rate.

For more information about each file, you can read the metadata.csv file. It contains all the files authors, lincenses and some more information. You won't need to read it yourself on this tutorial.

# @title [Run this] Util functions and data structures.

data_dir = './dataset/small_birds_dataset'

bird_code_to_name = {
  'wbwwre1': 'White-breasted Wood-Wren',
  'houspa': 'House Sparrow',
  'redcro': 'Red Crossbill',  
  'chcant2': 'Chestnut-crowned Antpitta',
  'azaspi1': "Azara's Spinetail",   
}

birds_images = {
  'wbwwre1': 'https://upload.wikimedia.org/wikipedia/commons/thumb/2/22/Henicorhina_leucosticta_%28Cucarachero_pechiblanco%29_-_Juvenil_%2814037225664%29.jpg/640px-Henicorhina_leucosticta_%28Cucarachero_pechiblanco%29_-_Juvenil_%2814037225664%29.jpg', #   Alejandro Bayer Tamayo from Armenia, Colombia 
  'houspa': 'https://upload.wikimedia.org/wikipedia/commons/thumb/5/52/House_Sparrow%2C_England_-_May_09.jpg/571px-House_Sparrow%2C_England_-_May_09.jpg', #    Diliff
  'redcro': 'https://upload.wikimedia.org/wikipedia/commons/thumb/4/49/Red_Crossbills_%28Male%29.jpg/640px-Red_Crossbills_%28Male%29.jpg', #  Elaine R. Wilson, www.naturespicsonline.com
  'chcant2': 'https://upload.wikimedia.org/wikipedia/commons/thumb/6/67/Chestnut-crowned_antpitta_%2846933264335%29.jpg/640px-Chestnut-crowned_antpitta_%2846933264335%29.jpg', #   Mike's Birds from Riverside, CA, US
  'azaspi1': 'https://upload.wikimedia.org/wikipedia/commons/thumb/b/b2/Synallaxis_azarae_76608368.jpg/640px-Synallaxis_azarae_76608368.jpg', # https://www.inaturalist.org/photos/76608368
}

test_files = os.path.abspath(os.path.join(data_dir, 'test/*/*.wav'))

def get_random_audio_file():
  test_list = glob.glob(test_files)
  random_audio_path = random.choice(test_list)
  return random_audio_path


def show_bird_data(audio_path):
  sample_rate, audio_data = wavfile.read(audio_path, 'rb')

  bird_code = audio_path.split('/')[-2]
  print(f'Bird name: {bird_code_to_name[bird_code]}')
  print(f'Bird code: {bird_code}')
  display(Image(birds_images[bird_code]))

  plttitle = f'{bird_code_to_name[bird_code]} ({bird_code})'
  plt.title(plttitle)
  plt.plot(audio_data)
  display(Audio(audio_data, rate=sample_rate))

print('functions and data structures created')
functions and data structures created

Playing some audio

To have a better understanding about the data, lets listen to a random audio files from the test split.

random_audio = get_random_audio_file()
show_bird_data(random_audio)
Bird name: Chestnut-crowned Antpitta
Bird code: chcant2

jpeg

png

Training the Model

When using Model Maker for audio, you have to start with a model spec. This is the base model that your new model will extract information to learn about the new classes. It also affects how the dataset will be transformed to respect the models spec parameters like: sample rate, number of channels.

YAMNet is an audio event classifier trained on the AudioSet dataset to predict audio events from the AudioSet ontology.

It's input is expected to be at 16kHz and with 1 channel.

You don't need to do any resampling yourself. Model Maker takes care of that for you.

  • frame_length is to decide how long each traininng sample is. in this caase EXPECTED_WAVEFORM_LENGTH * 3s

  • frame_steps is to decide how far appart are the training samples. In this case, the ith sample will start at EXPECTED_WAVEFORM_LENGTH * 6s after the (i-1)th sample.

The reason to set these values is to work around some limitation in real world dataset.

For example, in the bird dataset, birds don't sing all the time. They sing, rest and sing again, with noises in between. Having a long frame would help capture the singing, but setting it too long will reduce the number of samples for training.

spec = audio_classifier.YamNetSpec(
    keep_yamnet_and_custom_heads=True,
    frame_step=3 * audio_classifier.YamNetSpec.EXPECTED_WAVEFORM_LENGTH,
    frame_length=6 * audio_classifier.YamNetSpec.EXPECTED_WAVEFORM_LENGTH)
INFO:tensorflow:Checkpoints are stored in /tmp/tmpb9s06uc_
2021-08-13 11:36:10.589549: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-08-13 11:36:11.221073: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:36:11.222115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-08-13 11:36:11.222160: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-13 11:36:11.225980: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-08-13 11:36:11.226105: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-08-13 11:36:11.227354: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-08-13 11:36:11.227719: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-08-13 11:36:11.228872: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-08-13 11:36:11.229850: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-08-13 11:36:11.230048: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-08-13 11:36:11.230184: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:36:11.231138: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:36:11.232098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-08-13 11:36:11.232889: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-13 11:36:11.233519: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:36:11.234423: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-08-13 11:36:11.234532: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:36:11.235434: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:36:11.236226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-08-13 11:36:11.236300: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-13 11:36:11.850245: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-13 11:36:11.850281: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-08-13 11:36:11.850289: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-08-13 11:36:11.850620: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:36:11.851697: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:36:11.852722: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:36:11.853607: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)

Loading the data

Model Maker has the API to load the data from a folder and have it in the expected format for the model spec.

The train and test split are based on the folders. The validation dataset will be created as 20% of the train split.

train_data = audio_classifier.DataLoader.from_folder(
    spec, os.path.join(data_dir, 'train'), cache=True)
train_data, validation_data = train_data.split(0.8)
test_data = audio_classifier.DataLoader.from_folder(
    spec, os.path.join(data_dir, 'test'), cache=True)

Training the model

the audio_classifier has the create method that creates a model and already start training it.

You can customize many parameterss, for more information you can read more details in the documentation.

On this first try you'll use all the default configurations and train for 100 epochs.

batch_size = 128
epochs = 100

print('Training the model')
model = audio_classifier.create(
    train_data,
    spec,
    validation_data,
    batch_size=batch_size,
    epochs=epochs)
Training the model
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
classification_head (Dense)  (None, 5)                 5125      
=================================================================
Total params: 5,125
Trainable params: 5,125
Non-trainable params: 0
_________________________________________________________________
2021-08-13 11:36:16.226473: I tensorflow/core/profiler/lib/profiler_session.cc:126] Profiler session initializing.
2021-08-13 11:36:16.226507: I tensorflow/core/profiler/lib/profiler_session.cc:141] Profiler session started.
2021-08-13 11:36:16.226617: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1611] Profiler found 1 GPUs
2021-08-13 11:36:16.268535: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcupti.so.11.2
2021-08-13 11:36:16.464758: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session tear down.
2021-08-13 11:36:16.467500: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1743] CUPTI activity buffer flushed
Epoch 1/100
2021-08-13 11:36:17.146373: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-08-13 11:36:17.148544: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2000189999 Hz
2021-08-13 11:36:22.052068: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
1/Unknown - 6s 6s/step - loss: 1.8520 - acc: 0.1797
2021-08-13 11:36:22.481947: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-08-13 11:36:22.509360: I tensorflow/core/profiler/lib/profiler_session.cc:126] Profiler session initializing.
2021-08-13 11:36:22.509426: I tensorflow/core/profiler/lib/profiler_session.cc:141] Profiler session started.
2/Unknown - 7s 2s/step - loss: 1.8028 - acc: 0.1641
2021-08-13 11:36:24.298946: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data.
2021-08-13 11:36:24.307382: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1743] CUPTI activity buffer flushed
2021-08-13 11:36:24.908455: I tensorflow/core/profiler/internal/gpu/cupti_collector.cc:673]  GpuTracer has collected 12192 callback api events and 12191 activity events. 
2021-08-13 11:36:25.457335: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session tear down.
2021-08-13 11:36:26.421294: I tensorflow/core/profiler/rpc/client/save_profile.cc:137] Creating directory: /tmp/tmpb9s06uc_/summaries/train/plugins/profile/2021_08_13_11_36_25
2021-08-13 11:36:27.240297: I tensorflow/core/profiler/rpc/client/save_profile.cc:143] Dumped gzipped tool data for trace.json.gz to /tmp/tmpb9s06uc_/summaries/train/plugins/profile/2021_08_13_11_36_25/kokoro-gcp-ubuntu-prod-296600362.trace.json.gz
3/Unknown - 11s 3s/step - loss: 1.8044 - acc: 0.1432
2021-08-13 11:36:27.763631: I tensorflow/core/profiler/rpc/client/save_profile.cc:137] Creating directory: /tmp/tmpb9s06uc_/summaries/train/plugins/profile/2021_08_13_11_36_25
2021-08-13 11:36:27.774319: I tensorflow/core/profiler/rpc/client/save_profile.cc:143] Dumped gzipped tool data for memory_profile.json.gz to /tmp/tmpb9s06uc_/summaries/train/plugins/profile/2021_08_13_11_36_25/kokoro-gcp-ubuntu-prod-296600362.memory_profile.json.gz
2021-08-13 11:36:27.787034: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: /tmp/tmpb9s06uc_/summaries/train/plugins/profile/2021_08_13_11_36_25Dumped tool data for xplane.pb to /tmp/tmpb9s06uc_/summaries/train/plugins/profile/2021_08_13_11_36_25/kokoro-gcp-ubuntu-prod-296600362.xplane.pb
Dumped tool data for overview_page.pb to /tmp/tmpb9s06uc_/summaries/train/plugins/profile/2021_08_13_11_36_25/kokoro-gcp-ubuntu-prod-296600362.overview_page.pb
Dumped tool data for input_pipeline.pb to /tmp/tmpb9s06uc_/summaries/train/plugins/profile/2021_08_13_11_36_25/kokoro-gcp-ubuntu-prod-296600362.input_pipeline.pb
Dumped tool data for tensorflow_stats.pb to /tmp/tmpb9s06uc_/summaries/train/plugins/profile/2021_08_13_11_36_25/kokoro-gcp-ubuntu-prod-296600362.tensorflow_stats.pb
Dumped tool data for kernel_stats.pb to /tmp/tmpb9s06uc_/summaries/train/plugins/profile/2021_08_13_11_36_25/kokoro-gcp-ubuntu-prod-296600362.kernel_stats.pb
21/21 [==============================] - 49s 2s/step - loss: 1.5436 - acc: 0.3267 - val_loss: 1.3580 - val_acc: 0.4677
Epoch 2/100
21/21 [==============================] - 0s 22ms/step - loss: 1.2939 - acc: 0.4795 - val_loss: 1.1890 - val_acc: 0.6200
Epoch 3/100
21/21 [==============================] - 1s 24ms/step - loss: 1.1496 - acc: 0.5800 - val_loss: 1.0764 - val_acc: 0.6784
Epoch 4/100
21/21 [==============================] - 0s 22ms/step - loss: 1.0317 - acc: 0.6549 - val_loss: 0.9985 - val_acc: 0.7174
Epoch 5/100
21/21 [==============================] - 0s 21ms/step - loss: 0.9562 - acc: 0.6835 - val_loss: 0.9408 - val_acc: 0.7479
Epoch 6/100
21/21 [==============================] - 0s 21ms/step - loss: 0.8858 - acc: 0.7226 - val_loss: 0.9004 - val_acc: 0.7564
Epoch 7/100
21/21 [==============================] - 0s 22ms/step - loss: 0.8362 - acc: 0.7426 - val_loss: 0.8671 - val_acc: 0.7381
Epoch 8/100
21/21 [==============================] - 0s 21ms/step - loss: 0.7918 - acc: 0.7572 - val_loss: 0.8422 - val_acc: 0.7272
Epoch 9/100
21/21 [==============================] - 0s 21ms/step - loss: 0.7533 - acc: 0.7693 - val_loss: 0.8217 - val_acc: 0.7174
Epoch 10/100
21/21 [==============================] - 0s 22ms/step - loss: 0.7272 - acc: 0.7738 - val_loss: 0.8074 - val_acc: 0.7065
Epoch 11/100
21/21 [==============================] - 0s 22ms/step - loss: 0.7011 - acc: 0.7866 - val_loss: 0.7961 - val_acc: 0.7016
Epoch 12/100
21/21 [==============================] - 0s 22ms/step - loss: 0.6673 - acc: 0.8058 - val_loss: 0.7868 - val_acc: 0.6845
Epoch 13/100
21/21 [==============================] - 0s 22ms/step - loss: 0.6464 - acc: 0.8084 - val_loss: 0.7805 - val_acc: 0.6833
Epoch 14/100
21/21 [==============================] - 0s 22ms/step - loss: 0.6321 - acc: 0.8077 - val_loss: 0.7762 - val_acc: 0.6748
Epoch 15/100
21/21 [==============================] - 0s 21ms/step - loss: 0.6132 - acc: 0.8160 - val_loss: 0.7704 - val_acc: 0.6724
Epoch 16/100
21/21 [==============================] - 0s 22ms/step - loss: 0.6020 - acc: 0.8201 - val_loss: 0.7678 - val_acc: 0.6663
Epoch 17/100
21/21 [==============================] - 0s 22ms/step - loss: 0.5800 - acc: 0.8227 - val_loss: 0.7628 - val_acc: 0.6699
Epoch 18/100
21/21 [==============================] - 0s 22ms/step - loss: 0.5681 - acc: 0.8257 - val_loss: 0.7618 - val_acc: 0.6590
Epoch 19/100
21/21 [==============================] - 0s 22ms/step - loss: 0.5534 - acc: 0.8359 - val_loss: 0.7600 - val_acc: 0.6602
Epoch 20/100
21/21 [==============================] - 0s 22ms/step - loss: 0.5458 - acc: 0.8254 - val_loss: 0.7600 - val_acc: 0.6590
Epoch 21/100
21/21 [==============================] - 0s 22ms/step - loss: 0.5368 - acc: 0.8382 - val_loss: 0.7573 - val_acc: 0.6602
Epoch 22/100
21/21 [==============================] - 0s 22ms/step - loss: 0.5298 - acc: 0.8393 - val_loss: 0.7579 - val_acc: 0.6577
Epoch 23/100
21/21 [==============================] - 0s 22ms/step - loss: 0.5167 - acc: 0.8495 - val_loss: 0.7569 - val_acc: 0.6577
Epoch 24/100
21/21 [==============================] - 0s 21ms/step - loss: 0.5080 - acc: 0.8453 - val_loss: 0.7549 - val_acc: 0.6590
Epoch 25/100
21/21 [==============================] - 0s 22ms/step - loss: 0.4954 - acc: 0.8521 - val_loss: 0.7560 - val_acc: 0.6553
Epoch 26/100
21/21 [==============================] - 0s 22ms/step - loss: 0.4891 - acc: 0.8525 - val_loss: 0.7585 - val_acc: 0.6590
Epoch 27/100
21/21 [==============================] - 1s 24ms/step - loss: 0.4860 - acc: 0.8536 - val_loss: 0.7577 - val_acc: 0.6565
Epoch 28/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4785 - acc: 0.8521 - val_loss: 0.7590 - val_acc: 0.6541
Epoch 29/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4702 - acc: 0.8589 - val_loss: 0.7585 - val_acc: 0.6590
Epoch 30/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4712 - acc: 0.8540 - val_loss: 0.7603 - val_acc: 0.6577
Epoch 31/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4598 - acc: 0.8600 - val_loss: 0.7633 - val_acc: 0.6602
Epoch 32/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4538 - acc: 0.8615 - val_loss: 0.7655 - val_acc: 0.6541
Epoch 33/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4509 - acc: 0.8615 - val_loss: 0.7649 - val_acc: 0.6577
Epoch 34/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4381 - acc: 0.8679 - val_loss: 0.7635 - val_acc: 0.6565
Epoch 35/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4362 - acc: 0.8694 - val_loss: 0.7664 - val_acc: 0.6602
Epoch 36/100
21/21 [==============================] - 1s 23ms/step - loss: 0.4353 - acc: 0.8585 - val_loss: 0.7633 - val_acc: 0.6590
Epoch 37/100
21/21 [==============================] - 0s 22ms/step - loss: 0.4203 - acc: 0.8664 - val_loss: 0.7650 - val_acc: 0.6553
Epoch 38/100
21/21 [==============================] - 0s 22ms/step - loss: 0.4298 - acc: 0.8660 - val_loss: 0.7666 - val_acc: 0.6565
Epoch 39/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4214 - acc: 0.8705 - val_loss: 0.7673 - val_acc: 0.6577
Epoch 40/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4201 - acc: 0.8653 - val_loss: 0.7677 - val_acc: 0.6590
Epoch 41/100
21/21 [==============================] - 0s 22ms/step - loss: 0.4118 - acc: 0.8705 - val_loss: 0.7670 - val_acc: 0.6565
Epoch 42/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4030 - acc: 0.8799 - val_loss: 0.7683 - val_acc: 0.6590
Epoch 43/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3988 - acc: 0.8762 - val_loss: 0.7711 - val_acc: 0.6590
Epoch 44/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4010 - acc: 0.8773 - val_loss: 0.7722 - val_acc: 0.6577
Epoch 45/100
21/21 [==============================] - 0s 21ms/step - loss: 0.4047 - acc: 0.8743 - val_loss: 0.7752 - val_acc: 0.6602
Epoch 46/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3962 - acc: 0.8709 - val_loss: 0.7755 - val_acc: 0.6590
Epoch 47/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3944 - acc: 0.8724 - val_loss: 0.7813 - val_acc: 0.6626
Epoch 48/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3907 - acc: 0.8818 - val_loss: 0.7789 - val_acc: 0.6626
Epoch 49/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3783 - acc: 0.8792 - val_loss: 0.7822 - val_acc: 0.6614
Epoch 50/100
21/21 [==============================] - 1s 24ms/step - loss: 0.3770 - acc: 0.8822 - val_loss: 0.7847 - val_acc: 0.6577
Epoch 51/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3742 - acc: 0.8886 - val_loss: 0.7897 - val_acc: 0.6602
Epoch 52/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3700 - acc: 0.8897 - val_loss: 0.7900 - val_acc: 0.6638
Epoch 53/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3693 - acc: 0.8799 - val_loss: 0.7970 - val_acc: 0.6590
Epoch 54/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3698 - acc: 0.8875 - val_loss: 0.7987 - val_acc: 0.6565
Epoch 55/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3732 - acc: 0.8830 - val_loss: 0.8003 - val_acc: 0.6577
Epoch 56/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3641 - acc: 0.8912 - val_loss: 0.8000 - val_acc: 0.6590
Epoch 57/100
21/21 [==============================] - 0s 23ms/step - loss: 0.3667 - acc: 0.8811 - val_loss: 0.8048 - val_acc: 0.6565
Epoch 58/100
21/21 [==============================] - 0s 23ms/step - loss: 0.3611 - acc: 0.8878 - val_loss: 0.8086 - val_acc: 0.6553
Epoch 59/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3568 - acc: 0.8909 - val_loss: 0.8079 - val_acc: 0.6577
Epoch 60/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3596 - acc: 0.8886 - val_loss: 0.8063 - val_acc: 0.6590
Epoch 61/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3538 - acc: 0.8946 - val_loss: 0.8061 - val_acc: 0.6577
Epoch 62/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3573 - acc: 0.8905 - val_loss: 0.8049 - val_acc: 0.6577
Epoch 63/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3568 - acc: 0.8875 - val_loss: 0.8026 - val_acc: 0.6614
Epoch 64/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3461 - acc: 0.8860 - val_loss: 0.8126 - val_acc: 0.6590
Epoch 65/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3502 - acc: 0.8920 - val_loss: 0.8133 - val_acc: 0.6590
Epoch 66/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3498 - acc: 0.8893 - val_loss: 0.8178 - val_acc: 0.6553
Epoch 67/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3436 - acc: 0.8788 - val_loss: 0.8198 - val_acc: 0.6565
Epoch 68/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3372 - acc: 0.8995 - val_loss: 0.8164 - val_acc: 0.6602
Epoch 69/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3432 - acc: 0.8867 - val_loss: 0.8216 - val_acc: 0.6553
Epoch 70/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3345 - acc: 0.8961 - val_loss: 0.8202 - val_acc: 0.6553
Epoch 71/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3366 - acc: 0.8973 - val_loss: 0.8299 - val_acc: 0.6577
Epoch 72/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3367 - acc: 0.8973 - val_loss: 0.8232 - val_acc: 0.6590
Epoch 73/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3342 - acc: 0.8924 - val_loss: 0.8332 - val_acc: 0.6553
Epoch 74/100
21/21 [==============================] - 1s 25ms/step - loss: 0.3310 - acc: 0.8961 - val_loss: 0.8306 - val_acc: 0.6565
Epoch 75/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3345 - acc: 0.8950 - val_loss: 0.8350 - val_acc: 0.6590
Epoch 76/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3359 - acc: 0.8886 - val_loss: 0.8325 - val_acc: 0.6577
Epoch 77/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3290 - acc: 0.8939 - val_loss: 0.8317 - val_acc: 0.6602
Epoch 78/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3325 - acc: 0.8961 - val_loss: 0.8412 - val_acc: 0.6590
Epoch 79/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3316 - acc: 0.8920 - val_loss: 0.8401 - val_acc: 0.6577
Epoch 80/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3319 - acc: 0.8976 - val_loss: 0.8398 - val_acc: 0.6577
Epoch 81/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3279 - acc: 0.8935 - val_loss: 0.8364 - val_acc: 0.6614
Epoch 82/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3270 - acc: 0.8893 - val_loss: 0.8469 - val_acc: 0.6577
Epoch 83/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3212 - acc: 0.8920 - val_loss: 0.8512 - val_acc: 0.6565
Epoch 84/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3276 - acc: 0.8973 - val_loss: 0.8528 - val_acc: 0.6577
Epoch 85/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3305 - acc: 0.8878 - val_loss: 0.8530 - val_acc: 0.6577
Epoch 86/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3132 - acc: 0.9021 - val_loss: 0.8465 - val_acc: 0.6602
Epoch 87/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3129 - acc: 0.8965 - val_loss: 0.8530 - val_acc: 0.6590
Epoch 88/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3147 - acc: 0.8916 - val_loss: 0.8476 - val_acc: 0.6602
Epoch 89/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3248 - acc: 0.8830 - val_loss: 0.8564 - val_acc: 0.6590
Epoch 90/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3129 - acc: 0.8939 - val_loss: 0.8500 - val_acc: 0.6626
Epoch 91/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3162 - acc: 0.8924 - val_loss: 0.8585 - val_acc: 0.6614
Epoch 92/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3071 - acc: 0.8984 - val_loss: 0.8640 - val_acc: 0.6602
Epoch 93/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3094 - acc: 0.8976 - val_loss: 0.8658 - val_acc: 0.6602
Epoch 94/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3029 - acc: 0.8988 - val_loss: 0.8623 - val_acc: 0.6590
Epoch 95/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3085 - acc: 0.8976 - val_loss: 0.8666 - val_acc: 0.6614
Epoch 96/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3155 - acc: 0.8939 - val_loss: 0.8631 - val_acc: 0.6602
Epoch 97/100
21/21 [==============================] - 0s 22ms/step - loss: 0.3118 - acc: 0.8976 - val_loss: 0.8695 - val_acc: 0.6602
Epoch 98/100
21/21 [==============================] - 1s 23ms/step - loss: 0.3074 - acc: 0.8957 - val_loss: 0.8638 - val_acc: 0.6614
Epoch 99/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3121 - acc: 0.8954 - val_loss: 0.8693 - val_acc: 0.6590
Epoch 100/100
21/21 [==============================] - 0s 21ms/step - loss: 0.3107 - acc: 0.8965 - val_loss: 0.8754 - val_acc: 0.6614

The accuracy looks good but it's important to run the evaluation step on the test data and vefify your model achieved good results on unseed data.

print('Evaluating the model')
model.evaluate(test_data)
Evaluating the model
28/28 [==============================] - 12s 405ms/step - loss: 0.8017 - acc: 0.7968
[0.8017078042030334, 0.796785295009613]

Understanding your model

When training a classifier, it's useful to see the confusion matrix. The confusion matrix gives you detailed knowledge of how your classifier is performing on test data.

Model Maker already creates the confusion matrix for you.

def show_confusion_matrix(confusion, test_labels):
  """Compute confusion matrix and normalize."""
  confusion_normalized = confusion.astype("float") / confusion.sum(axis=1)
  axis_labels = test_labels
  ax = sns.heatmap(
      confusion_normalized, xticklabels=axis_labels, yticklabels=axis_labels,
      cmap='Blues', annot=True, fmt='.2f', square=True)
  plt.title("Confusion matrix")
  plt.ylabel("True label")
  plt.xlabel("Predicted label")

confusion_matrix = model.confusion_matrix(test_data)
show_confusion_matrix(confusion_matrix.numpy(), test_data.index_to_label)

png

Testing the model [Optional]

You can try the model on a sample audio from the test dataset just to see the results.

First you get the serving model.

serving_model = model.create_serving_model()

print(f'Model\'s input shape and type: {serving_model.inputs}')
print(f'Model\'s output shape and type: {serving_model.outputs}')
Model's input shape and type: [<KerasTensor: shape=(None, 15600) dtype=float32 (created by layer 'audio')>]
Model's output shape and type: [<KerasTensor: shape=(1, 521) dtype=float32 (created by layer 'keras_layer')>, <KerasTensor: shape=(1, 5) dtype=float32 (created by layer 'sequential')>]

Coming back to the random audio you loaded earlier

# if you want to try another file just uncoment the line below
random_audio = get_random_audio_file()
show_bird_data(random_audio)
Bird name: Azara's Spinetail
Bird code: azaspi1

jpeg

png

The model created has a fixed input window.

For a given audio file, you'll have to split it in windows of data of the expected size. The last window might need to be filled with zeros.

sample_rate, audio_data = wavfile.read(random_audio, 'rb')

audio_data = np.array(audio_data) / tf.int16.max
input_size = serving_model.input_shape[1]

splitted_audio_data = tf.signal.frame(audio_data, input_size, input_size, pad_end=True, pad_value=0)

print(f'Test audio path: {random_audio}')
print(f'Original size of the audio data: {len(audio_data)}')
print(f'Number of windows for inference: {len(splitted_audio_data)}')
Test audio path: /tmpfs/src/temp/tensorflow/lite/g3doc/tutorials/dataset/small_birds_dataset/test/azaspi1/XC513872.wav
Original size of the audio data: 894944
Number of windows for inference: 58

You'll loop over all the splitted audio and apply the model for each one of them.

The model you've just trained has 2 outputs: The original YAMNet's output and the one you've just trained. This is important because the real world environment is more complicated than just bird sounds. You can use the YAMNet's output to filter out non relevant audio, for example, on the birds use case, if YAMNet is not classifying Birds or Animals, this might show that the output from your model might have an irrelevant classification.

Below both outpus are printed to make it easier to understand their relation. Most of the mistakes that your model make are when YAMNet's prediction is not related to your domain (eg: birds).

print(random_audio)

results = []
print('Result of the window ith:  your model class -> score,  (spec class -> score)')
for i, data in enumerate(splitted_audio_data):
  yamnet_output, inference = serving_model(data)
  results.append(inference[0].numpy())
  result_index = tf.argmax(inference[0])
  spec_result_index = tf.argmax(yamnet_output[0])
  t = spec._yamnet_labels()[spec_result_index]
  result_str = f'Result of the window {i}: ' \
  f'\t{test_data.index_to_label[result_index]} -> {inference[0][result_index].numpy():.3f}, ' \
  f'\t({spec._yamnet_labels()[spec_result_index]} -> {yamnet_output[0][spec_result_index]:.3f})'
  print(result_str)


results_np = np.array(results)
mean_results = results_np.mean(axis=0)
result_index = mean_results.argmax()
print(f'Mean result: {test_data.index_to_label[result_index]} -> {mean_results[result_index]}')
/tmpfs/src/temp/tensorflow/lite/g3doc/tutorials/dataset/small_birds_dataset/test/azaspi1/XC513872.wav
Result of the window ith:  your model class -> score,  (spec class -> score)
2021-08-13 11:38:36.391100: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-08-13 11:38:36.710809: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-08-13 11:38:37.115542: I tensorflow/stream_executor/cuda/cuda_dnn.cc:359] Loaded cuDNN version 8100
Result of the window 0:   chcant2 -> 0.969,    (Outside, rural or natural -> 0.364)
Result of the window 1:   azaspi1 -> 0.606,    (Animal -> 0.441)
Result of the window 2:   chcant2 -> 0.608,    (Outside, rural or natural -> 0.338)
Result of the window 3:   chcant2 -> 0.759,    (Outside, rural or natural -> 0.411)
Result of the window 4:   azaspi1 -> 0.588,    (Animal -> 0.636)
Result of the window 5:   azaspi1 -> 0.525,    (Animal -> 0.692)
Result of the window 6:   azaspi1 -> 0.969,    (Animal -> 0.703)
Result of the window 7:   chcant2 -> 0.973,    (Outside, rural or natural -> 0.249)
Result of the window 8:   azaspi1 -> 0.956,    (Animal -> 0.626)
Result of the window 9:   chcant2 -> 0.867,    (Outside, rural or natural -> 0.451)
Result of the window 10:  chcant2 -> 0.792,    (Animal -> 0.267)
Result of the window 11:  azaspi1 -> 0.621,    (Outside, rural or natural -> 0.310)
Result of the window 12:  azaspi1 -> 0.955,    (Animal -> 0.598)
Result of the window 13:  azaspi1 -> 0.911,    (Animal -> 0.724)
Result of the window 14:  azaspi1 -> 0.368,    (Outside, rural or natural -> 0.347)
Result of the window 15:  azaspi1 -> 0.718,    (Animal -> 0.465)
Result of the window 16:  chcant2 -> 0.875,    (Ocean -> 0.306)
Result of the window 17:  azaspi1 -> 0.645,    (Animal -> 0.379)
Result of the window 18:  chcant2 -> 0.985,    (Stream -> 0.710)
Result of the window 19:  azaspi1 -> 0.958,    (Animal -> 0.574)
Result of the window 20:  chcant2 -> 0.867,    (Outside, rural or natural -> 0.425)
Result of the window 21:  azaspi1 -> 0.884,    (Animal -> 0.478)
Result of the window 22:  chcant2 -> 0.991,    (Water -> 0.549)
Result of the window 23:  chcant2 -> 0.709,    (Outside, rural or natural -> 0.404)
Result of the window 24:  chcant2 -> 0.642,    (Outside, rural or natural -> 0.606)
Result of the window 25:  azaspi1 -> 0.972,    (Animal -> 0.715)
Result of the window 26:  azaspi1 -> 0.810,    (Animal -> 0.477)
Result of the window 27:  chcant2 -> 0.882,    (Rustling leaves -> 0.300)
Result of the window 28:  chcant2 -> 0.788,    (Outside, rural or natural -> 0.439)
Result of the window 29:  chcant2 -> 0.983,    (Water -> 0.221)
Result of the window 30:  wbwwre1 -> 0.810,    (Outside, rural or natural -> 0.333)
Result of the window 31:  chcant2 -> 0.576,    (Outside, rural or natural -> 0.492)
Result of the window 32:  redcro -> 0.988,     (Outside, rural or natural -> 0.503)
Result of the window 33:  chcant2 -> 0.933,    (Outside, rural or natural -> 0.394)
Result of the window 34:  azaspi1 -> 0.757,    (Animal -> 0.746)
Result of the window 35:  redcro -> 0.558,     (Rustling leaves -> 0.723)
Result of the window 36:  azaspi1 -> 0.995,    (Animal -> 0.907)
Result of the window 37:  redcro -> 0.774,     (Outside, rural or natural -> 0.478)
Result of the window 38:  azaspi1 -> 0.981,    (Animal -> 0.751)
Result of the window 39:  azaspi1 -> 0.814,    (Animal -> 0.508)
Result of the window 40:  houspa -> 0.913,     (Animal -> 0.709)
Result of the window 41:  azaspi1 -> 0.976,    (Animal -> 0.636)
Result of the window 42:  chcant2 -> 0.884,    (Wind -> 0.586)
Result of the window 43:  azaspi1 -> 0.999,    (Animal -> 0.791)
Result of the window 44:  chcant2 -> 0.972,    (Water -> 0.414)
Result of the window 45:  chcant2 -> 0.990,    (Water -> 0.540)
Result of the window 46:  azaspi1 -> 0.472,    (Animal -> 0.556)
Result of the window 47:  chcant2 -> 0.993,    (Water -> 0.176)
Result of the window 48:  azaspi1 -> 0.878,    (Animal -> 0.630)
Result of the window 49:  chcant2 -> 0.515,    (Outside, rural or natural -> 0.500)
Result of the window 50:  chcant2 -> 0.995,    (Stream -> 0.626)
Result of the window 51:  azaspi1 -> 0.420,    (Animal -> 0.822)
Result of the window 52:  chcant2 -> 0.987,    (Pour -> 0.361)
Result of the window 53:  azaspi1 -> 0.991,    (Animal -> 0.648)
Result of the window 54:  chcant2 -> 0.964,    (Bee, wasp, etc. -> 0.881)
Result of the window 55:  azaspi1 -> 0.463,    (Animal -> 0.788)
Result of the window 56:  chcant2 -> 0.524,    (Pink noise -> 0.281)
Result of the window 57:  chcant2 -> 0.684,    (Pink noise -> 0.316)
Mean result: chcant2 -> 0.4399028718471527

Exporting the model

The last step is exporting your model to be used on embedded devices or on the browser.

The export method export both formats for you.

models_path = './birds_models'
print(f'Exporing the TFLite model to {models_path}')

model.export(models_path, tflite_filename='my_birds_model.tflite')
Exporing the TFLite model to ./birds_models
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
2021-08-13 11:38:42.915272: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
INFO:tensorflow:Assets written to: /tmp/tmpvml0r1hi/assets
INFO:tensorflow:Assets written to: /tmp/tmpvml0r1hi/assets
2021-08-13 11:38:47.326775: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:38:47.327174: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2021-08-13 11:38:47.327293: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session
2021-08-13 11:38:47.327799: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:38:47.328105: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-08-13 11:38:47.328190: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:38:47.328503: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:38:47.328786: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-08-13 11:38:47.328833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-13 11:38:47.328839: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-08-13 11:38:47.328845: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-08-13 11:38:47.328983: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:38:47.329288: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:38:47.329531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
2021-08-13 11:38:47.361146: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1144] Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 807 nodes (657), 1193 edges (1038), time = 17.374ms.
  function_optimizer: function_optimizer did nothing. time = 0.327ms.

2021-08-13 11:38:48.154513: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:345] Ignored output_format.
2021-08-13 11:38:48.154570: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:348] Ignored drop_control_dependency.
2021-08-13 11:38:48.207873: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:210] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2021-08-13 11:38:48.217247: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:38:48.217657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-08-13 11:38:48.217818: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:38:48.218109: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:38:48.218338: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-08-13 11:38:48.218377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-13 11:38:48.218383: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-08-13 11:38:48.218389: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-08-13 11:38:48.218521: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:38:48.218837: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-13 11:38:48.219084: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
INFO:tensorflow:TensorFlow Lite model exported successfully: ./birds_models/my_birds_model.tflite
INFO:tensorflow:TensorFlow Lite model exported successfully: ./birds_models/my_birds_model.tflite

You can also export the SavedModel version for serving or using on a Python environment.

model.export(models_path, export_format=[mm.ExportFormat.SAVED_MODEL, mm.ExportFormat.LABEL])
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
INFO:tensorflow:Assets written to: ./birds_models/saved_model/assets
INFO:tensorflow:Assets written to: ./birds_models/saved_model/assets
INFO:tensorflow:Saving labels in ./birds_models/labels.txt
INFO:tensorflow:Saving labels in ./birds_models/labels.txt

Next Steps

You did it.

Now your new model can be deployed on mobile devices using TFLite AudioClassifier Task API.

You can also try the same process with your own data with different classes and here is the documentation for Model Maker for Audio Classification.

Also learn from end-to-end reference apps: Android, iOS.