TensorFlow Hub

Introduction

TensorFlow Hub is a library to foster the publication, discovery, and consumption of reusable parts of machine learning models. A module is a self-contained piece of a TensorFlow graph, along with its weights and assets, that can be reused across different tasks in a process known as transfer learning.

Modules contain variables that have been pre-trained for a task using a large dataset. By reusing a module on a related task, you can:

  • train a model with a smaller dataset,
  • improve generalization, or
  • significantly speed up training.

Here's an example that uses an English embedding module to map an array of strings to their embeddings:

import tensorflow as tf
import tensorflow_hub as hub

with tf.Graph().as_default():
  embed = hub.Module("https://tfhub.dev/google/nnlm-en-dim128-with-normalization/1")
  embeddings = embed(["A long sentence.", "single-word", "http://example.com"])

  with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(tf.tables_initializer())

    print(sess.run(embeddings))

Getting Started

Additional Information

Fairness

As in all of machine learning, fairness is an important consideration. Modules typically leverage large pretrained datasets. When reusing such a dataset, it’s important to be mindful of what data it contains (and whether there are any existing biases there), and how these might impact your downstream experiments.

Status

Although we hope to prevent breaking changes, this project is still under active development and is not yet guaranteed to have a stable API or module format.

Security

Since they contain arbitrary TensorFlow graphs, modules can be thought of as programs. Using TensorFlow Securely describes the security implications of referencing a module from an untrusted source.

Source-Code & Bug Reports

The source code is available on GitHub. Use GitHub issues for feature requests and bugs. Please see the TensorFlow Hub mailing list for general questions and discussion.