For contributing models to tfhub.dev, a documentation in Markdown must be provided. For a full overview of the process of adding models to tfhub.dev see the contribute a model guide.
Types of Markdown documentation
There are 3 types of Markdown documentation used in tfhub.dev:
- Publisher Markdown - contains information about a publisher (learn more in the become a publisher guide).
- Model Markdown - contains information about a specific model.
- Collection Markdown - contains information about a publisher-defined collection of models (learn more in the create a collection guide).
The following content organization is recommended when contributing to the TensorFlow Hub GitHub repository:
- each publisher directory is in the
- each publisher directory contains optional
- each model should have its own directory under
- each collection should have its own directory under
Publisher and collection Markdowns are unversioned, while models can have different versions. Each model version requires a separate Markdown file named after the version it describes (i.e. 1.md, 2).
All model versions for a given model should be located in the model directory.
Below is an illustration on how the Markdown content is organized:
assets ├── publisher_name_a │ ├── publisher_name_a.md -> Documentation of the publisher. │ └── models │ └── model -> Model name with slashes encoded as sub-path. │ ├── 1.md -> Documentation of the model version 1. │ └── 2.md -> Documentation of the model version 2. ├── publisher_name_b │ ├── publisher_name_b.md -> Documentation of the publisher. │ ├── models │ │ └── ... │ └── collections │ └── collection -> Documentation for the collection feature. │ └── 1.md ├── publisher_name_c │ └── ... └── ...
Model page specific Markdown format
The model documentation is a Markdown file with some add-on syntax. See the example below for a minimal example or a more realistic example Markdown file.
A high-quality model documentation contains code snippets, information how the model was trained and intended usage. You should also make use of model-specific metadata properties explained below so users can find your models on tfhub.dev faster.
# Module google/text-embedding-model/1 Simple one sentence description. <!-- asset-path: https://path/to/text-embedding-model/model.tar.gz --> <!-- module-type: text-embedding --> <!-- fine-tunable: true --> <!-- format: saved_model_2 --> ## Overview Here we give more information about the model including how it was trained, expected use cases, and code snippets demonstrating how to use the model: `` Code snippet demonstrating use (e.g. for a TF model using the tensorflow_hub library) import tensorflow_hub as hub model = hub.KerasLayer(<model name>) inputs = ... output = model(inputs) ``
Model deployments and grouping deployments together
tfhub.dev allows publishing TF.js, TFLite and Coral deployments of a TensorFlow model.
The first line of the Markdown file should specify the type of the deployment format:
# Tfjs publisher/model/versionfor TF.js deployments
# Lite publisher/model/versionfor Lite deployments
# Coral publisher/model/versionfor Coral deployments
It is a good idea for these different deployments to show up on the same model page on tfhub.dev. To associate a given TF.js, TFLite or Coral deployment to a TensorFlow model, specify the parent-model tag:
<!-- parent-model: publisher/model/version -->
Sometimes you might want to publish one or more deployments without a TensorFlow
SavedModel. In that case, you'll need to create a Placeholder model and specify
its handle in the
parent-model tag. The placeholder Markdown is identical to
TensorFlow model Markdown, except that the first line is:
publisher/model/version and it doesn't require the
Model Markdown specific metadata properties
The Markdown files can contain metadata properties. These are represented as Markdown comments after the description of the Markdown file, e.g.
# Module google/universal-sentence-encoder/1 Encoder of greater-than-word length text trained on a variety of data. <!-- module-type: text-embedding --> ...
The following metadata properties exist:
format: For TensorFlow models: the TensorFlow Hub format of the model. Valid values are
hubwhen the model was exported via the legacy TF1 hub format or
saved_model_2when the model was exported via a TF2 Saved Model.
asset-path: the world-readable remote path to the actual model assets to upload, such as to a Google Cloud Storage bucket. The URL should be allowed to be fetched from by the robots.txt file (for this reason, "https://github.com/./releases/download/." is not supported as it is forbidden by https://github.com/robots.txt)
parent-model: For TF.js/TFLite/Coral models: handle of the accompanying SavedModel/Placeholder
module-type: the problem domain, e.g. "text-embedding" or "image-classification"
dataset: the dataset the model was trained on, e.g. "ImageNet-21k" or "Wikipedia"
network-architecture: the network architecture the model is based on, e.g. "BERT" or "Mobilenet V3"
language: the language code of the language a text model was trained on, e.g. "en" or "fr"
fine-tunable: Boolean, whether the model can be fine-tuned by the user
license: The license that applies to the model. The default assumed license for a published model is Apache 2.0 License. The other accepted options are listed in OSI Approved Licenses. The possible (literal) values are:
custom. Note that a custom license will require special consideration case by case.
The Markdown documentation types support different required and optional metadata properties:
|Collection||module-type||dataset, language, network-architecture|
|Placeholder||module-type||dataset, fine-tunable, language, license, network-architecture|
|SavedModel||asset-path, module-type, fine-tunable, format||dataset, language, license, network-architecture|