For contributing to tfhub.dev, documentation in Markdown format must be provided. For a full overview of the process of contributing models to tfhub.dev see the contribute a model guide.
Types of Markdown documentation
There are 3 types of Markdown documentation used in tfhub.dev:
- Publisher Markdown - information about a publisher (see markdown syntax)
- Model Markdown - information about a specific model and how to use it (see markdown syntax)
- Collection Markdown - contains information about a publisher-defined collection of models (see markdown syntax)
The following content organization is required when contributing to the TensorFlow Hub GitHub repository:
- each publisher directory is in the
- each publisher directory contains optional
- each model should have its own directory under
- each collection should have its own directory under
Publisher markdowns are unversioned, while models can have different versions. Each model version requires a separate Markdown file named after the version it describes (i.e. 1.md, 2). Collections are versioned but only a single version (1) is supported.
All model versions for a given model should be located in the model directory.
Below is an illustration on how the Markdown content is organized:
assets/docs ├── <publisher_name_a> │ ├── <publisher_name_a>.md -> Documentation of the publisher. │ └── models │ └── <model_name> -> Model name with slashes encoded as sub-path. │ ├── 1.md -> Documentation of the model version 1. │ └── 2.md -> Documentation of the model version 2. ├── <publisher_name_b> │ ├── <publisher_name_b>.md -> Documentation of the publisher. │ ├── models │ │ └── ... │ └── collections │ └── <collection_name> │ └── 1.md -> Documentation for the collection. ├── <publisher_name_c> │ └── ... └── ...
Publisher markdown format
Publisher documentation is declared in the same kind of markdown files as models, with slight syntactic differences.
The correct location for the publisher file on the TensorFlow Hub repo is: tfhub.dev/assets/docs/<publisher_name>/<publisher_name.md>
See the minimal publisher documentation example for the "vtab" publisher:
# Publisher vtab Visual Task Adaptation Benchmark [![Icon URL]](https://storage.googleapis.com/vtab/vtab_logo_120.png) ## VTAB The Visual Task Adaptation Benchmark (VTAB) is a diverse, realistic and challenging benchmark to evaluate image representations.
The example above specifies the publisher id, a short description, path to icon to use, and a longer free-form markdown documentation.
Publisher name guideline
Your publisher name can be your GitHub username or the name of the GitHub organization you manage.
Model page markdown format
The model documentation is a Markdown file with some add-on syntax. See the example below for a minimal example or a more realistic example Markdown file.
A high-quality model documentation contains code snippets, information how the model was trained and intended usage. You should also make use of model-specific metadata properties explained below so users can find your models on tfhub.dev faster.
# Module google/text-embedding-model/1 Simple one sentence description. <!-- asset-path: https://path/to/text-embedding-model/model.tar.gz --> <!-- task: text-embedding --> <!-- fine-tunable: true --> <!-- format: saved_model_2 --> ## Overview Here we give more information about the model including how it was trained, expected use cases, and code snippets demonstrating how to use the model: ``` Code snippet demonstrating use (e.g. for a TF model using the tensorflow_hub library) import tensorflow_hub as hub model = hub.KerasLayer(<model name>) inputs = ... output = model(inputs) ```
Model deployments and grouping deployments together
tfhub.dev allows publishing TF.js, TFLite and Coral deployments of a TensorFlow SavedModel.
The first line of the Markdown file should specify the type of the format:
# Module publisher/model/versionfor SavedModels
# Tfjs publisher/model/versionfor TF.js deployments
# Lite publisher/model/versionfor Lite deployments
# Coral publisher/model/versionfor Coral deployments
It is a good idea for these different formats of the same conceptual model to show up on the same model page on tfhub.dev. To associate a given TF.js, TFLite or Coral deployment to a TensorFlow SavedModel model, specify the parent-model tag:
<!-- parent-model: publisher/model/version -->
Sometimes you might want to publish one or more deployments without a TensorFlow
SavedModel. In that case, you'll need to create a Placeholder model and specify
its handle in the
parent-model tag. The placeholder Markdown is identical to
TensorFlow model Markdown, except that the first line is:
publisher/model/version and it doesn't require the
Model Markdown specific metadata properties
The Markdown files can contain metadata properties. These are used to provide filters and tags to help users find your model. The metadata attributes are included as Markdown comments after the short description of the Markdown file, e.g.
# Module google/universal-sentence-encoder/1 Encoder of greater-than-word length text trained on a variety of data. <!-- task: text-embedding --> ...
The following metadata properties are supported:
format: For TensorFlow models: the TensorFlow Hub format of the model. Valid values are
hubwhen the model was exported via the legacy TF1 hub format or
saved_model_2when the model was exported via a TF2 Saved Model.
asset-path: the world-readable remote path to the actual model assets to upload, such as to a Google Cloud Storage bucket. The URL should be allowed to be fetched from by the robots.txt file (for this reason, "https://github.com/./releases/download/." is not supported as it is forbidden by https://github.com/robots.txt). See below for more information on the expected file type and content.
parent-model: For TF.js/TFLite/Coral models: handle of the accompanying SavedModel/Placeholder
fine-tunable: Boolean, whether the model can be fine-tuned by the user.
task: the problem domain, e.g. "text-embedding". All supported values are defined in task.yaml.
dataset: the dataset the model was trained on, e.g. "wikipedia". All supported values are defined in dataset.yaml.
network-architecture: the network architecture the model is based on, e.g. "mobilenet-v3". All supported values are defined in network_architecture.yaml.
language: the language code of the language a text model was trained on, e.g. "en". All supported values are defined in language.yaml.
license: The license that applies to the model, e.g. "mit". The default assumed license for a published model is Apache 2.0 License. All supported values are defined in license.yaml. Note that the
customlicense will require special consideration case by case.
colab: HTTPS URL to a notebook that demonstrates how the model can be used or trained (example for bigbigan-resnet50). Must lead to
colab.research.google.com. Note that Jupyter notebooks hosted on GitHub can be accessed via
<a href="https://colab.research.google.com/github/ORGANIZATION/PROJECT/">https://colab.research.google.com/github/ORGANIZATION/PROJECT/</a> blob/master/.../my_notebook.ipynb.
demo: HTTPS URL to a website that demonstrates how the TF.js model can be used (example for posenet).
interactive-visualizer: name of the visualizer that should be embedded on the model page, e.g. "vision". Displaying a visualizer allows users to explore the model's predictions interactively. All supported values are defined in interactive_visualizer.yaml.
The Markdown documentation types support different required and optional metadata properties:
|Collection||task||dataset, language, network-architecture|
|Placeholder||task||dataset, fine-tunable, interactive-visualizer, language, license, network-architecture|
|SavedModel||asset-path, task, fine-tunable, format||colab, dataset, interactive-visualizer, language, license, network-architecture|
|Tfjs||asset-path, parent-model||colab, demo, interactive-visualizer|
|Lite||asset-path, parent-model||colab, interactive-visualizer|
|Coral||asset-path, parent-model||colab, interactive-visualizer|
Model-specific asset content
Depending on the model type, the following file types and contents are expected:
- SavedModel: a tar.gz archive containing content like so:
saved_model.tar.gz ├── assets/ # Optional. ├── assets.extra/ # Optional. ├── variables/ │ ├── variables.data-?????-of-????? │ └── variables.index ├── saved_model.pb ├── keras_metadata.pb # Optional, only required for Keras models. └── tfhub_module.pb # Optional, only required for TF1 models.
- TF.js: a tar.gz archive containing content like so:
tf_js_model.tar.gz ├── group* ├── *.json ├── *.txt └── *.pb
- TFLite: a .tflite file
- Coral: a .tflite file
Generally, all files and directories (whether compressed or uncompressed) must start with a word character so e.g. dots are no valid prefix of file names/directories.
Collection page markdown format
Collections are a feature of tfhub.dev that enables publishers to bundle related models together to improve user search experience.
See the list of all collections on tfhub.dev.
Here is a minimal example that would go into
assets/docs/vtab/collections/benchmark/1.md. Note that the
collection's name in the first line does not include the
which is included in the filepath.
# Collection vtab/benchmark/1 Collection of visual representations that have been evaluated on the VTAB benchmark. <!-- task: image-feature-vector --> ## Overview This is the list of visual representations in TensorFlow Hub that have been evaluated on VTAB. Results can be seen in [google-research.github.io/task_adaptation/](https://google-research.github.io/task_adaptation/) #### Models | | |-------------------| | [vtab/sup-100/1](https://tfhub.dev/vtab/sup-100/1) | | [vtab/rotation/1](https://tfhub.dev/vtab/rotation/1) | |------------------------------------------------------|
The example specifies the name of the collection, a short one sentence description, problem domain metadata and free-form markdown documentation.