Rekomendasi TensorFlow: Panduan Memulai

Dalam tutorial ini, kita membangun sebuah model matriks faktorisasi sederhana dengan menggunakan dataset MovieLens 100K dengan TFRS. Kami dapat menggunakan model ini untuk merekomendasikan film untuk pengguna tertentu.

Impor TFRS

Pertama, instal dan impor TFRS:

pip install -q tensorflow-recommenders
pip install -q --upgrade tensorflow-datasets
from typing import Dict, Text

import numpy as np
import tensorflow as tf

import tensorflow_datasets as tfds
import tensorflow_recommenders as tfrs

Baca datanya

# Ratings data.
ratings = tfds.load('movielens/100k-ratings', split="train")
# Features of all the available movies.
movies = tfds.load('movielens/100k-movies', split="train")

# Select the basic features.
ratings = x: {
    "movie_title": x["movie_title"],
    "user_id": x["user_id"]
movies = x: x["movie_title"])
Bangun kosakata untuk mengonversi ID pengguna dan judul film menjadi indeks bilangan bulat untuk menyematkan lapisan:

user_ids_vocabulary = tf.keras.layers.StringLookup(mask_token=None)
user_ids_vocabulary.adapt( x: x["user_id"]))

movie_titles_vocabulary = tf.keras.layers.StringLookup(mask_token=None)

Tentukan model

Kita bisa menentukan model TFRS dengan mewarisi dari tfrs.Model dan melaksanakan compute_loss metode:

class MovieLensModel(tfrs.Model):
  # We derive from a custom base class to help reduce boilerplate. Under the hood,
  # these are still plain Keras Models.

  def __init__(
      user_model: tf.keras.Model,
      movie_model: tf.keras.Model,
      task: tfrs.tasks.Retrieval):

    # Set up user and movie representations.
    self.user_model = user_model
    self.movie_model = movie_model

    # Set up a retrieval task.
    self.task = task

  def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:
    # Define how the loss is computed.

    user_embeddings = self.user_model(features["user_id"])
    movie_embeddings = self.movie_model(features["movie_title"])

    return self.task(user_embeddings, movie_embeddings)

Tentukan dua model dan tugas pengambilan.

# Define user and movie models.
user_model = tf.keras.Sequential([
    tf.keras.layers.Embedding(user_ids_vocabulary.vocab_size(), 64)
movie_model = tf.keras.Sequential([
    tf.keras.layers.Embedding(movie_titles_vocabulary.vocab_size(), 64)

# Define your objectives.
task = tfrs.tasks.Retrieval(metrics=tfrs.metrics.FactorizedTopK(
Sesuaikan dan evaluasi.

Buat model, latih, dan buat prediksi:

# Create a retrieval model.
model = MovieLensModel(user_model, movie_model, task)

# Train for 3 epochs., epochs=3)

# Use brute-force search to set up retrieval using the trained representations.
index = tfrs.layers.factorized_top_k.BruteForce(model.user_model)
    movies.batch(100).map(lambda title: (title, model.movie_model(title))))

# Get some recommendations.
_, titles = index(np.array(["42"]))
print(f"Top 3 recommendations for user 42: {titles[0, :3]}")
Top 3 recommendations for user 42: [b'Rent-a-Kid (1995)' b'Just Cause (1995)'
 b'Land Before Time III: The Time of the Great Giving (1995) (V)']