RSVP for your your local TensorFlow Everywhere event today!

Text classification with an RNN

View on TensorFlow.org Run in Google Colab View source on GitHub Download notebook

This text classification tutorial trains a recurrent neural network on the IMDB large movie review dataset for sentiment analysis.

Setup

pip install -q tensorflow_datasets
import numpy as np

import tensorflow_datasets as tfds
import tensorflow as tf

tfds.disable_progress_bar()

Import matplotlib and create a helper function to plot graphs:

import matplotlib.pyplot as plt

def plot_graphs(history, metric):
  plt.plot(history.history[metric])
  plt.plot(history.history['val_'+metric], '')
  plt.xlabel("Epochs")
  plt.ylabel(metric)
  plt.legend([metric, 'val_'+metric])

Setup input pipeline

The IMDB large movie review dataset is a binary classification dataset—all the reviews have either a positive or negative sentiment.

Download the dataset using TFDS. See the loading text tutorial for details on how to load this sort of data manually.

dataset, info = tfds.load('imdb_reviews', with_info=True,
                          as_supervised=True)
train_dataset, test_dataset = dataset['train'], dataset['test']

train_dataset.element_spec
(TensorSpec(shape=(), dtype=tf.string, name=None),
 TensorSpec(shape=(), dtype=tf.int64, name=None))

Initially this returns a dataset of (text, label pairs):

for example, label in train_dataset.take(1):
  print('text: ', example.numpy())
  print('label: ', label.numpy())
text:  b"This was an absolutely terrible movie. Don't be lured in by Christopher Walken or Michael Ironside. Both are great actors, but this must simply be their worst role in history. Even their great acting could not redeem this movie's ridiculous storyline. This movie is an early nineties US propaganda piece. The most pathetic scenes were those when the Columbian rebels were making their cases for revolutions. Maria Conchita Alonso appeared phony, and her pseudo-love affair with Walken was nothing but a pathetic emotional plug in a movie that was devoid of any real meaning. I am disappointed that there are movies like this, ruining actor's like Christopher Walken's good name. I could barely sit through it."
label:  0

Next shuffle the data for training and create batches of these (text, label) pairs:

BUFFER_SIZE = 10000
BATCH_SIZE = 64
train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
test_dataset = test_dataset.batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
for example, label in train_dataset.take(1):
  print('texts: ', example.numpy()[:3])
  print()
  print('labels: ', label.numpy()[:3])
texts:  [b'Cult of the Cobra is now available on DVD in a pristine print that does full justice to whatever merits it has as a movie. Unfortunately, that is not saying much.<br /><br />It has a competent cast of second-rankers that acquit themselves as well as could be expected under the circumstances. It is efficiently directed, entirely on sound stages and standing sets on the studio backlot. It looks OK, but is ponderously over-plotted and at a scant 80 minutes it is still heavily padded.<br /><br />For example, the double cobra attack on the first of the GIs was surely one attack too many.<br /><br />The business about Julia choosing to marry Pete rather than Tom never amounts to anything. Tom immediately falls in love with Lisa and she never has any reason to be jealous of Julia (nor is she).<br /><br />Julia\'s \'feminine intuition\' is introduced as if it is going to lead to an important plot development, but it doesn\'t. Similarly, Pete\'s investigation into cobra cults and the suspicion that briefly falls on Tom serve no purpose other than to fill up screen time.<br /><br />These are just symptoms of the underlying problem. The movie is structured like a mystery but it isn\'t. As soon as the curse is pronounced we know exactly where the story is heading, so the characters are left painstakingly uncovering what we already know.<br /><br />The ending is particularly lame. Julia is menaced purely by accident. Lisa has no reason to want to kill her - she just happens to be in the wrong place at the wrong time. When Tom turns up in the nick of time to save her, it is not even clear whether she was threatened at all. He then simply disposes of the cobra in the way any of the previous victims might have done.<br /><br />It is such an inconsequential little pipsqueak of a story that I found myself wondering how on earth it had been pitched to the studio heads. Then it occurred to me. Someone said: "Those Val Lewton movies were very successful over at RKO, so why don\'t we make one like that?"<br /><br />Cult of the Cobra is clearly modelled on Cat People: mysterious, troubled, shape-shifting woman falls in love with the hero, is apparently frigid, kills people, arouses the suspicions of the hero\'s woman friend and dies at the end. But \'modelled on\' doesn\'t mean \'as good as\' - by a wide margin. It copies, but doesn\'t understand what it is copying.<br /><br />It is obviously trying for the low-key, suggestive Lewton style, but this approach doesn\'t follow through into the story. Lisa is no Irene. She is meant to be strange and mysterious but there is no mystery about her. We get a glimpse of her after the first attack in Asia, so immediately recognise her when she turns up in New York. There is never any doubt about her purpose. Neither is there any ambiguity about whether of not she actually turns into a snake.<br /><br />Then again, during her nocturnal prowling we get, not one, but two attempts at \'buses\'. Neither come off, because the director doesn\'t understand what makes a \'bus\' work and, in any case, they happen to the stalker, not the person being stalked.<br /><br />These faint echoes of Cat People give Cult of the Cobra whatever small distinction it might have, but they only draw attention to the yawning gulf between the original and the imitation.<br /><br />Plagiarism may be the sincerest form of flattery, but I doubt if Lewton or Tourneur were particularly flattered when this tepid little time-passer came out.'
 b'as an actor I really like independent films but this one is amateur at best.<br /><br />The boys go to Vermont for a civil service yet when the plane lands it flies over a palm tree - were the directors aware that palm trees are not in Vermont? Pines yes - palms no. And the same for the wedding service - again nice grove of palm trees.<br /><br />When the boys are leaving VT they apparently could not get a ticket on any major airline since the plane that is filmed is Federal Express. Did they ship themselves Overnight in a crate? Come on guys little details like this separate an indi film from totally amateur.<br /><br />The Christian brother is far gayer than Arthur with his bleached hair and tribal band tattoo. The two should have switched roles.<br /><br />The minor characters are laughable and overact something terrible.<br /><br />Applause to the directors for making a gay film but pay some attention to your locations and casting next time'
 b'"Come Undone" appears to elicit a lot of opinions among the contributors to this forum. Granted, it\'s a film that promises a take on gay life, as most viewers expect and somehow, it gets away from that promise into an introspective view at a young man\'s soul. The film has a way of staying with us even when it has ended. It is a character study about how a young man gets involved into a love affair with someone so much different than him that, in the end, will leave Mathieu confused, hurt and depressed when things don\'t go according to what he hoped the relationship would be.<br /><br />If you haven\'t seen the film, perhaps you would like to stop reading.<br /><br />Sebastien Lifshitz, the director of the film, has told his story from Mathieu\'s viewpoint. Most viewers appear to be disoriented by the different times within the film, but there are hints that are not obvious, as one can see, in retrospect. The story is told in flashbacks that might add to the way some people will view the film. This is a story about the doomed the love Mathieu felt for Cedric and the ultimate breakdown of their life together.<br /><br />First of all, Cedric, the handsome young local, pursues Mathieu until he succeeds in convincing him he likes him. Mathieu feels the attraction for Cedric too. We realize how different both young men are by the way Cedric tells Mathieu\'s family how he feels school is not for him. On the other hand, Mathieu, who wants to be an architect, finds beauty in the abandoned place where Cedric has taken him. We watch as Mathieu, reading from the guide book, wants Cedric\'s attention.<br /><br />When Mathieu comes out to his mother, she wisely tells him about the importance of continuing his career. She also points out about what future both of them would have together, which proves to be true. Mathieu appears to have learned his lesson, the hard way. He goes on to an uncertain life with Cedric and attempts to take his own life. We watch him in the hospital speaking to a psychiatrist that has treated his wounded soul.<br /><br />The ending might be confusing for most viewers, but there is a moment in the film when Mathieu goes to work in a bar where we see him washing glasses and looking intently to Pierre, the young man who frequents the bar. That is why when Mathieu goes looking for Pierre at his house, appears to be hard to imagine. Yet, we have seen the way Mathieu is obviously interested in Pierre. The last scene at the beach, when Pierre and Mathieu are seen strolling in the sand, has a hopeful sign that things will be better between them as they watch a young boy, apparently lost, but then realizing the father is nearby.<br /><br />Jeremie Elkaim makes Mathieu one of the most complex characters in recent films. This is a young man who is hard to understand on a simple level. Mathieu has suffered a lot, first with the separation of his parents, then with his depressed mother and with losing Cedric. Stephan Rideau, who has been seen on other important French films, is equally good, as the shallow Cedric.<br /><br />While "Come Undone" will divide opinions, the film deserves a viewing because of the complexity and the care Sebastien Lifshitz gives to the story.']

labels:  [0 0 1]

Create the text encoder

The raw text loaded by tfds needs to be processed before it can be used in a model. The simplest way to process text for training is using the experimental.preprocessing.TextVectorization layer. This layer has many capabilities, but this tutorial sticks to the default behavior.

Create the layer, and pass the dataset's text to the layer's .adapt method:

VOCAB_SIZE=1000
encoder = tf.keras.layers.experimental.preprocessing.TextVectorization(
    max_tokens=VOCAB_SIZE)
encoder.adapt(train_dataset.map(lambda text, label: text))

The .adapt method sets the layer's vocabulary. Here are the first 20 tokens. After the padding and unknown tokens they're sorted by frequency:

vocab = np.array(encoder.get_vocabulary())
vocab[:20]
array(['', '[UNK]', 'the', 'and', 'a', 'of', 'to', 'is', 'in', 'it', 'i',
       'this', 'that', 'br', 'was', 'as', 'for', 'with', 'movie', 'but'],
      dtype='<U14')

Once the vocabulary is set, the layer can encode text into indices. The tensors of indices are 0-padded to the longest sequence in the batch (unless you set a fixed output_sequence_length):

encoded_example = encoder(example)[:3].numpy()
encoded_example
array([[  1,   5,   2, ...,   1, 366,  46],
       [ 15,  34, 288, ...,   0,   0,   0],
       [209,   1, 717, ...,   0,   0,   0]])

With the default settings, the process is not completely reversible. There are three main reasons for that:

  1. The default value for preprocessing.TextVectorization's standardize argument is "lower_and_strip_punctuation".
  2. The limited vocabulary size and lack of character-based fallback results in some unknown tokens.
for n in range(3):
  print("Original: ", example[n].numpy())
  print("Round-trip: ", " ".join(vocab[encoded_example[n]]))
  print()
Original:  b'Cult of the Cobra is now available on DVD in a pristine print that does full justice to whatever merits it has as a movie. Unfortunately, that is not saying much.<br /><br />It has a competent cast of second-rankers that acquit themselves as well as could be expected under the circumstances. It is efficiently directed, entirely on sound stages and standing sets on the studio backlot. It looks OK, but is ponderously over-plotted and at a scant 80 minutes it is still heavily padded.<br /><br />For example, the double cobra attack on the first of the GIs was surely one attack too many.<br /><br />The business about Julia choosing to marry Pete rather than Tom never amounts to anything. Tom immediately falls in love with Lisa and she never has any reason to be jealous of Julia (nor is she).<br /><br />Julia\'s \'feminine intuition\' is introduced as if it is going to lead to an important plot development, but it doesn\'t. Similarly, Pete\'s investigation into cobra cults and the suspicion that briefly falls on Tom serve no purpose other than to fill up screen time.<br /><br />These are just symptoms of the underlying problem. The movie is structured like a mystery but it isn\'t. As soon as the curse is pronounced we know exactly where the story is heading, so the characters are left painstakingly uncovering what we already know.<br /><br />The ending is particularly lame. Julia is menaced purely by accident. Lisa has no reason to want to kill her - she just happens to be in the wrong place at the wrong time. When Tom turns up in the nick of time to save her, it is not even clear whether she was threatened at all. He then simply disposes of the cobra in the way any of the previous victims might have done.<br /><br />It is such an inconsequential little pipsqueak of a story that I found myself wondering how on earth it had been pitched to the studio heads. Then it occurred to me. Someone said: "Those Val Lewton movies were very successful over at RKO, so why don\'t we make one like that?"<br /><br />Cult of the Cobra is clearly modelled on Cat People: mysterious, troubled, shape-shifting woman falls in love with the hero, is apparently frigid, kills people, arouses the suspicions of the hero\'s woman friend and dies at the end. But \'modelled on\' doesn\'t mean \'as good as\' - by a wide margin. It copies, but doesn\'t understand what it is copying.<br /><br />It is obviously trying for the low-key, suggestive Lewton style, but this approach doesn\'t follow through into the story. Lisa is no Irene. She is meant to be strange and mysterious but there is no mystery about her. We get a glimpse of her after the first attack in Asia, so immediately recognise her when she turns up in New York. There is never any doubt about her purpose. Neither is there any ambiguity about whether of not she actually turns into a snake.<br /><br />Then again, during her nocturnal prowling we get, not one, but two attempts at \'buses\'. Neither come off, because the director doesn\'t understand what makes a \'bus\' work and, in any case, they happen to the stalker, not the person being stalked.<br /><br />These faint echoes of Cat People give Cult of the Cobra whatever small distinction it might have, but they only draw attention to the yawning gulf between the original and the imitation.<br /><br />Plagiarism may be the sincerest form of flattery, but I doubt if Lewton or Tourneur were particularly flattered when this tepid little time-passer came out.'
Round-trip:  [UNK] of the [UNK] is now [UNK] on dvd in a [UNK] [UNK] that does full [UNK] to whatever [UNK] it has as a movie unfortunately that is not saying [UNK] br it has a [UNK] cast of [UNK] that [UNK] themselves as well as could be expected under the [UNK] it is [UNK] directed [UNK] on sound [UNK] and [UNK] sets on the [UNK] [UNK] it looks ok but is [UNK] [UNK] and at a [UNK] [UNK] minutes it is still [UNK] [UNK] br for example the [UNK] [UNK] [UNK] on the first of the [UNK] was [UNK] one [UNK] too [UNK] br the business about [UNK] [UNK] to [UNK] [UNK] rather than tom never [UNK] to anything tom [UNK] falls in love with [UNK] and she never has any reason to be [UNK] of [UNK] nor is [UNK] br [UNK] [UNK] [UNK] is [UNK] as if it is going to lead to an important plot development but it doesnt [UNK] [UNK] [UNK] into [UNK] [UNK] and the [UNK] that [UNK] falls on tom [UNK] no [UNK] other than to [UNK] up screen [UNK] br these are just [UNK] of the [UNK] problem the movie is [UNK] like a mystery but it isnt as soon as the [UNK] is [UNK] we know exactly where the story is [UNK] so the characters are left [UNK] [UNK] what we already [UNK] br the ending is particularly lame [UNK] is [UNK] [UNK] by [UNK] [UNK] has no reason to want to kill her she just happens to be in the wrong place at the wrong time when tom turns up in the [UNK] of time to save her it is not even clear whether she was [UNK] at all he then simply [UNK] of the [UNK] in the way any of the previous [UNK] might have [UNK] br it is such an [UNK] little [UNK] of a story that i found myself [UNK] how on earth it had been [UNK] to the [UNK] [UNK] then it [UNK] to me someone said those [UNK] [UNK] movies were very [UNK] over at [UNK] so why dont we make one like [UNK] br [UNK] of the [UNK] is clearly [UNK] on [UNK] people [UNK] [UNK] [UNK] woman falls in love with the hero is apparently [UNK] [UNK] people [UNK] the [UNK] of the [UNK] woman friend and [UNK] at the end but [UNK] on doesnt mean as good as by a [UNK] [UNK] it [UNK] but doesnt understand what it is [UNK] br it is obviously trying for the [UNK] [UNK] [UNK] style but this [UNK] doesnt follow through into the story [UNK] is no [UNK] she is meant to be strange and [UNK] but there is no mystery about her we get a [UNK] of her after the first [UNK] in [UNK] so [UNK] [UNK] her when she turns up in new york there is never any doubt about her [UNK] [UNK] is there any [UNK] about whether of not she actually turns into a [UNK] br then again during her [UNK] [UNK] we get not one but two attempts at [UNK] [UNK] come off because the director doesnt understand what makes a [UNK] work and in any case they happen to the [UNK] not the person being [UNK] br these [UNK] [UNK] of [UNK] people give [UNK] of the [UNK] whatever small [UNK] it might have but they only [UNK] attention to the [UNK] [UNK] between the original and the [UNK] br [UNK] may be the [UNK] form of [UNK] but i doubt if [UNK] or [UNK] were particularly [UNK] when this [UNK] little [UNK] came out

Original:  b'as an actor I really like independent films but this one is amateur at best.<br /><br />The boys go to Vermont for a civil service yet when the plane lands it flies over a palm tree - were the directors aware that palm trees are not in Vermont? Pines yes - palms no. And the same for the wedding service - again nice grove of palm trees.<br /><br />When the boys are leaving VT they apparently could not get a ticket on any major airline since the plane that is filmed is Federal Express. Did they ship themselves Overnight in a crate? Come on guys little details like this separate an indi film from totally amateur.<br /><br />The Christian brother is far gayer than Arthur with his bleached hair and tribal band tattoo. The two should have switched roles.<br /><br />The minor characters are laughable and overact something terrible.<br /><br />Applause to the directors for making a gay film but pay some attention to your locations and casting next time'
Round-trip:  as an actor i really like [UNK] films but this one is [UNK] at [UNK] br the boys go to [UNK] for a [UNK] [UNK] yet when the [UNK] [UNK] it [UNK] over a [UNK] [UNK] were the directors [UNK] that [UNK] [UNK] are not in [UNK] [UNK] yes [UNK] no and the same for the [UNK] [UNK] again nice [UNK] of [UNK] [UNK] br when the boys are [UNK] [UNK] they apparently could not get a [UNK] on any major [UNK] since the [UNK] that is filmed is [UNK] [UNK] did they [UNK] themselves [UNK] in a [UNK] come on guys little [UNK] like this [UNK] an [UNK] film from totally [UNK] br the [UNK] brother is far [UNK] than [UNK] with his [UNK] [UNK] and [UNK] [UNK] [UNK] the two should have [UNK] [UNK] br the [UNK] characters are [UNK] and [UNK] something [UNK] br [UNK] to the directors for making a [UNK] film but pay some attention to your [UNK] and casting next time                                                                                                                                                                                                                                                                                                                                                                                                                                                     

Original:  b'"Come Undone" appears to elicit a lot of opinions among the contributors to this forum. Granted, it\'s a film that promises a take on gay life, as most viewers expect and somehow, it gets away from that promise into an introspective view at a young man\'s soul. The film has a way of staying with us even when it has ended. It is a character study about how a young man gets involved into a love affair with someone so much different than him that, in the end, will leave Mathieu confused, hurt and depressed when things don\'t go according to what he hoped the relationship would be.<br /><br />If you haven\'t seen the film, perhaps you would like to stop reading.<br /><br />Sebastien Lifshitz, the director of the film, has told his story from Mathieu\'s viewpoint. Most viewers appear to be disoriented by the different times within the film, but there are hints that are not obvious, as one can see, in retrospect. The story is told in flashbacks that might add to the way some people will view the film. This is a story about the doomed the love Mathieu felt for Cedric and the ultimate breakdown of their life together.<br /><br />First of all, Cedric, the handsome young local, pursues Mathieu until he succeeds in convincing him he likes him. Mathieu feels the attraction for Cedric too. We realize how different both young men are by the way Cedric tells Mathieu\'s family how he feels school is not for him. On the other hand, Mathieu, who wants to be an architect, finds beauty in the abandoned place where Cedric has taken him. We watch as Mathieu, reading from the guide book, wants Cedric\'s attention.<br /><br />When Mathieu comes out to his mother, she wisely tells him about the importance of continuing his career. She also points out about what future both of them would have together, which proves to be true. Mathieu appears to have learned his lesson, the hard way. He goes on to an uncertain life with Cedric and attempts to take his own life. We watch him in the hospital speaking to a psychiatrist that has treated his wounded soul.<br /><br />The ending might be confusing for most viewers, but there is a moment in the film when Mathieu goes to work in a bar where we see him washing glasses and looking intently to Pierre, the young man who frequents the bar. That is why when Mathieu goes looking for Pierre at his house, appears to be hard to imagine. Yet, we have seen the way Mathieu is obviously interested in Pierre. The last scene at the beach, when Pierre and Mathieu are seen strolling in the sand, has a hopeful sign that things will be better between them as they watch a young boy, apparently lost, but then realizing the father is nearby.<br /><br />Jeremie Elkaim makes Mathieu one of the most complex characters in recent films. This is a young man who is hard to understand on a simple level. Mathieu has suffered a lot, first with the separation of his parents, then with his depressed mother and with losing Cedric. Stephan Rideau, who has been seen on other important French films, is equally good, as the shallow Cedric.<br /><br />While "Come Undone" will divide opinions, the film deserves a viewing because of the complexity and the care Sebastien Lifshitz gives to the story.'
Round-trip:  come [UNK] appears to [UNK] a lot of [UNK] among the [UNK] to this [UNK] [UNK] its a film that [UNK] a take on [UNK] life as most viewers expect and somehow it gets away from that [UNK] into an [UNK] view at a young [UNK] [UNK] the film has a way of [UNK] with us even when it has [UNK] it is a character [UNK] about how a young man gets involved into a love [UNK] with someone so much different than him that in the end will leave [UNK] [UNK] [UNK] and [UNK] when things dont go [UNK] to what he [UNK] the relationship would [UNK] br if you havent seen the film perhaps you would like to stop [UNK] br [UNK] [UNK] the director of the film has told his story from [UNK] [UNK] most viewers appear to be [UNK] by the different times within the film but there are [UNK] that are not obvious as one can see in [UNK] the story is told in [UNK] that might add to the way some people will view the film this is a story about the [UNK] the love [UNK] felt for [UNK] and the [UNK] [UNK] of their life [UNK] br first of all [UNK] the [UNK] young local [UNK] [UNK] until he [UNK] in [UNK] him he [UNK] him [UNK] feels the [UNK] for [UNK] too we realize how different both young men are by the way [UNK] tells [UNK] family how he feels school is not for him on the other hand [UNK] who wants to be an [UNK] finds beauty in the [UNK] place where [UNK] has taken him we watch as [UNK] reading from the [UNK] book wants [UNK] [UNK] br when [UNK] comes out to his mother she [UNK] tells him about the [UNK] of [UNK] his career she also points out about what future both of them would have together which [UNK] to be true [UNK] appears to have [UNK] his [UNK] the hard way he goes on to an [UNK] life with [UNK] and attempts to take his own life we watch him in the [UNK] [UNK] to a [UNK] that has [UNK] his [UNK] [UNK] br the ending might be [UNK] for most viewers but there is a moment in the film when [UNK] goes to work in a [UNK] where we see him [UNK] [UNK] and looking [UNK] to [UNK] the young man who [UNK] the [UNK] that is why when [UNK] goes looking for [UNK] at his house appears to be hard to imagine yet we have seen the way [UNK] is obviously interested in [UNK] the last scene at the [UNK] when [UNK] and [UNK] are seen [UNK] in the [UNK] has a [UNK] [UNK] that things will be better between them as they watch a young boy apparently lost but then [UNK] the father is [UNK] br [UNK] [UNK] makes [UNK] one of the most [UNK] characters in [UNK] films this is a young man who is hard to understand on a simple level [UNK] has [UNK] a lot first with the [UNK] of his parents then with his [UNK] mother and with [UNK] [UNK] [UNK] [UNK] who has been seen on other important french films is [UNK] good as the [UNK] [UNK] br while come [UNK] will [UNK] [UNK] the film deserves a viewing because of the [UNK] and the care [UNK] [UNK] gives to the story                               


Create the model

A drawing of the information flow in the model

Above is a diagram of the model.

  1. This model can be build as a tf.keras.Sequential.

  2. The first layer is the encoder, which converts the text to a sequence of token indices.

  3. After the encoder is an embedding layer. An embedding layer stores one vector per word. When called, it converts the sequences of word indices to sequences of vectors. These vectors are trainable. After training (on enough data), words with similar meanings often have similar vectors.

    This index-lookup is much more efficient than the equivalent operation of passing a one-hot encoded vector through a tf.keras.layers.Dense layer.

  4. A recurrent neural network (RNN) processes sequence input by iterating through the elements. RNNs pass the outputs from one timestep to their input on the next timestep.

    The tf.keras.layers.Bidirectional wrapper can also be used with an RNN layer. This propagates the input forward and backwards through the RNN layer and then concatenates the final output.

    • The main advantage to a bidirectional RNN is that the signal from the beginning of the input doesn't need to be processed all the way through every timestep to affect the output.

    • The main disadvantage of a bidirectional RNN is that you can't efficiently stream predictions as words are being added to the end.

  5. After the RNN has converted the sequence to a single vector the two layers.Dense do some final processing, and convert from this vector representation to a single logit as the classification output.

The code to implement this is below:

model = tf.keras.Sequential([
    encoder,
    tf.keras.layers.Embedding(
        input_dim=len(encoder.get_vocabulary()),
        output_dim=64,
        # Use masking to handle the variable sequence lengths
        mask_zero=True),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1)
])

Please note that Keras sequential model is used here since all the layers in the model only have single input and produce single output. In case you want to use stateful RNN layer, you might want to build your model with Keras functional API or model subclassing so that you can retrieve and reuse the RNN layer states. Please check Keras RNN guide for more details.

The embedding layer uses masking to handle the varying sequence-lengths. All the layers after the Embedding support masking:

print([layer.supports_masking for layer in model.layers])
[False, True, True, True, True]

To confirm that this works as expected, evaluate a sentence twice. First, alone so there's no padding to mask:

# predict on a sample text without padding.

sample_text = ('The movie was cool. The animation and the graphics '
               'were out of this world. I would recommend this movie.')
predictions = model.predict(np.array([sample_text]))
print(predictions[0])
[-0.02237528]

Now, evaluate it again in a batch with a longer sentence. The result should be identical:

# predict on a sample text with padding

padding = "the " * 2000
predictions = model.predict(np.array([sample_text, padding]))
print(predictions[0])
[-0.02237528]

Compile the Keras model to configure the training process:

model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              optimizer=tf.keras.optimizers.Adam(1e-4),
              metrics=['accuracy'])

Train the model

history = model.fit(train_dataset, epochs=10,
                    validation_data=test_dataset, 
                    validation_steps=30)
Epoch 1/10
391/391 [==============================] - 44s 95ms/step - loss: 0.6858 - accuracy: 0.5052 - val_loss: 0.5634 - val_accuracy: 0.7193
Epoch 2/10
391/391 [==============================] - 34s 87ms/step - loss: 0.4815 - accuracy: 0.7763 - val_loss: 0.4102 - val_accuracy: 0.8328
Epoch 3/10
391/391 [==============================] - 34s 87ms/step - loss: 0.3718 - accuracy: 0.8415 - val_loss: 0.3441 - val_accuracy: 0.8542
Epoch 4/10
391/391 [==============================] - 34s 86ms/step - loss: 0.3363 - accuracy: 0.8604 - val_loss: 0.3365 - val_accuracy: 0.8594
Epoch 5/10
391/391 [==============================] - 34s 87ms/step - loss: 0.3221 - accuracy: 0.8638 - val_loss: 0.3395 - val_accuracy: 0.8583
Epoch 6/10
391/391 [==============================] - 34s 86ms/step - loss: 0.3133 - accuracy: 0.8692 - val_loss: 0.3307 - val_accuracy: 0.8583
Epoch 7/10
391/391 [==============================] - 34s 87ms/step - loss: 0.3106 - accuracy: 0.8687 - val_loss: 0.3240 - val_accuracy: 0.8599
Epoch 8/10
391/391 [==============================] - 34s 86ms/step - loss: 0.3014 - accuracy: 0.8759 - val_loss: 0.3178 - val_accuracy: 0.8630
Epoch 9/10
391/391 [==============================] - 33s 84ms/step - loss: 0.2971 - accuracy: 0.8747 - val_loss: 0.3278 - val_accuracy: 0.8562
Epoch 10/10
391/391 [==============================] - 33s 85ms/step - loss: 0.2982 - accuracy: 0.8731 - val_loss: 0.3194 - val_accuracy: 0.8656

test_loss, test_acc = model.evaluate(test_dataset)

print('Test Loss: {}'.format(test_loss))
print('Test Accuracy: {}'.format(test_acc))
391/391 [==============================] - 16s 42ms/step - loss: 0.3157 - accuracy: 0.8639
Test Loss: 0.315724641084671
Test Accuracy: 0.8639199733734131

plt.figure(figsize=(16,8))
plt.subplot(1,2,1)
plot_graphs(history, 'accuracy')
plt.ylim(None,1)
plt.subplot(1,2,2)
plot_graphs(history, 'loss')
plt.ylim(0,None)
(0.0, 0.6800670087337494)

png

Run a prediction on a new sentence:

If the prediction is >= 0.0, it is positive else it is negative.

sample_text = ('The movie was cool. The animation and the graphics '
               'were out of this world. I would recommend this movie.')
predictions = model.predict(np.array([sample_text]))

Stack two or more LSTM layers

Keras recurrent layers have two available modes that are controlled by the return_sequences constructor argument:

  • If False it returns only the last output for each input sequence (a 2D tensor of shape (batch_size, output_features)). This is the default, used in the previous model.

  • If True the full sequences of successive outputs for each timestep is returned (a 3D tensor of shape (batch_size, timesteps, output_features)).

Here is what the flow of information looks like with return_sequences=True:

layered_bidirectional

The interesting thing about using an RNN with return_sequences=True is that the output still has 3-axes, like the input, so it can be passed to another RNN layer, like this:

model = tf.keras.Sequential([
    encoder,
    tf.keras.layers.Embedding(len(encoder.get_vocabulary()), 64, mask_zero=True),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64,  return_sequences=True)),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(1)
])
model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              optimizer=tf.keras.optimizers.Adam(1e-4),
              metrics=['accuracy'])
history = model.fit(train_dataset, epochs=10,
                    validation_data=test_dataset,
                    validation_steps=30)
Epoch 1/10
391/391 [==============================] - 80s 164ms/step - loss: 0.6861 - accuracy: 0.5069 - val_loss: 0.4562 - val_accuracy: 0.7812
Epoch 2/10
391/391 [==============================] - 58s 149ms/step - loss: 0.4216 - accuracy: 0.8140 - val_loss: 0.3823 - val_accuracy: 0.8146
Epoch 3/10
391/391 [==============================] - 59s 150ms/step - loss: 0.3561 - accuracy: 0.8470 - val_loss: 0.3346 - val_accuracy: 0.8396
Epoch 4/10
391/391 [==============================] - 59s 150ms/step - loss: 0.3264 - accuracy: 0.8627 - val_loss: 0.3252 - val_accuracy: 0.8500
Epoch 5/10
391/391 [==============================] - 58s 149ms/step - loss: 0.3225 - accuracy: 0.8629 - val_loss: 0.3498 - val_accuracy: 0.8271
Epoch 6/10
391/391 [==============================] - 58s 149ms/step - loss: 0.3055 - accuracy: 0.8738 - val_loss: 0.3250 - val_accuracy: 0.8568
Epoch 7/10
391/391 [==============================] - 58s 149ms/step - loss: 0.3110 - accuracy: 0.8679 - val_loss: 0.3410 - val_accuracy: 0.8422
Epoch 8/10
391/391 [==============================] - 58s 149ms/step - loss: 0.2991 - accuracy: 0.8724 - val_loss: 0.3189 - val_accuracy: 0.8573
Epoch 9/10
391/391 [==============================] - 60s 154ms/step - loss: 0.2977 - accuracy: 0.8736 - val_loss: 0.3241 - val_accuracy: 0.8505
Epoch 10/10
391/391 [==============================] - 60s 153ms/step - loss: 0.2976 - accuracy: 0.8727 - val_loss: 0.3212 - val_accuracy: 0.8557

test_loss, test_acc = model.evaluate(test_dataset)

print('Test Loss: {}'.format(test_loss))
print('Test Accuracy: {}'.format(test_acc))
391/391 [==============================] - 28s 72ms/step - loss: 0.3196 - accuracy: 0.8563
Test Loss: 0.31963279843330383
Test Accuracy: 0.8562800288200378

# predict on a sample text without padding.

sample_text = ('The movie was not good. The animation and the graphics '
                    'were terrible. I would not recommend this movie.')
predictions = model.predict(np.array([sample_text]))
print(predictions)
[[-1.7240916]]

plt.figure(figsize=(16,6))
plt.subplot(1,2,1)
plot_graphs(history, 'accuracy')
plt.subplot(1,2,2)
plot_graphs(history, 'loss')

png

Check out other existing recurrent layers such as GRU layers.

If you're interestied in building custom RNNs, see the Keras RNN Guide.