How Positive was your Year with TensorFlow.js and Twilio

Lizzie Siegle - Dec 16 '19 - - Dev Community

header img
This blog post was written for Twilio and originally published on the Twilio blog.

As 2019 (and the decade) come to an end, it's interesting to reflect on the time spent. What do our text messages say about how positive or negative our time was? This post uses TensorFlow.js to analyze the sentiment of your Twilio text messages for the year.
tony hawk sentiment

Prerequisites

How does TensorFlow.js help with sentiment analysis?

TensorFlow makes it easier to perform machine learning (you can read 10 things you need to know before getting started with it here) and for this post we will use one of their pre-trained models and training data. Let's go over some high-level definitions:

  • Convolutional Neural Network (CNN): a neural network often used to classify images and video that takes input and returns output of a fixed size. Exhibits translational invariance, that is, a cat is a cat regardless of where in an image it is.
  • Recurrent Neural Network (RNN): a neural network best-suited for text and speech analysis that can work with sequential input and output of arbitrary sizes.
  • Long Short-Term Memory networks (LSTM): a special type of RNN often used in practice due to its ability to learn to both remember and forget important details.

TensorFlow.js provides a pre-trained model trained on a set of 25,000 movie reviews from IMDB, given either a positive or negative sentiment label, and two model architectures to use: CNN or LSTM. This post will be using the CNN.

What do your Twilio texts say about you?

To see what messages sent to or from your Twilio account say about you you could view previous messages in your SMS logs but let's do it with code.

Setting up

Create a new directory to work in called sentiment, and open your terminal in that directory. Run:

npm init --yes
Enter fullscreen mode Exit fullscreen mode

to create a new Node.js project. Install the dependencies we will use: Tensorflow.js, node-fetch (to fetch metadata from the TensorFlow.js sentiment concurrent neural network) and Twilio:
npm install @tensorflow/tfjs node-fetch twilio
Make a file called sentiment.js and require the Node.js modules at the top. A JavaScript function setup() will loop through text messages sent from a personal phone number to our Twilio client (make sure to get your Account SID and Auth Token from the Twilio console.) We set the dates so we retrieve all messages sent this year in 2019, but you can play around with it to reflect a time period of your choosing. setup() will then return an array of text messages.

const tf = require("@tensorflow/tfjs");

const fetch = require("node-fetch");
const client = require("twilio")(
 'REPLACE-WITH-YOUR-TWILIO-ACCOUNT-SID',
 'REPLACE-WITH-YOUR-TWILIO-AUTH-TOKEN'
);

const setup = async () => {
  const messages = await client.messages.list({
    dateSentAfter: new Date(Date.UTC(2019, 0, 1, 0, 0, 0)),
    dateSentBefore: new Date(Date.UTC(2019, 11, 31, 0, 0, 0)),
    from: "REPLACE-WITH-YOUR-PERSONAL-PHONE-NUMBER"
  });
  return messages.map(m => m.body);
}
Enter fullscreen mode Exit fullscreen mode

If you have a lot of duplicate messages, you could analyze the sentiment of each unique unique message by returning Array.from(new Set(messages.map(m => m.body)));.

Prepare, clean, and vectorize data

Next we want to fetch some metadata which provides both the shape and type of the model, but can generally be viewed as a training configuration that does some heavy lifting for us. This is where we'll use node-fetch to get the metadata hosted at a remote URL to help us train our model.

const getMetaData = async () => {
  const metadata = await fetch("https://storage.googleapis.com/tfjs-models/tfjs/sentiment_cnn_v1/metadata.json")
  return metadata.json()
}
Enter fullscreen mode Exit fullscreen mode

Soon we will convert words to sequences of word indices based on the metadata but first we need to make those sequences of equal lengths and convert the strings of words to integers, a process which is called vectorizing. Sequences longer than the size of the last dimension of the returned tensor (metadata.max_len) are truncated and sequences shorter than it are padded at the start of the sequence. This function is credited to the TensorFlow.js sentiment example.

const padSequences = (sequences, metadata) => {
  return sequences.map(seq => {
    if (seq.length > metadata.max_len) {
      seq.splice(0, seq.length - metadata.max_len);
    }
    if (seq.length < metadata.max_len) {
      const pad = [];
      for (let i = 0; i < metadata.max_len - seq.length; ++i) {
        pad.push(0);
      }
      seq = pad.concat(seq);
    }
    return seq;
  });
}
Enter fullscreen mode Exit fullscreen mode

Make a prediction for each text message

Let it go sentiment
We need to load our model before we can predict the sentiment of a text message. This is done in this function similar to the one that loaded our metadata:

const loadModel = async () => {
    const url = `https://storage.googleapis.com/tfjs-models/tfjs/sentiment_cnn_v1/model.json`;
    const model = await tf.loadLayersModel(url);
    return model;
};
Enter fullscreen mode Exit fullscreen mode

Then the function that predicts how positive a text message is accepts three parameters: one text message, the model loaded from a remote URL in the next function, and the metadata. In predict the input text is first tokenized and trimmed with regular expressions to convert it to lower-case and remove punctuation.

const predict = (text, model, metadata) => {
  const trimmed = text.trim().toLowerCase().replace(/(\.|\,|\!)/g, '').split(' ');
Enter fullscreen mode Exit fullscreen mode

Next those trimmed words are converted to a sequence of word indices based on the metadata. Let's say a word is in the testing input but not in the training data or recognition vocabulary. This is called out-of-vocabulary, or OOV. With this conversion, even if a word is OOV like a misspelling or emoji, it can still be embedded as a vector, or array of numbers, which is needed to be used by the machine learning model.

  const sequence = trimmed.map(word => {
    const wordIndex = metadata.word_index[word];
    if (typeof wordIndex === 'undefined') {
      return  2; //oov_index
    }

    return wordIndex + metadata.index_from;
  });
Enter fullscreen mode Exit fullscreen mode

Finally, the model predicts how positive the text is. We create a TensorFlow object with our sequences of word indices. Once our output data is retrieved and loosely downloaded from the GPU to the CPU with the synchronous dataSync() function, we need to explicitly manage memory and remove that tensor's memory with dispose() before returning a decimal showing how positive the model thinks the text is.

  const paddedSequence = padSequences([sequence], metadata);
  const input = tf.tensor2d(paddedSequence, [1, metadata.max_len]);

  const predictOut = model.predict(input);
  const score = predictOut.dataSync()[0];
  predictOut.dispose();
  return score;
}
Enter fullscreen mode Exit fullscreen mode

Here's the complete code for predict:

const predict = (text, model, metadata) => {
  const trimmed = text.trim().toLowerCase().replace(/(\.|\,|\!)/g, '').split(' ');
  const sequence = trimmed.map(word => {
    const wordIndex = metadata.word_index[word];
    if (typeof wordIndex === 'undefined') {
      return 2; //oov_index
    }
    return wordIndex + metadata.index_from;
  });
  const paddedSequence = padSequences([sequence], metadata);
  const input = tf.tensor2d(paddedSequence, [1, metadata.max_len]);

  const predictOut = model.predict(input);
  const score = predictOut.dataSync()[0];
  predictOut.dispose();
  return score;
}
Enter fullscreen mode Exit fullscreen mode

We could sure use a helper function that compares each positivity score and determines whether that makes the text message positive, negative, or neutral.

const getSentiment = (score) => {
  if (score > 0.66) {
    return `Score of ${score} is Positive`;
  }
  else if (score > 0.4) {
    return `Score of ${score} is Neutral`;
  }
  else {
    return `Score of ${score} is Negative`;
  }
}
Enter fullscreen mode Exit fullscreen mode

This helper function will be called in run() which calls most of our functions. In run(), we first load our pretrained model at a remote URL to use it to create our model with the TensorFlow.js-specific function loadLayersModel() (1load_model()1 in Keras, a high-level open source neural networks Python library that can run on top of TensorFlow and other machine learning tools) which accepts a model.json file as its argument. If you have a HDF5 file (which is how models are saved in Keras), you can convert that to a model.json using the TensorFlow.js pip package.

For each text, the model makes a prediction and adds it to a running sum of decimals before finally calling getSentiment() on the average of the predictions for each text message.

async function run(text) {
  const url = `https://storage.googleapis.com/tfjs-models/tfjs/sentiment_cnn_v1/model.json`
  const model = await tf.loadLayersModel(url); 
  const metadata = await getMetaData();
  let sum = 0;
  text.forEach(function (prediction) {
    console.log(` ${prediction}`);
    perc = predict(prediction, model, metadata);
    sum += parseFloat(perc, 10);
  })
  console.log(getSentiment(sum/text.length));
}
Enter fullscreen mode Exit fullscreen mode

Don't forget to call run()!

setup().then(function(result) {
  run(result); 
});
Enter fullscreen mode Exit fullscreen mode

Test your app

On the command line, run node sentiment.js. You should see whether or not your texts for the year are positive, negative, or neutral.
sentiment percentage printed out on the command line
Was your year positive? What about your decade maybe?

What's Next?

just the beginning gif
In this post, you saw how to retrieve old text messages from the Twilio API, clean input with regular expressions, and perform sentiment analysis on texts with TensorFlow in JavaScript. You can also change the dates you retrieve text messages from or change the phone number (maybe your Twilio number sent more positive messages than your personal one sent to a Twilio number!).

For other projects, you can apply sentiment analysis to other forms of input like text files of stories (the first Harry Potter story is a text file on GitHub here, you're welcome!), real-time chat (maybe with Twilio), email, social media posts like tweets, GitHub commit messages, and more!

If you have any questions or are working with TensorFlow and communications, I'd love to chat with you!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .