Building BookWorm: A book info & recommendation bot using Twilio!

Jayesh Bapu Ahire - Mar 30 '20 - - Dev Community

In previous article, we built the WhatsApp bot to fight fake news! If you missed it you can check it out here. In this detailed tutorial we will see how we can build a bot which will give us some book recommendations and tell us information about a book we want.

Let's just jump into this!

Let's just jump into this

Aim:

We will be building the WhatsApp bot which will give us more information about the book whose name we will be providing as an input and will also recommend us similar books!

What we will need?

Dataset:

We will be using goodbooks-10k dataset.

This dataset contains six million ratings for ten thousand most popular (with most ratings) books. There are also:

  • books marked to read by the users
  • book metadata (author, year, etc.)
  • tags/shelves/genres

You can download zipped data from here: https://github.com/zygmuntz/goodbooks-10k/releases

Pre-Processing:

We will do some preprocessing for initial stage. This is ready to use dataset though we will drop some columns which we will not be using and fill some blank cells.
Initially the dataset has 23 columns out of which we dropped 4 columns which are title, work_ratings_count, image_url and small_image_url as we won't be using them.

# importing pandas module 
import pandas as pd 

# making data frame from csv file 
books = pd.read_csv("books.csv", index_col ="Name" ) 

# dropping passed columns 
books.drop(["title", "work_ratings_count",  "image_url", "small_image_url"], axis = 1, inplace = True)

#filling blank values with "Not Available" 
books = books.fillna("Not Available")

Enter fullscreen mode Exit fullscreen mode

We will do some occasional formatting whenever needed.

Let's split further task into two modules:

  1. Fetch Book information
  2. Book recommendation system

Let's see first part:

1. Fetch Book information:

This part is very simple and you don't need to know anything apart from basic python.
Here, we will go through our clean CSV file which we got after preprocessing and search for book title which we received from user in title field in csv (which is renamed original_title field).
If we find a match we will return the index of that row and store that index into list of matched books.
So now we have the list of indexes books matching to user query. Now we can fetch whatever information we want so let's just keep this list aside for a second.

def get_matches(book_title):
    matching_books_list = []
    with open('clean_books.csv', 'r') as file_reader:
        flines = file_reader.readline()
        print(flines.rstrip())
        search = file_reader.readlines()

        for i, sline in enumerate(search):
            if book_title.upper() in sline.upper():
                matching_books_list.append(i)
    return matching_books_list
Enter fullscreen mode Exit fullscreen mode

Now, we have to see how we can infer for serving results on WhatsApp!
Let's create a flask server for that which we can use to serve our book information API.

app = Flask(__name__)
@app.route('/sms', methods=['POST'])
def sms():
    resp = MessagingResponse()
    inbMsg = request.values.get('Body')
    book_list = book_ratings.get_matches(inbMsg)
    df = pd.read_csv('clean_books.csv')
    for i in book_list:
        resp.message(
            'The book with title' + df['original_title'].iloc[i] + 'written by ' + df['authors'].iloc[i] +'has average user rating of ' + str(df['average_rating'].iloc[i])+' and this book is reviewed by '+str(df['work_text_reviews_count'].iloc[i]+'people.')'.\n ---------------------------------')
    return str(resp)
Enter fullscreen mode Exit fullscreen mode

Now let's go ahead and see how we can build the recommendation engine.

2. Book recommendation engine:

We will generate recommendations using 3 different criteria. For all of them we will vectorize the input and find the cosine similarity with our particular columns in our dataset and return ones those have very high similarity.
Let's see how we can do this

  • Author Based recommendations:
tf = TfidfVectorizer(analyzer='word',ngram_range=(1, 2),min_df=0, stop_words='english')
tfidf_matrix = tf.fit_transform(books['authors'])
cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)

# Build a 1-dimensional array with book titles
titles = books['title']
indices = pd.Series(books.index, index=books['title'])

# Function that get book recommendations based on the cosine similarity score of book authors
def authors_recommendations(title):
    idx = indices[title]
    sim_scores = list(enumerate(cosine_sim[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:21]
    book_indices = [i[0] for i in sim_scores]
    return titles.iloc[book_indices]
Enter fullscreen mode Exit fullscreen mode
  • Tags based recommendations:
tf1 = TfidfVectorizer(analyzer='word',ngram_range=(1, 2),min_df=0, stop_words='english')
tfidf_matrix1 = tf1.fit_transform(books_with_tags['tag_name'].head(10000))
cosine_sim1 = linear_kernel(tfidf_matrix1, tfidf_matrix1)

# Build a 1-dimensional array with book titles
titles1 = books['title']
indices1 = pd.Series(books.index, index=books['title'])

# Function that get book recommendations based on the cosine similarity score of books tags
def tags_recommendations(title):
    idx = indices1[title]
    sim_scores = list(enumerate(cosine_sim1[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:21]
    book_indices = [i[0] for i in sim_scores]
    return titles.iloc[book_indices]

Enter fullscreen mode Exit fullscreen mode
  • Corpus based recommendations: In this we will build recommendation of books using the authors and tags attributes for better results. We will create corpus of features and calculate the TF-IDF on the corpus of attributes for gettings better recommendations.
books['corpus'] = (pd.Series(books[['authors', 'tag_name']]
                .fillna('')
                .values.tolist()
                ).str.join(' '))

tf_corpus = TfidfVectorizer(analyzer='word',ngram_range=(1, 2),min_df=0, stop_words='english')
tfidf_matrix_corpus = tf_corpus.fit_transform(books['corpus'])
cosine_sim_corpus = linear_kernel(tfidf_matrix_corpus, tfidf_matrix_corpus)

# Build a 1-dimensional array with book titles
titles = books['title']
indices = pd.Series(books.index, index=books['title'])

# Function that get book recommendations based on the cosine similarity score of books tags
def corpus_recommendations(title):
    idx = indices1[title]
    sim_scores = list(enumerate(cosine_sim_corpus[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:5]
    book_indices = [i[0] for i in sim_scores]
    return titles.iloc[book_indices]
Enter fullscreen mode Exit fullscreen mode

Above functions will return list of recommended books but for the sake of inference we will return list of indexes which match our criteria using return book_indices instead of return titles.iloc[book_indices].

Now let's see how inferencing will work in case of book recommendation system.

app = Flask(__name__)
@app.route('/sms', methods=['POST'])
def sms():
    resp = MessagingResponse()
    inbMsg = request.values.get('Body')
    rec = recommendations.corpus_recommendations(inbMsg)
    df = pd.read_csv('clean_books.csv')
    resp.message('Recommendations based on your input:')
    for i in rec:
        resp.message (df['original_title'].iloc[i+2]+ "\n")
    return str(resp)
Enter fullscreen mode Exit fullscreen mode

You can use any of the recommendation function though I have used corpus recommendation here as it considers both author and tags.

Final steps

Once this is done, we will run our flask server using this:

FLASK_APP=app:app FLASK_ENV=development flask run
Enter fullscreen mode Exit fullscreen mode

To test this we'll need to open up a tunnel to our server running on our machine. We will be using ngrok for this. Run this once you have installed ngrok:

ngrok http 5000
Enter fullscreen mode Exit fullscreen mode

This will open a tunnel pointing to port 5000 and will provide us a public ngrok URL which will point to our local application. Now, we have to open the WhatsApp Sandbox in our Twilio console and enter that URL plus the path /sms into the field labelled When a message comes in.

Let's send our sandbox number a message with book name and let's see results:

  • Book Information

  • Book Recommendation

We have successfully generated book information and recommendation! Isn't this cool?

You can find the complete code here.

What's next?

This was an basic intro to how you can create an recommendation system using Twilio WhatsApp API or Messaging API. You can use similar approach to enhance customer experience in your business.
What you are planning to build with this? Let me know in comments below or hit me up on twitter !

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .