How to Build a Machine Learning Chatbot for Your Business

Ivan Ozhiganov

Ivan Ozhiganov

Founder & CEO at Azoft

#Advanced technology


24 Jan 2019

Reading time:

24 Jan 2019

Let’s imagine a situation: Helen decided to have a seaside vacation, and she has no time to search for options, especially because there are so many resorts. So, this job goes to a travel agency. Helen sends a message to the online chats of three tour operators. Nobody replies in the first chat. An agent from the second invites Helen to their office to look at catalogs. Finally, Pete, a smart chatbot from the third tour agency estimates Helen’s request in the chat and offers five different trips.

Pete asked just a couple of questions. In 5 minutes, he determined when Helen wants to have a vacation and how much money she’s ready to pay. At the same time, the smart chatbot promoted a tour agency and helped the business save costs on hiring additional staff to support the chat. Helen is happy and dreaming about a new bikini. And Pete the Chatbot is answering the next client’s questions. Meanwhile, the tour agency owner is counting the profit.

This story should surprise no one. The market for machine learning chatbots continues to develop. According to experts, the size of the natural language processing market is estimated to grow up to USD 16 Billion by 2021. Modern companies see these perspectives and use AI chatbots as a new channel for communication with clients.

Chatbots in site

How does a machine learning chatbot work?

AI chatbots are no longer just a standard set of answers to questions, they are natural language processing technologies, neural network models and machine learning algorithms.

Machine learning allows a smart chatbot to communicate with users adequately and consistently. The more naturally a smart chatbot talks, the more likely a user treats it as a real person. In the future, five or ten years from now, a customer won’t even understand who talks is on the other end — a human or a bot.    

Today chatbots demonstrate the magic of understanding:

  • asking questions according to a specific topic
  • responding precisely
  • remembering the context of the conversation

To make a chatbot work you need to follow these 4 conditions:

  • prepare topics for conversation
  • create samples of the phrases
  • train a chatbot to talk
  • assign to the chatbot a specific task

Who needs a machine learning chatbot?

Smart chatbots can replace human support in almost any industry. In particular consider applying machine learning chatbots if your business relates to banking, E-commerce, delivery and logistics services, mobile and internet providers, food service etc. And if your company has a contact center, a chatbot development can be a new channel of sales and a way to improve customer service.

When a machine learning chatbot is effective

Chatbots will be perfect for:

  • consulting about products and services
  • direct sales
  • technical support
  • polls and E-mailing

Chatbots allow businesses to:

  • support communication with customers 24/7
  • automate typical request processing
  • save money on human resources — you don’t need contact center operators or consultants anymore
  • reduce reputational risks

How we created a machine learning chatbot for our HR department

About 100 employees work at Azoft. The company has a social program and a corporate events calendar. There is also lots of useful data on the corporate web portal. However, HR officers often face a situation, when somebody, let’s say, John, has a question. John is too lazy to look for the answer. John writes to the HR department: “I’d like to get compensation for the sports activity. Does it include compensation for go-karting?” There could be more than 10 similar questions a day and all from the one employee.

Therefore we decided to develop a smart chatbot that can determine the needs of Azoft employees and answer their questions. We implemented an R&D study and made a machine learning chatbot for the Azoft HR department. Our virtual assistant for HR is based on a recurrent neural network.

The Project Stages

1. We developed an architecture of a chatbot that doesn’t ask questions.

Before the project began we made two key requirements for the chatbot:

  • The Chatbot has to answer customer inquiries.
  • The Chatbot has to engage in dialogue similar to a live conversation to give the impression of “being there”.

First, we created an architecture where the chatbot could answer inquiries from users but couldn’t ask counter questions. We got a simple algorithm of natural language processing or user’s phrase processing:

  1. Convert a user’s phrase from text to a numeric vector.
  2. Process a numeric vector with a User Intent Classifier.
  3. Choose a phrase to meet a user’s inquiry from the list of prepared phrases that coincide with a User Intent
Chatbot pipeline

The simplest chatbot architecture

2. Defined the User Intent using a decision tree.

We decided to try decision trees to determine user intents. To process phrases we had to convert them into numerical data.

We used a bag-of-words algorithm to convert user phrases into numeric vectors. Bag-of-words allows us to convert a user’s phrase into a numeric vector where the numerical quantity equals the quantity of words in the chatbot’s vocabulary. If some word — for example, basil — appears in a sentence twice (“I want to buy a bunch of basil but don’t know where to find basil”), then the number for this word in a numeric vector will be equal to two (2). If a word didn’t appear at all — for example, artichoke — then the number will be equal to zero (0).

A numeric vector looks like an array of numbers where every figure is a word’s number corresponding to this word by an index in a sentence. In fact, a numeric vector determines how often words appear in a sentence and then combines these frequencies into a common dataset.

We processed every sentence first to convert them into a numeric vector — excluding the most frequently appeared words: conjunctions, prepositions, and others.

Bag of words example

An example of how we converted a sentence to a numeric vector using the bag-of-words algorithm

This vector includes all the words that can be met in a user’s phrase.


The final scheme of communication between a user and chatbot based on the bag-of-words

To classify the users intentions we used a DecisionTreeClassifier from the sci-kit-learn library.

The bag-of-words algorithm turned out to be quite precise and handled even situations where unfamiliar words were used.


A conversation with a smart chatbot based on the bag-of-words algorithm

3. Develop an architecture of a chatbot that asks questions and answers user queries.

A “bag-of-words” chatbot could easily answer questions but it didn’t create a live conversation illusion. It couldn’t ask questions. For this reason, we decided to add to our chatbot the skill of asking questions. Thus we started to change the chatbot architecture.

First, we created a new entity UserIntent. It keeps the information about user intention, a set of phrases that can express this intention, and actions that can be done in response to the user inquiry.

Another entity we had was Action. It helps determines what actions a chatbot has to take to respond to the user inquiry. This change allowed the chatbot not only to answer a user with a predefined phrase but also take actions that help to create a new phrase in response to the user’s question.

For example, an employee asked the chatbot to find a coworker’s phone number. Then the algorithm turns to the base of contacts of all the user colleagues to find a required phone number. It’s necessary to implement all the mentioned actions to fulfill this user’s inquiry.

Besides Action and UserIntent, we added an entity DataEntity that expresses all the data needed to ask a user for.  

When a chatbot doesn’t know the user’s intention, it waits for the entered phrase. A user says something and the chatbot tries to recognize the user intention from the phrase. If the chatbot succeeds, it can take an action to fulfill the user inquiry. If there is enough data, the chatbot takes an action and “forgets” a user’s question as it’s already answered. If there is not enough data, the chatbot returns to the user with a request to give additional data. So this is what a conversation between the chatbot and a user could look like.


The architecture of interaction between a user and a chatbot

4. Changed the model in favor of neural networks

The decision tree model based on the bag-of-words algorithm showed pretty good results for recognizing written text. But when we increased the number of user intents, the chatbot vocabulary grew. This led to the growth of elements in the bag-of-words vectors derived from the user phrase processing. As a result, we had to spend more and more time training the chatbot. This didn’t work for us and we decided to train a neural network to determine the user’s intention.

We applied an already trained model word2vec from the open Facebook AI repository to convert phrases into numeric vectors. This type of model produces a corresponding numeric vector for every word.

At this stage, we did experiments with a neural network of MLP (multi-layered perceptron) architecture. A numeric vector of user’s phrase formed all of the numeric vectors sum of all the words in a phrase. However, the MLP model was unsteady to the new words that appeared in questions but weren’t mentioned during training.  

We settled on the neural network model with a bi-directional LSTM layer because we’ve successfully used it for other tasks with nlp machine learning.


The architecture of the neural network model with a bi-directional LSTM layer

A machine learning chatbot model based on the neural network recognized user intentions perfectly. The neural network analyzed every word separately considering all the neighboring words and only then decided what a user wants.

When we changed the chatbot architecture and used another model of the neural network to classify phrases based on the user intention, we got a smart chatbot. Our AI chatbot can answer user questions and ask follow-up questions as well.


Example of conversation between a user and neural network-based chatbot

5. The Outcome

We developed a smart chatbot on the basis of a neural network that determines what a user wants just on their phrase. Our chatbot demonstrated 99,9% accuracy in understanding natural language during a conversation.

Our study showed that the word2vec algorithm can be applied to get numeric vectors. Unlike the bag-of-words algorithm which widens in proportion to the number of words in the chatbot’s vocabulary.

We plan to add new user intentions to the chatbot base of knowledge and to implement data search in the user’s phrases.

Why do you need to develop a smart chatbot

If your business works with customers every day, you need a chatbot. First, this means for B2C companies, where there is high customer traffic. Although today a simple chatbot is not enough for messengers and live-chats. More and more clients interact with companies via digital channels — websites, mobile apps, messengers. Thus, it’s time to automate the process of interaction with customers.

The world promotes new requirements to virtual assistants: a chatbot has to understand natural language, know business terminology, determine meaning from enquiries and express a logical response to questions of users. That’s why chatbot developers are moving from simple virtual assistants and experimental versions to machine learning chatbots.

You too can learn how to create a chatbot and create one for your business. All that’s left is to start.