Chatbots – What, Why, and How?

What are Chatbots?

What is all this rage about Chatbots? Why are they popping up all over the tech news? Why are big companies like Google, Facebook, Microsoft jumping in and creating platforms and products?

Let us start with a few descriptions from the Web.

Wikipedia has a more elaborate description.

A chatbot (also known as a talkbot, chatterbot, Bot, chatterbox, Artificial Conversational Entity) is a computer program which conducts a conversation via auditory or textual methods. Such programs are often designed to convincingly simulate how a human would behave as a conversational partner…

Here is one from Kik, that I like.

Bots are like mini-apps that live in a conversation thread. Consumers can chat to bots as if they were chatting to a friend. Bots help people find information, have fun, or get connected to the real world

The key words are “convincingly simulate”. Another term for this “seemingly intelligent”. Pay careful attention. Bots are not humans. The bot makers try their level best to simulate humans, but we have a long way to go before they can come anywhere near human intelligence (or lack of it).

Why do we need Chatbots?

Why do we need chatbots? We have been pretty happy living our lives, without them so far. So why? and Why now? There are several good stories if you just Google “why chatbots”. a But I am going to just give you one of my favorite answers from Kik.

Why? Three simple reasons:

Messaging has surpassed social media in usage.

Consumers don’t download new apps.

And if chat is the new browser, bots are the new websites.

The beauty of bots is that you don’t have to download new apps. Bots live in your chat app, for which you already have an account. Also, you don’t have to learn a new UI, since you already know how to use your chat app.

How Chatbots work?

Let us look through the flow of a simple request to a Chatbot and its response. This is a an oversimplified version. In reality, the components and interactions are more complex.

Let us assume that you are the user.

You make a simple request to find out how to return a gadget you purchased. This is a typical customer service request. You invoke the customer service chatbot and enter a text message.
In the chatbot world, your request “How do I return my gadget X” is an utterance. The intent of the request is finding instructions on how to send the gadget back for a refund. Humans, being humans, have a variety of ways of expressing the intent. Here are a few. They are all asking for the same thing.
1. I would like to return my mobile phone that I just purchased. Can you tell me how to do it?
2. How do I return my gadget I received yesterday. It is not working.
3. You sent me a defective gadget. I want to send it back.
Once the bot understands this request (using some pattern matching or natural language understanding), it has to map this request to a service at the vendor site. In our example, the request for finding “instructions to return a product” initiates a search in the rules/policy/procedure part of the database.
The bot application (typically server/cloud based) receives this request and searches the knowledge base. For example, the return policies may vary based on products and customer shipping locations.
The knowledge base consists of information about products, users, policies, procedures and other information. It can be a typical company database or some other structure.
The response may indicate the location and instructions on how to return the package. This may be in some geeky format like JSON that programs understand (better than humans).
The bot application extracts relevant information from search results and sends it to the language module.
This cryptic answer from the bot application is formatted by a component of the language module (called Natural Language Generator) into a polite, human readable format and sent to the chat client.

This completes one round trip of request resulting in a response.

A few things to think about

So how do various components of the bot work? For example, how does the natural language understanding (NLU) module know how to extract the intent of the user?
How does NLU know what other l information is needed from the client to satisfy the request?
How does the mapper work? How is the intent mapped to a set of functions in the bot application?
How does Natural Langauge Generation (NLG) work?
What happens if the user has typos in his request?

These are the kinds of problems that keep bot makers up at night, worrying. We will discuss each component and their working in more detail in future posts.