Copyright All rights reserved. No part of the material protected by copyright may be reproduced. or utilized in any form, electronic or mechanical, including photocopying, recording, or by any other information storage or retrieval system, without the written consent of the copyright owner. iii
Copyright All rights reserved. No part of the material protected by copyright may be reproduced. or utilized in any form, electronic or mechanical, including photocopying, recording, or by any other information storage or retrieval system, without the written consent of the copyright owner. iii
Who should read this book? This book is valuable to those who are looking forward to gain insights into chatbots and how to build them. It is intended for beginners who want to get started with the basics of chatbots and the api.ai platform. The book would walk you through the following: foll owing: What are chatbots? Types of Chatbots What is api.ai platform Basics of api.ai platform How to create your own conversational bot? *No programming knowledge orspecifictechnical skillsetsrequired.
Table of Contents Chapter 1: Introduction …………………………………… ……………………………………………………………………………… …………………………………………………………..1 ………………..1 1.1 What are Chatbots? …………………………………………………………………………………………….1 1.2 Types of Chatbots ………………………………………………………………………………………………….2 1.3 Design considerations for chatbot ……………………………………………………………………………..3 Chapter 2: Introduction to api.ai ……………………………………………………………………………………4 2.1 What is api.ai? ………………………………………………………………………………………………………4 2.2 Basics of api.ai………………………………………………………………………………………………………4 2.3 Agents in api.ai ……………………………………………………………………………………………………..5 2.4 Domains in api.ai …………………………………………………………………………………………………….6 Chapter 3: Intents in api.ai ……………………………………………………………………………………………7 3.1. What are intents? …………………………………………………………………………………………………..7 3.2 Creating new Intent ……………………………………………………………………………………………….7 3.3 Actions and Speech responses ……………………………………………………………………………………9 3.4 Illustration of Intent: Order Pizza ………………………………………………………………………………10 3.5 Intent Priority ……………………………………………………………………………………………………….11 3.6 Operations on Intents …………………………………………………………………………………………….12 Chapter 4: Entities in api.ai ……………………………………… ………………………………………………………………………………… …………………………………………………14 ………14 4.1. What are entities?…………………………………………………………………………………………………14 4.2. Creating new Entity……………………………………………………………………………………………….14 4.3. Using Entities in Intents …………………………………………………………………………………………15 4.4. Types of entities …………………………………………………………………………………………………..16 4.5 Operations on Entities …………………………………………………………………………………………….20 Chapter 5: Contexts in api.ai ……………………………………… ………………………………………………………………………………… ……………………………………………….21 …….21 5.1. What are contexts? ……………………………………………………………………………………………….21 5.2. Types of Context…………………………………………………………………………………………………..21 5.3. Need of contexts…………………………………………………………………………………………………..22 5.4. Property of Context ……………………………………………………………………………………………….23 Chapter 6: Dialogs in api.ai ………………………………………………………………………………………….26 6.1. Types of Dialogs …………………………………………………………………………………………………..26 Chapter 7: Hands-On…………………………………………………………………………………………………..31 Resources …………………………………………………………………………………………………………………..35 References …………………………………… ……………………………………………………………………………… ……………………………………………………………………………36 …………………………………36
v vii
Chapter 1: Introduction 1.1 What are Chatbots? A chatbot is a computer program that can make conversation with humans. The conversation medium can be voice or text. Over the last few years, messaging apps have become more popular than Social Media. People are spending more time on messaging apps. This makes an attractive proposition for businesses to be available on messaging apps such as Messenger, Slack, Skype, Telegram etc. to interact with their users or potential customers. To interact with many users at a time, businesses need to hire more customer care professionals. A computer application that can simulate human conversation or simply a chatbot makes it much more affordable for businesses to interact with their customers on these messaging platforms. Here is an example to help you visualize a Chatbot. For example, if you wanted to buy book from Amazon online, you would go to their website, look around until you find book you wanted, and then you would purchase it.
Fig 1: Example of Chatbot in E-commerce 1
If Amazon makes a bot, you would simply be able to message Amazon on Facebook. It would ask you which book you are looking for and find it for you. Instead of browsing a website, you will have a conversation with the Amazon bot, mirroring the type of experience you would get when you buy a book from the store. You might have used ‘Apple’s Siri’ which is also an example of a Chatbot.
1.2 Types of Chatbots From development perspective you can classify chatbots into two main categories. Limited rule-based bots Machine learning bots.
1.2.1 Limited rule based bots A rule based bot is limited to certain texts or commands. If user says something other than the defined commands, the bot won’t be able to respond with the desired effect.
1.2.2 Machine Learning bots Machine learning chatbots work using Artificial Intelligence that can understand language, not just commands. The best part is that these bots get continuously smarter and learn from past conversations. To help illustrate how they work, here is a typical conversation between a human and a chatbot: Human: “I need a direct flight from Singapore to Germany” Bot: “Great! When would you like to travel?” Human: “From Dec 24, 2016 to Jan 10, 2017” Bot: “Got it! Looking for flights from Singapore to Germany on 25-12-2016 returning 1001-2017 that are direct.”
1.3 Design considerations for chatbot As far as the design of chatbots is concerned, defining the conversation (conversational design) is the most crucial part in building bots. The good conversational design is the one which is bot driven. Bots may also follow a certain pattern in which they receive and respond to the user messages in natural language. Thus, it’s always advisable to keep the bot updated and refined in order to understand the conversational flow between the bot and the user. Following are some of the key points to be followed while designing any bot: Create a chatbot that helps the user and enhances the user experience. Bots work great when it comes to situations where there is a single response to a query, but won’t be effective in walking users through a process that has multiple steps. If you can answer in a few words, use a bot. If the answer is more complex, should be handled by human. Try to keep the conversation bot driven, because you may not always predict the behavior of the end users in what way they would interact with the system. Limit the use of bots in your application, typically to one if it’s a web based application. Monitor the performance of the bot, in order to understand the way, it responds to a particular set of commands or natural language statements.
Chapter 2: Introduction to api.ai 2.1 What is api.ai? Api.ai is the conversational user experience platform enabling developers to design and integrate natural language interactions into their solutions in a matter of minutes. Api.ai is Gartner Cool Vendor 2015. It is an exciting development in the world of APIs and ‘Internet of Things’. It is a platform that lets developers seamlessly integrate intelligent voice command systems into their products to create consumer-friendly voiceenabled user interfaces. Example: Apple’s Siri Sign up here https://api.ai/ to explore more about api.ai.
2.2 Basics of api.ai Api.ai platform comprises of the following: Agent: Agents can be described as NLU (Natural Language Understanding) modules for applications. Their purpose is to transform natural user language into actionable data. Intents: An intent represents a mapping between what a user says and what action should be taken by your software. Entities: An entity is a concept we want our bot to understand when it is mentioned by the user in conversation. Each entity has a range of values and properties that contain the terms the bot will need to understand to respond to this concept. Context: A context is a powerful tool that can help to create sophisticated voice interaction scenarios. Domains: Domains are pre-defined knowledge packages. By enabling Domains, you can make your agent understand thousands of diverse requests – no coding required! There are different types of domains available like Weather, Booking, Calculator etc. Actions: An actions corresponds to the step your application will take when a specific intent has been triggered by a user input. Actions can have parameters for extracting information from user inputs.
2.3 Agents in api.ai Agents can be described as ‘Natural Language Understanding’ modules for applications. You would be prompted to create a new agent once you register with api.ai.
Fig 2: Create new agent
Agents can be designed to manage a conversation flow in a specific way. This can be done with the help of contexts, intent priorities and fulfilment in the form of speech response. Agents are platform agnostic. You only have to design an agent once, and then you can integrate it with a variety of platforms or download files compatible with various applications. The main components of the agent are the intent or domain which are linked to corresponding actions (which may contain parameters from user input) and the speech response.
2.3.1 Export and Import for Agents • Export Agent: You can export your entre agent as a zip file, which will contain all of the intents and entities from your agent in JSON format. To do this, go to your agent settings, select ‘Export and Import’ from the horizontal menu, and click on the corresponding button. (Refer Fig) • Import Agent: To import an agent, first create a new agent, then set the language to the same as the agent being imported, select ‘Export and Import’ from the horizontal menu, click the ‘Import from zip’ button, and follow the instructions. (Refer Fig)
Fig 3: Export and Import Agent
2.4 Domains in api.ai Domains are pre-defined knowledge packages. By enabling Domains, you can make your agent understand thousands of diverse requests – no coding required! There are different types of domains available like Weather, Booking, Calculator etc. User can manage the different domains for their applications from the ‘Domains’ tab as shown in the figure below
Fig 4: Domains in api.ai
Chapter 3: Intents in api.ai 3.1. What are intents? An intent represents a mapping between what a user says and what action should be taken by your software. There are four main components of intent: User says: what the user says in natural language. (specific to the intent) Action: steps the application will take when a specific intent is triggered with user input. Speech Response: what the software or bot responds to the intent in natural language. Context: It’s a tool that can help to create sophisticated voice interaction scenarios.
3.2 Creating new Intent While creating a new intent, we need to define the four components for the intent. Following figures will tell us more about how to create an intent by defining the four components.
Fig 5: Create Intent and components of intent The ‘User Says’ component in an intent can be in two modes: Example Mode (indicated by the “icon): Examples are written in natural language and annotated so that parameter values can be extracted. Template Mode: (indicated by the @ icon): Templates contain direct references to entities instead of annotations, i.e., entity names are prefixed with the @ sign.
3.3 Actions and Speech responses An action corresponds to the step your application will take when a specific intent has been triggered by a user input. The action name can be defined manually. Actions can have parameters for extracting information from user inputs. Parameters can be filled in automatically from the ‘Users says’ examples and templates, or added manually. Speech Responses can contain references to parameter values. If a parameter is present in the parameter table, use the following format to refer to its value in the ‘Speech Response’ field: $parameter_name. There are special parameter value types that don’t appear automatically in the parameter table. If you need to refer to such a special type of value, you’ll have to add a new parameter to the parameter table, define its value manually, and then reference it in the response as $parameter_name. Example: We define an intent of ‘Order food’, the ‘food-type’ can be a user defined parameter that can be defined as follows. (Refer fig)
Fig 6: Passing parameter values
3.4 Illustration of Intent: Order Pizza
Fig 7: Illustration of Intent: Order Pizza In the above example of ‘order a pizza’, we specified the component for defining an intent. They are as follows: User says: I want a pizza / order pizza/ deliver pizza etc.
Speech Response: Sure, I will get you the best pizza ever. Which one would you like? Action: order.pizza Context: topping type, pizza Api.ai will return a JSON with the respective parameters and be used in the application we intend to develop.
Fig 8: JSON for the illustration of ‘order pizza’
3.5 Intent Priority Api.ai allows the user to prioritize the intents. Intent priority allows to assign more weight to one of the intent in case an input phrase matches multiple intents. Intents priority can be changed by clicking on the blue (default) dot on the left of the intent name and selecting the priority from the drop- down menu. (Refer to fig)
Fig 9: Intent Priority
3.6 Operations on Intents There are two basic operations that can be performed on intents: Downloading and Uploading Intents. 3.6.1 Download Intents Intents can be downloaded in JSON or CSV format. To do so, move the cursor over the intent, click the little cloud sign, and choose the format (Refer Fig)
Fig 10: Download Intents 3.6.1 Upload Intents You can upload intents as CSV, JSON or XML files. To do so, click on the menu button next to the “Create Intent” button, choose “Upload intent” and follow the instructions on the page.
Fig 11: Upload Intents
Chapter 4: Entities in api.ai 4.1. What are entities? An entity is a concept we want our bot to understand when it is mentioned by the user in conversation. Each entity has a range of values and properties that contain the terms the bot will need to understand to respond to this concept. While creating intents, it is important to understand what are the entities being involved and define them accordingly.
4.2. Creating new Entity To understand entities and how to create them, let’s consider an example of ‘Intent: to book a room’. The entities involved in this intent would be: Date: for which date the user wishes to book the room. (It is a system defined entity defined by api.ai) Room type: which type of room the user would like to book (Ex: Deluxe, Superior etc.)
Fig 12: Create Entity
4.3. Using Entities in Intents We would consider the example of ‘Intent: book a room’ and understand how to define the entities and use them in speech responses.
Fig 13: Using Entities in Intents
When the user says ‘book a room’, he would be prompted to enter the room-type. Once he enters the room type, he would be prompted to specify a date. Once both the parameters are provided by the user, the booking would be successful. (Refer fig)
Fig 14: Example of book a room
4.4. Types of entities Entities play an important role while defining intents while building a conversational bot. Api.ai classifies entities into three main types: System Entities Developer Entities User Entities
4.4.1. System entities These are pre-built entities provided by Api.ai in order to facilitate handling the most popular common concepts. These are the entities that api.ai understands, like date, color, email etc. The full list can be found in Api.ai documentation. Below are some of the examples of system entities distinguished by their structure. System Mapping Entity: @sys.date matches common date references such as “November 1, 2016” or “The first of November of references such as “November 1, 2016” or “The first of November of 01T12:00:00-03:00” System Enum Type Entity: @sys.color matches most popular colors and returns the matched color as it is without mapping it to any reference value. For example, shades of red, such as “scarlet” or “crimson”, won’t be mapped to “red” and will return their original values “scarlet” and “crimson”. System Composite Entity: @sys.unit-currency is meant for matching amounts of money with indication of currency name, e.g., “50 euros” or “twenty dollars and five cents”. It returns an object type value consisting of two attribute-value pairs: {“amount”:50,“currency”:“EUR”}
4.4.2 Developer entities You can create your own entities for your agents, either through web forms, uploading them in JSON or CSV formats, or via API calls. There are three types of developer entities mentioned as below: Developer mapping entities Developer enum entities Developer composite entities
Developer mapping entities This entity type allows the mapping of a group of synonyms to a reference value. In natural language, you can often have many ways to say the same thing. For this reason, each mapping entity has a list of entries, where each entry contains a mapping between a group of synonyms (ways that a particular concept could be expressed in natural language) and a reference value. For example, a food type entity could have an entry with a reference value of
“vegetarian” with synonyms of “veg” and “veggie”. To create such an entity, leave the checkbox ‘Define Synonyms’ checked reference value and synonyms.
and add a
Fig 15: Developer Mapping Entity
Developer enum entities Enum type entities contain a set of entries that do not have mappings to reference values. To create an enum type entity, uncheck the checkbox ‘Define Synonyms’ and add entries. Entries can contain simple words or phrases, or other entities. Fig 16: Developer Enum Entity
Developer composite entities Composite entities are enum type entities whose entries contain other entities used with aliases. They return object type values (if not defined in the parameter table). Composite entities are most useful for describing objects or concepts that can have several different attributes. For example, simple robot moves can have two characteristics: direction and number of steps. We may want to describe all possible combinations of these two characteristics within one entity. First, we create a mapping entity @direction for direction options:
Fig 17(a): Developer Composite Entity Then we create a composite entity @move for moves options. Make sure to uncheck the ‘Define synonyms’ checkbox.
Fig 17(b): Developer Composite Entity Finally, we’ll use @move entity with alias in the “Robot moves” intent.
Fig 17(c): Developer Composite Entity The phrase“Movefivesteps forward”will return the following parameter values: “parameters”: { “move”: { “direction”: “forward”, “steps”: 5 } }
4.4.3 User entities These are entities that are defined for a specific end-user. For example, a @playlist entity that has generic playlists (since playlists tend to be user-specific) could be defined in a request or for a given session.
4.5 Operations on Entities There are two basic operations that can be performed on entities: Downloading and Uploading Entities.
4.5.1 Download Entities Intents can be downloaded in JSON or CSV format. To do so, move the cursor over the intent, click the little cloud sign, and choose the format (Refer Fig)
Fig 18: Download Entity
4.5.2 Upload Entities You can upload intents as CSV, JSON or XML files. To do so, click on the menu button next to the “Create Intent” button, choose “Upload intent” and follow the instructions on the page. Fig 19: Upload Entity
Chapter 5: Contexts in api.ai 5.1. What are contexts? Defining the conversation is a key aspect in designing conversational bots. A context is a powerful tool that can help to create sophisticated voice interaction scenarios. When a dialog is created, it’s usually the case that many branches of conversation is possible. Therefore, intents in an agent should be designed so that they follow each other in the correct order during a conversation.
5.2. Types of Context Output Context Input context Corresponding input and output contexts in intents will determine whether they follow or precede one another.
Fig 20: Input and Output Contexts
5.3. Need of contexts To understand contexts, let’s take an example of an intent say: Play music. The entities involved in this intent would be the music artists whose tracks are to be played. So, the user can say: I want to listen to Eminem, the speech response can be ‘Playing tracks of Eminem’. This is a case where there is a single query and response. But what if a user wants to handle commands related to multiple queries like ‘Play the next song’. This is where contexts come into picture. Contexts are like ‘topics in discussion’ and they help us co-ordinate the discussion. So, in order to handle the intent ‘Play the next song’, we define the output context as ‘playingmusic’ in ‘play music intent’. We also create a new intent ‘next song’ with ‘playing-music’ as the input and output context. (Refer fig)
Fig 21: Defining output context for an intent
Fig 22: Defining input context for intent Thus, by defining contexts, commands related to multiple queries can be handled.
5.4. Property of Context ‘Lifespan’ is the most important property possessed by all contexts. By default, the lifespan value is set to five. It means that the context will last for the next five matched intents. So if you have different input/output contexts in each of the following intents, all of them will be collected in the next five stages of the dialog, like a chain of contexts. Sometimes you might want to get rid of a context after the following intent is reached. In these situations, you can simply change the lifespan to one.
Fig 23: Lifespan of Context
Chapter 6: Dialogs in api.ai ‘Dialogs’ are
an integral part while building conversational bots and in case of voice interaction scenarios. Dialogs can be classified into two main types.
6.1. Types of Dialogs Linear dialogs Non-linear dialogs
6.1.1 Linear Dialogs These are the dialogs that help to collect the information necessary to complete the required action. We can use ‘Slot filling’ in order to build linear dialogs. Let’s take an example of ‘Ticket booking application for a sightseeing attraction’. A user may request to book using natural language statements with some search parameters as follows: I want to book a tickets for Universal Studios Can I book tickets for Universal Studios for Sunday? Thus, our application would need three parameters (necessary information) in order to complete the booking action. They are as follows: Date Number of Adults Number of Children This is where the ‘ Slot filling’ feature can be used. Even before you define the intent, you can list all of the parameters your application can use to perform a search. (both required and optional). Thus, we list all the parameters first.
Fig 24: Define parameters in conversation
When you mark a parameter as “Required”, you’ll be asked to write prompts that your application will address to your user when they make a request that does not include that parameter.
Fig 25: Specify ‘Required’ parameters For all the required parameters, you would be requested to ‘define prompts’. This process is known as ‘Slot filling’. (Refer Fig)
Fig 26: Slot Filling in api.ai
Thus the system would prompt for the date, the number of adults and the number of children before completing the ‘book.ticket’ action. The agent will continue to ask questions until all the information for the required parameters is collected. At any point of time, the user can ask to cancel or start the dialog from beginning.
6.1.2 Non-linear Dialogs These kind of dialogs may have several branches depending upon the user’s answers to conditional statements. An example can be ‘Multiple choice questions’, where a user has to choose one amongst the pre-defined options. Here, we can take an example of a ‘Customer feedback form’ for a sightseeing attraction. It can have questions like ‘Rate your experience at the Universal Studios’. The possible options would be Excellent Good Satisfactory Bad To build this dialog we’ll need to use contexts. Here’s how it’s done: First, we create an intent which reacts to the “start” command and triggers the dialog. In this intent we’ll ask the first question and set the outgoing context as “rate your experience-question”. As a result, the preferred intents for the answers to this question will have this as the incoming context.
Fig 27: Define outgoing context
Next we create intents for each of the four expected response variations, with the incoming context “rateyourexperience-question”. These four questions will work only while this context is active.
Fig 28: Defining input contexts for each of the options
Next we create intents for each of the four expected response variations, with the incoming context “rateyourexperience-question”. These four questions will work only while this context is active. Thus, non-linear dialogs in a conversation or voice interaction scenarios can be handled using contexts in api.ai.
Chapter 7: Hands-On Now, that we are familiar with the concepts of api.ai, let’s try building our first conversational bot. Problem Statement: To build a bot that can book a hotel room for a user. There are 4 steps involved in building any conversational bot: 1. Conversational design: Define the conversation between the user and the bot after understanding the objective of building the bot. 2. Sketch a flowchart in order to have a pictorial view of the conversation defined. 3. List down the intents, entities, contexts involved in the building the bot after analyzing the conversation. 4. Develop the bot by creating the required intents, entities and contexts and test the bot for expected responses.
7.1. Conversational Design In this section, we will define the conversation for our bot for booking a hotel room. The conversation can be defined as follows: Bot: Hey, I can assist you with the room booking services for our hotel. Would you be interested? User: Yes. I would like to book a room from 25th December 2016 to 5th Jan 2017. Bot : Sure. We have deluxe and superior rooms. Rate for deluxe is 120 USD per night and that for superior is 140 USD per night. The rates are inclusive of breakfast. Which one would you like to book? User: I would go for a superior. Bot: Shall I proceed to book? User: Yes, please. Bot: Kindly enter your email ID to send a booking confirmation User:
[email protected] Bot: Your booking is confirmed. I have sent the booking details to example @gmail.com. Hope to see you again!
7.2. Flowchart for Conversational Design: Hotel Room Booking
7.3. Determining intents, entities and contexts With reference to the above flowchart, we list the intents, entities and the contexts invloved in developing the bot.
Agent : Hotel-Room-Booking Intents : greeting,to book a room, confirm-booking,reject-booking Entities : System defined Entities : from date, to date,email-ID Developer Entities : roomtype Action : book.room Contexts : interested,not_interested , confirm_booking,reject_booking
7.4. Develop the Bot Refer ‘Resources’ for ‘Hotel-Room-Booking’ Agent.
Resources For more examples refer: https://github.com/getstarted-guru/ebookresource/