
A Comprehensive Survey of Existing Chatbot Architectures and Techniques(Part-2) | by Tejwinder Singh
To design a chatbot, there exist several techniques and algorithms that are useful to a programmer. Some of these are based on natural language processing techniques and were quite common in the earlier versions of the dialog systems, while others make use of deep neural network techniques. Bradesko and Mladenic [1] and Abdul-Kader and Woods [2] investigated some of the design techniques and algorithms used in Loebner Prize Competition conversational chatbots, some of which are mentioned below:
1.Pattern Matching: One of the most widely used techniques for designing chatbots that match the input message with the patterns present in the knowledge database using sentence measurement techniques like bigram methods [3].
2. Parsing: This technique takes the user’s message as input and looks for objects and subjects like verbs, nouns, and common phrases in the message to find dependent and related phrases and to determine the grammatical structure which can then be analyzed to check if it forms a valid expression. By this method, a chatbot can cover varied input sentences with similar keywords using a limited set of patterns.

3. AIML: A universal, easy-to-understand derivative of XML which consists of data objects called AIML objects which contain two units topics, and categories, consisting of either parsed or unparsed data. Topic includes a name attribute and set of categories related to the topic, wherein each category consists of patterns which represent user input and a template implies chatbot response. Categories can be atomic, default, or recursive [4].
4. Markov Chain Models: The idea behind Markov Chains is that there is a fixed probability with which each letter or a word appears in a dataset. This idea is used to generate responses that are probabilistically more applicable. As explained by Serban et al. a two-step procedure is followed. Firstly a seed reply is created using a sequence of keywords extracted from the message and secondly, two separate Markov chains generate the words following and succeeding the seed keywords starting from the seed reply. In this way, many candidate responses are produced and one with the highest entropy is returned as the response from the chatbot [5].
5. Ontologies: Modern task-based dialogue systems are based on domain ontologies which contain the knowledge as a set of concepts that are hierarchically and relationally interconnected. Ontologies contain concepts that are interconnected into graphs and this structure can be used to represent the intentions that the system can extract from messages. An ontology defines one or more frames with each frame having a group of slots and defines the values that each slot can take. Milward and Beveridge examined the use of ontological domain knowledge for dialogue based breast cancer referrals and control of networked home appliances [6].
1. How Conversational AI can Automate Customer Service
2. Automated vs Live Chats: What will the Future of Customer Service Look Like?
3. Chatbots As Medical Assistants In COVID-19 Pandemic
4. Chatbot Vs. Intelligent Virtual Assistant — What’s the difference & Why Care?
6. Word Embeddings: Bengio et al. proposed a distributed vector representation of words called word embeddings [7]. Word embeddings are words converted into numbers that capture the meanings, semantic relationships, and contexts and may have different numerical representations for the same word. Different types of word embeddings include frequency-based embeddings and prediction based embeddings.
7. TF-IDF: It intends to capture the importance of a particular word to some documents present in the corpora [8]. The ‘term-frequency’ term simply denotes to the contribution of a word in a document or the count of the number of times a word appears in a corpus and the term ‘inverse document frequency’ is used to penalize the words that appear more often in the corpora. The final score is calculated as the product of these two terms.
8. Recurrent Neural Networks: A variant of neural networks that stores the state of previous input and combines with the current input which helps preserve some of the relationships between the current state and the final state. This leads to the formation of an internal state which models the time-dependent data and the internal state is updated at each time step [8].
9. LSTM: To model longer dependencies the hidden units in Recurrent Neural Networks are changed to Long-Short Term Memory units [9]. It maintains two different memories- a cell state which is the long-term memory and a hidden state which is the short-term memory. A series of gates is used to determine if a new input is to be remembered, forgotten, or used as an output.
One of the primary objectives of artificial intelligence is to make machines act like humans. As suggested by, H. N. lo et al. research in the field of chatbots and conversational agents has seen a sudden increase from 2015 but more attention should be given to the new technologies like deep learning and new trends like mobile chatting apps for further research in this topic [10]. Keynar et al. investigated the importance of Open Data as a new trend that enables government transparency and citizen participation. They have used machine learning models for entity recognition and intent classification and a neural network that selects the response from a predefined set of actions [11].
Young, Cambria et al. made the first attempt to augment a large common sense knowledge base into an end-to-end conversational model [12]. Common sense knowledge in artificial intelligence research refers to facts about the everyday world that humans are assumed to know. Commonsense knowledge has to be integrated into a conversational model in order to make human-computer interaction more interesting and engaging and since this kind of knowledge is very vast in itself the model in this paper uses an external memory module which is better than forcing the system to encode it in model parameters. This model takes into account both the message content and related commonsense for selecting appropriate response and employs retrieval-based methods.
One other important issue that needs to be taken care of is the emotional intelligence of these dialogue systems or chatbots. Cambria et al. study the use of common sense knowledge for developing emotionally sensitive systems [13]. Such systems or social chatbots must be designed in such a way so as to enhance user engagement while taking both intellectual quotient (IQ) and emotional quotient (EQ) into account [14]. Social chatbots are designed in such a way that their primary task is not solving all the problems but acting as a virtual companion to the user. They need to have an emotional connection with the user and can interact through a number of modalities including text, speech, and vision.
We studied and analyzed various types of dialogue systems that exist including rule-based and corpus-based systems. From using simple natural language processing techniques, including pattern matching, parsing, and AIML for designing chatbots, dialogue systems have come a long way and nowadays implements complex neural network architectures for response generation. From rule-based approaches such as ELIZA to data-driven approaches which can either be based on information retrieval methods such as DBpedia Chatbot or generation based methods implementing recurrent neural networks or hierarchical neural networks to model short term dependencies. The architecture was further improved with the introduction of long-short term memory units which made dialogue systems more engaging and helped produce more natural and humanly responses. Use of common sense knowledge and other techniques are being worked upon to develop intellectual and emotional chatbots like XiaoIce. We can certainly say that there has been a lot of research and improvement going on in the field of designing a dialogue system.
[1] Bradeško, L., & Mladenić, D. (2012, October). A survey of chatbot systems through a loebner prize competition. In Proceedings of Slovenian Language Technologies Society Eighth Conference of Language Technologies (pp. 34–37).
[2] Abdul-Kader, S. A., & Woods, J. C. (2015). Survey on chatbot design techniques in speech conversation systems. International Journal of Advanced Computer Science and Applications, 6(7).
[3] Setiaji, B., & Wibowo, F. W. (2016, January). Chatbot using a knowledge in database: human-to-machine conversation modeling. In 2016 7th International Conference on Intelligent Systems, Modelling and Simulation (ISMS) (pp. 72–77). IEEE.
[4] Shawar, B. A., & Atwell, E. (2002). A comparison between Alice and Elizabeth chatbot systems. University of Leeds, School of Computing research report 2002.19.
[5] Serban, I. V., Lowe, R., Henderson, P., Charlin, L., & Pineau, J. (2015). A survey of available corpora for building data-driven dialogue systems. arXiv preprint arXiv:1512.05742.
[6] Milward, D., & Beveridge, M. (2003, August). Ontology-based dialogue systems. In Proc. 3rd Workshop on Knowledge and reasoning in practical dialogue systems (IJCAI03) (pp. 9–18).
[7] Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of machine learning research, 3(Feb), 1137–1155.
[8] Lowe, R., Pow, N., Serban, I., & Pineau, J. (2015). The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909.
[9] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735–1780.
[10] Io, H. N., & Lee, C. B. (2017, December). Chatbots and conversational agents: A bibliometric analysis. In 2017 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM) (pp. 215–219). IEEE.
[11] Keyner, S., Savenkov, V., & Vakulenko, S. (2019). Open Data Chatbot. arXiv preprint arXiv:1909.03653.
[12] Young, T., Cambria, E., Chaturvedi, I., Zhou, H., Biswas, S., & Huang, M. (2018, April). Augmenting end-to-end dialogue systems with commonsense knowledge. In Thirty-Second AAAI Conference on Artificial Intelligence.
[13] Cambria, E., Hussain, A., Havasi, C., & Eckl, C. (2010). Sentic computing: Exploitation of common sense for the development of emotion-sensitive systems. In Development of Multimodal Interfaces: Active Listening and Synchrony (pp. 148–156). Springer, Berlin, Heidelberg.
[14] Shum, H. Y., He, X. D., & Li, D. (2018). From Eliza to XiaoIce: challenges and opportunities with social chatbots. Frontiers of Information Technology & Electronic Engineering, 19(1), 10–26.
Credit: Source link