Preview

Mekhatronika, Avtomatizatsiya, Upravlenie

Advanced search

Dialogue System of Controlling Robot Based on the Theory of Finite-State Automata

https://doi.org/10.17587/mau.20.686-695

Abstract

The article discusses the system of dialogue control manipulation robots. The analysis of the basic methods of automatic speech recognition, speech understanding, dialogue management, voice response synthesis in dialogue systems has been carried out. Three types of dialogue management are considered as "system initiative", "user initiative" and "combined initiative". A system of object-oriented dialog control of a robot based on the theory of finite state machines with using a deep neural network is proposed. The main difference of the proposed system lies in the separate implementation of the dialogue process and robot’s actions, which is close to the pace of natural dialogue control. This method of constructing a dialogue control robot allows system to automatically correct the result of speech recognition, robot’s actions based on tasks. The necessity of correcting the result of speech recognition and robot’s actions may be caused by the users’ accent, working environment noise or incorrect voice commands. The process of correcting speech recognition results and robot’s actions consists of three stages, respectively, in a special mode and a general mode. The special mode allows users to directly control the manipulator by voice commands. The general mode extends the capabilities of users, allowing them to get additional information in real time. At the first stage, continuous speech recognition is built by using a deep neural network, taking into account the accents and speech speeds of various users. Continuous speech recognition is a real-time voice to text conversion. At the second stage, the correction of the speech recognition result by managing the dialogue based on the theory of finite automata. At the third stage, the actions of the robot are corrected depending on the operating state of the robot and the dialogue management process. In order to realize a natural dialogue between users and robots, the problem is solved in creating a small database of possible dialogues and using various training data. In the experiments, the dialogue system is used to control the KUKA manipulator (KRC4 control) to put the desired block in the specified location, implemented in the Python environment using the RoboDK software. The processes and results of experiments confirming the operability of the interactive robot control system are given. A fairly high accuracy (92 %) and an automatic speech recognition rate close to the rate of natural speech were obtained.

About the Authors

Yin Shuai
Bauman Moscow State Technical University
Russian Federation
Post-Graduate Student of Robotic Systems and Mechatronics Department


A. S. Yuschenko
Bauman Moscow State Technical University
Russian Federation


References

1. Jurafsky D., Martin J. H. Speech and Language Processing: An introduction to natural language processing, computational linguistics, and speech recognition, Pearson, 2014, pp. 273—543.

2. Sergienko R. Text Classification for Spoken Dialogue Systems, Institute of Telecommunications and Institute of Artificial Intelligence, 2016, pp. 17—58.

3. Mansour A. H., Salh G. Z. A., Mohammed K. A. Voice Recognition using Dynamic Time Warping and Mel-Frequency Cepstral Coefficients Algorithms, International Journal of Computer Applications, 2015, pp. 34—41.

4. Yu Z. S., Kobayashi H. An Efficient Forward-Backward Algorithm for an Explicit-Duration Hidden Markov Model, IEEE Signal Processing Letters, 2003, pp. 11—14.

5. Tu S. Derivation of Baum-Welch Algorithm for Hidden Markov Models, available at: https://people.eecs.berkeley. edu/~stephentu/writeups/hmm-baum-welch-derivation.pdf

6. Tao C. A generalization of discrete hidden Markov model and of Viterbi algorithm, Department of Computer Science, 1992, pp. 1381—1387.

7. Rabiner L. R. A tutorial on hidden Markov models and selected applications in speech recognition, Proceeding of the IEEE, 1989, pp. 257—286.

8. Rabiner L., Juang B. H. Fundamentals of Speech Recognition, Prentice-Hall, Upper Saddle River, 1993, pp. 321—386.

9. Arisoy E., Sainath T., Kingsbury B., Ramabhadean B. Deep neural network language modela, In proceedings of the Joint Human Language Technology Conference and the North American Chapter of the Association of Computational Linguistics Workshop, 2012, pp. 20—28.

10. Dong Y., Li D. Automatic Speech Recognition (A Deep Learning Approach), Springer-Verlag, London, 2015, pp. 13—48.

11. Pauls A., Klein D. Faster and smaller N-gram language modelas, Annual Meeting of the Association for Computation Linguistics: Human Language Technologies, 2011, pp. 258—267.

12. Yuschenko A. S. Interactive robot control based on fuzzy logic, Proceedings of the international scientific-technical conference "Extreme Robotics", September 25—26 2012, St. Petersburg, Polytechnic service, 2012, pp. 29—36 (in Russian).

13. Meza-Ruiz I. V., Riedel S., Lemon O. Spoken language understanding in dialogue systems, using a 2-layer Markov logic network, Improving semantic accuracy. Semantics and Pragmatics of Dialogue (LONDIAL’08), 2008, pp. 191—192.

14. Williams J. D. Web-style ranking and slu combination for dialog state tracking, Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2014, pp. 282—291.

15. Henderson M., Thomson B., Young S. J. Deep Neural Network Approach for the Dialog State Tracking Challenge, Proceedings of SIGDIAL, 2013, pp. 467—471.

16. Thomson B., Young S. Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems, Computer Speech & Language, 2010, pp. 562—588.

17. Pieraccini R., Huerta M. J. Where do we go from here? research and commercial spoken dialog systems, SIGdial Workshop on Discourse and Dialogue, 2005, рp. 1—24.

18. Thomson B., Young S. Bayesian update of dialogue state: A POMDP (the partially observable Markov decision process) framework for spoken dialogue systems, Computer Speech & Language, 2010, рp. 562—588.

19. Yuschenko A. S., Morozov D. N., Zhonin A. A. Speech control for mobile Robotic systems, Proc.of 4th International Conference "Mechatronic Systems and Materials" MSM-2008, Byalostok, Poland, July, 2008, pp. 14—17.

20. Zhonin A. A. Algorithm for learning the dialogue manager of the dialogue robot control system, Integrated models and soft computing in artificial intelligence. Sat scientific papers of the international conference, Moscow, Phys. mat. Lit. 2011, pp. 395—406 (in Russian).

21. Yuschenko A. S. Intellectual planning in the activities of robots, Mekhatronika, Avtomatizatsiya, Upravlenie, 2005, no. 3, pp. 5—18 (in Russian)

22. Huang J., Rathod V., Sun C., Zhu M. l., Korattikara A. Speed/accuracy trade-offs for modern convolutional object detectors, Computer Vision and Pattern Recognition, 2017.


Review

For citations:


Shuai Y., Yuschenko A.S. Dialogue System of Controlling Robot Based on the Theory of Finite-State Automata. Mekhatronika, Avtomatizatsiya, Upravlenie. 2019;20(11):686-695. (In Russ.) https://doi.org/10.17587/mau.20.686-695

Views: 1108


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1684-6427 (Print)
ISSN 2619-1253 (Online)