Planning the Movement of Robots in a Social Environment Via Reinforcement Learning

L, A. Stankevich; A. A. Larionov

doi:10.17587/mau.25.520-529

Planning the Movement of Robots in a Social Environment Via Reinforcement Learning

L, A. Stankevich, A. A. Larionov

https://doi.org/10.17587/mau.25.520-529

Full Text:

PDF (Rus)

Buy (600 RUB)

Generate QR code

Abstract

The work is devoted to the problem of controlling the movement of robots in a social environment in crowded places. An algorithm for planning the movement of mobile robots among stationary and moving obstacles using reinforcement learning has been developed and studied. The GA3C-CADRL algorithm was chosen as a prototype, in which the robot and obstacles are considered as interacting agents. The algorithm was modified and implemented using an LSTM recurrent neural network to approximate the value and policy functions simultaneously. The neural network was trained on a common data set obtained through actor-critic reinforcement learning. Additionally, the rl_ planner and social_msgs components have been developed to integrate a pre-trained planning algorithm into a robot control system on the Robot Operating System 2 software platform. The first component implements processing of input data, calculating the robot’s actions and generating the required speed of movement, and the second contains messages with information about neighboring agents. To test the algorithm, experiments were carried out with three different scenarios: – with static obstacles, – mixed, – with dynamic agents. The number of episodes for training the algorithm with 5 agents reached 1,500,000. Simulation of the movement of a robot on two tracks in the environment Gazebo showed that in conditions of static obstacles the robot reaches the goal in the shortest time. In the presence of dynamic obstacles, the time increased by 2 times due to collision avoidance. At the same time, the distance to the nearest agent remained safe (more than 2 meters).

Keywords

mobile robot, social environment, movement planning, reinforcement learning, recurrent neural network

About the Authors

L, A. Stankevich

The Great Peter Saint-Petersburg Polytechnic University
Russian Federation

Stankevich L. A., Cand. of Tech. Sc., Associate Professor,

Saint-Petersburg.

A. A. Larionov

LTD "Special Technologic Center"
Russian Federation

Saint-Petersburg.

References

1. Xiao X., Liu B., Warnell G., Stone P. Motion planning and control for mobile robot navigation using machine learning: a survey, Autonomous Robots, 2022, no. 46, pp. 569—597, available at: https://doi.org/10.48550/arXiv.2011.13112.

2. Koenig S., Likhachev M. D* Lite, Eighteenth National Conference on Artificial Intelligence, 2002, pp. 476—483, available at: www.aaai.org.

3. Filimonov A. B., Filimonov N. B. Issues of Motion Control of Mobile Robots Based on the Potential Guidance Method, Mekhatro nika, Avtomatizatsiya, Upravlenie, 2019, vol. 20, no. 11, pp. 677—685 (in Russian).

4. Fox D., Burgard W., Thrun S. The dynamic window approach to collision avoidance, IEEE Robotics & Automation Magazine,1997, no. 1 (4), pp. 23—33.

5. Gerasimov V. N. The motion control system of the mobile robot in environment with dynamic obstacles, Computing, Telecommunications and Control, 2013, no. 5 (181), pp. 94—102 (In Russian).

6. Macenski S., Martín F., White R., Clavero J. G. The Marathon 2: A Navigation System, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, pp. 2718—2725, available at: https://doi.org/10.48550/arXiv.2003.00368.

7. Yandex.Rover on street of the Skolkovo [Electronic resource]. LTD "Yandex", available at: https://yandex.ru/blog/company/yandeks-roverna-ulitsakh-skolkovo (08.06.2023) (in Russian).

8. Alonso-Mora J., Andreas B., Martin R., Paul B., Roland S. Optimal Reciprocal Collision Avoidance for Multiple NonHolonomic Robots, Distributed Autonomous Robotic Systems: The 10th International Symposium, 2013, pp. 203—216, available at: https://doi.org/10.1007/978-3-642-32723-0_15.

9. Ferrer G., Garrel A., Sanfeliu A. Layered costmaps for context-sensitive navigation, 2013 European Conference on Mobile Robots, 2013, pp. 331—336, available at: doi:10.1109/IROS20146942636.

10. Trautman P., Ma1 J., Richard M. Murray R. M., Krause A. Robot navigation in dense human crowds: the case for cooperation, 2013 IEEE International Conference on Robotics and Automation, 2013, pp. 2153—2160, available at: http://www.cds.caltech.edu/~murray/papers/tmmk13-icra.html.

11. Rudenko A., Kucner T. P., Swaminathan C. S., Chadalavada R. T., Arras K. O., Lilienthal F. J. THÖR: Human-Robot Navigation Data Collection and Accurate Motion Trajectories Dataset, IEEE Robotics and Automation Letters, 2020, no. 2 (5), pp. 676—682, available at: https://doi.org/10.48550/arXiv.1909.04403.

12. Tai L., Zhang J., Liu M., Burgard W. Socially Compliant Navigation Through Raw Depth Inputs with Generative Adversarial Imitation Learning, 2018, pp. 1111—1117, available at: https://doi.org/10.48550/arXiv.1710.02543.

13. Pérez-Higueras N., Cabalero F., Merino L. Teaching Robot Navigation Behaviors to Optimal RRT Planners, International Journal of Social Robotics, 2018, no. 10, pp. 235—249.

14. Plaat A. Deep Reinforcement Learning, First Edition, Singapore, Springer Singapore, 2022, p. 406.

15. Long P., Fan T., Liao X., Liu W., Zhang H., Pan J. Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning, 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 6252—6259, available at: https://doi.org/10.48550/arXiv.1709.10082.

16. Everett M., Fan Chen Y., How J. P. Collision Avoidance in Pedestrian-Rich Environments with Deep Reinforcement Learning, IEEE Access, 2021, pp. 10357—10377, available at: https://doi.org/10.1109/ACCESS.2021.3050338.

17. Chen Y., Liu M., Everett M., How J. P. Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning, 2017 IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 285—292, available at: https://doi.org/10.48550/arXiv.1609.07845.

18. Zhou Z., Zhu P., Zeng Z., Xiao J., Lu H. Robot navigation in a crowd by integrating deep reinforcement learning and online planning, Applied Intelligence, 2022, pp. 15600—15616, available at: https://doi.org/10.48550/arXiv.2102.13265.

19. Macenski S., Foote T., Gerkey B., Lalancette C., Woodall W. Robot Operating System 2: Design, architecture, and uses in the wild, Science Robotics, 2022, no. 7 (66), available at: https://doi.org/10.1126/scirobotics.abm6074.

20. Larionov A. A. Master thesis "Planning mobile robot movement in social environment with reinforcement learning" [Electronic resource], LTD "Yandex", available at: https://github.com/TonyCooT/msc_thesis (08.06.2023) (In Russian).

Review

For citations:

Stankevich L.A., Larionov A.A. Planning the Movement of Robots in a Social Environment Via Reinforcement Learning. Mekhatronika, Avtomatizatsiya, Upravlenie. 2024;25(10):520-529. (In Russ.) https://doi.org/10.17587/mau.25.520-529

ISSN 1684-6427 (Print)
ISSN 2619-1253 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Mekhatronika, Avtomatizatsiya, Upravlenie

Planning the Movement of Robots in a Social Environment Via Reinforcement Learning

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy