Structural Detection of Visual Objects for Mobile Robots

N. A. Sergievskiy; A. A. Kharlamov

doi:10.17587/mau/17.187-192

Structural Detection of Visual Objects for Mobile Robots

N. A. Sergievskiy, A. A. Kharlamov

https://doi.org/10.17587/mau/17.187-192

Full Text:

PDF (Rus)

Generate QR code

Abstract

This paper presents StructDetect, a fast method for object detection. The target detection process consists of two stages: generation of a hypothesis (object proposals) (1) and verification of the hypothesis (2). Generation of the object proposals is carried out by means of a simple structural model on the basis of line segment combining. Line segment is detected by EdLines algorithm. Then a computer attributes the line segments and their pairs and creates "a connection table", which filters some combinations. Further, it creates a triple combination of the line segments filtered by "the connection table". Each combination has a handcraft descriptor based on the line segment attribute. This descriptor is used to learn kNN classifier and generate object proposals in the area of 3 line segments. These proposals define a set of candidate bounding boxes available to the detector. The second module is based on a convolutional neural network, which takes a fixed-length feature vector from each region. The convolution neural network computes once per image and features vector extracts with adaptively-sized pooling from the last convolution layer. Then the feature vectors are classified by the random forest algorithm. Accuracy of this approach is comparable with the accuracy of such modern detector methods as SPPNet and RCNN. StructDetect is 7 times faster than SPPNet and has a frame rate of 4fps on a CPU.

Keywords

детектирование объектов, компьютерное зрение, зрение роботов, нейронные сети, глубокое обучение, случайный лес, object detection, object proposals, computer vision, robot vision, deep learning, random forest

About the Authors

N. A. Sergievskiy

ELVEES-NeoTek
Russian Federation

A. A. Kharlamov

Institute of Higher Nervous Activity and Neurophysiology of the Russian Academy of Sciences (IHNA&N RAS); Moscow State Linguistics University
Russian Federation

References

1. Girshick, Ross Brook. From rigid templates to grammars: Object detection with structured models. University of Chicago, 2012.

2. Paul V., Jones M. J. Robust real-time face detection // International journal of computer vision. 2004. V. 57, N. 2. P. 137-154.

3. LeCun Y. Gradient-based learning applied to document recognition // Proc. of the IEEE. 1998. V. 86, N. 11. P. 2278-2324.

4. Krizhevsky A., Sutskever I., Hinton G. E. Imagenet classification with deep convolutional neural networks // Advances in neural information processing systems. 2012.

5. Fan R. E., Chang K. W., Hsieh C. J., Wang, X. R., Lin C. J. LIBLINEAR: A library for large linear classification // The Journal of Machine Learning Research. 2008. V. 9. P. 1871-1874.

6. Uijlings J. R., van de Sande K. E., Gevers T., Smeulders A. W. Selective search for object recognition // International journal of computer vision. 2013. V. 104, N. 2. P. 154-171.

7. Everingham M., Van Gool L., Williams C. K., Winn J., Zisserman A. The pascal visual object classes (voc) challenge // International journal of computer vision. 2010. V. 88, N. 2. P. 303-338.

8. Deng J., Dong W., Socher R., Li L. J., Li K., Fei-Fei L. Imagenet: A large-scale hierarchical image database // Computer Vision and Pattern Recognition. CVPR 2009. IEEE Conference on IEEE, 2009.

9. Sivic J., Russell B. C., Efros A., Zisserman A., Freeman W. T. Discovering objects and their location in images // Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on. 2005. V. 1.

10. Zitnick, Lawrence C., Dollár P. Edge boxes: Locating object proposals from edges // Computer Vision-ECCV 2014. Springer International Publishing, 2014. P. 391-405.

11. Hosang J., Benenson R., Schiele B. How good are detection proposals, really? arXiv preprint arXiv:1406.6962 (2014).

12. Girshick R., Donahue J., Darrell T., Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation // Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014.

13. He K., Zhang X., Ren S., Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition // Computer Vision-ECCV 2014. Springer International Publishing, 2014. P. 346-361.

14. Girshick Ross. Fast R-CNN. arXiv preprint arXiv:1504.08083 (2015).

15. Лобов С. А., Сергиевский Н. А., Харламов А. А. Адаптация алгоритма сверточных нейронных сетей на ПЛИС // Программные системы: теория и приложения. 2013. № 3 (17).

16. Cheng M. M., Zhang Z., Lin W. Y., Torr P. BING: Binarized normed gradients for objectness estimation at 300fps // Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014.

17. Akinlar, Cuneyt, Cihan Topal. EDLines: A real-time line segment detector with a false detection control // Pattern Recognition Letters. 2011. V. 32, N. 13. P. 1633-1642.

18. Chatfield K., Simonyan K., Vedaldi A., Zisserman A. Return of the devil in the details: Delving deep into convolutional nets // arXiv preprint arXiv:1405.3531 (2014).

19. Liaw A., Wiener M. Classification and regression by random-Forest // R news. 2002. V. 2, N. 3. P. 18-22.

Review

For citations:

Sergievskiy N.A., Kharlamov A.A. Structural Detection of Visual Objects for Mobile Robots. Mekhatronika, Avtomatizatsiya, Upravlenie. 2016;17(3):187-192. (In Russ.) https://doi.org/10.17587/mau/17.187-192

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 1684-6427 (Print)
ISSN 2619-1253 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Mekhatronika, Avtomatizatsiya, Upravlenie

Structural Detection of Visual Objects for Mobile Robots

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy