Preview

Mekhatronika, Avtomatizatsiya, Upravlenie

Advanced search

Structural Detection of Visual Objects for Mobile Robots

https://doi.org/10.17587/mau/17.187-192

Abstract

This paper presents StructDetect, a fast method for object detection. The target detection process consists of two stages: generation of a hypothesis (object proposals) (1) and verification of the hypothesis (2). Generation of the object proposals is carried out by means of a simple structural model on the basis of line segment combining. Line segment is detected by EdLines algorithm. Then a computer attributes the line segments and their pairs and creates "a connection table", which filters some combinations. Further, it creates a triple combination of the line segments filtered by "the connection table". Each combination has a handcraft descriptor based on the line segment attribute. This descriptor is used to learn kNN classifier and generate object proposals in the area of 3 line segments. These proposals define a set of candidate bounding boxes available to the detector. The second module is based on a convolutional neural network, which takes a fixed-length feature vector from each region. The convolution neural network computes once per image and features vector extracts with adaptively-sized pooling from the last convolution layer. Then the feature vectors are classified by the random forest algorithm. Accuracy of this approach is comparable with the accuracy of such modern detector methods as SPPNet and RCNN. StructDetect is 7 times faster than SPPNet and has a frame rate of 4fps on a CPU.

About the Authors

N. A. Sergievskiy
ELVEES-NeoTek
Russian Federation


A. A. Kharlamov
Institute of Higher Nervous Activity and Neurophysiology of the Russian Academy of Sciences (IHNA&N RAS); Moscow State Linguistics University
Russian Federation


References

1. Girshick, Ross Brook. From rigid templates to grammars: Object detection with structured models. University of Chicago, 2012.

2. Paul V., Jones M. J. Robust real-time face detection // International journal of computer vision. 2004. V. 57, N. 2. P. 137-154.

3. LeCun Y. Gradient-based learning applied to document recognition // Proc. of the IEEE. 1998. V. 86, N. 11. P. 2278-2324.

4. Krizhevsky A., Sutskever I., Hinton G. E. Imagenet classification with deep convolutional neural networks // Advances in neural information processing systems. 2012.

5. Fan R. E., Chang K. W., Hsieh C. J., Wang, X. R., Lin C. J. LIBLINEAR: A library for large linear classification // The Journal of Machine Learning Research. 2008. V. 9. P. 1871-1874.

6. Uijlings J. R., van de Sande K. E., Gevers T., Smeulders A. W. Selective search for object recognition // International journal of computer vision. 2013. V. 104, N. 2. P. 154-171.

7. Everingham M., Van Gool L., Williams C. K., Winn J., Zisserman A. The pascal visual object classes (voc) challenge // International journal of computer vision. 2010. V. 88, N. 2. P. 303-338.

8. Deng J., Dong W., Socher R., Li L. J., Li K., Fei-Fei L. Imagenet: A large-scale hierarchical image database // Computer Vision and Pattern Recognition. CVPR 2009. IEEE Conference on IEEE, 2009.

9. Sivic J., Russell B. C., Efros A., Zisserman A., Freeman W. T. Discovering objects and their location in images // Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on. 2005. V. 1.

10. Zitnick, Lawrence C., Dollár P. Edge boxes: Locating object proposals from edges // Computer Vision-ECCV 2014. Springer International Publishing, 2014. P. 391-405.

11. Hosang J., Benenson R., Schiele B. How good are detection proposals, really? arXiv preprint arXiv:1406.6962 (2014).

12. Girshick R., Donahue J., Darrell T., Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation // Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014.

13. He K., Zhang X., Ren S., Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition // Computer Vision-ECCV 2014. Springer International Publishing, 2014. P. 346-361.

14. Girshick Ross. Fast R-CNN. arXiv preprint arXiv:1504.08083 (2015).

15. Лобов С. А., Сергиевский Н. А., Харламов А. А. Адаптация алгоритма сверточных нейронных сетей на ПЛИС // Программные системы: теория и приложения. 2013. № 3 (17).

16. Cheng M. M., Zhang Z., Lin W. Y., Torr P. BING: Binarized normed gradients for objectness estimation at 300fps // Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014.

17. Akinlar, Cuneyt, Cihan Topal. EDLines: A real-time line segment detector with a false detection control // Pattern Recognition Letters. 2011. V. 32, N. 13. P. 1633-1642.

18. Chatfield K., Simonyan K., Vedaldi A., Zisserman A. Return of the devil in the details: Delving deep into convolutional nets // arXiv preprint arXiv:1405.3531 (2014).

19. Liaw A., Wiener M. Classification and regression by random-Forest // R news. 2002. V. 2, N. 3. P. 18-22.


Review

For citations:


Sergievskiy N.A., Kharlamov A.A. Structural Detection of Visual Objects for Mobile Robots. Mekhatronika, Avtomatizatsiya, Upravlenie. 2016;17(3):187-192. (In Russ.) https://doi.org/10.17587/mau/17.187-192

Views: 342


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1684-6427 (Print)
ISSN 2619-1253 (Online)