Search by item HOME > Access full text > Search by item

JBE, vol. 26, no. 2, pp.184-196, March, 2021

DOI: https://doi.org/10.5909/JBE.2021.26.2.184

Object Size Prediction based on Statistics Adaptive Linear Regression for Object Detection

Yonghye Kwon, Jongseok Lee, and Donggyu Sim

C.A E-mail: dgsim@kw.ac.kr

Abstract:

This paper proposes statistics adaptive linear regression-based object size prediction method for object detection. YOLOv2 and YOLOv3, which are typical deep learning-based object detection algorithms, designed the last layer of a network using statistics adaptive exponential regression model to predict the size of objects. However, an exponential regression model can propagate a high derivative of a loss function into all parameters in a network because of the property of an exponential function. We propose statistics adaptive linear regression layer to ease the gradient exploding problem of the exponential regression model. The proposed statistics adaptive linear regression model is used in the last layer of the network to predict the size of objects with statistics estimated from training dataset. We newly designed the network based on the YOLOv3tiny and it shows the higher performance compared to YOLOv3 tiny on the UFPR-ALPR dataset.



Keyword: Object Detection, Statistics Adaptive Linear Regression, YOLO

Reference:
[1] Masi, Y. Wu, T. Hassner, and P. Natarajan, “Deep Face Recognition: A Survey,” in SIBGRAPI Conference on Graphics, Patterns and Images, Parana, 2018.
[2] E. Arnold, O. Y. Al-Jarrah, M. Dianati, S. Fallah, D. Oxtoby, and A. Mouzakitis, “A Survey on 3D Object Detection Methods for Autonomous Driving Applications,” IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 10, pp. 3782-3795, 2019.
[3] OpenALPR, [Online]. Available: https://www.openalpr.com/
[4] W. Xiongwei, D. Sahoo, and S. C. H. Hoi, “Recent advances in deep learning for object detection,” Neurocomputing, vol. 396, pp. 39-64, 2020.
[5] M. Everingham, L. V. Gool, C. K. I. Williams, J. M. Winn, and A. Zisserman, “The PASCAL Visual Object Classes (VOC) Challenge,” International Journal of Computer Vision, vol. 88, no. 2, pp. 303-338, 2010.
[6] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: Common Objects in Context,” in European Computer Vision Conference, Zurich, 2014.
[7] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014.
[8] R. Girshick, “Fast R-CNN,” in IEEE International Conference on Computer Vision, Santiago, 2015.
[9] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017.
[10] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” in IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016.
[11] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: Single Shot MultiBox Detector,” in European Conference on Computer Vision, Amsterdam, 2016.
[12] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” in IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017.
[13] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” arXiv, 2018.
[14] R. Laroca, E. Severo, L. A. Zanlorensi, L. S. Oliveira, G. R. Gonçalves, W. R. Schwartz, and D. Menotti, “A Robust Real-Time Automatic License Plate Recognition Based on the YOLO Detector,” in International Joint Conference on Neural Networks, Rio de Janeiro, 2018.
[15] Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully Convolutional One-Stage Object Detection,” in IEEE/CVF International Conference on Computer Vision, Seoul, 2019.
[16] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT: Real-Time Instance Segmentation,” in IEEE/CVF International Conference on Computer Vision, Seoul, 2019.
[17] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and B. Serge, “Feature Pyramid Networks for Object Detection,” in IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017.
[18] yolov3 tiny, [Online]. Available: https://github.com/pjreddie/darknet/blob/master/cfg/yolov3-tiny.cfg.
[19] D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[20] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, 2001.
[21] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, 2005.
[22] H. Bay, T. Tuytelaars, and L. V. Gool, “SURF: Speeded Up Robust Features,” in European Conference on Computer Vision, Graz, 2006.
[23] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273-297, 1995.
[24] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, Lake Tahoe, 2012.
[25] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv, 2014.
[26] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016.
[27] G. Huang, Z. Liu, L. V. D. Maaten, and K. Q. Weinberger, “Densely Connected Convolutional Networks,” in IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017.
[28] Python, [Online]. Available: https://www.python.org/
[29] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “PyTorch: An Imperative Style, High-Performance Deep Learning Library,” in Advances in Neural Information Processing Systems, Vancouver, 2019.
[30] S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” in International Conference on Machine Learning, Lille, 2015.

Comment


Editorial Office
1108, New building, 22, Teheran-ro 7-gil, Gangnam-gu, Seoul, Korea
Homepage: www.kibme.org TEL: +82-2-568-3556 FAX: +82-2-568-3557
Copyrightⓒ 2012 The Korean Institute of Broadcast and Media Engineers
All Rights Reserved