Search by item HOME > Access full text > Search by item

JBE, vol. 23, no. 5, pp.598-605, September, 2018


Using Skeleton Vector Information and RNN Learning Behavior Recognition Algorithm

Mi-Kyung Kim and Eui-Young Cha

C.A E-mail:


Behavior awareness is a technology that recognizes human behavior through data and can be used in applications such as risk behavior through video surveillance systems. Conventional behavior recognition algorithms have been performed using the 2D camera image device or multi-mode sensor or multi-view or 3D equipment. When two-dimensional data was used, the recognition rate was low in the behavior recognition of the three-dimensional space, and other methods were difficult due to the complicated equipment configuration and the expensive additional equipment. In this paper, we propose a method of recognizing human behavior using only CCTV images without additional equipment using only RGB and depth information. First, the skeleton extraction algorithm is applied to extract points of joints and body parts. We apply the equations to transform the vector including the displacement vector and the relational vector, and study the continuous vector data through the RNN model. As a result of applying the learned model to various data sets and confirming the accuracy of the behavior recognition, the performance similar to that of the existing algorithm using the 3D information can be verified only by the 2D information.

Keyword: skeleton, feature vector, RNN, SELU, deep learning

[1] C. Jung,and D. Kang,“A Recognition Algorithm of Suspicious Human Behaviors using Hidden Markov Models in an Intelligent Surveillance System, ” Journal of multimedia information system, Vol.11, No.11, pp.1491-1500,Nov. 2008.
[2] TRAN, Du, et al. “Learning spatiotemporal features with 3d convolutional networks. arXiv preprint arXiv:1412.0767, 2014.
[3] LI, Wanqing, ZHANG, Zhengyou, LIU, Zicheng. “Action recognition based on a bag of 3d points”. Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on. IEEE, p. 9-14.2010.
[4] HU, Jian-Fang, et al. “Jointly learning heterogeneous features for RGB-D activity recognition. ” Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 5344-5352. 2015.
[5] INSAFUTDINOV, Eldar, et al. ArtTrack: “Articulated multi-person tracking in the wild,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017.
[6] Diogo Carbonera Luvizon, Hedi Tabia, David Picard, “Learning features combination for human action recognition from skeleton sequences,” Pattern Recognition Letters, Volume 99, pp 13-20, ISSN 0167-8655. 2017.
[7] CHO, Kyunghyun, et al. “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.
[8] IOFFE, Sergey SZEGEDY, “Christian. Batch normalization: Accelerating deep network training by reducing internal covariate shift,” International Conference on Machine Learning. pp. 448-456. 2015.
[9] SRIVASTAVA, Nitish, et al. Dropou, “A simple way to prevent neural networks from overfitting,” Journal of machine learning research, 15.1: 1929-1958. 2014.
[10] KINGMA, Diederik BA, Jimmy. Adam: “A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
[11] NAIR, Vinod HINTON, “Geoffrey E. Rectified linear units improve restricted boltzmann machines,” Proceedings of the 27th international conference on machine learning (ICML-10). pp. 807-814. 2010.
[12] KLAMBAUER,Günter,et al.“Self-Normalizing Neural Networks,” arXiv preprint arXiv: 1706.02515, 2017.
[13] CHANG, Chih-Chung, LIN, Chih-Jen. LIBSVM, “A library for support vector machines,” ACM transactions on intelligent systems and technology (TIST), 2.3: 27, 2011.
[14] LI, Ruonan, ZICKLER, Todd. “Discriminative virtual views for cross-view action recognition,” Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, pp. 2855-2862. 2012.
[15] LI, Binlong CAMPS, Octavia I.
SZNAIER, Mario.“Cross-view activity recognition using hankelets,” Computer Vision and Pattern Recognition(CVPR),2012 IEEE Conference on. IEEE, pp.1362-1369. 2012.
[16] SADANAND, Sreemanananth, CORSO, Jason J. Action bank “A high-level representation of activity in video,”. Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, pp. 1234-1241. 2012.
[17] MAJI, Subhransu, BOURDEV, Lubomir, MALIK, Jitendra. “Action recognition from a distributed representation of pose and appearance,” Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, pp. 3177-3184. 2011.
[18] WANG, Jiang, et al. “Cross-view action modeling, learning and recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2649-2656. 2014.
[19] OREIFEJ, Omar, LIU, Zicheng. Hon4d “Histogram of oriented 4d normals for activity recognition from depth sequences,” Computer vision and pattern recognition (CVPR), 2013 IEEE conference on. IEEE, pp. 716-723. 2013.
[20] CAO, Liangliang, et al. “Heterogeneous feature machines for visual recognition,” Computer Vision, 2009 IEEE 12th International Conference on. IEEE, pp. 1095-1102. 2009.
[21] CAI, Zhuowei, et al. “Multi-view super vector for action recognition,” Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. pp. 596-603. 2014.
[22] ZHANG, Yu, YEUNG, Dit Yan. “Multi-task learning in heterogeneous feature spaces,” 25th AAAI Conference on Artificial Intelligence and the 23rd Innovative Applications of Artificial Intelligence Conference, AAAI-11/IAAI-11, San Francisco, CA, United States, 7-11 August 2011, Code 87049, Proceedings of the National Conference on Artificial Intelligence. pp. 574. 2011.
[23] S.Shin, and J.Cha, “Human Activity Recognition System Using Multimodal Sensor and Deep Learning Based on LSTM,”, Transactions of the Korean Society of Mechanical Engineers - A 42(2), pp 111-121. 2018.2,
[24] S.Jo, H.Kang, “Real-time object tracking in Multi-Camera environments,”, Journal of Computing Science and Engineering, 2004.10, Vol. 31, No. 2 (Ⅱ),pp691-693


Editorial Office
1108, New building, 22, Teheran-ro 7-gil, Gangnam-gu, Seoul, Korea
Homepage: TEL: +82-2-568-3556 FAX: +82-2-568-3557
Copyrightⓒ 2012 The Korean Institute of Broadcast and Media Engineers
All Rights Reserved