Search by item HOME > Access full text > Search by item

JBE, vol. 27, no. 4, pp.519-526, July, 2022

DOI: https://doi.org/10.5909/JBE.2022.27.4.519

Character Recognition and Search for Media Editing

Yong-Suk Park and Hyun-Sik Kim

C.A E-mail: hskim@keti.re.kr

Abstract:

Identifying and searching for characters appearing in scenes during multimedia video editing is an arduous and time-consuming process. Applying artificial intelligence to labor-intensive media editing tasks can greatly reduce media production time, improving the creative process efficiency. In this paper, a method is proposed which combines existing artificial intelligence based techniques to automate character recognition and search tasks for video editing. Object detection, face detection, and pose estimation are used for character localization and face recognition and color space analysis are used to extract unique representation information.



Keyword: Video editing, Character recognition, Object detection, Face recognition, Feature extraction

Reference:
[1] Q. Tang, B. Gu, and A. Whinston, “Content Contribution in Social Media: The Case of YouTube,” Proceeding of 2012 45th Hawaii International Conference on System Sciences, Maui, HI, USA, pp. 4476-4485, 2012. doi: https://doi.org/10.1109/HICSS.2012.181
[2] T. Soe, “AI video editing tools. What editors want and how far is AI from delivering?” arXiv:2109.07809 [cs.HC], pp. 1-7, 2021. doi: https://doi.org/10.48550/arXiv.2109.07809
[3] L. Jiao et al., “New Generation Deep Learning for Video Object Detection: A Survey,” IEEE Transactions on Neural Networks and Learning Systems (Early Access), pp.1-21, Feb. 2021. doi: https://doi.org/10.1109/TNNLS.2021.3053249
[4] Y. Feng, S. Yu, H. Peng, Y. -R. Li, and J. Zhang, “Detect Faces Efficiently: A Survey and Evaluations,” IEEE Transactions on Biometrics, Behavior, and Identity Science, Vol.4, No.1, pp.1-18, Jan. 2022. doi: https://doi.org/10.1109/TBIOM.2021.3120412
[5] I. Masi, Y. Wu, T. Hassner, and P. Natarajan, “Deep Face Recognition: A Survey,” Proceeding of 22018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), Parana, Brazil, pp. 471-478, 2018. doi: https://doi.org/10.1109/SIBGRAPI.2018.00067
[6] G. -S. Hsu and C. -H. Tang, “Dual-View Normalization for Face Recognition,” IEEE Access, Vol.8, pp.147765-147775, July 2020. doi: https://doi.org/10.1109/ACCESS.2020.3014877
[7] T. L. Munea, Y. Z. Jembre, H. T. Weldegebriel, L. Chen, C. Huang, and C. Yang, “The Progress of Human Pose Estimation: A Survey and Taxonomy of Models Applied in 2D Human Pose Estimation,” IEEE Access, Vol.8, pp.133330-133348, July 2020. doi: https://doi.org/10.1109/ACCESS.2020.3010248
[8] S. Du and S. Wang, “An Overview of Correlation-Filter-Based Object Tracking,” IEEE Transactions on Computational Social Systems, Vol.9, No.1, pp.18-31, Feb. 2022. doi: https://doi.org/10.1109/TCSS.2021.3093298
[9] A. Gautam and S. Singh, “Trends in Video Object Tracking in Surveillance: A Survey,” Proceeding of 2019 Third International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, India, pp. 729-733, 2019. doi: https://doi.org/10.1109/I-SMAC47947.2019.9032529
[10] C. -Y. Wang, I-H. Yeh, and H. -Y. Liao, “You Only Learn One Representation: Unified Network for Multiple Tasks,” arXiv:2105.04206 [cs.CV], pp. 1-11, 2021. doi: https://doi.org/10.48550/arXiv.2105.04206
[11] J. Deng, J. Guo, E. Ververas, I. Kotsia, and S. Zafeiriou, “RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild,” Proceeding of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, pp. 5202-5211, 2020. doi: https://doi.org/10.1109/CVPR42600.2020.00525
[12] V. Bazarevsky, I. Grishchenko, K. Raveendran, T. Zhu, F. Zhang, and M. Grundmann, “BlazePose: On-device Real-time Body Pose tracking,” arXiv:2006.10204 [cs.CV], pp. 1-4, 2020. doi: https://doi.org/10.48550/arXiv.2006.10204
[13] J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “ArcFace: Additive Angular Margin Loss for Deep Face Recognition,” Proceeding of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 4685-4694, 2019. doi: https://doi.org/10.1109/CVPR.2019.00482

Comment


Editorial Office
1108, New building, 22, Teheran-ro 7-gil, Gangnam-gu, Seoul, Korea
Homepage: www.kibme.org TEL: +82-2-568-3556 FAX: +82-2-568-3557
Copyrightⓒ 2012 The Korean Institute of Broadcast and Media Engineers
All Rights Reserved