Search by item HOME > Access full text > Search by item

JBE, vol. 25, no. 4, pp.620-628, July, 2020


Subdivision Ensemble Model for Highlight Detection

Hansol Lee and Gyemin Lee

C.A E-mail: Video highlight, Ensemble model, BiLSTM, Event subsection, Event subdivision


Automatically predicting video highlight is an important task for media industry and streaming platform providers to save time and cost of manual video editing process. We propose a new ensemble model that combines multiple highlight detectors with each focusing on different parts of highlight events. Therefore, our model can capture more information-rich sections of events. Furthermore, the proposed model can extract improved features for highlight detection particularly when the train video set is small. We evaluate our model on e-sports and baseball videos.

Keyword: Video highlight, Ensemble model, BiLSTM, Event subsection, Event subdivision

[1] K. Zhang, WL. Chao, F. Sha, and K. Grauman, “Video Summarization with Long Short-term Memory,“ European Conference on Computer Vision, Amsterdam, Netherlands, pp. 766-782, 2016,
[2] K. Zhou, Y. Qiao, and Tao Xiang, “Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward,” In Thirty-Second AAAI Conference on Artificial Intelli- gence, pp. 7582-7589, 2018.
[3] B. Zhao, X. Li, and X. Lu, “HSA-RNN: Hierarchical Structure- Adaptive RNN for Video Summarization,” The IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 7405-7414, 2018,
[4] B. Mahasseni, M. Lam, and S. Todorovic, “Unsupervised Video Summarization with Adversarial LSTM Networks,” In CVPR, pp. 2982-2991, 2017,
[5] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative Adversarial Nets,” In NIPS, pp. 2672-2680, 2014.
[6] K. Zhang, K. Grauman, and F. Sha, “Retrospective Encoders for Video Summarization,” In ECCV, pp. 383-399, 2018, 1007/978-3-030-01237-3_24.
[7] H. Lee, G. Lee, “Summarizing Long-Length Videos with GAN- Enhanced Audio/Visual Features,” In ICCV workshop, 2019,
[8] H. Lee, G. Lee, “Video Highlight Prediction Using GAN and Multiple Time-Interval Information of Audio and Image,” Journal of Broadcast Engineering, Vol. 25, No. 2, pp. 143-150, 2020, 10.5909/JBE.2020.25.2.143
[9] E. Kim, G. Lee, “Highlight Detection in Personal Broadcasting by Analysing Chat Traffic : Game Contests as a Test Case," Journal of Broadcast Engineering, Vol. 23, No. 2, pp. 218-226, 2018,
[10] E. Kim, G. Lee, “Video Highlight Prediction Using Multiple Time- Interval Information of Chat and Audio,” Journal of Broadcast Engineering, Vol. 24, No. 4, pp. 553-563, 2019, 10.5909/JBE.2019.24.4.1.
[11] Twitch, (accessed May. 20, 2020).
[12] Kakao TV, (accessed May. 20, 2020).
[13] A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet Classification with Deep Convolutional Neural Networks,” In NIPS, 2012,
[14] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” In CVPR, pp. 770-778, 2016,
[15] Naver-sports, (accepted May. 20, 2020).
[16] OGN, (accepted May. 20, 2020).


Editorial Office
1108, New building, 22, Teheran-ro 7-gil, Gangnam-gu, Seoul, Korea
Homepage: TEL: +82-2-568-3556 FAX: +82-2-568-3557
Copyrightⓒ 2012 The Korean Institute of Broadcast and Media Engineers
All Rights Reserved