Search by item HOME > Access full text > Search by item

JBE, vol. 26, no. 4, pp.441-452, July, 2021


Luma Mapping Function Generation Method Using Attention Map of Convolutional Neural Network in Versatile Video Coding Encoder

Naseong Kwon, Jongseok Lee, Joohyung Byeon, and Donggyu Sim

C.A E-mail:


In this paper, we propose a method for generating luma signal mapping function to improve the coding efficiency of luma signal mapping methods in LMCS. In this paper, we propose a method to reflect the cognitive and perceptual features by multiplying the attention map of convolutional neural networks on local spatial variance used to reflect local features in the existing LMCS. To evaluate the performance of the proposed method, BD-rate is compared with VTM-12.0 using classes A1, A2, B, C and D of MPEG standard test sequences under AI (All Intra) conditions. As a result of experiments, the proposed method in this paper shows improvement in performance the average of -0.07% for luma components in terms of BD-rate performance compared to VTM-12.0 and encoding/decoding time is almost the same.

Keyword: VVC, Encoder, Luma mapping with Chroma Scaling, CNN

[1] G. Sullivan, J. Ohm, W. Han, and T. Wiegand, “Overview of the high efficiency video coding (HEVC) standard,” Institute of Electrical and Electronics Engineers (IEEE) Transactions on circuits and systems for video technology, Vol.22, No.12, pp. 1649-1668, Dec. 2012.
[2] B. Bross, J. Chen, S. Liu, and Y.-K. Wang, “Versatile Video Coding (Draft 10),” JVET-S2001, Jul. 2020.
[3] J. Lee, J. Park, H. Choi, J. Byeon, and D. Sim, “Overview of VVC,” Broadcasting and Media Magazine, Vol.24, No.4, pp. 10-25, Oct. 2019.
[4] D. Park, Y. Yun, and J. Kim, "VVC의 In-Loop Filter 기술," Broadcasting and Media Magazine, Vol.24, No.4, pp. 87-101, Oct. 2019.
[5] T. Lu, F. Pu, P. Yin, S. McCarthy, W. Husak, T. Chen, E. Francois, C. Chevance, F. Hiron, J. Chen, R. Liao, Y. Ye, and J. Luo, "Luma Mapping with Chroma Scaling in Versatile Video Coding," Data Compression Conference (DCC), Snowbird, UT, USA, pp. 193-202, 2020.
[6] VTM,
[7] J. Im, U. Im, and D. Sim, "HDR/WCG 영상 압축을 위한 표준 기술 동향," Broadcasting and Media Magazine, Vol.21, No.1, pp. 59-69, 2016.
[8] Rec. ITU-R BT.2100-2, “Image parameter values for high dynamic range television for use in production and international programme exchange”
[9] A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutional neuralnetworks,” In Neural Information Processing Systems (NIPS), 2012.
[10] L. Zhou, X. Song, J. Yao, L. Wang, and F. Chen, “Convolution Neural Network Filter (CNNF) for Intra Frame,” JVET-I0022, Joint Video Exploration Team of ISO/IEC and ITU-T, Gwangju, Korea, Jan 2018.
[11] J. Kang, S. Kim, and K. Lee, “Multi-modal/multi-scale convolutional neural network based in-loop filter design for next generation video codec,” Institute of Electrical and Electronics Engineers (IEEE) International Conference on Image Processing (ICIP), pp. 26-30, 2017.
[12] F. Zhang, C. Feng and D. R. Bull, "Enhancing VVC Through Cnn-Based Post-Processing," Institute of Electrical and Electronics Engineers (IEEE) International Conference on Multimedia and Expo (ICME), pp. 1-6, 2020.
[13] H. Moon, and J. Kim, "CNN Based In-loop Filter in Versatile Video Coding (VVC)," Proceedings of the Korean Society of Broadcast Engineers Conference, The Korean Institute of Broadcast and Media Engineers, pp. 270-271, 2018.
[14] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” In ICLR, 2015.
[15] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: towards real-time object detection with region proposal networks." Institute of Electrical and Electronics Engineers (IEEE) transactions on pattern analysis and machine intelligence Vol.39, No.6, pp. 1137-1149, 2017.
[16] E. François, F. Galpin, K. Naser, and P. de Lagrange, “AHG7/AHG15: Signalling of corrective values for chroma residual scaling,” JVET-P0371, Oct. 2019.
[17] J. Deng, W. Dong, R. Socher, L. Li, K. Ki, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database." 2009 Institute of Electrical and Electronics Engineers (IEEE) conference on computer vision and pattern recognition, pp. 248-255, 2009.
[18] F. Bossen, J. Boyce, K. Suehring, X. Li, and V. Seregin, “JVET common test conditions and software reference configurations for SDR video,” JVET-N1010, Mar. 2019.
[19] G. Bjøntegaard, “Calculation of average PSNR differences between RDcurves,” Tech. Rep. VCEGM33, Video Coding Experts Group (VCEG), 2001.
[20] ONNX Runtime,, 2019.
[21] Bitstream InSights - VTM,


Editorial Office
1108, New building, 22, Teheran-ro 7-gil, Gangnam-gu, Seoul, Korea
Homepage: TEL: +82-2-568-3556 FAX: +82-2-568-3557
Copyrightⓒ 2012 The Korean Institute of Broadcast and Media Engineers
All Rights Reserved