|Search by item||HOME > Access full text > Search by item|
JBE, vol. 26, no. 5, pp.599-607, September, 2021
Reinforcement Learning based Inactive Region Padding Method
Dongsin Kim, Kutub Uddin, and Byung Tae Oh
C.A E-mail: Inactive region padding, Reinforcement learning, Deep learning, Immersive video
Inactive region means a region filled with invalid pixel values to represent a specific image. Generally, inactive regions are occurred when the non-rectangular formatted images are converted to the rectangular shaped image, especially when 3D images are represented in 2D format. Because these inactive regions highly degrade the compression efficiency, filtering approaches are often applied to the boundaries between active and inactive regions. However, the image characteristics are not carefully considered during filtering. In the proposed method, inactive regions are padded through reinforcement learning that can consider the compression process and the image characteristics. Experimental results show that the proposed method performs an average of 3.4% better than the conventional padding method.
Keyword: Inactive region padding, Reinforcement learning, Deep learning, Immersive video
 Y. Ye, E. Alshina, and J. Boyce, “Algorithm descriptions of projection format conversion and video quality metrics in 360Lib (Version 5),” Joint Video Exploration Team of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, JVET-H1004, Oct. 2017.
 B. Salahieh, B. Kroon, J. Jung, M. Domański (Eds.), “Test model 2 for Immersive Video,” ISO/IEC JTC1/SC29/WG11, N18577, July 2019.
 Y.-H. Lee, H.-C. Lin, J.-L. Lin, S.-K. Chang, C.-C. Ju, “EE4: ERP/EAP-based segmented sphere projection with different padding sizes,” Joint Video Exploration Team of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, JVET-G0097, Jul. 2017.
 A. Abbas, “AHG8: An Update on RSP Projection,” Joint Video Exploration Team of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, JVET-H0056, Oct. 2017.
 G. Sullivan, J. Ohm, W. Han, and T. Wiegand, “Overview of the High Efficiency Video Coding (HEVC) Standard”, IEEE Transactions on Circuits and Systems for Video Technology, Vol.22, No.12, pp.1649-1668, December 2012.
 J. Lee, J. Park, H. Choi, J. Byeon, and D. Sim, “Overview of VVC”, Broadcasting and Media Magazine, Vol.24, No.4, pp.10-25, October 2019.
 Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems 27, 2014.
 Eldesokey, Abdelrahman, Michael Felsberg, and Fahad Shahbaz Khan. "Confidence propagation through cnns for guided sparse depth regression." IEEE transactions on pattern analysis and machine intelligence 42.10: 2423-2436, 2019.
 Takeda, Hiroyuki, Sina Farsiu, and Peyman Milanfar. "Kernel regression for image processing and reconstruction." IEEE Transactions on image processing 16.2: 349-366, 2007.
 Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." International conference on machine learning. PMLR, 2016.
 Bjøntegaard, G. “Calculation of average PSNR differences between RD-curves.” VCEG-M33, 2001.
 M. Yu, H. Lakshman, and B. Girod, “A framework to evaluate omnidirectional video coding schemes,” in IEEE International Symposium on Mixed and Augmented Reality, pp. 31–36, 2015.
 Y. Sun, A. Lu, and L. Yu, “AHG8: WS-PSNR for 360 video objective quality evaluation,” in Joint Video Exploration Team of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, JVET-D0040, Chengdu, 2016.