Search by item HOME > Access full text > Search by item

JBE, vol. 27, no. 4, pp.487-498, July, 2022

DOI: https://doi.org/10.5909/JBE.2022.27.4.487

360 RGBD Image Synthesis from a Sparse Set of Images with Narrow Field-of-View

Soojie Kim and In Kyu Park

C.A E-mail: pik@inha.ac.kr

Abstract:

Depth map is an image that contains distance information in 3D space on a 2D plane and is used in various 3D vision tasks. Many existing depth estimation studies mainly use narrow FoV images, in which a significant portion of the entire scene is lost. In this paper, we propose a technique for generating 360° omnidirectional RGBD images from a sparse set of narrow FoV images. The proposed generative adversarial network based image generation model estimates the relative FoV for the entire panoramic image from a small number of non-overlapping images and produces a 360° RGB and depth image simultaneously. In addition, it shows improved performance by configuring a network reflecting the spherical characteristics of the 360° image.



Keyword: 360 Image, Panorama, Image Synthesis, Depth Estimation, 3D Scene Reconstruction

Reference:
[1] Y. Wang, W. L. Chao, D.Garg, B. Hariharan, M. Campbell, and K. Q. Weinberger, “Pseudo-lidar from visual depth estimation: Bridging the gap in 3d object detection for autonomous driving,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2019. doi: https://doi.org/10.48550/arXiv.1812.07179
[2] F. E. Wang, Y. H. Yeh, M. Sun, W. C. Chiu, and Y. H. Tsai, “LED2-Net: Monocular 360 layout estimation via differentiable depth rendering,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2021. doi: https://doi.org/10.48550/arXiv.2104.00568
[3] S. Li, “Binocular spherical stereo,” IEEE Trans. on Intelligent Transportation Systems, vol 9, December 2008. doi: https://doi.org/10.1109/TITS.2008.2006736
[4] N. Zioulis, A. Karakottas, D. Zarpalas, F. Alvarez, and P.Daras, “Spherical view synthesis for self-supervised 360 depth estimation,” Proc. International Conference on 3D Vision, September 2019. doi: https://doi.org/10.48550/arXiv.1909.08112
[5] F. E. Wang, Y. H. Yeh, M. Sun, W. C. Chiu, and Y. H. Tsai, “BiFuse: Monocular 360° depth estimation via bi-projection fusion,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2020. doi: https://doi.org/10.1109/CVPR42600.2020.00054
[6] N. Ziloulis, A. Karakottas, D. Zarpalas, and P. Daras, “OmniDepth: Dense depth estimation for indoors spherical panoramas,” Proc. European Conference on Computer Vision, September 2018. doi: https://doi.org/10.48550/arXiv.1807.09620
[7] J. Zbontar and Y. LeCun, “Stereo matching by training a convolutional neural network to compare image patches,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2016. doi: https://doi.org/10.48550/arXiv.1510.05970
[8] A. Kendall, H. Martirosyan, S. Dasgupta, P. Henry, R. Kennedy, A. Bachrach, and A. Bry, “End-to-end learning of geometry and context for deep stereo regression,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, July 2017. doi: https://doi.org/10.48550/arXiv.1703.04309
[9] J. R. Chang and Y. S. Chen, “Pyramid stereo matching network,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2018. doi: https://doi.org/10.48550/arXiv.1803.08669
[10] C. Godard, O. M. Aodha, and G. J. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, July 2017. doi: https://doi.org/10.48550/arXiv.1609.03677
[11] K. Lu, N. Barnes, S. Anwar, and L. Zheng, “From depth what can you see? Depth completion via auxiliary image reconstruction,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2020. doi: https://doi.org/10.1109/CVPR42600.2020.01132
[12] Y. Zhang and T. Funkhouser, “Deep depth completion of a single RGB-D image,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2018. doi: https://doi.org/10.48550/arXiv.1803.09326
[13] N. H. Wang, B. Solarte, Y. H. Tsai, W. C. Chiu, and M. Sun, “360SD-Net: 360° Stereo depth estimation with learnable cost volume,” Proc. IEEE International Conference on Robotics and Automation, November 2020. doi: https://doi.org/10.48550/arXiv.1911.04460
[14] J. S. Sumantri and I. K. Park, “360 Panorama synthesis from a sparse set of images with unknown field of view,” IEEE Trans. on Computational Imaging, vol. 6, pp. 1179-1193, July 2020. doi: https://doi.org/10.48550/arXiv.1904.03326
[15] C. O. W.J. Cho and K. Yoon, “RGBD panorama synthesis using normal field-of-view cameras and mobile depth sensors in arbitrary configuration,” Proc. The 33rd Workshop on Image Processing and Image Understanding, p1-11, February 2021. https://arxiv.org/pdf/2112.06179.pdf
[16] D. C. Dowson and B. V. Landau, “The Frechet distance between multivariate normal distributions,” in Journal of Multivariate Analysis, 12:450–455, 1982. doi: https://doi.org/10.1016/0047-259X(82)90077-X
[17] Y. Ren, X. Yu, R. Zhang, T. H. Li, S. Liu, and G. Li, “Structureflow: Image inpainting via structure-aware appearance flow,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2019. doi: https://doi.org/10.48550/arXiv.1908.03852
[18] H. Liu, B. Jiang, Y. Song, W. Huang, and C. Yang, “Rethinking image inpainting via a mutual encoder-decoder with feature equalizations,” Proc. European Conference on Computer Vision, November 2020. doi: https://doi.org/10.48550/arXiv.2007.06929
[19] A. Chang, A. Dai, T. Funkhouser, M. Halber, M. Niessner, M. Savva, S. Song, A. Zeng, and Y. Zhang, “Matterport3D: Learning from RGB-D data in indoor environments,” Proc. International Conference on 3D Vision, October 2017. doi: https://doi.org/10.48550/arXiv.1709.06158
[20] I. Armeni, S. Sax, A. R. Zamir, and S. Savarese, “Joint 2D-3D semantic data for indoor scene understanding,” arXiv preprint arXiv:1702. 01105, April 2017. doi: https://doi.org/10.48550/arXiv.1702.01105
[21] S. Song, F. Yu, A. Zeng, A. X. Chang, M. Savva, and T. Funkhouser, “Semantic scene completion from a single depth image,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, July 2017. doi: https://doi.org/10.48550/arXiv.1611.08974
[22] C. Sun, M. Sun, and H. T. Chen, "HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features," Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2021. doi: https://doi.org/10.48550/arXiv.2011.11498
[23] K. Nazeri, E. Ng, T. Joseph, F. Z. Qureshi, and M. Ebrahimi, "EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning," arXiv:1901.00212, January 2019. doi: https://doi.org/10.48550/arXiv.1901.00212
[24] K. He, X. Chen, S. Xie, Y. Li, P. Dollar, and R. Grirshick, “Masked Autoencoders Are Scalable Vision Learners,” arXiv:2111.06377, November 2021. doi: https://doi.org/10.48550/arXiv.2111.06377

Comment


Editorial Office
1108, New building, 22, Teheran-ro 7-gil, Gangnam-gu, Seoul, Korea
Homepage: www.kibme.org TEL: +82-2-568-3556 FAX: +82-2-568-3557
Copyrightⓒ 2012 The Korean Institute of Broadcast and Media Engineers
All Rights Reserved