Search by item | HOME > Access full text > Search by item |
JBE, vol. 27, no. 4, pp.487-498, July, 2022
DOI: https://doi.org/10.5909/JBE.2022.27.4.487 360 RGBD Image Synthesis from a Sparse Set of Images with Narrow Field-of-View Soojie Kim and In Kyu Park C.A E-mail: pik@inha.ac.kr Abstract: Depth map is an image that contains distance information in 3D space on a 2D plane and is used in various 3D vision tasks. Many existing depth estimation studies mainly use narrow FoV images, in which a significant portion of the entire scene is lost. In this paper, we propose a technique for generating 360° omnidirectional RGBD images from a sparse set of narrow FoV images. The proposed generative adversarial network based image generation model estimates the relative FoV for the entire panoramic image from a small number of non-overlapping images and produces a 360° RGB and depth image simultaneously. In addition, it shows improved performance by configuring a network reflecting the spherical characteristics of the 360° image. Keyword: 360 Image, Panorama, Image Synthesis, Depth Estimation, 3D Scene Reconstruction Reference: [1] Y. Wang, W. L. Chao, D.Garg, B. Hariharan, M. Campbell, and K. Q. Weinberger, “Pseudo-lidar from visual depth estimation: Bridging the gap in 3d object detection for autonomous driving,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2019. doi: https://doi.org/10.48550/arXiv.1812.07179 [2] F. E. Wang, Y. H. Yeh, M. Sun, W. C. Chiu, and Y. H. Tsai, “LED2-Net: Monocular 360 layout estimation via differentiable depth rendering,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2021. doi: https://doi.org/10.48550/arXiv.2104.00568 [3] S. Li, “Binocular spherical stereo,” IEEE Trans. on Intelligent Transportation Systems, vol 9, December 2008. doi: https://doi.org/10.1109/TITS.2008.2006736 [4] N. Zioulis, A. Karakottas, D. Zarpalas, F. Alvarez, and P.Daras, “Spherical view synthesis for self-supervised 360 depth estimation,” Proc. International Conference on 3D Vision, September 2019. doi: https://doi.org/10.48550/arXiv.1909.08112 [5] F. E. Wang, Y. H. Yeh, M. Sun, W. C. Chiu, and Y. H. Tsai, “BiFuse: Monocular 360° depth estimation via bi-projection fusion,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2020. doi: https://doi.org/10.1109/CVPR42600.2020.00054 [6] N. Ziloulis, A. Karakottas, D. Zarpalas, and P. Daras, “OmniDepth: Dense depth estimation for indoors spherical panoramas,” Proc. European Conference on Computer Vision, September 2018. doi: https://doi.org/10.48550/arXiv.1807.09620 [7] J. Zbontar and Y. LeCun, “Stereo matching by training a convolutional neural network to compare image patches,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2016. doi: https://doi.org/10.48550/arXiv.1510.05970 [8] A. Kendall, H. Martirosyan, S. Dasgupta, P. Henry, R. Kennedy, A. Bachrach, and A. Bry, “End-to-end learning of geometry and context for deep stereo regression,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, July 2017. doi: https://doi.org/10.48550/arXiv.1703.04309 [9] J. R. Chang and Y. S. Chen, “Pyramid stereo matching network,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2018. doi: https://doi.org/10.48550/arXiv.1803.08669 [10] C. Godard, O. M. Aodha, and G. J. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, July 2017. doi: https://doi.org/10.48550/arXiv.1609.03677 [11] K. Lu, N. Barnes, S. Anwar, and L. Zheng, “From depth what can you see? Depth completion via auxiliary image reconstruction,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2020. doi: https://doi.org/10.1109/CVPR42600.2020.01132 [12] Y. Zhang and T. Funkhouser, “Deep depth completion of a single RGB-D image,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2018. doi: https://doi.org/10.48550/arXiv.1803.09326 [13] N. H. Wang, B. Solarte, Y. H. Tsai, W. C. Chiu, and M. Sun, “360SD-Net: 360° Stereo depth estimation with learnable cost volume,” Proc. IEEE International Conference on Robotics and Automation, November 2020. doi: https://doi.org/10.48550/arXiv.1911.04460 [14] J. S. Sumantri and I. K. Park, “360 Panorama synthesis from a sparse set of images with unknown field of view,” IEEE Trans. on Computational Imaging, vol. 6, pp. 1179-1193, July 2020. doi: https://doi.org/10.48550/arXiv.1904.03326 [15] C. O. W.J. Cho and K. Yoon, “RGBD panorama synthesis using normal field-of-view cameras and mobile depth sensors in arbitrary configuration,” Proc. The 33rd Workshop on Image Processing and Image Understanding, p1-11, February 2021. https://arxiv.org/pdf/2112.06179.pdf [16] D. C. Dowson and B. V. Landau, “The Frechet distance between multivariate normal distributions,” in Journal of Multivariate Analysis, 12:450–455, 1982. doi: https://doi.org/10.1016/0047-259X(82)90077-X [17] Y. Ren, X. Yu, R. Zhang, T. H. Li, S. Liu, and G. Li, “Structureflow: Image inpainting via structure-aware appearance flow,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2019. doi: https://doi.org/10.48550/arXiv.1908.03852 [18] H. Liu, B. Jiang, Y. Song, W. Huang, and C. Yang, “Rethinking image inpainting via a mutual encoder-decoder with feature equalizations,” Proc. European Conference on Computer Vision, November 2020. doi: https://doi.org/10.48550/arXiv.2007.06929 [19] A. Chang, A. Dai, T. Funkhouser, M. Halber, M. Niessner, M. Savva, S. Song, A. Zeng, and Y. Zhang, “Matterport3D: Learning from RGB-D data in indoor environments,” Proc. International Conference on 3D Vision, October 2017. doi: https://doi.org/10.48550/arXiv.1709.06158 [20] I. Armeni, S. Sax, A. R. Zamir, and S. Savarese, “Joint 2D-3D semantic data for indoor scene understanding,” arXiv preprint arXiv:1702. 01105, April 2017. doi: https://doi.org/10.48550/arXiv.1702.01105 [21] S. Song, F. Yu, A. Zeng, A. X. Chang, M. Savva, and T. Funkhouser, “Semantic scene completion from a single depth image,” Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, July 2017. doi: https://doi.org/10.48550/arXiv.1611.08974 [22] C. Sun, M. Sun, and H. T. Chen, "HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features," Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2021. doi: https://doi.org/10.48550/arXiv.2011.11498 [23] K. Nazeri, E. Ng, T. Joseph, F. Z. Qureshi, and M. Ebrahimi, "EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning," arXiv:1901.00212, January 2019. doi: https://doi.org/10.48550/arXiv.1901.00212 [24] K. He, X. Chen, S. Xie, Y. Li, P. Dollar, and R. Grirshick, “Masked Autoencoders Are Scalable Vision Learners,” arXiv:2111.06377, November 2021. doi: https://doi.org/10.48550/arXiv.2111.06377 Comment |