|Search by item||HOME > Access full text > Search by item|
JBE, vol. 23, no. 1, pp.3-10, January, 2018
Implementing a Depth Map Generation Algorithm by Convolutional Neural Network
Seungsoo Lee, Hong Jin Kim, and Manbae Kim
C.A E-mail: firstname.lastname@example.org
Depth map has been utilized in a varity of fields. Recently research on generating depth map by artificial neural network (ANN) has gained much interest. This paper validates the feasibility of implementing the ready-made depth map generation by convolutional neural network (CNN). First, for a given image, a depth map is generated by the weighted average of a saliency map as well as a motion history image. Then CNN network is trained by test images and depth maps. The objective and subjective experiments are performed on the CNN and showed that the CNN can replace the ready-made depth generation method.
Keyword: Depth map, CNN, Saliency map, Motion Hitsory Image, Ready-made depth map
 S. Kim and J. Yoo, “3D conversion of 2D video using depth layer partition,” Journal of Broadcast Engineering, Vol. 15, No. 2, Jan. 2011.
 J. Jung, J. Lee, I Shin, J. Moon and Y. Ho, “Improved depth perception of single view images”, ECTI Transactions on Electrical Engineering, Electronics and Communications, Vol. 8, No. 2, Aug. 2010.
 W. Tam and L. Zhang, “3D-TV Content Generation: 2D-To-3D Conversion,” Proc. of IEEE ICME, 2006.
 D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network”, Advances in Neural Information Processing Systems, 27, 2014.
 A. Afifi and O. Hellwich, “Object Depth Estimation from a Single Image using Fully Convolutional Neural Network“, Int’ Conf. on Digital Image Computing: Techniques and Applications (DICTA), Nov. 2016.
 F. Liu, C. Shen, G. Lin, and I. Reid, “Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields”, IEEE Trans. Pattern Analysis and Machine Intellegence, Vol. 38, No. 10, Oct. 2016.
 M. Kim, “Generation of Stereoscopic image from 2D Image based on saliency and edge modeling”, Journal of Broadcast Engineering, Vol. 20, No. 3, May 2015.
 W. Kim, J. Gil and M. Kim, “Motion depth generation using MHI for 3D video conversion”, Journal of Broadcast Engineering, Vol. 22, No. 4, July 2017.
 Y. Zhang, G. Jiang, M. Yu, and K. Chen, “Stereoscopic visual attention model for 3D video”, Advances in Multimedia Modeling, 2010.
 J. Kim, A. Baik, Y. Jung and D. Park, “2D-to-3D image/video conversion by using visual attention analysis,” Int’ Conf. on Image Processing, 2009.
 Y. Zhai, and M. Shah, “Visual attention detection in video sequences using spatiotemporal cues,“ 14th Annual ACM Int’ Conf. on Multimedia, pp. 815-824, 2006.
 A. Bobick and J. Davis, "The recognition of human movement using temporal templates," IEEE Trans. Pattern Recognition and Pattern Analysis, Vol 23, No. 3 Mar. 2001.
 R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk, “SLIC superpixels compared to state-of-the-art superpixel methods”, IEEE Trans. Pattern Analysis and Machine Intelligence, 34, (11), pp. 2274–2281, 2012.
 X. Glorot and Y. Bengio, “Understanding the difficulty of training deep forward neural networks”, Int’ Conf. Artificial Intelligence and Statistics, Society for Artificial Intelligence and Statistics, 2010.
 K. Fukuchi, K. Miyazato, A. Kimura, S. Takagi, and J. Yamato, “Saliency-based video segmentation with graph cuts and sequentially updated priors,” in Proc. IEEE Int. Conf. Multimedia Expo, pp. 638–641, June–July, 2009.
 D. Tsai, M. Flagg, and J. M. Rehg, “Motion coherent tracking with multi-label MRF optimization,” Proc. Brit. Mach. Vis. Conf., 2010.
 D. Baltieri, R. Vezzani and R. Cucchiara, "3DPes: 3D People Dataset for Surveillance and Forensics,“ in Proceedings of the 1st International ACM Workshop on Multimedia access to 3D Human Objects, pp. 59-64, Nov-Dec, 2011. (http://imagelab.ing.unimore.it/visor/ 3dpes. asp)