Search by item HOME > Access full text > Search by item

JBE, vol. 26, no. 5, pp.608-620, September, 2021

DOI: https://doi.org/10.5909/JBE.2021.26.5.608

Face Super-Resolution using Adversarial Distillation of Multi-Scale Facial Region Dictionary

Byungho Jo, In Kyu Park, and Sungeun Hong

C.A E-mail: csehong@inha.ac.kr

Abstract:

Recent deep learning-based face super-resolution (FSR) works showed significant performances by utilizing facial prior knowledge such as facial landmark and dictionary that reflects structural or semantic characteristics of the human face. However, most of these methods require additional processing time and memory. To solve this issue, this paper propose an efficient FSR models using knowledge distillation techniques. The intermediate features of teacher network which contains dictionary information based on major face regions are transferred to the student through adversarial multi-scale features distillation. Experimental results show that the proposed model is superior to other SR methods, and its effectiveness compare to teacher model. 



Keyword: Image super-resolution, Face super-resolution, Knowledge distillation, Adversarial learning, Deep learning

Reference:
[1] D. Kim, M. Kim, G. Kwon, and D. S. Kim, “Progressive face super-resolution via attention to facial landmark,” In Proc. of British Machine Vision Conference, pp. 192, 2017
[2] X. Hu et al., “Face super-resolution guided by 3D facial priors,” In Proc. of European Conference on Computer Vision, pp. 763-780, 2020, doi:10.1007/978-3-030-58548-8_44.
[3] Y. Chen, Y. Tai, X. Liu, C. Shen, and J. Yang. “Fsrnet: End-to-end learning face super-resolution with facial priors,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2492-2501, 2018, doi:10.1109/CVPR.2018.00264.
[4] X. Yu, B. Fernando, R. Hartley and F. Porikli, “Super-resolving very low-resolution face images with supplementary attributes,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 908-917, 2018, doi:10.1109/CVPR.2018.00101.
[5] X. Li, M. Liu, Y. Ye, W. Zuo, L. Lin, and R. Yang. “Learning warped guidance for blind face restoration,” In Proc. of European Conference on Computer Vision, pp. 278-296, 2018, doi:10.1007/978-3-030-01261-8_17.
[6] B. Dogan, S. Gu, and R. Timofte. “Exemplar guided face image super-resolution without facial landmarks,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1814-1823, 2019, doi:10.1109/CVPRW.2019.00232.
[7] X. Li, C. Chen, S. Zhou, X. Lin, W. Zuo, and L. Zhang. “Blind face restoration via deep multi-scale component dictionaries,” In Proc. of European Conference on Computer Vision, pp. 399-415, 2020, doi:10.1007/978-3-030-58545-7_23.
[8] I. Chung, S. Park, J. Kim, and N. Kwak, “Feature-map-level online adversarial knowledge distillation,” In Proc. of International Conference on Machine Learning, pp. 2006-2015, 2020.
[9] W. Xiaogang, and T. Xiaoou. “Hallucinating face by eigentransformation,” IEEE Trans. on Systems, Man, and Cybernetics, Part C, 35(3):425-434, 2005.
[10] S. Kolouri and G. K. Rohde. “Transport-based single frame super resolution of very low resolution face images,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 4876-4884, 2015, doi:10.1109/CVPR.2015.7299121.
[11] C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” In Proc. of European Conference on Computer Vision, pp. 184-199, 2014, doi:10.1007/978-3-319-10593-2_13.
[12] Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu. “Image super-resolution using very deep residual channel attention networks,” In Proc. of European Conference on Computer Vision, pp. 294-310, 2018, doi:10.1007/978-3-030-01234-2_18.
[13] M. Haris, G. Shakhnarovich, and N. Ukita, “Deep back-projection networks for super-resolution,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1664-1673, 2018, doi: 10.1109/CVPR.2018.00179.
[14] Z. Li, J. Yang, Z. Liu, X. Yang, G. Jeon, and W. Wu, “Feedback network for image super-resolution,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3867-3876, 2019, doi: 10.1109/CVPR.2019.00399.
[15] I. J. Goodfellow, J. P. Abadie, M. Mirza, B. Xu, D. W. Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” In Proc. of Advances in Neural Information Processing Systems, 2014.
[16] A. Newell, K. Yang, and J. Deng, “Stacked hourglass networks for human pose estimation,” In Proc. of European Conference on Computer Vision, pp. 483-499, 2016, doi:10.1007/978-3-319-46484-8_29.
[17] Y. Song, J. Zhang, S. He, L. Bao, and Q. Yang. “Learning to hallucinate face images via component generation and enhancement,” In Proc. of International Joint Conference on Artificial Intelligence, pp. 4537-4543, 2017, doi:10.5555/3171837.3171921.
[18] C. Ma, Z. Jiang, Y. Rao, J. Lu, and J. Zhou. “Deep face super-resolution with iterative collaboration between attentive recovery and landmark estimation,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 5568-5577, 2020, doi:10.1109/CVPR42600.2020.00561.
[19] L. Yang, C. Liu, P.Wang, S. Wang, P. Ren, S. Ma, and W. Gao. “Hifacegan: Face renovation via collaborative suppression and replenishment,” In Proc. of ACM International Conference on Multimedia, pp. 1551-1560, 2020, doi:10.1145/3394171.3413965.
[20] G. Koch, R. Zemel, and R. Salakhutdinov. “Siamese neural networks for one-shot image recognition,” In Proc. of International Conference on Machine Learning, 2015.
[21] G. Hinton, O. Vinyals, and J. Dean. “Distilling the knowledge in a neural network,” In Proc. of Neural Information Processing Systems, 2014
[22] S. Zagoruyko and N. Komodakis, “Paying more attention to attention: Improving the performance of convolutional neural networks via attention trasfer,” In Proc. of International Conference on Learning Representations, 2017
[23] S. Ahn and S. J. Kang, “Deep learning-based real-time super-resolution architecture design,” Journal of Broadcast Engineering, Vol. 26, No.2, pp. 167-174, March, 2021.
[24] O. S. Kwon “Real-time low-resolution face recognition algorithm for surveillance systems,” Journal of Broadcast Engieering, Vol.25, No.1, pp. 105-108, Jan, 2020.
[25] Q. Gao, Y. Zhao, G. Li, and T. Tong, “Image super-reoslution using knowledge distillation,” In Prof. of Asian Conference on Computer Vision, pp. 527-541, 2018, doi:10.1007/978-3-030-20890-5_34.
[26] Z. He, T. Dai, J. Lu, Y. Jiang, and S. T. Xia. “FAKD: Feature-affinity based knowledge distillation for efficient image super-resolution,” In Proc. of IEEE International Conference on Image Processing, pp. 518-522, 2020, doi:10.1109/ICIP40778.2020.9190917.
[27] W. Lee, J. Lee, D. Kim, and B. Ham. “Learning with privileged information for efficient image super-resolution,” In Proc. of European Conference on Computer Vision, pp. 465-482, 2020, doi:10.1007/978-3-030-58586-0_28.
[28] A. Bulat, and G. Tzimiropoulos. “How far are we from solving the 2d & 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks),” In Proc. of IEEE International Conference on Computer Vision. pp. 1021-1030, 2017, doi:10.1109/ICCV.2017.116.
[29] T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 4396-4405, 2019, doi:10.1109/CVPR.2019.00453.
[30] Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman. “Vggface2: A dataset for recognising faces across pose and age,” In Proc. of IEEE International Conference on Automatic Face & Gesture Recognition, pages 67-74, 2018
[31] K. Simonyand and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” In Proc. of International Conference on Learning Representations, 2015.
[32] K. He, G. Gkioxari, P. Dollar, and R. Girshick. “Mask r-cnn,” In Proc. of IEEE International Conference on Computer Vision, pp. 2980-2988, 2017, doi:10.1109/ICCV.2017.322.
[33] X. Huang and S. Belongi. “Arbitrary style transfer in real-time with adaptive instance normalization,” In Proc. of IEEE International Conference on Computer Vision, pp. 1510-1519, 2017, doi:10.1109/ICCV.2017.167.
[34] X. Wang, K. Yu, C. Dong, and C. C. Loy. “Recovering realistic texture in image super-resolution by deep spatial feature transform,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 606-615, 2018, doi:10.1109/CVPR.2018.00070.
[35] J. Johnson, A. Alahi, F.F. Li. “Perceptual Losses for real-time style transfer and super-resolution,” In Proc. of European Conference on Computer Vision, pp. 694-711, 2016, doi:10.1007/978-3-319-46475-6_43.
[36] J. Dong, W. Dong, R. Socher, L . J. Li, K. Li, and F. F. Li. “ImageNet: A large-scale hierarchical image database,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, 2009, doi:10.1109/CVPR.2009.5206848.
[37] X. Mao, Q. Li, H. Xie, R.Y.K Lau, Z. Wang, and S. PaulSmalley. “Least squares generative adversarial networks,” In Proc. of IEEE International Conference on Computer Vision, pp. 2813-2821, 2017, doi:10.1109/ICCV.2017.304.
[38] D.P. Kingma and J. Ba. “Adam: A method for stochastic optimization,” arxXiv preprint arXiv:1412.6980, 2014.
[39] R. Zhang, P. Isola, A. Efros, E. Shechtman, and O. Wang. “The unreasonable effectiveness of deep features as a perceptual metric,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1664-1673, 2018, doi:10.1109/CVPR.2018.00068.
[40] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter. “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” In Proc. of Advances in Neural Information Processing Systems, pp. 6629-6640, 2017.
[41] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao. “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE Signal Processing Letters, 23(10):1499-1503, 2016.
[42] Z. Liu, P. Luo, X. Wang, and X. Tang. “Deep learning face attributes in the wild,” In Proc. of International Conference on Computer Vision, pp. 3730-3738, 2015, doi:10.1109/ICCV.2015.425.

Comment


Editorial Office
1108, New building, 22, Teheran-ro 7-gil, Gangnam-gu, Seoul, Korea
Homepage: www.kibme.org TEL: +82-2-568-3556 FAX: +82-2-568-3557
Copyrightⓒ 2012 The Korean Institute of Broadcast and Media Engineers
All Rights Reserved