Search by item HOME > Access full text > Search by item

JBE, vol. 27, no. 4, pp.561-568, July, 2022


Efficient Memory Update Module for Video Object Segmentation

Junho Jo and Nam Ik Cho

C.A E-mail: Video Object Segmentation, Memory, Update, Adaptive


Most deep learning-based video object segmentation methods perform the segmentation with past prediction information stored in external memory. In general, the more past information is stored in the memory, the better results can be obtained by accumulating evidence for various changes in the objects of interest. However, all information cannot be stored in the memory due to hardware limitations, resulting in performance degradation. In this paper, we propose a method of storing new information in the external memory without additional memory allocation. Specifically, after calculating the attention score between the existing memory and the information to be newly stored, new information is added to the corresponding memory according to each score. In this way, the method works robustly because the attention mechanism reflects the object changes well without using additional memory. In addition, the update rate is adaptively determined according to the accumulated number of matches in the memory so that the frequently updated samples store more information to maintain reliable information.

Keyword: Video Object Segmentation, Memory, Update, Adaptive

[1] Oh, S. W., Lee, J. Y., Xu, N., & Kim, S. J., "Video object segmentation using space-time memory networks." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. doi:
[2] Li, Yu, Zhuoran Shen, and Ying Shan., "Fast video object segmentation using the global context module." European Conference on Computer Vision. Springer, Cham, 2020. doi:
[3] Liang, Y., Li, X., Jafari, N., & Chen, J., "Video object segmentation with adaptive feature bank and uncertain-region refinement." Advances in Neural Information Processing Systems 33: 3430-3441., 2020.
[4] Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A., & Van Gool, L, "The 2017 davis challenge on video object segmentation." arXiv preprint arXiv:1704.00675, 2017.
[5] Ning Xu, Linjie Yang, Dingcheng Yue, Jianchao Yang, Brian Price, Jimei Yang, Scott Cohen, Yuchen Fan, Yuchen Liang, and Thomas Huang., “Youtube-vos: Sequence-to-sequence video object segmentation.“ In European Conference on Computer Vision (ECCV), 2018. doi:
[6] Yao, R., Lin, G., Xia, S., Zhao, J., & Zhou, Y., "Video object segmentation and tracking: A survey." ACM Transactions on Intelligent Systems and Technology (TIST) 11.4 ,1-47p, 2020. doi:
[7] Wang, H., Jiang, X., Ren, H., Hu, Y., & Bai, S., "Swiftnet: Real-time video object segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021. doi:
[8] Hu, Yuan-Ting, Jia-Bin Huang, and Alexander G. Schwing. "Video- match: Matching based video object segmentation." Proceedings of the European conference on computer vision (ECCV). 2018. doi:
[9] He, K., Zhang, X., Ren, S., & Sun, J., “Deep residual learning for image recognition.“, Proceedings of the IEEE conference on computer vision and pattern recognition, p. 770-778, 2016. doi:
[10] T. -Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie, "Feature Pyramid Networks for Object Detection.", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936-944, 2017. doi:


Editorial Office
1108, New building, 22, Teheran-ro 7-gil, Gangnam-gu, Seoul, Korea
Homepage: TEL: +82-2-568-3556 FAX: +82-2-568-3557
Copyrightⓒ 2012 The Korean Institute of Broadcast and Media Engineers
All Rights Reserved