研究方向:计算机视觉、多模态机器学习、多媒体分析
个人英文主页:https://jianjieluo.github.io
工作邮箱:jianjieluo@gdut.edu.cn
个人简介:
罗剑杰,广东工业大学“青年百人计划”引进人才、讲师,中山大学与京东AI研究院联合培养博士,师从朝红阳教授、冯剑琳教授及加拿大工程院院士梅涛教授。长期致力于计算机视觉、多模态机器学习及多媒体分析等方向的前沿研究,参与国家级科研项目2项,相关研究成果成功落地应用于京东智能客服等产品。近年来聚焦于以视觉内容为中心的多模态学习,在视觉理解、生成与鉴伪等关键问题上持续探索,累计在CVPR、ECCV、ACM Multimedia、IEEE TCSVT等国际权威会议与期刊上发表论文共8篇。获2018年美国大学生数学建模大赛国际特等奖提名(F奖)、2025年ACM MM多媒体领域国际学术挑战赛“身份保持的视频生成”赛道季军。
教育背景:
2019.09 - 2024.12 中山大学(与京东AI研究院联合培养) 计算机学院 (工学博士)
2015.09 - 2019.06 中山大学 计算机学院(工学学士)
2012.09 - 2015.06 湛江第一中学 卓越班 (高中)
主要论文(*为通讯作者):
[1] [CVPR] J. Luo(罗剑杰), Y. Li, Y. Pan, T. Yao, J. Feng, H. Chao, T. Mei. "Semantic-Conditional Diffusion Networks for Image Captioning", in IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2023. (CCF-A,CV三大顶会之一)
[2] [ECCV] J. Luo(罗剑杰), J. Chen, Y. Li, Y. Pan, J. Feng, H. Chao, T. Yao. "Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning", in European Conference on Computer Vision(ECCV), 2024. (CCF-B,CV三大顶会之一)
[3] [MM] J. Luo(罗剑杰), Y. Li, Y. Pan, T. Yao, H. Chao, T. Mei. "CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising", in ACM International Conference on Multimedia(MM), 2021. (CCF-A,多媒体国际顶会)
[4] [TCSVT] J. Luo(罗剑杰), Y. Li, Y. Pan, T. Yao, J. Feng, H. Chao, T. Mei. "Exploring Vision-Language Foundation Model for Novel Object Captioning", in IEEE Transactions on Circuits and Systems for Video Technology(TCSVT), 2024. (CCF-B,中科院一区Top 期刊)
[5] [MM] J. Xu, J. Luo*(罗剑杰), Z. Yang. "Improving Identity Preservation in Video Generation with Multi-Branch Models", in ACM International Conference on Multimedia(MM), 2025. (CCF-A,多媒体国际顶会)
[6] [ECAI] Z. Xu, J. Luo*(罗剑杰), F. Yu, Z. Yang. "Multi-perspective Frequency Domain Learning for Generalizable AI-Generated Image Detection", in European Conference on Artificial Intelligence(ECAI), 2025. (CCF-B,人工智能国际会议)
[7] [TOMM] J. Chen, J. Luo(罗剑杰), Y. Pan, Y. Li, T. Yao, H. Chao, T. Mei. "Boosting Vision-and-Language Navigation with Direction Guiding and Backtracing," in ACM Transactions on Multimedia Computing, Communications, and Applications(TOMM), 2022. (CCF-B,多媒体旗舰期刊)
[8] [MM] Y. Pan, Y. Li, J. Luo(罗剑杰), J. Xu, T. Yao, T. Mei. "Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training," in ACM International Conference on Multimedia(MM), 2022. (CCF-A,多媒体国际顶会)
学术任职: