Yantao Lai
I am a second year master's student in the Computer Science and Technology College at Nanjing University of Aeronautics and Astronautics, with a keen interest in the research of multimodal artificial intelligence, particularly the interaction between human attention and natural language. My thesis focuses on exploring the dynamics shifts of human gaze/attention across various scenarios. I am fortunate to be advised by Rong Quan (dissertation advisor), Wentong Li, Dong Liang, and Jie Qin.
Previously, I was an undergraduate student at the School of Internet at Anhui University, majoring in Intelligent Science and Technology, where I primarily studied foundational theories related to computer science and artificial intelligence.
Résumé  / 
Email  / 
WeChat
|
|
News
- [Oct 2024] A patent related to Goal-Directed scanpath prediction has been disclosed.
- [Feb 2025] A patent related to gaze prediction for panoramic images has been granted!
- [Dec 2024] One paper focused on Goal-Directed scanpath prediction has been accepted to ICASSP 2025!
- [Oct 2024] One paper focused on gaze prediction for panoramic images has been accepted to ECCV 2024!
|
Research
I am broadly interested in Computer Vision and Multimodal AI (Vision-Language Modeling). My primary research focus is on modeling human attention in multimodal scenes through the prediction of human gaze. For more details, refer to my résumé.
|
|
Pathformer3D: A 3D Scanpath Transformer for 360° Images
Rong Quan*,
Yantao Lai*,
Mengyu Qiu,
Dong Liang†
ECCV, 2024
PDF
/
Bibtex
/
Code
|
|
一种面向全景图像的人眼扫视轨迹预测方法
Rong Quan,
Yantao Lai,
Dong Liang,
Mengyu Qiu,
Jie Qin,
Patent, Granted
PDF
/
Bibtex
/
Code
|
|
CLIPGaze: Zero-Shot Goal-Directed Scanpath Prediction Using CLIP
Yantao Lai,
Rong Quan,
Dong Liang†,
Jie Qin,
ICASSP oral, 2025
PDF
/
Bibtex
/
Code
|
|
一种目标导向的扫视路径预测方法
Rong Quan,
Yantao Lai,
Dong Liang,
Jie Qin,
Patent, Disclosed
PDF
/
Bibtex
/
Code
|
|