Yantao Lai

I am a second year master's student in the Computer Science and Technology College at Nanjing University of Aeronautics and Astronautics, with a keen interest in the research of multimodal artificial intelligence, particularly the interaction between human attention and natural language. My thesis focuses on exploring the dynamics shifts of human gaze/attention across various scenarios. I am fortunate to be advised by Rong Quan (dissertation advisor), Wentong Li, Dong Liang, and Jie Qin.

Previously, I was an undergraduate student at the School of Internet at Anhui University, majoring in Intelligent Science and Technology, where I primarily studied foundational theories related to computer science and artificial intelligence.

Résumé  /  Email  /  WeChat

profile photo
News
  • [Oct 2024] A patent related to Goal-Directed scanpath prediction has been disclosed.
  • [Feb 2025] A patent related to gaze prediction for panoramic images has been granted!
  • [Dec 2024] One paper focused on Goal-Directed scanpath prediction has been accepted to ICASSP 2025!
  • [Oct 2024] One paper focused on gaze prediction for panoramic images has been accepted to ECCV 2024!
Research

I am broadly interested in Computer Vision and Multimodal AI (Vision-Language Modeling). My primary research focus is on modeling human attention in multimodal scenes through the prediction of human gaze. For more details, refer to my résumé.

Pathformer3D: A 3D Scanpath Transformer for 360° Images
Rong Quan*, Yantao Lai*, Mengyu Qiu, Dong Liang
ECCV, 2024
PDF / Bibtex / Code

一种面向全景图像的人眼扫视轨迹预测方法
Rong Quan, Yantao Lai, Dong Liang, Mengyu Qiu, Jie Qin,
Patent, Granted
PDF / Bibtex / Code

CLIPGaze: Zero-Shot Goal-Directed Scanpath Prediction Using CLIP
Yantao Lai, Rong Quan, Dong Liang, Jie Qin,
ICASSP oral, 2025
PDF / Bibtex / Code

一种目标导向的扫视路径预测方法
Rong Quan, Yantao Lai, Dong Liang, Jie Qin,
Patent, Disclosed
PDF / Bibtex / Code


Webpage template from Jon Barron