About Me

I am Yudong Zhang (张宇东), a PhD candidate in NICS-EFC Laboratory, Department of Electronic Engineering, Tsinghua University, under the supervision of Professor Yu Wang (汪玉). I completed my bachelor’s degree in Department of Electronic Engineering at Tsinghua University in 2020, advised by Professor Jiansheng Chen (陈健生).

My research focuses on enhancing the safety and efficiency of vision-language models. To date, I have authored or co-authored 10 peer-reviewed papers, including 4 first-author publications in academic conferences such as AAAI, ACM Multimedia (ACMMM), and NAACL, with citations to date.

Currently, I am an intern at Tencent’s Hunyuan team, mentored by Xingwu Sun (孙兴武) and Ruobing Xie (谢若冰). I focus on pre-training of large language models.

If you are interested in academic collaboration or would like to discuss potential research opportunities, please feel free to reach out via email zhangyd16@mails.tsinghua.edu.cn.

I will be graduating in June 2026 and I am looking for job opportunities.

🔥 News

  • 2025.08: 🎉 One paper with me as the first author (F3) are accepted by ACMMM 2025 as Oral.
  • 2025.07: 🎉 One paper with me as the first author (DHCP) are accepted by ACMMM 2025.
  • 2025.05: I was promoted by Tsinghua University on Twitter and Facebook, see details at Twitter 1, Twitter 2, Twitter 3, Facebook.
  • 2025.04: I was recognized as Machine Learning Platform Department Outstanding Intern Award, Technology and Engineering Group (TEG), Tencent.
  • 2025.01: 🎉 One paper with me as the first author (QAVA) are accepted by NAACL 2025.
  • 2024.12: 🎉 One paper with me as the first author (JointAugmentation) are accepted by AAAI 2025.
  • 2024.07: 🎉 One paper with me as the first author (PIP) are accepted by ACMMM 2024 as Oral.
  • 2024.04: I join Tencent Hunyuan Team as a intern student in Beijing!

📝 Publications

(* indicates equal contribution, † indicates corresponding author.)

Safety of Large Models

  • ACM-MM 2024 (Oral) PIP: Detecting Adversarial Examples in Large Vision-Language Models via Attention Patterns of Irrelevant Probe Questions, Yudong Zhang, Ruobing Xie†, Jiansheng Chen†, Xingwu Sun, Yu Wang† | Paper | Code | Slide | Video
  • NAACL 2025 QAVA: Query-Agnostic Visual Attack to Large Vision-Language Models, Yudong Zhang, Ruobing Xie†, Jiansheng Chen†, Xingwu Sun, Zhanhui Kang, Yu Wang† | Paper | Code | Video
  • ACM-MM 2025 DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models, Yudong Zhang, Ruobing Xie†, Xingwu Sun, Yiqing Huang, Jiansheng Chen†, Zhanhui Kang, Di Wang, Yu Wang† | Paper | Code
  • ACM-MM 2025 (Oral) Fighting Fire with Fire (F3): A Training-free and Efficient Visual Adversarial Example Purification Method in LVLMs, Yudong Zhang, Ruobing Xie†, Yiqing Huang, Jiansheng Chen†, Xingwu Sun, Zhanhui Kang, Di Wang, Yu Wang† | Paper | Code

Efficiency of pre-training

  • AAAI 2025 Enhancing Contrastive Learning Inspired by the Philosophy of “the Blind Men and the Elephant”, Yudong Zhang, Ruobing Xie†, Jiansheng Chen†, Xingwu Sun, Zhanhui Kang, Yu Wang† | Paper | Code | Video

Preprint

  • Arxiv The Security Threat of Compressed Projectors in Large Vision-Language Models, Yudong Zhang, Ruobing Xie, Xingwu Sun, Jiansheng Chen, Zhanhui Kang, Di Wang, Yu Wang | Paper

Others (non-first author)

  • ACM-MM 2022 3D Human Mesh Reconstruction by Learning to Sample Joint Adaptive Tokens for Transformers, Youze Xue, Jiansheng Chen†, Yudong Zhang, Cheng Yu, Huimin Ma, Hongbing Ma | Paper

  • CIKM 2023 Transferable Structure-based Adversarial Attack of Heterogeneous Graph Neural Network, Yu Shang, Yudong Zhang, Jiansheng Chen†, Depeng Jin, Yong Li | Paper

  • AAAI 2024 Step Vulnerability Guided Mean Fluctuation Adversarial Attack against Conditional Diffusion Models, Hongwei Yu, Jiansheng Chen†, Xinlong Ding, Yudong Zhang, Ting Tang, Huimin Ma | Paper

  • Knowledge-Based Systems (KBS) Image paragraph captioning with topic clustering and topic shift prediction, Ting Tang, Jiansheng Chen†, Yiqing Huang, Huimin Ma, Yudong Zhang, Hongwei Yu | Paper

  • ICCV 2025 DADet: Safeguarding Image Conditional Diffusion Models against Adversarial and Backdoor Attacks via Diffusion Anomaly Detection, Hongwei Yu, Xinlong Ding, Jiawei Li, Jinlong Wang, Yudong Zhang, Rongquan Wang, Huimin Ma, Jiansheng Chen

🎖 Honors and Awards

Highlights Honors

Others

  • 2022.12 Tsinghua University December Ninth Counselor Award
  • 2024.12 Graduate Student “Star of Electronics”, Department of Electronic Engineering, Tsinghua University (no more than 5 graduate students per year)
  • 2019.12 Undergraduate Student “Star of Electronics”, Department of Electronic Engineering, Tsinghua University (no more than 5 undergraduate students per year)
  • Tsinghua University General Excellence Scholarship (2 times during undergraduate and 4 times during graduate)
  • Tsinghua University Science and Technology Innovation Excellence Award, Volunteer Excellence Award, Social Work Excellence Award (2 times), Friends of Tsinghua-Changfei Scholarship First Prize and Second Prize
  • Outstanding Student Leader of Tsinghua University (3 times)

📖 Educations

  • 2020.09 - now, PhD student, Department of Electronic Engineering, Tsinghua University, Beijing.
  • 2016.09 - 2020.06, Undergraduate, Department of Electronic Engineering, Tsinghua University, Beijing.
  • 2013.09 - 2016.06, Baotou No. 95 Middle School (Baogang No. 1 Middle School), Baotou, Inner Mongolia.

💻 Internships

🔧 Patent

Chinese patent (Patent granted)

  • Kubernetes container access methods, devices, computing devices, and storage media (2024104387144), Yu Wang, Yudong Zhang.
  • Distributed Task Dynamic Service Discovery Method, Device, and Task Training System (2024104387341), Yu Wang, Yudong Zhang.
  • Methods and apparatus for multi-user collaborative use of GPU computing capabilities (2024104384428), Yu Wang, Yudong Zhang (Inventor registration error, change in progress).
  • Sample purification methods, apparatus, equipment, and media (2024109947320), Yudong Zhang, Ruobing Xie, Xingwu Sun, Zhanhui Kang.
  • Model hallucination detection method, apparatus, device, storage medium, and program product (2024110474366), Yudong Zhang, Ruobing Xie, Xingwu Sun, Zhanhui Kang.
  • Text generation method, apparatus, device, and readable storage medium (2024109116223), Yudong Zhang, Ruobing Xie, Xingwu Sun, Zhanhui Kang.

Chinese patent (Patent pending)

  • Adversarial example image generation method, apparatus, computer device, and storage medium (2024110458537), Yudong Zhang, Ruobing Xie, Xingwu Sun, Zhanhui Kang.
  • An image processing method and related apparatus (2024110598759), Yudong Zhang, Ruobing Xie, Xingwu Sun, Zhanhui Kang.
  • Image processing methods, devices, equipment, readable storage media, and program products (2024111072007), Yudong Zhang, Ruobing Xie, Xingwu Sun, Zhanhui Kang.

✍️ Academic Service

Reviewer

  • CVPR
  • ICCV
  • ACM-MM
  • ICLR
  • ARR (ACL/EMNLP/NAACL)
  • AAAI