Publications

Publications

Below is a curated list of my publications.


1) My Favorite Streamer is an LLM: Discovering, Bonding, and Co-Creating in AI VTuber Fandom

  • Authors: Jiayi Ye*, Chaoran Chen, Yue Huang, Yanfang Ye, Toby Jia-Jun Li, Xiangliang Zhang
  • Venue: CHI 2026
  • Links: arXiv · PDF

2) Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge

  • Authors: Jiayi Ye*, Yanbo Wang*, Yue Huang*, Dongping Chen, Qihui Zhang, Nuno Moniz, Tian Gao, Werner Geyer, Chao Huang, Pin-Yu Chen, Nitesh V Chawla, Xiangliang Zhang (* Equal Contribution)
  • Venue: ICLR 2025
  • Links: arXiv · PDF

3) Dissecting Logical Reasoning in LLMs: A Fine-Grained Evaluation and Supervision Study

  • Authors: Yujun Zhou*, Jiayi Ye*, Zipeng Ling*, Yufei Han, Yue Huang, Haomin Zhuang, Zhenwen Liang, Kehan Guo, Taicheng Guo, Xiangqi Wang, Xiangliang Zhang (* Equal Contribution)
  • Venue: EMNLP 2025 Findings
  • Links: arXiv · PDF

4) AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?

  • Authors: Han Bao*, Yue Huang*, Yanbo Wang*, Jiayi Ye*, Xiangqi Wang, Xiuying Chen, Mohamed Elhoseiny, Xiangliang Zhang (* Equal Contribution)
  • Venue: SFLLM Workshop @ NeurIPS 2024
  • Links: Website · arXiv · PDF

5) TRUSTEVAL: A Dynamic Evaluation Toolkit on Trustworthiness of Generative Foundation Models

  • Authors: Yanbo Wang*, Jiayi Ye*, Siyuan Wu*, Chujie Gao, Yue Huang, Xiuying Chen, Yue Zhao, Xiangliang Zhang (* Equal Contribution)
  • Venue: NAACL 2025 System Demonstration Track
  • Links: Website · Code

6) UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation

  • Authors: Qihui Zhang, Munan Ning, Zheyuan Liu, Yanbo Wang, Jiayi Ye, Yue Huang, Shuo Yang, Xiao Chen, Yibing Song, Li Yuan
  • Venue: CVPR 2025
  • Links: arXiv · PDF

7) On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

  • Authors: Yue Huang, Chujie Gao, Siyuan Wu, Haoran Wang, Xiangqi Wang, Yujun Zhou, Yanbo Wang, Jiayi Ye, Jiawen Shi, Zhaoyi Liu, Tianrui Guan, Dongping Chen, Ruoxi Chen, etc.
  • Venue: arXiv Preprint
  • Links: Website · arXiv · PDF · Documentation · Demo · GitHub