Xinyuan Wang

Hi! I am Xinyuan Wang (王心远).

me.png

I am a Ph.D. student at XLANG Lab, the University of Hong Kong, supervised by Prof. Tao Yu.

I am now working on agentic foundation models, especially computer-use agent models (OpenCUA), agent evaluation (Computer Agent Arena, OSWorld-Verified), and agent data synthesis (Jedi, VideoAgentTrek). At UCSD, I worked on automatic LLM prompt optimization (PromptAgent) and LLM Reasoning (LLM Reasoners). I also worked in Prof. Zhuowen Tu’s group, exploring how to improve diffusion models’ conceptual performance.

News

Jan 26, 2026 Computer Agent Arena and VideoAgentTrek are accepted by ICLR 2026!
Oct 11, 2025 🎉 OpenCUA received the Best Paper Award at the COLM AIA Workshop!
Sep 19, 2025 OpenCUA and Jedi are accepted by NeurIPS as Spotlight paper!
Sep 19, 2025 OpenCUA is accepted by COLM 2025 Workshop AIA as Oral paper!
Aug 13, 2025 OpenCUA: Open Foundations for Computer-Use Agents is published on Arxiv! It is the first open-source foundation for computer-use agents, including infrastructure, dataset, training recipe, model and benchmark.

Selected Publications

  1. phonebuddy.png
    PhoneBuddy: Training Open Models for Agentic Phone Use
    Zhengyang Tang, Xin Lai, Pengyuan Lyu, Xinyuan Wang, Tianyi Bai, and 5 more authors
    2026
  2. phone_privacy.png
    Do Phone-Use Agents Respect Your Privacy?
    Zhengyang Tang, Ke Ji, Xidong Wang, Zihan Ye, Xinyuan Wang, and 5 more authors
    2026
  3. gamecraft.png
    GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?
    Tongxu Luo, Rongsheng Wang, Jiaxi Bi, Chenming Xu, Zhengyang Tang, and 6 more authors
    2026
  4. videoagenttrek.png
    VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos
    Dunjie Lu, Yiheng Xu, Junli Wang, Haoyuan Wu, Xinyuan Wang, and 10 more authors
    2025
  5. opencua_main_fig.png
    Opencua: Open foundations for computer-use agents
    Xinyuan Wang, Bowen Wang, Dunjie Lu, Junlin Yang, Tianbao Xie, and 6 more authors
    NeurIPS 2025 (Spotlight), COLM 2025 Workshop AIA (Best Paper), 2025
  6. jedi.png
    Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
    Tianbao Xie, Jiaqi Deng, Xiaochuan Li, Junlin Yang, Haoyuan Wu, and 6 more authors
    NeurIPS 2025 (Spotlight), 2025
  7. osworld-verified.png
    Introducing OSWorld-Verified
    Tianbao Xie, Mengqi Yuan, Danyang Zhang, Xinzhuang Xiong, Zhennan Shen, and 12 more authors
    xlang.ai, Jul 2025
  8. arena.png
    Computer Agent Arena: Compare & Test Computer Use Agents on Crowdsourced Real-World Tasks
    Bowen Wang, Xinyuan Wang, Jiaqi Deng, Tianbao Xie, Ryan Li, and 11 more authors
    Jul 2025
  9. llm_reasoners_preview.png
    LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models
    Shibo Hao, Yi Gu, Haotian Luo, Tianyang Liu, Xiyan Shao, and 6 more authors
    COLM 2024, Jul 2024
  10. promptagent_header.png
    PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization
    Xinyuan Wang, Chenxi Li, Zhen Wang, Fan Bai, Haotian Luo, and 4 more authors
    ICLR 2024, Jul 2024