Siyuan Chen (陈思元)

I am a second year Ph.D. student at computer science department at Carnegie Mellon University, happily advised by Phil Gibbons and Heather Miller, and I am closely working with Ben Titzer. I am a member of PDL and Catalyst at CMU. I finished my undergraduate at Turning Class of Peking University, where I was fortunately advised by Yun (Eric) Liang.

My research interests lie in system for machine learning, especially providing efficient and secure solutions for post-trained systems. For instance, I worked on LLM fine-tuning on commodity hardware through compressed offloading (LSP-Offload, AAAI’25), SLO-customized LLM serving (SLOs-Serve, Preprint), and defense against prompt injection attck in tool based agentic systems (RTBAS, Preprint).

Before PhD, I am engaged in machine learning compiler and runtime optimizations. I worked on optimizing the batching of dynamic neural networks (ED-Batch, ICML23’), performance modeling of data movement for tensor programs (Chimera, HPCA’23), mapping mechanism from DNN to SoC (COMB, DAC’23), and analytical-based simulator for fused program on general hardware (TileFlow, MICRO’23).

News

  • May 2025. I am in SystemResearch@Google(SRG) this summer working on RL systems, hosted by Samira Khan.

  • May 2025. I am honored to be selected as the ML commons ML and System Rising Stars, great thanks to the organizer and my advisors!

  • Apr. 2025. A new paper RTBAS on Arxiv, where we develop a novel approach to reduce user fatigue in tool-based agentic system.

  • Mar. 2025. A new paper SLOs-Serve on Arxiv, where we develop a novel scheduling algorithm to support customized service level objectives in LLM Serving.

  • June 2024. I begin the internship as student researcher in SystemResearch@Google (SRG) working on LLM serving, hosted by Samira Khan.

  • Sept. 2023. I am starting CS PhD at Carnegie Mellon University, co-advised by Phil Gibbons and Heather Miller. Hope for a stimulating and fruitful journey at Pittsburgh!
  • Aug. 2023 TileFlow is publicly available!
  • Apr. 2023 ED-Batch is accepted to ICML23’, thanks for the mentor and professors!

Publications

(*Equal Contribution)

  • Practical Offloading for Fine-Tuning LLM on Commodity GPU via Learned Sparse Projectors Siyuan Chen, Zhuofeng Wang, Zelong Guan, Yudong Liu, Phillip B. Gibbons. AAAI 25. Full Version, Code.

  • TileFlow: A Framework for Modeling Fusion Dataflow via Tree-based Analysis. Size Zheng, Siyuan Chen, Siyuan Gao, Liancheng Jia, Guangyu Sun, Runsheng Wang, Yun Liang. MICRO 2023. PDF

  • ED-Batch: Efficient Automatic Batching of Dynamic Deep Neural Networks via Finite State Machine. Siyuan Chen, Pratik Fegade, Tianqi Chen, Phillip B. Gibbons, Todd C. Mowry. ICML 23’. PDF Code Poster Video.

  • Memory and Computation Coordinated Mapping of DNNs onto Complex Heterogeneous SoC. Size Zheng, Siyuan Chen, Yun Liang. The 60th Design Automation Conference (DAC), July 2023. PDF

  • Chimera: An Analytical Optimizing Framework for Effective Compute-intensive Operators Fusion. Size Zheng*, Siyuan Chen*, Pedi Song, Renze Chen, Xiuhong Li, Shengen Yan, Dahua Lin, Jingwen Leng, Yun Liang. 29th international symposium on High Performance Computer Architecture (HPCA), February 2023. PDF.

Preprint

  • SLOs-Serve: Optimized Serving of Multi-SLO LLMs. Siyuan Chen, Zhipeng Jia, Samira Khan, Arvind Krishnamurthy, and Phillip B. Gibbons. Arxiv Preprint

  • RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage. Peter Yong Zhong*, Siyuan Chen*, Ruiqi Wang, McKenna McCall, Ben L. Titzer, Heather Miller, Phillip B. Gibbons. Arxiv Preprint