Siyuan Chen (陈思元)

I am a second year Ph.D. student at computer science department at Carnegie Mellon University, happily advised by Phil Gibbons and Heather Miller, and I am closely working with Ben Titzer. I am a member of PDL and Catalyst at CMU. I finished my undergraduate at Turning Class of Peking University, where I was fortunately advised by Yun (Eric) Liang.

My research interests lie in system for machine learning, especially providing efficient and secure solutions for post-trained systems. For instance, I worked on LLM fine-tuning on commodity hardware through compressed offloading (LSP-Offload, AAAI’25), SLO-customized LLM serving (SLOs-Serve, Preprint), and defense against prompt injection attck in tool based agentic systems (RTBAS, Preprint).

Before PhD, I am engaged in machine learning compiler and runtime optimizations. I worked on optimizing the batching of dynamic neural networks (ED-Batch, ICML23’), performance modeling of data movement for tensor programs (Chimera, HPCA’23), mapping mechanism from DNN to SoC (COMB, DAC’23), and analytical-based simulator for fused program on general hardware (TileFlow, MICRO’23).

News

May 2025. I am in SystemResearch@Google(SRG) this summer working on RL systems, hosted by Samira Khan.
May 2025. I am honored to be selected as the ML commons ML and System Rising Stars, great thanks to the organizer and my advisors!
Apr. 2025. A new paper RTBAS on Arxiv, where we develop a novel approach to reduce user fatigue in tool-based agentic system.
Mar. 2025. A new paper SLOs-Serve on Arxiv, where we develop a novel scheduling algorithm to support customized service level objectives in LLM Serving.
June 2024. I begin the internship as student researcher in SystemResearch@Google (SRG) working on LLM serving, hosted by Samira Khan.
Sept. 2023. I am starting CS PhD at Carnegie Mellon University, co-advised by Phil Gibbons and Heather Miller. Hope for a stimulating and fruitful journey at Pittsburgh!
Aug. 2023 TileFlow is publicly available!
Apr. 2023 ED-Batch is accepted to ICML23’, thanks for the mentor and professors!

Publications

(*Equal Contribution)

Practical Offloading for Fine-Tuning LLM on Commodity GPU via Learned Sparse Projectors Siyuan Chen, Zhuofeng Wang, Zelong Guan, Yudong Liu, Phillip B. Gibbons. AAAI 25. Full Version, Code.
TileFlow: A Framework for Modeling Fusion Dataflow via Tree-based Analysis. Size Zheng, Siyuan Chen, Siyuan Gao, Liancheng Jia, Guangyu Sun, Runsheng Wang, Yun Liang. MICRO 2023. PDF
ED-Batch: Efficient Automatic Batching of Dynamic Deep Neural Networks via Finite State Machine. Siyuan Chen, Pratik Fegade, Tianqi Chen, Phillip B. Gibbons, Todd C. Mowry. ICML 23’. PDF Code Poster Video.
Memory and Computation Coordinated Mapping of DNNs onto Complex Heterogeneous SoC. Size Zheng, Siyuan Chen, Yun Liang. The 60th Design Automation Conference (DAC), July 2023. PDF
Chimera: An Analytical Optimizing Framework for Effective Compute-intensive Operators Fusion. Size Zheng*, Siyuan Chen*, Pedi Song, Renze Chen, Xiuhong Li, Shengen Yan, Dahua Lin, Jingwen Leng, Yun Liang. 29th international symposium on High Performance Computer Architecture (HPCA), February 2023. PDF.

Preprint

SLOs-Serve: Optimized Serving of Multi-SLO LLMs. Siyuan Chen, Zhipeng Jia, Samira Khan, Arvind Krishnamurthy, and Phillip B. Gibbons. Arxiv Preprint
RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage. Peter Yong Zhong*, Siyuan Chen*, Ruiqi Wang, McKenna McCall, Ben L. Titzer, Heather Miller, Phillip B. Gibbons. Arxiv Preprint