My research interests mainly lie in Machine Learning Systems, especially in designing efficient
and scalable systems for GenAI. I am particularly interested in kvcache storage & transmission,
multi-tenant LoRA serving and distributed architecture scheduling for AIGC and RLHF frameworks.
The goal of my research is to push the boundary of how large-scale models can be designed,
trained and efficiently served.
Undergraduate Student | University of Michigan, Ann Arbor
August '25 - Present
Member of SymbioticLab under CSE Department in College of Engineering
Advisor: Mosharaf Chowdhury.
Committed to serving Any-to-Any Multimodal LLMs with high throughput and low latency
on distributed heterogeneous hardware clusters.
Research Intern | kvcache.ai
May '25 - Present
Working on Mooncake project from MADSys Group at THU.
Mentor: Teng Ma.
Committed to promoting KVCache-centric disaggregated architecture for LLM Serving
and building more efficient and flexible data plane for RLHF training framework
Undergraduate Student | Shanghai Jiao Tong University
September '22 - Present
Member of UM-SJTU Joint Institute for dual Bachelor's degrees. Working in EPCC Lab under
School of Computer Science. Advisor: Shixuan Sun .
Committed to reducing latency for Multi-tenant concurrent LoRA serving and building disaggregated architecture for serverless graph processing
FaaSBoard: Efficient Graph Processing with a Disaggregated Architecture
on Serverless Services
We analyze the limitations of monolithic function architectures in graph processing,
and motivate the design of a disaggregated serverless architecture. We introduce FaaSBoard,
a graph processing system built on serverless cloud services to validate the effectiveness
of the disaggregated architectural design. We conduct a comprehensive evaluation of FaaSBoard
to demonstrate its superior performance and cost.
This template is a modification to Jon Barron's website.