My research interests mainly lie in Large Model Systems, particularly in disaggregating
and saturating hardware resources. Currently, I'm working on RL Infra and VLA Inference.
The goal of my research is to push the boundary of how large-scale models can be designed,
trained and efficiently served to make physical impact.
Development Intern | Minimax
December '25 - Present
RL Infra team, responsible for developing frameworks for training production models.
Tech Member | SGLang
October '25 - Present
Member of SGLang team, responsible for supporting RL features.
Research Intern | KVCache-AI.Org
May '25 - Nov '25
Core dev of Mooncake from MADSys Group at THU.
Promote disaggregated storage and communication layers for large model systems.
Mentor: Mingxing Zhang and Teng Ma.
* means equal contribution here
Bridging the GPU Utilization Gap: Predictive
Multi-Dimensional Resource Scheduling for AI
Workloads
I'm a sport enthusiast, especially in basketball and running. I've been watching NBA since 2018,
and a big fan of the Golden State Warriors and Stephen Curry!
Hollow Knight is my favorite video game, which took me about 50 hours to complete whole challenges.
This template is a modification to Jon Barron's website.