I am a PhD student at Georgia Institute of Technology advised by Prof.
Tushar Krishna.
I am currently visiting MIT, advised by Prof.
Song Han, and work closely with
Han Cai.
Earlier in my PhD, I focused on efficient quantization and sparsity for LLMs
to reduce serving cost, learning CUDA kernel development through these projects.
Now I work on agentic LLM efficiency and system-algorithm co-design. I believe
future AI systems will be adaptive and learnable from environment, algorithm,
and workload feedback.
Prior to Georgia Tech, I worked with Prof.
Baharan Mirzasoleiman
at UCLA on efficient machine learning from massive datasets.
I received my B.Eng. in Computer Science from Zhejiang University in 2023.