Xintian Han,
Honggang Chen,
Quan Lin,
Jingyue Gao,
Xiangyuan Ren,
Lifei Zhu,
Zhisheng YE,
Shikang Wu,
XiongHang Xie,
Xiaochu Gan,
Bingzheng Wei,
Peng Xu,
Zhe Wang,
Yuchao Zheng,
Jingjian Lin,
Di Wu,
Junfeng Ge
(2025).
LEMUR: Large Scale End-to-End Multimodal Recommendation.
arXiv.
Chenxiang Ma,
Zhisheng YE,
Hanyu Zhao,
Zehua Yang,
Tianhao Fu,
Jiaxun Han,
Jie Zhang,
Yingwei Luo,
Xiaolin Wang,
Zhenlin Wang,
Yong Li,
Diyu Zhou
(2025).
Memory Offloading for Large Language Model Inference with Latency SLO Guarantees.
arXiv.
Qinghao Hu,
Zhisheng YE,
Zerui Wang,
Guoteng Wang,
Meng Zhang,
Qiaoling Chen,
Peng Sun,
Dahua Lin,
Xiaolin Wang,
Yingwei Luo,
Yonggang Wen,
Tianwei Zhang
(2024).
Characterization of Large Language Model Development in the Datacenter.
In
NSDI.