木叶吟
木叶吟
Home
Experience
Posts
Publications
Services
CV
Light
Dark
Automatic
English
中文 (简体)
Congestion Control
CONCUR: Controlling Mid-Phase Thrashing in Agentic Batch Inference
A technical note on CONCUR, an agent-level admission control layer that prevents KV cache collapse during long-running agentic LLM inference.
Zhisheng YE
May 17, 2026
5 min read
Cite
×