Junyuan Shang
sjy1203
ยท
AI & ML interests
NLP
Recent Activity
authored
a paper
24 minutes ago
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
authored
a paper
24 minutes ago
NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time
authored
a paper
28 minutes ago
ERNIE 5.0 Technical Report
Organizations
None yet