Tanishq Abraham is at ICML @iScienceLuvr
Kimi K2 paper dropped!
describes:
- MuonClip optimizer
- large-scale agentic data synthesis pipeline that systematically generates tool-use demonstrations via simulated and real-world environments
- an RL framework that combines RLVR with a self-
critique rubric reward mechanism that allows model to evaluate its own outputs
2025年07月21日 14:11