What is Kimi K2.5?
Kimi K2.5 is Moonshot AI's most powerful multimodal model as of January 2026.
Like other AI models, Kimi K2.5 can understand plain English and produce text that resembles human writing. However, what makes this version of Kimi unique is its ability to generate code from visuals, such as UI designs or videos, and orchestrate other tools autonomously. It can also coordinate up to 100 sub-agents working simultaneously to complete one large task.
It uses a mixture-of-experts architecture and features 1 trillion parameters — a truly massive number. Those parameters are organized into multiple neural networks, where each was optimized for a specific task.
In practice, during our testing we felt that the model answered very quickly and precisely — on the level of the best AI models. As a point of comparison, it feels faster than Claude Opus 4.5 while being as helpful and accurate.
Kimi K2.5 is released under the Modified MIT License — this makes it an open-source Chinese AI model, competing directly with DeepSeek V4 and Baidu Ernie 5.0.
Kimi K2.5 Features
Native multimodality — Kimi can understand visual and text-based information at the same time, and, in our testing, it proved extremely accurate. For example, we gave it a screenshot of a design dashboard and it accurately identified all the colors, giving precise hex values.
Agent Swarm — for complex tasks, Kimi K2.5 can self-direct an agent swarm with up to 100 sub-agents, each executing a separate task. Compared to when the model works in a “single thread model” this allows it to complete tasks up to 4.5x faster.
Large context window. The model has a 256K context window, which is smaller than the largest ones we’ve seen (up to 2 million tokens), but still high and more than Claude Opus 4.5, which has only 200K tokens. This makes it ideal for large codebases, spreadsheets, tables and very long chats.
Kimi K2.5 Pricing
API pricing is $0.60/1M for uncached input, $0.30/1M for cached input, and $2.50/1M for output. Cached tokens cost only $0.15/M, and caching is automatic with no configuration needed.
Kimi K2.5 Benchmarks
Moonshot compared Kimi K2.5 against GPT-5.2, Claude 4.5 Opus and other reasoning models across more than two dozen benchmarks.
The model achieved the highest score on HLE-Full — one of the industry's most difficult evaluations. Key scores include: 96.1 on AIME 2025, 87.6 on GPQA Diamond, 85 on Live Codebench v6, and 76.8 on SWE-Bench verified.









