Back to list
This article was auto-translated.View original (中文)
Tech1mo ago

The Strongest Model on the Dark Side of the Moon, Kimi K2.6, is Released, Writing Code Non-Stop for 13 Hours, Benchmarking Against GPT-5.4

Recently, Moonshot AI released and open-sourced the KimiK2.6 model, with comprehensive upgrades in coding, long-range task execution, and Agent cluster capabilities. KimiK2.6 is now available on the Kimi official website, the latest version of the application, API, and KimiCode programming assistant. KimiK2.6 has shown impressive performance in multiple authoritative benchmark tests.

The Strongest Model on the Dark Side of the Moon, Kimi K2.6, is Released, Writing Code Non-Stop for 13 Hours, Benchmarking Against GPT-5.4

Whether it's the highly difficult Humanity's Last Exam, SWE-Bench Pro which focuses on real-world software engineering capabilities, or DeepSearchQA which evaluates Agent retrieval abilities, the scores all reached industry-leading levels, matching or exceeding closed-source models such as GPT-5.4, Claude Opus 4.6, and Gemini 3.1 Pro.

As Kimi's strongest coding model to date, Kimi K2.6 has significantly improved long-range coding capabilities.

In testing, it can code continuously for 13 hours, writing or modifying more than 4000 lines of code, completing the development and optimization of complex systems.

At the same time, by deeply integrating code with visual capabilities, Kimi K2.6 can deliver professional-grade Web applications with highly creative designs.

In the internal code evaluation Kimi Code Bench, K2.6's score is approximately 20% higher than the previous generation K2.5.

It is worth mentioning that its generalization ability is also outstanding.

Tests show that Kimi K2.6 can be deployed locally on a Mac and optimize the inference process using the Zig language. In over 4000 tool calls and 12 hours of continuous operation, the throughput increased from approximately 15 tokens/s to approximately 193 tokens/s, ultimately achieving an inference efficiency approximately 20% faster than LM Studio.

In terms of Agent capabilities, Kimi K2.6 supports multi-Agent collaboration, allowing scheduling of Agent combinations with different expertise to complete complex tasks, integrating capabilities such as search, in-depth research, document analysis, and long-form generation, significantly improving overall task quality.

At the same time, its Agent cluster architecture has also been upgraded, supporting up to 300 sub-Agents running in parallel, executing approximately 4000 collaborative steps, and completing end-to-end delivery from documents to web pages, and then to PPTs and tables in one go.