Xiaomi and MiniMax both unleash their ultimate moves, signaling the start of the Agent Pricing War.

By: blockbeats|2026/03/20 13:25:43

On March 18 and 19, two Chinese companies successively released their respective Agent-oriented large models. The domestic AI startup MiniMax launched M2.7, and Xiaomi's large model team MiMo introduced V2-Pro. Both models entered the global top tier in the Agent benchmark, but their API output pricing is 1/21 and 1/8 of Claude Opus 4.6, respectively.

Both companies played their cards in the same week, but with completely different hands. They represent two completely different technical paths, betting on two futures of the Agent era.

Same Exam, 1/17 Tuition Fee

First, let's look at the most intuitive comparison.

Xiaomi and MiniMax both unleash their ultimate moves, signaling the start of the Agent Pricing War.

According to OpenRouter and various company official pricing pages, based on API output price (per million tokens), MiniMax M2.7 is $1.2, and MiMo-V2-Pro is $3. As a reference, the output price for Claude Opus 4.6 is $25, GPT-5.2 is $14, and Claude Sonnet 4.6 is $15.

The price difference is an order of magnitude, but the performance difference is not. In SWE-bench Verified (the current mainstream benchmark for measuring code engineering capability), MiMo-V2-Pro scored 78%, Sonnet 4.6 was 79.6%, a difference of less than two percentage points. M2.7's SWE-Pro score is 56.22%, on par with GPT-5.3-Codex. In VIBE-Pro (end-to-end project delivery capability), M2.7 scored 55.6%, approaching the level of Opus 4.6.

The focus of this chart is not on who is higher or lower—the benchmark systems of various companies are not entirely aligned, so direct comparisons should be cautious. The focus is on the "price-performance scissor difference": domestic Agent models have squeezed into the same performance band, but are in completely different price ranges.

Trillion Parameters vs. Self-evolution

Price is just the surface. The two companies have presented two completely different sets of trump cards.

MiMo-V2-Pro follows the "go big or go home" route. According to Xiaomi's official announcement, V2-Pro has over 1 trillion total parameters, 42B activation parameters, and supports an ultra-long context of 1 million tokens. Its core innovation is the Hybrid Attention mixed attention mechanism, adjusting the ratio of Sliding Window Attention (SWA) to Global Attention (GA) to 7:1—its predecessor V2-Flash was 5:1. This architecture makes the model more stable in scenarios where long documents are processed and multiple tool parallel calls in the Agent scene. In PinchBench (Agent tool invocation capability assessment), MiMo-V2-Pro scored 84%.

M2.7 took a completely different path. According to MiniMax's official tech blog post on March 18, M2.7's parameter count was not disclosed, but it demonstrated a "self-iterative evolution" mechanism: the model autonomously ran over 100 optimization loops, including analyzing failure trajectories, planning modifications, modifying its own code architecture, running evaluations, and looping again, ultimately achieving a 30% performance improvement on an internal evaluation set. In the MLE Bench Lite (Machine Learning Contest Difficulty Assessment), out of 22 challenging problems, M2.7 secured 9 gold, 5 silver, and 1 bronze, with an average medal rate of 66.6%.

From five dimensions, the two paths are aimed in completely different directions: MiMo-V2-Pro clearly dominates in context length and code engineering dimensions, while M2.7 widens the gap in office automation and self-iterative capability. According to MiniMax's same tech blog post, M2.7 scored ELO 1495 on GDPval-AA (Office Document Processing Evaluation), ranking first among open-source models, and maintained a 97% skill compliance rate in the MM-Claw test covering over 40 complex skills.

Four Versions in Five Months

Not only are the technical paths of the two companies different, but their iteration rhythms are also completely different.

According to public release records, from the release of M2 in October 2025 to the release of M2.7 in March 2026, MiniMax iterated four versions within five months, averaging a major version every 49 days. The gap between M2.5 and M2.7 was only about 30 days.

The rhythm of Xiaomi's MiMo is different: MiMo-7B was released in April 2025 (an open-source inference model with 7B parameters), V2-Flash was released in December of the same year (with 309B total parameters), and V2-Pro was released in March 2026 (with 1T total parameters). The parameter scale between each generation is much larger, but the intervals between versions are also longer.

MiniMax chose small, frequent steps, with each iteration not making big leaps but at a very high frequency. M2.7's self-iterative mechanism itself is designed for "continuous evolution." Xiaomi opted for a more impactful approach, with each version featuring significant changes in parameter scale and architecture.

-- Price

Anonymous 8 Days, Summit OpenRouter

In addition to the technical roadmap, Xiaomi's release strategy has also broken industry conventions.

According to Reuters, on March 11, an anonymous model named Hunter Alpha appeared on the world's largest API aggregation platform, OpenRouter. No brand endorsement, no product launch event, no technical blog. Its API pricing was extremely low, yet its performance was surprisingly strong.

The community began to speculate about its origins. According to Republic World and several tech media reports, the most mainstream speculation was DeepSeek V4, as MiMo team leader Luo Fuli had previously worked on research at DeepSeek. The number of API calls quickly skyrocketed, with the total number of calls during the anonymous period exceeding 1 trillion tokens, reaching the top of the OpenRouter weekly rankings.

Early on March 19, Xiaomi revealed: Hunter Alpha is indeed MiMo-V2-Pro. According to the same Reuters report, Xiaomi's Hong Kong stock once surged by 5.8% after the revelation.

This is the first time a domestic large-scale model has proven itself on a global platform through purely blind testing. Not relying on the brand, not relying on publicity, it took 8 days to let developers vote with their feet.

Solana did not fall behind during the bear market. Trading enthusiasm has waned, but the network is more stable, RWA and stablecoins are expanding, and the capital foundation is much thicker than in the previous cycle. The real question is: when the speculative tide recedes, can perpetuals, predicti...

Young people in South Korea make a "final effort" in the epic bull market

The South Koreans' average of two accounts for wildly gambling in the chip bull market reflects the survival anxiety and harsh reality of countless young people trying to break through class barriers behind the nationwide stock trading frenzy for wealth.

Dialogue with OmenX Founder: Why does the prediction market need an evolution from "spot" to "derivatives"?

How to reconstruct the prediction market using leverage?

When the P2P illicit funds from ten years ago turned into 60,000 bitcoins

The largest Bitcoin money laundering case in the UK has new developments: 16,000 Chinese victims are pursuing 61,000 seized Bitcoins across borders, and the dispute over the applicability of UK and Chinese laws will directly determine whether the victims can share in the soaring profits.

Morning News | CME Group launches Nasdaq Cryptocurrency Index futures; Asset management giant Janus Henderson strategically invests in Ethena

Overview of Important Market Events on June 10

Why did Oracle deliver the strongest financial report in history, yet its stock price fell?

Oracle's revenue for fiscal year 2026 set a record, with AI cloud orders soaring to $638 billion, but massive capital expenditures on computing power led to negative free cash flow, causing a 5% drop in after-hours stock prices.

Bitcoin Layer 2 Network Botanix: Why Did We Choose to Dissolve?

The Bitcoin L2 star project Botanix announced a gradual shutdown, with the team admitting to facing severe challenges from the failure of its business model and the prevailing trends. Users are urged to withdraw all assets before July 9, 2026.

Morning Report | OpenAI has submitted an S-1 registration statement draft to the U.S. SEC; Morpho completes $175 million financing

Overview of Important Market Events on June 9th

Galaxy Deep Research Report: How Hyperliquid's HIP-4 Upgrade Changes the Landscape of Prediction Markets?

The platform that wins this competition will be the one whose execution layer is the hardest to replicate, whose builder ecosystem delivers the fastest, and whose regulatory path is the most open.

Latest research from 13 top universities including Cornell University: The current state, challenges, and misconceptions of the fusion of Crypto and AI

The combination of AI and crypto is still in its early stages, with both serving as complementary "middleware": AI translates human intentions into executable programs, while cryptographic technology provides verifiable and tamper-proof guarantees for computational processes and results. In the dire...

Deconstructing Anthropic: The Best AI Company, Possibly Also a Type of Organizational Invention

Instead of competing with ambition, focusing on restraint, how does Anthropic leverage extreme strategic focus and an "counterintuitive" geek culture to counterattack OpenAI on the AI battlefield?

Every exchange is a "Universal Exchange."

You initially build infrastructure for something, then realize it can also be used for many other things, and then you continuously expand the business to accommodate everything that the infrastructure can support.

The counterattack of traditional finance: Alliance chains are quietly reviving

Whether public chains win or consortium chains win has never been the focus.

Pantera Capital Partner: How Tokenization is Restructuring the Private Equity and Early Investment Ecosystem?

Top tech companies are going public later and later, leaving retail investors shut out during the high growth period. Can tokenization give ordinary people back this entry ticket?

Mastercard Launches Agent Pay for AI, Plans to Record AI Agent Payment Authorizations on Polygon

Mastercard launched Agent Pay for AI, a new payment protocol designed to help AI agents make small payments such as pay-per-use access to data and APIs. The system plans to record human-granted AI agent permissions on Polygon, focusing on verifiable authorization, identity, and payment controls.

Curve Deploys Llamalend v2 on Optimism With 250,000 OP Incentives

Curve launched Llamalend v2 on Optimism with 250,000 OP incentives from the Optimism Foundation. The upgrade expands Llamalend beyond its earlier crvUSD-focused model, adding broader collateral support, LlamaRisk market reviews, and the ability to use Curve LP tokens as collateral.

Raydium Old Liquidity Pool Reportedly Exploited, With $1.34 Million Moved to Ethereum and Tornado Cash

An old Raydium liquidity pool was reportedly exploited for around $1.34 million in USDC, RAY, and wSOL, with the stolen funds bridged to Ethereum and deposited into Tornado Cash. The incident highlights the tail risks of legacy DeFi pools, old contracts, and cross-chain fund laundering paths.

Kalshi Executive Challenges “SBF Backed AI Unicorns” Narrative, Says Leopold Aschenbrenner Was Key Figure

Kalshi executive John Wang questioned the “SBF backed AI unicorns” narrative, saying Leopold Aschenbrenner was the key figure behind major AI investment decisions.

Morning News | CME Group launches Nasdaq Cryptocurrency Index futures; Asset management giant Janus Henderson strategically invests in Ethena

Overview of Important Market Events on June 10

Why did Oracle deliver the strongest financial report in history, yet its stock price fell?