MiniMax Enacts Peak-Time Rate Limiting as M2.7 Model Demand Surges
MiniMax's open platform will implement dynamic rate limiting during peak hours due to unexpectedly high demand for its newly launched M2.7 agent model. The company cites rapid traffic growth and concerns over automated batch tasks affecting shared compute resources as reasons for the policy, aimed at ensuring stable service for the majority of users.
MiniMax has announced it will begin dynamically limiting access to its services during peak traffic periods, a move prompted by overwhelming demand for its latest flagship AI model.
The company's open platform issued a service adjustment notice on March 20th, stating that traffic growth for the newly released MiniMax-M2.7 model had "exceeded team expectations." To guarantee service stability and availability for all users, the platform will enforce rate limits based on account usage during high-demand windows.
According to the announcement, the team observed that a significant portion of incoming requests originated from "ultra-high concurrency automated batch tasks or multi-user sharing patterns." The new policy is designed to prevent a minority of atypical traffic patterns from monopolizing the public computing pool, thereby ensuring fair allocation of resources and a stable experience for most customers.
The M2.7 model, launched by MiniMax (Xiyu Technology) on March 18th, represents a new generation of agent-centric large language models. A key claimed innovation is its "model self-evolution" pathway, facilitated by an "Agent Harness" system that allows the model to participate deeply in its own training and optimization processes.
Internally, MiniMax reports that M2.7 can handle approximately 30% to 50% of the workload in certain R&D scenarios, achieving a roughly 30% improvement on internal evaluation benchmarks. On technical performance, the model reportedly matched GPT-5.3-Codex with a 56.22% pass rate on the SWE-Pro benchmark (covering multiple programming languages) and scored 55.6% on the repo-level code generation benchmark VIBE-Pro—a result nearly on par with Opus 4.6.
The need for immediate capacity management underscores the model's rapid adoption and the intense computational demands of advanced AI agent systems. This service adjustment highlights the scaling challenges faced by AI platforms as they deploy increasingly powerful and popular models.