Chinese Ki -Startup Minimax, which could also be best known within the West, has published the realistic AI video model Hailuo, has published its latest major language model. Minimax-M1 – and in great news for firms and developers it’s complete Open source under an Apache 2.0 licenseThis signifies that firms take it and use it for industrial applications and might change in response to their wishes without restriction or payment.
M1 is an open offer that defines recent standards within the argumentation with an extended context, the usage of agent tools and efficient computing power. It is out there within the AI code -Sharing community today Hug And Microsoft's competing code Sharing Community GithubThe first publication of what the corporate described as a “minimaxweek” from its social account on X – with further product announcements.
Minimax-M1 differs with a context window of 1 million input token and as much as 80,000 tokens within the output and positions it as some of the expansive models for long-context-related argumentation tasks.
The “context window” in large language models (LLMS) refers back to the maximum variety of tokens that the model can process at the identical time – including input and output. Token are the fundamental text units that may contain entire words, parts of words, punctuation brands or codes symbols. These tokens are converted into numerical vectors, with which the model represents and manipulated the importance of its parameters (weights and distortions). You are essentially the mother tongue of the LLM.
For comparison, Openai's GPT-4O Has a context window of only 128,000 tokens – enough to exchange ideas About the worth of a novel information Between the user and the model in a single backwards and forwards interaction. At 1 million tokens, Minimax-M1 could exchange a small or book series of data. Google Gemini 2.5 Pro also offers an upper upper limit of 1 million with a reported 2 million windows.
But M1 has one other trick within the sleeves: it was trained in an modern, imaginative and highly efficient technology using reinforcement learning. The model is trained using a MOE architecture of the experts (Hybrid “mixture (MEE) with a flash note mechanism, which is meant to cut back the inference costs.
According to the technical report, Minimax-M1 consumes only 25% of the floating point operations (flops) required by Deepseek R1 with a generation length of 100,000 tokens.
Architecture and variants
The model is out there in two variants-minimax M1-40K and Minimax M1-80K-, which turn to their “pondering budgets” or their output lengths.
The architecture is predicated on the corporate's former Minimax Text 01 Foundation and comprises 456 billion parameters, with 45.9 billion per token activated.
The training costs of the model are an excellent function of the publication. Minimax reports that the M1 model was trained with a big scale with a big scale (RL) with efficiency with a website rare on this area with total costs of $ 534,700.
This efficiency is attributed to a custom RL algorithm called CiSpo, which climbing the importance of importance of importance than to token updates and the hybrid attention design that contributes to scaling the scaling.
This is an amazingly “low-cost” amount for a border LLM when Deepseek has trained its hit -R1 argumentation model at A reported costs of $ 5 to $ 6 millionWhile the training costs of Openais' GPT-4-a greater than two years of model now-war should exceed 100 million US dollars. These costs come from the worth of graphics processing units (GPUS), the massively parallel computing hardware, which is principally produced by firms akin to NVIDIA and might cost the 20,000 to 30,000 USD or more per module, and from the energy that’s required for the execution of those chips on a big scale in data centers.
Benchmark
Minimax-M1 was rated in a series of established benchmarks that test prolonged argument, software engineering and power use functions.
On Aime 2024, a benchmark of the mathematics competition, the M1-80K model achieves an accuracy of 86.0%. It also provides a robust performance in coding and long context tasks and fulfills:
- 65.0% on LiveCodebench
- 56.0% verified on SWE-bench
- 62.8% on Tau-bench
- 73.4% on Openai MRCR (4-Needle version)

These results set minimax-M1 in front of other open competitors akin to Deepseek-R1 and QWEN3-235B-A22B for several complex tasks.
While models with closed weight akin to O3 and Gemini 2.5 Pro from Openai still extend just a few benchmarks, minimax-M1 significantly narrowed the performance gap, while they continue to be freely accessible under an Apache 2.0 license.
For the supply, Minimax Vllm recommends that as a servant -baking, referring to the optimization for big model workloads, memory efficiency and batch requirements treatment. The company also offers provision options via the Transformers Library.
MiniMAX-M1 incorporates structured functions. Functions and is filled with a chat bot API with online search, video and image generation, language synthesis and language clone tools. These characteristics aim to support broader agent behavior in real applications.
Effects on technical decision -makers and company buyers
The open access of Minimax-M1, the long context functions and the calculation of efficiency bear in mind various recurring challenges for technical specialists who’re accountable for the management of AI systems on a scale.
Responsible for engineering leads accountable for the whole life cycle of LLMS, which is the optimization of the model output and the supply under narrow schedules offer minimax-M1 a small operational cost profile and supports advanced argumentation tasks. His long context window could significantly reduce the pre-processing efforts for corporate documents or protocol data, which include ten or tons of of hundreds of tokens.
For those that manage AI orchestration pipelines, the flexibility to set and supply minimax-M1 with the assistance of established tools akin to Vllm or Transformers supports easier integration into the prevailing infrastructure. The architecture of hybridanding may help to simplify the scaling strategies, and the model of the model with multi-stage argumentation and software engineering benchmark offers a top-class basis for internal copilots or agent-based systems.
From the information platform's perspective, teams which can be accountable for maintaining an efficient, scalable infrastructure can profit from the support of M1 for structured functional calls and its compatibility with automated pipelines. The open source nature gives the teams to adapt the performance to their stack without provider-lock-in.
Safety lines can be given a price when evaluating M1 potential for protected, local provision of a model with a high capability, which just isn’t based on the transfer of sensitive data to finish points of third-party providers.
Together, Minimax-M1 offers a versatile option for organizations that need to experiment or scale with advanced AI skills while managing the prices, remain inside the operating limits and avoid proprietary restrictions.
The publication signals the continued minimax deal with practical, scalable AI models. By combining open access with advanced architecture and calculation efficiency, Minimax-M1 can function a basic model for developers that construct up the following generation applications for which each the depth of argument and the long-distance input understanding require.
We will follow the opposite Minimax publications in the course of the week. Stay tuned!