Chinese firms continue to release AI models that rival the capabilities of systems developed by OpenAI and other US-based AI companies.
This week, MiniMax, an Alibaba- and Tencent-backed startup that has raised about $850 million in venture capital and is valued at more than $2.5 billion, debuted three new models: MiniMax-Text-01, MiniMax-VL- 01 and T2A -01-HD. The MiniMax-Text-01 is a text-only model, while the MiniMax-VL-01 can understand both images and text. The T2A-01-HD, meanwhile, generates audio — specifically speech.
MiniMax claims that MiniMax-Text-01, which has 456 billion parameters in size, outperforms models like Google’s recently unveiled Gemini 2.0 Flash on benchmarks like MATH and SimpleQA, which measure a model’s ability to answer math problems and facts. grounded questions. Parameters roughly correspond to a model’s problem-solving abilities, and models with more parameters generally perform better than those with fewer parameters.
As for the MiniMax-VL-01, MiniMax says it rivals Anthropic’s Sonnet Claude 3.5 for assessments that require multimodal understanding, such as ChartQA, which tasks models with answering chart- and diagram-related questions (eg ” What is the maximum value of the orange line in this graph?”) Granted, the MiniMax-VL-01 does not best the Gemini 2.0 Flash in many of these tests OpenAI’s and Meta’s Llama 3.1 beat it in some, too.
Notably, MiniMax-Text-01 has an extremely large context window. A model’s context, or context window, refers to the input (for example, text) that a model considers before generating output (additional text). With a context window of 4 million characters, MiniMax-Text-01 can parse about 3 million words in one go – or just over five copies of War and Peace.
For context (no pun intended), MiniMax-Text-01’s context window is roughly 31 times the size of GPT-4o and Llama 3.1.
MiniMax’s latest model released this week, the T2A-01-HD, is an audio generator optimized for speech. The T2A-01-HD can generate a synthetic voice with adjustable cadence, pitch and tenor in about 17 different languages, including English and Chinese, and clone a voice from just 10 seconds of an audio recording.
MiniMax did not publish benchmark results comparing the T2A-01-HD to other audio-generating models. But to this reporter’s ear, the T2A-01-HD’s outputs sound on par with audio models from Meta and startups like PlayAI.
With the exception of the T2A-01-HD, which is exclusively available through the API platform of MiniMax and Hailuo AI, the new MiniMax models can be downloaded from GitHub and the Hugging Face AI development platform.
However, just because the models are “open” does not mean that they are not closed in some respects. MiniMax-Text-01 and MiniMax-VL-01 are not truly open source in the sense that MiniMax has not released the components (eg training data) needed to recreate them from scratch. Additionally, they are under MiniMax’s restrictive license, which prohibits developers from using the models to improve rival AI models and requires platforms with more than 100 million monthly active users to request a separate license from MiniMax.
MiniMax was founded in 2021 by former employees of SenseTime, one of China’s largest AI firms. The company’s projects include apps like Talkie, an AI-powered role-playing platform along the lines of Character AI, and text-to-video models that MiniMax has released on Hailuo.
Some of MiniMax’s products have been the subject of minor controversy.
Talkie, which was pulled from Apple’s App Store in December for unspecified “technical” reasons, features AI avatars of public figures including Donald Trump, Taylor Swift, Elon Musk and LeBron James, none of whom appear to have accepted to appear in the application.
In December, Broadcast magazine reported that MiniMax’s video generators could reproduce the logos of British TV channels, suggesting that MiniMax’s models were trained on content from those channels. And MiniMax is reportedly being sued by iQIYI, a Chinese video streaming service that claims MiniMax illegally trained on iQIYI’s copyrighted recordings.
The new MiniMax models arrive days after the outgoing Biden administration proposed tougher export rules and restrictions on AI technologies for Chinese enterprises. Companies in China were already barred from buying advanced AI chips, but if the new rules go into effect as written, companies will face tighter restrictions on both the semiconductor technology and the designs needed to enable sophisticated AI systems.
On Wednesday, the Biden administration announced additional measures focused on keeping sophisticated chips out of China. Chip foundries and packaging companies that want to export certain chips will be subject to more extensive licensing requirements unless they exercise greater control and due diligence to prevent their products from reaching customers Chinese.