Alibaba's Qwen 2.5 Max: Challenging OpenAI's Dominance in the LLM Arena?

Technology News

Alibaba's Qwen 2.5 Max: Challenging OpenAI's Dominance in the LLM Arena?
AIArtificial IntelligenceLarge Language Models
  • 📰 TheRegister
  • ⏱ Reading Time:
  • 231 sec. here
  • 17 min. at publisher
  • 📊 Quality Score:
  • News: 138%
  • Publisher: 61%

Alibaba's recent release of Qwen 2.5 Max, a powerful new large language model (LLM), has sparked debate about the shifting landscape of AI development. While DeepSeek's advancements caught the attention of Silicon Valley, Alibaba's Qwen 2.5 Max appears to outpace not only DeepSeek's V3 but also leading models like GPT-4o, Claude 3.5 Sonnet, and Llama 3.1 405B in benchmark tests. This raises questions about the US lead in AI and the cost-effectiveness of developing advanced models.

The speed and efficiency at which DeepSeek claims to be training large language models (LLMs) competitive with America's best has been a reality check for Silicon Valley . However, the startup isn't the only Chinese model builder the US has to worry about. This week Chinese cloud and e-commerce goliath Alibaba unveiled a flurry of LLMs including what appears to be a new model called Qwen 2.

5 Max that it reckons not only outperforms DeepSeek's V3, which the reasoning-capable R1 is based on, but trounces America's top models. As always, we recommend taking benchmarks with a grain of salt, but if Alibaba is to be believed, Qwen 2.5 Max – which can search the web, and output text, video, and images from inputs – managed to outperform OpenAI's GPT-4o, Anthropic's Claude 3.5 Sonnet, and Meta's Llama 3.1 405B across the popular Arena-Hard, MMLU-Pro, GPQA-Diamond, LiveCodeBench, and LiveBench benchmark suites. Given the fervor around DeepSeek, we feel compelled to emphasize that Alibaba is drawing comparisons against V3 and not the latest model. In any case, the announcement further fuels the perception that, despite ongoing efforts to stifle Chinese AI development by the West, the US lead in AI may not be as large as previously thought. And the perception that the countless billions upon billions of dollars demanded by Silicon Valley to develop artificial intelligence looks a little greedy. Unfortunately, beyond performance claims, API access, and a web-based chatbot, Alibaba's Qwen team is being rather tight-lipped about its latest model release. Unlike DeepSeek, whose models are available to freely download and use if you don't want to rely on DeepSeek's apps or cloud, Alibaba has not released Qwen 2.5 Max. It's available to access from Alibaba's servers. What we do know so far is Qwen 2.5 Max is a large-scale mixture of expert (MoE) model that was trained on a corpus of 20 trillion tokens before being further refined using supervised fine-tuning and reinforcement learning from human feedback. MoE models have become increasingly popular among model builders to decouple parameter count from actual performance. Because only a portion of the model is active for any given request – there's no need to activate the entire neural network to tackle a query, just the 'expert' parts relevant to the question – it's now possible to increase parameter count without compromising throughput. That is to say, rather than running an input query through the entire multi-billion-parameter network, performing all those calculations per token, only query-relevant layers are used, meaning outputs are generated faster. We reached out to Alibaba for comment; we'll let you know if we hear back. In the meantime, we asked Qwen 2.5 Max, via form, to share its specs, and it doesn't appear to know much about itself either. But even if it did spit out a number, we're not sure we'd believe it. It's possible, we may never get hold of Qwen 2.5 Max's neural network weights. On the Alibaba Cloud website, the model is listed as being proprietary, which might explain why the Chinese super-corp is sharing so little about the model. Not disclosing parameter counts and other key details is par for the course for many model builders, including Alibaba has been similarly tight-lipped with regard to its proprietary Qwen Turbo and Qwen Plus models. The lack of details makes evaluating model performance somewhat challenging as performance has to be weighted against cost. A model may outperform another in benchmarks, but if it costs 3-4x more to run, it may not be worth the hassle. This certainly appears to be the case with Qwen 2.5 Max. At $10 per million input tokens and $30 for every million tokens generated. Compare that to GPT-4o, for which OpenAI is $2.50 per million input tokens and $10 per million output tokens, or half that if you opt for its batch processing. With that said, Qwen 2.5 Max is still cheaper than OpenAI's flagship o1 model which will run you $15 per million input tokens and $60 per million output tokens generated. As mentioned, Alibaba's latest Qwen model is only the latest in a string of LLMs released by the Chinese mega-biz since 2023. Its most recent generation of models, which bear the Qwen 2.5 name, began Pit against its contemporaries, Alibaba claimed the largest of these models could go toe-to-toe and in some cases best Meta's far larger 405B Llama model. But again, we recommend taking these claims with a grain of salt here. Alongside its general-purpose models, Alibaba also released the weights for several math and code-optimized LLMs and extended access to a pair of proprietary models called Qwen Plus and Qwen Turbo, which boasted alleged performance within spitting distance of GPT-4o and GPT-4o mini. Its OpenAI o1 style 'thinking' model called QwQ. And then this week, leading up to the Qwen 2.5 Max launch, the cloud provider a trio of open vision language models (VLMs) weighing in at 3, 7, and 72-billion-parameters in size

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

TheRegister /  🏆 67. in UK

AI Artificial Intelligence Large Language Models Llms Alibaba Qwen 2.5 Max Openai GPT-4O Claude 3.5 Sonnet Llama 3.1 405B Deepseek Silicon Valley

United Kingdom Latest News, United Kingdom Headlines

Similar News:You can also read news stories similar to this one that we have collected from other news sources.

DeepSeek's Budget-Friendly LLM Challenges AI GiantsDeepSeek's Budget-Friendly LLM Challenges AI GiantsA new LLM from Chinese tech company DeepSeek is shaking up the AI market with its affordability and performance, prompting a price war in China. Alibaba responds with its own advanced LLM, Qwen 2.5-Max, claiming superiority over OpenAI and Meta's models. DeepSeek's low training costs, reportedly achieved with limited hardware, raise questions about the future of AI development.
Read more »

Prompt Engineering Boosts LLM Code GenerationPrompt Engineering Boosts LLM Code GenerationLLMs can write better code with the right prompts, but novice programmers may struggle to leverage this effectively. An experiment using Anthropic's Claude LLM showed that iterative prompting and prompt engineering can significantly improve code performance.
Read more »

DeepSeek's Open-Source LLM: Impressive Performance, Eerie Responses Raise QuestionsDeepSeek's Open-Source LLM: Impressive Performance, Eerie Responses Raise QuestionsDeepSeek's open-source R1 LLM family achieves impressive benchmark scores, but inconsistencies in self-identification and potential censorship raise concerns about training data and model reliability.
Read more »

DeepSeek's R1 curiously tells El Reg reader: 'My guidelines are set by OpenAI'DeepSeek's R1 curiously tells El Reg reader: 'My guidelines are set by OpenAI'Despite impressive benchmarks, the Chinese-made LLM is not without some interesting issues
Read more »

OpenAI's ChatGPT crawler can be tricked into DDoSing sites, answering your queriesOpenAI's ChatGPT crawler can be tricked into DDoSing sites, answering your queriesThe S in LLM stands for Security
Read more »

Paddy Bever's Max to Depart Coronation StreetPaddy Bever's Max to Depart Coronation StreetActor Paddy Bever is leaving Coronation Street after three years portraying Max Turner. The news comes as Max's girlfriend Lauren is facing trial for Joel Deering's death, a storyline that has seen Max entangled in several gripping narratives. While the details of Max's exit remain unclear, fans speculate about potential scenarios, including court appearances and even a fiery demise.
Read more »



Render Time: 2025-02-12 07:25:36