Alibaba Releases and Open-Sources QwQ-32B Reasoning Model, Challenging Deepseek R1

At 3:00 AM on March 6th, Alibaba released and open-sourced its new reasoning model, Tongyi Qianwen QwQ-32B. Alibaba claims this 32 billion-parameter model's performance rivals that of DeepSeek-R1, which boasts 671 billion parameters (with 37 billion activated).

It is reported that QwQ-32B underwent evaluation across various benchmark tests, assessing mathematical reasoning, programming capabilities, and general abilities. It was compared with OpenAI's o1-mini and both the full and distilled versions of DeepSeek-R1.

In the AIME24 evaluation set, which tests mathematical ability, and the LiveCodeBench, which evaluates coding proficiency, Qianwen QwQ-32B performed comparably to DeepSeek-R1, significantly outperforming o1-mini and the same-sized distilled R1 model.

In the LiveBench, the 'most difficult LLMs evaluation chart' led by Meta's chief scientist Yann LeCun, the IFEval evaluation set for instruction-following ability proposed by Google, and the BFCL test proposed by the University of California, Berkeley, and others, which evaluates accurate function or tool invocation, Qianwen QwQ-32B's scores surpassed DeepSeek-R1.

Official introductions state that this achievement highlights the effectiveness of applying reinforcement learning to powerful foundation models that have undergone large-scale pre-training. Furthermore, the Alibaba team integrated Agent-related capabilities into the reasoning model, enabling it to perform critical thinking while using tools and adjust the reasoning process based on environmental feedback.

In addition to significant performance improvements, another highlight of QwQ-32B is its substantially reduced deployment and usage costs. Developers and businesses can easily deploy it on local devices using consumer-grade hardware.

Since 2023, Alibaba's Tongyi team has open-sourced over 200 models, including the large language model Qianwen Qwen and the visual generation model Wanxiang Wan, covering parameter sizes from 0.5B to 110B, achieving open-source for large models across all modalities and sizes.

Previous lists from the open-source community Hugging Face showed that Alibaba's Wanxiang large model, open-sourced for only six days, had surpassed DeepSeek-R1, topping both the model popularity chart and the model space chart, becoming the most popular large model in the global open-source community recently. According to the latest data, Wanxiang 2.1 (Wan2.1) has exceeded one million downloads on Hugging Face and the ModelScope community, and has garnered over 6,000 stars on GitHub.

Following the release and open-sourcing of Tongyi Qianwen's latest reasoning model, Alibaba's stock price surged. Overnight, U.S. stock markets closed with an 8.61% increase, closing at $141.03. As of this writing, Alibaba's Hong Kong stock has risen by over 7%. Year-to-date, Alibaba's stock price has accumulated a gain of nearly 70%.

Alibaba Releases and Open-Sources QwQ-32B Reasoning Model, Challenging Deepseek R1

Tags

Featured Posts

Alibaba Launches Robotics and Embodied AI

Xiaomi Responds to Incident of Car Reportedly Driving Off on Its Own

Afari Technology Unveils AI Plan and New Brand

ByteDance’s Doubao Translation Model Supports 28 Languages, Performance Comparable to GPT-4o

Zhipu AI Launches GLM‑4.5, an Open-Source 355B AI Model Aimed at AI Agents

More in AI