Zhipu AI Launches GLM‑4.5, an Open-Source 355B AI Model Aimed at AI Agents

Chinese startup Zhipu AI (now rebranded as Z.ai) has open-sourced its new flagship model GLM‑4.5, a 355-billion-parameter foundation AI model. The July 28 announcement touts GLM-4.5’s state-of-the-art performance in reasoning, coding, and autonomous “agent” tasks, positioning it as a homegrown challenger to the likes of OpenAI’s GPT-4.

A New Flagship Model Born for AI Agents

Zhipu’s GLM-4.5 is explicitly designed for agent applications – AI systems that can carry out autonomous tasks by interacting with tools and environments. It is billed as the first “SOTA-level” (state-of-the-art) native agent model in China . Unlike standard chatbots, GLM-4.5 natively integrates complex logical reasoning, code generation, and interactive decision-making capabilities within a single model . This marks the first time Zhipu has fused multiple core abilities into one AI system, reflecting a technical breakthrough aimed at autonomous AI agents .

Technically, GLM-4.5 adopts a Mixture-of-Experts (MoE) architecture, enabling an enormous model scale with improved efficiency. The full GLM-4.5 has 355 billion parameters in total, of which about 32 billion are active for any given query . Zhipu also provides a lighter GLM-4.5-Air variant (106 billion total, 12 billion active parameters) for high efficiency deployments . This MoE design allows the model to achieve high performance without the runtime cost of utilizing all parameters at once, effectively doubling parameter efficiency compared to previous models .

The model was trained on an unprecedented dataset – 15 trillion tokens of general text pre-training, followed by 8 trillion tokens of targeted fine-tuning data for code, reasoning, and agent behavior, with additional reinforcement learning to enhance those skills . GLM-4.5 also supports an extended 128,000-token context window, enabling it to handle very long inputs or multi-step interactions, far exceeding typical context lengths of models like GPT-4 (8k–32k) or even Claude 2 (100k) . Zhipu introduced a dual-mode operation: a “thinking mode” for complex reasoning and tool use, and a “fast response” mode for immediate answers . Developers can toggle this to balance depth of reasoning against speed, which is particularly useful for agent use-cases that sometimes require chain-of-thought planning .

Benchmark Performance: Among the World’s Best

Initial evaluations place GLM-4.5 among the top-tier of AI models globally. On a suite of 12 representative benchmarks covering knowledge test (MMLU Pro), mathematics (MATH500), coding (LiveCodeBench), multi-step reasoning (TAU-Bench), and more, GLM-4.5 achieved an overall score ranking in the top three worldwide . This makes it the highest-ranking model among Chinese-developed and open-source models, narrowing the gap with only the very best proprietary models. Zhipu even reports that by the average of these tests, GLM-4.5 is #2 globally, effectively on par with the world’s strongest AI systems like GPT-4 . Chinese media noted the model’s performance – comparable to the top flagship AI models in the world .

Notably, GLM-4.5 shines in coding and “agentic” tasks. In Zhipu’s internal tests simulating software development scenarios, GLM-4.5 was benchmarked against leading coding-focused models such as Anthropic’s Claude Code, Moonshot’s Kimi-K2, and Alibaba’s Qwen-3 Coder. GLM-4.5 demonstrated superior task completion and more reliable tool usage than other open models, and came very close to Anthropic’s latest Claude-4 (Sonnet edition) on many programming challenges . Zhipu acknowledges Claude-4 still holds a slight edge on certain dimensions, but says GLM-4.5 can already serve as a viable drop-in alternative for most real development scenarios . To back this, the company publicly released the 52 coding tasks and agent interaction logs used in the evaluation for community scrutiny .

Parameter-for-parameter, GLM-4.5 is highly efficient. The model uses roughly half the parameters of DeepSeek-R1 and one-third of Kimi-K2, yet outperforms both in many benchmarks, indicating better use of model capacity . Zhipu’s smaller 12B-active variant GLM-4.5-Air even surpassed certain larger models like Google’s Gemini (2.5 “Flash” version), Alibaba’s Qwen-3 235B, and Anthropic’s Claude 4 Opus on some reasoning benchmarks . On a performance-per-parameter plot, the GLM-4.5 series lies on the Pareto frontier, meaning it delivers best-in-class performance for its model size . This emphasis on efficiency could appeal to researchers and enterprises looking to maximize AI capability while managing compute costs.

Advancements Over GLM-4 and Industry Peers

GLM-4.5 represents a significant leap from Zhipu’s previous generation GLM-4 models. Whereas GLM-4 (released earlier in 2024) offered versions up to 32B parameters and separate variants tuned for different tasks , the new GLM-4.5 consolidates multiple skillsets into one powerhouse model. Zhipu calls it their “first complete implementation” of merging diverse capabilities without sacrificing any of them . In practical terms, GLM-4.5 can handle tasks that previously required a combination of models – from complex problem solving and mathematical reasoning to writing code and controlling tools – all within a single unified system . This reduces the need for task-specific models (like the prior GLM-4-Plus vs. GLM-4-Air variants) and simplifies deployment.

Thanks to the MoE design and optimization, GLM-4.5 achieves double the effective model capacity of GLM-4 for a given computational cost . The inference speed and cost improvements are dramatic. Zhipu advertises GLM-4.5’s API at just ¥0.8 CNY per million input tokens and ¥2 per million output tokens, an order of magnitude cheaper than Claude’s API pricing . (For comparison, Claude 2’s 100k context API costs about ¥30 per million tokens.) This ~90% cost reduction lowers the barrier for businesses to utilize a GPT-4-class model . Moreover, a high-throughput version of GLM-4.5 can generate over 100 tokens per second, supporting low latency and high-concurrency applications . By contrast, GLM-4’s API had been priced higher (around ¥5 per million tokens) and could output only a few dozen tokens/sec in standard settings . GLM-4.5 thus delivers a major upgrade in both performance and cost-efficiency over its predecessor.

When stacked against Western peers, GLM-4.5 is aimed squarely at closing the gap. Zhipu’s launch messaging underscores a challenge to OpenAI – the model’s release is timed as a direct bid to rival GPT-4 and the upcoming Google Gemini in capability . While GPT-4 remains proprietary, Zhipu claims GLM-4.5 attains comparable prowess on many benchmarks . The Chinese press also highlighted that GLM-4.5’s parameter count (355B) exceeds that of GPT-4, though due to MoE, its active size is similar to GPT-4’s estimated ~1.5T parameters scope (if each expert counted) is larger than GPT-4’s 175B, but due to MoE its active size is ~32B, closer to GPT-3.5/Claude – yet it punches above its weight in performance . Likewise, Anthropic’s Claude 2 (also known as Claude-4) has been lauded for its long context and strong reasoning; GLM-4.5 matches Claude’s 100k token context with 128k, and Zhipu’s tests indicate GLM-4.5 can nearly replace Claude-4-Sonnet for coding assistant duties in enterprise settings . As for Google’s Gemini, still in development, Zhipu’s benchmarking suggests GLM-4.5 already competes: the smaller Air model beating a “Gemini 2.5 Flash” baseline in some reasoning tasks . These comparisons illustrate China’s accelerating progress – open models like GLM-4.5 are closing in on the capabilities of the best closed models from US tech giants.

Enterprise Use Cases and Commercial Potential

By open-sourcing GLM-4.5 under the permissive MIT License, Zhipu has made it available for commercial use by any developer or enterprise . Companies can download the model weights from platforms like Hugging Face and ModelScope, or utilize it via API on Zhipu’s cloud. This broad accessibility positions GLM-4.5 as an attractive foundation model for a variety of real-world applications.

AI Agents and Automation: GLM-4.5’s hallmark is enabling autonomous AI agents. During the launch, Zhipu showcased interactive demos of the model acting as an AI agent across different scenarios. In one demo, GLM-4.5 served as a web research assistant – given a topic, it could perform live searches, analyze the results, and compile an aggregated answer with sources . Another demo featured a simulated social media platform where the model generated content and even manipulated the interface (e.g. posting updates), demonstrating UI control abilities . Impressively, the team built a small Flappy Bird game and had GLM-4.5 power a bot to play the game, showing it handling front-end animation generation and game logic control autonomously . Perhaps the most enterprise-relevant example was an AI tool for automatic slide deck generation: GLM-4.5 would take a topic, search for relevant materials and images, then dynamically generate a formatted PowerPoint in HTML – complete with well-laid-out text and graphics . The model decided what information to include and how to style the slides, indicating potential for automating content creation and knowledge work.

Software Development: Given its strong coding skills, GLM-4.5 can function as a powerful coding co-pilot or backend code generation service. It supports code completion, debugging assistance, and multi-step tool usage (e.g. calling an API, then using the result in code). Zhipu reports that GLM-4.5 performed exceptionally in multi-turn coding tasks that involve reading error messages, planning a fix, writing code, and verifying outputs . This makes it suitable for building AI-assisted development tools, from chatbots that help engineers write code to fully automated coding agents that can take on defined programming tasks. The model’s reliable tool use (ability to invoke external functions or APIs when needed) and high task completion rate were cited as key advantages in this domain . Zhipu even made GLM-4.5 compatible with Anthropic’s Claude Code framework, meaning organizations that have workflows or agent systems designed for Claude can easily plug in GLM-4.5 as an alternative AI engine . This interoperability lowers friction for enterprises to experiment with GLM-4.5 in place of existing solutions.

Business Intelligence and Long Documents: With its 128k context window, GLM-4.5 can ingest and analyze very large documents or even sets of documents in one go. Enterprises could leverage it for summarizing lengthy reports, extracting insights from massive logs or datasets, or conducting in-depth research by feeding the model entire knowledge bases. The model also supports structured output formats like JSON for easy integration into business systems, and it features a “knowledge base retrieval” tool interface to augment its responses with factual references . These features indicate a focus on enterprise needs such as accuracy, interpretability, and integration.

Cost and Deployment: A major selling point is the low usage cost and flexible deployment options. The official pricing of roughly ¥2.8 CNY per million tokens (input+output combined) is significantly cheaper than OpenAI or Anthropic’s APIs for comparable models . For companies that prefer self-hosting due to data privacy or customization needs, GLM-4.5’s open source nature allows deployment on private infrastructure (though the resource requirements for the 32B-active model are non-trivial). The smaller GLM-4.5-Air lowers that bar and could potentially run on more affordable hardware. Additionally, Zhipu’s release of high-speed inference optimizations (100+ tokens/sec generation) means the model can serve interactive applications to many users concurrently without major latency . All of these factors broaden the range of enterprise use cases – from powering consumer-facing chat services and virtual assistants, to backend data analysis pipelines – where GLM-4.5 could be employed with manageable cost and performance.

Strategic Position: China’s AI Ecosystem and Global Competition

Zhipu AI’s GLM-4.5 launch underscores China’s strategic embrace of open-source AI as a means to compete globally. By releasing a model of this scale and sophistication under an open license, Zhipu is “democratizing” advanced AI capabilities , in line with a broader trend among Chinese AI firms. In recent months, several Chinese startups often dubbed the “AI Six Tigers” (六小虎) have open-sourced large models to rapidly improve and iterate. For example, Beijing-based Moonshot AI open-sourced its Kimi K2 (a 1-trillion-parameter MoE model) earlier in July, and Shenzhen’s Step AI (Jieyue Xingchen) released an open version of its latest Step-3 reasoning model . Zhipu’s GLM-4.5, however, is the largest open model to date and positions the company as a front-runner in this open-source wave .

This strategy is partly a response to U.S. players like OpenAI, whose cutting-edge models are closed-source and expensive to access. By offering a free (to download) alternative that approaches GPT-4-level performance, Chinese companies hope to build developer communities and set technical standards in the AI arena . The open-source approach also serves as a “competitive weapon,” allowing up-and-coming firms like Zhipu to gain global traction despite less resources than Big Tech incumbents . It encourages adoption by researchers and enterprises worldwide, potentially tilting the network effects in favor of these models over proprietary ones . As COAI News notes, free high-quality models put pressure on established players to reconsider their pricing and access policies, and help Chinese firms capture mindshare among developers who influence enterprise choices .

Within China, Zhipu is solidifying its role in the national AI ecosystem. The company – a spin-off from Tsinghua University – was one of the first in China to develop a 100-billion-parameter model (the GLM-130B in 2022) and has since been recognized as a leading innovator. It even drew attention from OpenAI’s chief Sam Altman; Chinese media noted that Zhipu was “named by OpenAI” as a notable Chinese contender . Backed by major tech giants Alibaba and Tencent as investors, as well as funding from state-backed venture capital, Zhipu carries significant support . Earlier this month, Shanghai’s Pudong VC and Zhangjiang Group invested ¥1 billion (~$140M) in Zhipu as a strategic stake . The company has raised multiple rounds from local governments in Hangzhou, Zhuhai, Chengdu and others in 2025 alone , aligning with China’s push to foster domestic AI champions. Zhipu is reportedly preparing for an IPO, shifting its listing plan to Hong Kong with a potential ~$300 million raise on the horizon . This infusion of capital would further fuel the compute and research needed to compete with Western AI leaders.

Industry analysts see GLM-4.5’s launch as a pivotal moment in China’s AI race. It comes on the heels of the World Artificial Intelligence Conference 2025 (WAIC) in Shanghai, where the divergence in strategies among China’s AI startups was a hot topic . While some of the “six tigers” have pivoted – e.g. Baichuan Intelligent focusing on medical AI, and ZeroOne dropping out of base model R&D to concentrate on enterprise services – Zhipu has doubled down on the core foundation model approach. By open-sourcing GLM-4.5, Zhipu aims to assert technological leadership and ensure its model becomes a platform upon which others build. This move also supports China’s national goal of achieving AI self-reliance; open models can be widely adopted domestically without concerns of foreign restrictions. As one Chinese commentary put it, open source has become the “corner overtaking” strategy for China’s AI – a way to leapfrog by harnessing community collaboration and transparency .

The global AI community is taking note as well. GLM-4.5 being released on Hugging Face and ModelScope means international researchers can readily evaluate and fine-tune it . Its performance on English benchmarks (like MMLU and coding tasks) suggests it is competitive not just for Chinese-language applications but globally. If developers around the world begin adopting GLM-4.5 for novel applications, it could increase China’s influence in AI software akin to how open-source software projects have done in other domains.

Launch Highlights and Outlook

Zhipu unveiled GLM-4.5 in a live online event on the evening of July 28, following WAIC. The launch included live demonstrations of the model’s agent capabilities (from coding to web browsing), which impressed observers with their complexity . GLM-4.5 is immediately accessible: interested users can test the full model for free via Zhipu’s web portals, Zhipu Qingyan (chatglm.cn) and the z.ai platform, which host a chat interface . For developers, the model’s API is available on Zhipu’s BigModel platform, and the complete trained weights can be downloaded from Hugging Face and ModelScope for experimentation . No specific commercial partnerships were announced at the launch, but Zhipu’s low-cost API and compatibility with existing AI agent frameworks strongly hint at enterprise integrations in the near future .

The reaction in China’s tech circles has been largely positive, with many seeing GLM-4.5 as a sign that local innovation is reaching a new level. Some early Chinese testers even described its performance as “离谱” (ridiculously strong) on certain tasks . Of course, competition is fierce and evolving: Baidu, Huawei, Alibaba, and others are all developing advanced models (many open-sourced as well) to compete in both domestic and international markets.

For Zhipu, GLM-4.5 is a major milestone but also a stepping stone. The company’s roadmap, supported by fresh funding and an anticipated IPO, likely includes further scaling of model capabilities (GLM-5 and beyond) and specialization in domains like multimodal understanding (they recently open-sourced a vision-language model, GLM-4.1V) . By staking a claim with an open model that rivals the best from Silicon Valley, Zhipu AI has firmly positioned itself at the vanguard of China’s AI advancement. GLM-4.5 not only provides Chinese enterprises an indigenous, cost-effective alternative for cutting-edge AI, but also signals to the world that China’s open-source AI efforts are accelerating quickly. The global AI community can expect more to come as these models continue to improve and drive competition – ultimately benefiting users and developers with more choices and faster innovation