Table flip! After being "attacked," Zhipu open-sourced its mobile Agent, allowing everyone to create AI phones

Wallstreetcn
2025.12.09 03:39
portai
I'm PortAI, I can summarize articles.

The "smart driving moment" in the mobile phone industry is accelerating. As the technological barriers are leveled, giants may be able to encircle a "Doubao phone," but it is difficult to encircle millions of personalized agents built on open-source frameworks. Zhipu AI stated, "It is not enough for just one company to do this. The original intention of AutoGLM being open-sourced is to turn this capability into a public foundation that the entire industry can jointly own and refine."

A "bean bag" may be besieged, but countless "bean bags" are on their way.

"Every phone can become an AI phone."

On the morning of December 9th, domestic leading large model manufacturer Zhipu AI officially announced the open-source of its core AI Agent model—AutoGLM. This is an intelligent agent framework developed over 32 months, capable of "Phone Use."

Zhipu's choice to open-source at this time is interpreted by the industry as a technical "table-turning" act—it signifies that the vision of "everyone can create a bean bag phone" has technically become possible.

For the industry, this may be the "smart driving moment" for the mobile phone industry. CITIC Securities commented that AI Agent for mobile phones is akin to autonomous driving for cars.

This open-source means that hardware manufacturers, mobile phone manufacturers, and developers can reproduce an AI assistant based on AutoGLM that can "understand" the screen and simulate human actions such as clicking, inputting, and sliding on their devices or systems. Currently, AutoGLM supports core scenarios of over 50 high-frequency Chinese applications, including WeChat, Taobao, and Douyin, with its automation capabilities similar to the previously discussed "bean bag phone" demonstration.

This action occurs at an extremely delicate and tense moment. Just a week prior, the "bean bag phone" launched by ByteDance in collaboration with Nubia stirred the entire tech circle, triggering a collective "stress response" from internet giants.

Breaking the Wall: From "Bean Bag Siege" to "Everyone as Agent"

The catalyst for the event can be traced back to a week ago.

On December 1st, ByteDance, in collaboration with Nubia under ZTE, launched the nubia M153 equipped with the "bean bag phone assistant," priced at 3,499 yuan. This phone, with system-level permissions, can simulate human operations, crossing APP islands to execute complex tasks such as ordering takeout, sending WeChat messages, and price comparison shopping. This innovation quickly ignited the market, with the first batch of stock selling out instantly. On the Xianyu platform, the unopened phone was even speculated to reach prices of 7,999 to 9,999 yuan.

However, this "God-like" cross-application capability quickly touched the reverse scale of the internet industry. Subsequently, WeChat, Taobao, and several banking apps activated their defense mechanisms. User feedback indicated that when the bean bag assistant attempted to take over WeChat or Taobao, there were instances of abnormal exits, risk warnings, and even account bans. The reasons given by the major companies were "security and privacy," but the industry generally believes that, in essence, this is a life-and-death battle for traffic entry and data control rights Just when the industry thought AI phones would fall into a brief silence due to the blockade by tech giants, Zhipu AI dropped a heavy bombshell.

According to Zhipu's official press release, the AutoGLM project has officially launched on GitHub, open-sourcing the trained core model, Phone Use capability framework and toolchain, as well as a runnable demo covering over 50 high-frequency Chinese apps. Zhipu clearly stated:

“It is not enough for just one company to do this. The primary intention of open-sourcing AutoGLM is to turn this capability into a public foundation that the entire industry can jointly own and refine.”

Market analysts pointed out that the lethality of this move lies in the fact that it transforms a technology originally regarded as a "big company's nuclear weapon" into a tool readily available to all developers. When the technical barriers are flattened, while the giants may be able to encircle a "Doubao phone," it will be difficult to encircle the countless personalized agents built on open-source frameworks.

Source: Zhipu AI official public account, same below

Deconstruction: "Dimensionality Reduction Strike" at the Technical Level, 32 Months of "Breakthrough at the Bottom Level"

Why is this open-source move described as "turning the table"? The core lies in Zhipu choosing a technical path that is difficult for the giants to defend against.

According to the technical details released by Zhipu and in-depth analysis within the industry, the technical implementation of AutoGLM has the following disruptive characteristics:

1. Evolution from "Chaos" to "Control":

According to Zhipu, the research and development of AutoGLM began in April 2023. The early system often "got lost" in mobile operations, but after 32 months of refinement, the team established a complete Phone Use capability framework, abstracting clicks, swipes, inputs, and interface understanding. In November 2024, AutoGLM completed the first mobile red envelope in human history done by AI—not through API, but by AI truly "understanding" the interface and completing the operation.

2. "Dimensionality Reduction Strike" of Visual Large Model + ADB:

Unlike traditional scripts that are easily banned by accessibility services, AutoGLM calls ADB (Android Debug Bridge) commands at the bottom level and combines it with the visual large model (AutoGLM-Phone-9B). Its operating logic is "view screen screenshot -> large model analysis -> simulate finger click." This vision-based "human-like operation" makes it extremely difficult for app manufacturers to defend against simple code detection. As long as the human eye can understand the interface, AI can operate

3. The "ultimate solution" to privacy issues and bypassing easily besieged interfaces:

In response to the reasons for the "siege" by major companies—privacy and security, Zhipu provided a clear response in its open-source announcement: "Technology is open to the entire ecosystem, and data and privacy will always remain on the user's side." AutoGLM supports private deployment, training through reinforcement learning algorithms like MobileRL in a cloud-based virtual phone, while in actual operation, enterprises and developers can control data within their own compliant environments. AutoGLM supports local deployment mode, where model operation and data processing are completed on the user's device, ensuring that data does not leave the phone. This architectural design directly undermines the legitimacy of the internet giants' siege under the pretext of "privacy leakage."

Impact: The "smart driving moment" in the mobile industry

From the perspective of investors and industry development, the open-source of Zhipu AutoGLM is of milestone significance, as it concerns not only technology but also the reconstruction of business models.

1. The "new competition point" for hardware manufacturers

CITIC Securities pointed out in its research report on December 5 that AI Agents for mobile phones are akin to autonomous driving for cars. Previously, mobile phone manufacturers struggled with a lack of a super entrance that could connect all apps, while the open-source of AutoGLM provides a ready-made technical foundation for manufacturers like Honor, Xiaomi, OPPO, and even small and medium hardware developers. Referring to the automotive industry's "Huawei + Seres" smart selection model, the deep binding of "large model manufacturers + mobile phone manufacturers" is expected to become the norm, and AI phones are likely to experience explosive growth similar to that of new energy vehicles.

2. The "forced mechanism" of the internet ecosystem

For internet giants like Tencent and Alibaba, their moats are facing severe challenges. When users can bypass the homepage recommendations and ad placements of apps through AI Agents to directly access core services (such as booking tickets or price comparison shopping), the traffic distribution logic of super apps will become ineffective. The open-source of Zhipu generalizes this capability, leaving the giants with only two choices: either continue to build high walls, risking a decline in user experience, or actively sit at the negotiation table, open APIs, and co-build a new ecosystem with AI Agents 3. Empowerment of Individual Developers

Just as Linux open source promoted the popularization of operating systems, Stable Diffusion open source ignited AI painting, and the open source of AutoGLM marks the entry of mobile agents into the "programmable" era. In the future, based on this framework, there may emerge public welfare agents specifically serving visually impaired individuals, efficiency agents focused on specific workflows, and even fully personalized private assistants.

Conclusion: The Transition of New and Old Orders

In December 2025, from the "charge" of Doubao mobile to the "open source" of Zhipu AutoGLM, within just ten days, China's internet experienced a dramatic upheaval regarding entry points, traffic, and control.

The open source of AutoGLM effectively returns the choice to users and developers. It announces that the era of relying solely on closed ecosystems and traffic monopolies is coming to an end. Although the current experience may still have delays or instability, the arrival of the agent era is unstoppable.

For the market, this is not just a technical news, but a clear signal: the interaction logic of smart terminals is undergoing a fundamental reversal, and a new trillion-dollar track—the edge intelligent agent economy—has been explosively opened.

Open source address: https://github.com/zai-org/Open-AutoGLM