Alibaba Open-Sources MAI-UI GUI Agent: From 2B to 235B Models with End-Cloud Collaboration Boosting Success Rate 33%

Alibaba Open-Sources MAI-UI GUI Agent: From 2B to 235B Models with End-Cloud Collaboration Boosting Success Rate 33%

Published:December 31, 2025
Reading Time:1 min read

Want to read in a language you're more familiar with?

Alibaba’s Tongyi Lab open-sources MAI-UI GUI agent in 2B to 235B variants, with end-cloud collaboration boosting task success by 33% and enabling cross-app, privacy-focused interactions for AI terminals.

Alibaba’s Tongyi Lab has open-sourced MAI-UI, a GUI agent framework, releasing the paper, code, and full-size models (2B/8B/32B/235B-A22B) covering edge to cloud deployment, enabling cross-app collaboration and privacy-protected interactions for AI terminals.

MAI-UI overcomes traditional GUI agent limitations by actively querying users for missing details and calling external APIs to streamline operations—such as integrating Amap API for commute comparisons or GitHub API for commit extraction and emailing—without manual app switching. Its innovative end-cloud system dynamically assigns tasks: privacy-sensitive operations stay local, complex ones go to the cloud, boosting the 2B edge model’s success rate by 33% and reducing cloud calls by over 40%, with more than 40% of tasks handled locally for efficiency and security.

Performance highlights set industry records: 76.7% success rate on AndroidWorld phone navigation (surpassing Gemini-2.5-Pro), 91.3% on MMBench GUI L2 accuracy, and 73.5% on ScreenSpot-Pro element positioning, far outperforming peers. Even the smallest 2B edge model achieves 49.1% navigation success, a 75% improvement over traditional edge models.

MAI-UI is now fully open on GitHub and arXiv, empowering developers to deploy and accelerate human-like interactions on AI phones and smart devices.

Source: QbitAI