Tencent Hunyuan and Xiamen University Introduce JarvisEvo, an AI Image Editing Agent

Tencent Hunyuan and Xiamen University Introduce JarvisEvo, an AI Image Editing Agent

Published:December 26, 2025
Reading Time:1 min read

Want to read in a language you're more familiar with?

Tencent Hunyuan and Xiamen University unveil JarvisEvo, an AI image editor that uses visual feedback (iMCoT) and a self-improving framework (SEPO) to edit photos like a human designer, integrating 200+ professional tools.

Tencent’s Hunyuan large model team, in collaboration with Xiamen University, has released JarvisEvo, an intelligent image-editing agent designed to edit images the way human designers do—by seeing and adjusting simultaneously.

JarvisEvo operates using an Interactive Multimodal Chain-of-Thought (iMCoT) mechanism: it first generates an editing plan, then invokes professional tools (integrating over 200 tools, including Adobe Lightroom), observes the visual results, and decides whether to proceed, revise, or correct its approach. This workflow addresses a major limitation of text-only reasoning chains, which often lead to “blind editing” and instruction hallucinations.

To enable self-improvement, the research team introduced a Synergistic Editing–Evaluation Policy Optimization (SEPO) framework. The model uses self-evaluation scores as intrinsic rewards while incorporating human-annotated data to calibrate its aesthetic judgment, preventing biased or self-deceptive optimization.

In evaluations conducted on the team’s proprietary ArtEdit dataset, JarvisEvo outperformed baseline models across multiple metrics and received higher scores in human subjective assessments.

Source: liangziwei