Tencent Announces the Open Source of Hunyuan Text-to-Image Large Model

On May 14th, Tencent announced that its Hunyuan text-to-image large model has been fully upgraded and open-sourced. It has been released on the Hugging Face platform and Github, including complete models such as model weights, inference code, and model algorithms for free commercial use by enterprises and individual developers.

It is understood that this is the first native Chinese DiT architecture text-to-image open-source model in the industry, supporting bilingual input and understanding in Chinese and English with a parameter size of 1.5 billion. The upgraded Hunyuan text-to-image large model adopts a consistent DiT architecture with Sora, which not only supports Wenshengtu but can also serve as the foundation for multi-modal visual generation like videos.

The upgraded Hunyuan text-to-image model adopts a diffusion model architecture based on Transformer (referred to as DiT), which has stronger scalability. With more parameters, the performance is stronger, which helps improve the effectiveness and efficiency of visual model generation. This is also the key technology behind the previously popular Sora product.

Lu Qinglin, head of Tencent‘s text-to-image team, said: “Tencent‘s development philosophy for Hunyuan text-to-image is practical, insisting on coming from practice and going back to practice. This time we have fully open-sourced the latest generation model in hopes of sharing Tencent‘s practical experience and research results in the field of text generation with the industry, jointly building an open-source ecosystem for Chinese text-to-image models, and accelerating the development of large-scale models in the industry.

Currently, Tencent‘s Hunyuan text-to-image model has reached 1.5 billion parameters. Evaluation data shows that the latest Tencent Hunyuan text-to-image model has improved by over 20% compared to its predecessor, far exceeding the open-source Stable Diffusion model. Among the currently open-sourced text-to-image models, it has the best overall performance and reaches an internationally leading level.

SEE ALSO: Tencent’s New Interactive Image-to-Video Tool: Follow-Your-Click