Alibaba Tongyi Qianwen tops the global open-source model rankings

Zhitong
2025.04.02 06:33
portai
I'm PortAI, I can summarize articles.

Alibaba's Tongyi Qianwen recently released the full-modal large model Qwen2.5-Omni, which topped the rankings on the global AI open-source community Hugging Face, becoming the number one open-source model in the world. This marks the first time a Chinese technology company has swept the top three spots on the list, demonstrating Hangzhou's leadership in the field of AI innovation. Qwen2.5-Omni can handle multiple inputs and generate natural speech, and its small size makes widespread application possible. Following closely are DeepSeek's V3-0324 and Qunke's SpatialLM-Llama-1B

According to the Zhitong Finance APP, on April 2nd, the world's largest AI open-source community Hugging Face updated its large model rankings. Alibaba (09988) Tongyi Qianwen's recently open-sourced end-to-end multimodal large model Qwen2.5-Omni topped the overall list, followed closely by DeepSeek-V3-0324 and Qunhe's SpatialLM-Llama-1B. This marks the first time that Chinese technology companies have swept the top three positions in the global open-source model rankings, highlighting Hangzhou's status as a hub for AI innovation.

The end-to-end multimodal large model Qwen2.5-Omni, which has taken the top spot, can simultaneously process various inputs such as text, images, audio, and video, and generate text and natural speech synthesis outputs in real-time. Compared to closed-source large models with hundreds of billions of parameters, Qwen2.5-Omni, with its small size of 7B, makes the widespread application of multimodal large models in the industry possible. It can even be easily deployed and applied on mobile phones.

SpatialLM, developed independently by Qunhe Technology, is a spatial understanding model that can generate physically accurate 3D scene layouts from just a video. Unlike traditional large language models, SpatialLM breaks through the limitations of understanding the geometry and spatial relationships of the physical world, playing a significant role in the spatial cognition and analysis capabilities of machine-like entities.

Additionally, DeepSeek's V3-0324 is a minor version update of V3. Although the official description refers to it as a "minor version upgrade," its tested capabilities are close to those of version 3.5, particularly excelling in complex logic and multimodal understanding