Supports 33 languages! Tencent Hunyuan launches an ultra-quantitative compressed version translation model

On April 29th, Tencent launched the ultra-quantized compressed version translation model Hy-MT1.5-1.8B-1.25bit, supporting 33 languages. The model size is only 440MB and can run locally on mobile phones without internet access, with translation quality superior to Google Translate. This model is based on Hy-MT1.5 and has 1.8 billion parameters, providing translation results comparable to commercial translation APIs. Tencent also introduced two quantization compression schemes, 2-bit and 1.25-bit, suitable for different mobile users, ensuring translation quality is maintained while compressing the size

According to Zhitong Finance APP, on April 29, Tencent's Hunyuan launched the ultra-quantized compressed version translation model Hy-MT1.5-1.8B-1.25bit, compressing the translation large model that supports 33 languages to 440MB, which can run directly on mobile devices without internet access, with translation quality superior to Google Translate.

Built on the Hunyuan translation large model Hy-MT1.5, the translation effect rivals commercial translation models

Hy-MT1.5 is a professional translation large model developed by Tencent's Hunyuan team, natively supporting 33 languages, 5 dialects/minor languages, and 1056 translation directions. From common Chinese-English translation to French, Japanese, Arabic, Russian, and even minority languages such as Tibetan and Mongolian, it can handle them with ease.

With only 1.8B parameters, Hy-MT1.5 achieves translation effects comparable to commercial translation APIs and 235B-level large models. In strict evaluation benchmarks, its translation quality not only surpasses mainstream systems like Google Translate but also proves that under efficient optimization, lightweight models can exhibit impressive translation capabilities.

The most extreme quantization compression, putting the model into mobile phones

Quantization compression, simply put, is: converting parameters originally represented by 16-bit numbers in the model to lower-bit numbers for storage. This is like compressing a high-definition photo into a thumbnail; the file size is much smaller, but you can still clearly see the content inside. For different mobile users, Tencent has specially launched two extreme quantization compression schemes: 2-bit and 1.25-bit.

Effect scoring of different sized models in FLORES-200 for Chinese-English translation

2-bit model: A balance of performance and quality (suitable for mid-to-high-end models)

The 2-bit model adopts the industry's top stretching elastic quantization (SEQ), quantizing model parameters to {-1.5, -0.5, 0.5, 1.5}, and combines quantization-aware distillation, achieving almost lossless translation quality while compressing the model size to 574MB, surpassing hundreds of GB large models. On mobile devices supporting Arm SME2 technology, the 2-bit model can achieve faster and more efficient inference.

1.25-bit model: Sherry extreme compression (suitable for all models)

To achieve extreme lightweight, Tencent has launched the 1.25-bit model based on Sherry (sparse efficient ternary quantization) technology. This technical solution has already been accepted by the top NLP academic conference ACL 2026 The core logic of the Sherry compression scheme lies in the "fine-grained sparsity" strategy: for every 4 model parameters, 3 of the most important ones are stored using 1-bit, while 1 is stored using 0, averaging only 1.25 bits per parameter.

Combined with the STQ core designed by Tencent specifically for mobile CPUs, this scheme achieves perfect adaptation to the SIMD instruction set. Ultimately, the original model of 3.3GB was further compressed to 440MB, easily running in the background, allowing ordinary smartphones with tight memory to smoothly perform high-quality offline translation.

This open-source release not only includes model weights but also specially created a practical Tencent Hunyuan translation demo version, particularly adapted for the "background word retrieval mode." Whether checking emails locally or browsing the web, Hunyuan translation is always available on demand. No internet is required, no subscription needed, completely processed locally without involving the collection and upload of personal information, and can be used permanently after a single download