Realtime AI News

Qwen Releases Qwen3-ForcedAligner-0.6B Model for Speech Alignment

Qwen team officially published Qwen3-ForcedAligner-0.6B-hf on Hugging Face, a forced alignment model supporting Chinese, English, Cantonese, and French.

PublishedJun 26, 2026, 16:42 Beijing time/Reads 0

Qwen team released Qwen3-ForcedAligner-0.6B-hf on Hugging Face on June 26, a forced alignment model built on the token-classification pipeline using the transformers library and safetensors format. It is designed for speech recognition alignment tasks.

The model's tags include transformers, safetensors, qwen3_asr, and token-classification, with language support for Chinese (zh), English (en), Cantonese (yue), and French (fr), signaling Qwen's systematic expansion of its speech technology stack.

Forced alignment is a critical component in speech recognition and speech synthesis, precisely matching audio signals to text along a timeline. This release gives developers a specialized tool for tasks such as speech dataset annotation and speech synthesis training data preparation.

Source: Qwen Hugging Face official model registry. Downloads: 0. Likes: 1.

Why it matters

This model fills a gap in Qwen's speech ecosystem for forced alignment, offering practical value for developers working on speech data annotation and TTS pipelines.

QwenModel ReleaseASRHugging Face

Sources

Source 1: https://huggingface.co/Qwen/Qwen3-ForcedAligner-0.6B-hf