AI Models

IBM’s Granite Speech Model Tops Open ASR Hugging Face Leaderboard

Granite was trained on a diverse set of public audio datasets representing multiple dialects and speech contexts.

The Left Shift Bureau

17 Jun 2025 — 1 min read

IBM claims its Granite Speech 3.3 8B model took the top spot on Hugging Face’s Open ASR leaderboard, outperforming rivals like OpenAI’s Whisper and Meta’s speech models.

Granite Speech 3.3 8B is an advanced speech recognition model built on its Granite 3.3 8B Instruct base using modality alignment and LoRA fine-tuning.

It achieved an industry-leading average word error rate (WER) of 5.85, despite being significantly smaller in size compared to competing models.

"When tested on several different bodies of audio data, the Granite model had the lowest word error rate (or the highest accuracy), beating out several proprietary models as well," IBM said in a blog post.

Granite was trained on a diverse set of public audio datasets representing multiple dialects and speech contexts—from voicemails to earnings calls—making it highly adaptable to real-world use.

IBM enhanced training by injecting noise and cutting audio segments to boost resilience.

The model’s architecture incorporates state-of-the-art components like conformer-based encoders and window query transformers, helping it handle complex speech scenarios.

IBM credits its strong performance to balanced data sampling and acoustic encoder innovations. With this release, IBM underscores its commitment to building human-level speech AI in the coming decade.

Designed for enterprise use, the model excels at transcribing English speech and translating it into French, Spanish, German, Italian, Portuguese, Japanese, and Mandarin.

The model is fully open-source and available on Hugging Face.

LambdaTest Rebrands as TestMu AI, Unveils Autonomous AI Testing Platform

The company said its new identity better aligns with its evolution into an AI-first platform that integrates deeply with modern development environments.

Apple Taps Google’s Gemini Models to Power Next Generation of Apple Intelligence and Siri

Apple said Apple Intelligence will continue to function within its tightly controlled ecosystem, running on Apple devices and through its Private Cloud Compute infrastructure.

OpenAI Acquires AI Healthcare App Torch in $100 Mn Deal to Boost ChatGPT Health

The startup’s four-member team, including co-founders and engineers, will join OpenAI to integrate its technology and expertise into ChatGPT Health’s development.

After ChatGPT Health, Anthropic Adds Healthcare Features to Claude

The capabilities are now available to U.S. subscribers on Claude’s Pro and Max plans.

Read more

LambdaTest Rebrands as TestMu AI, Unveils Autonomous AI Testing Platform

Apple Taps Google’s Gemini Models to Power Next Generation of Apple Intelligence and Siri

OpenAI Acquires AI Healthcare App Torch in $100 Mn Deal to Boost ChatGPT Health

After ChatGPT Health, Anthropic Adds Healthcare Features to Claude