OpenAI Launches gpt-realtime: Its Most Advanced Speech-to-Speech AI Model Yet
To enhance personalisation, OpenAI has introduced two exclusive new voices, Cedar and Marin, available only with gpt-realtime.
 
    OpenAI has unveiled gpt-realtime, its most advanced speech-to-speech AI model, promising a leap forward in natural and expressive voice interactions.
The model, now available through the Realtime API, is designed to power next-generation voice agents with unprecedented speed, accuracy, and fluency.
This follows the release of GPT-5, the most advanced model from OpenAI so far. During the same period, the San Francisco-based startup also released two open-source models.
Unlike traditional systems that stitch together separate speech-to-text and text-to-speech pipelines, gpt-realtime directly processes and generates audio within a single integrated model, reducing latency and preserving nuance in speech.
This breakthrough allows conversations to feel more fluid and human-like, whether for customer support, real-time translation, or interactive voice assistants.
The Realtime API is officially out of beta and ready for your production voice agents!
— OpenAI Developers (@OpenAIDevs) August 28, 2025
We’re also introducing gpt-realtime—our most advanced speech-to-speech model yet—plus new voices and API capabilities:
🔌 Remote MCPs
🖼️ Image input
📞 SIP phone calling
♻️ Reusable prompts pic.twitter.com/fX5yvt0CDD
The model also demonstrates significant improvements in following complex instructions, interpreting system prompts, and switching seamlessly between languages mid-sentence.
It can handle precise tasks such as reading legal disclaimers verbatim or repeating alphanumeric strings—key features for enterprise-grade applications.
To enhance personalisation, OpenAI has introduced two exclusive new voices, Cedar and Marin, available only with gpt-realtime.
“Voice AI is moving beyond novelty to production-ready systems,” OpenAI said in its announcement. “gpt-realtime brings the naturalness, reliability, and low latency needed to deploy at scale.”
 
             
             
            
Comments ()