gpt-realtime-1.5 by OpenAI - Tighter instruction adherence in speech agents
by•
Voice workflows just got stronger with gpt-realtime-1.5 in the Realtime API. The model offers more reliable instruction following, tool calling, and multilingual accuracy.
Replies
Best
Hunter
The team at @OpenAI shipped an interesting update!
GPT-Reatime-1.5 is OpenAI's flagship model audio model for voice agents & customer support.
Voice workflows just got stronger with gpt-realtime-1.5 in the Realtime API. The model offers more reliable instruction following, tool calling, and multilingual accuracy.
A +5% lift on Big Bench Audio and double-digit gains in alphanumeric transcription are not cosmetic improvements, they directly impact real-world reliability in production voice systems.
Those numbers point to better instruction adherence, cleaner tool calls, and more stable turn-taking, exactly what voice agents have historically struggled with.
Low latency + stronger interruption handling + improved multilingual accuracy makes this feel less like a demo upgrade and more like infrastructure maturing for enterprise use.
Instruction adherence in real-time voice is the unsexy problem that actually determines whether voice agents ship to production or stay in demos. Good to see this getting serious attention.
The multilingual accuracy improvement is the one I'm most curious about. Does it hold up equally across languages or are some still significantly behind English? That gap tends to be what blocks voice AI from working in non-US markets.
Building AI-powered products myself and the tool calling reliability in voice workflows is genuinely one of the harder problems to solve well. A model that actually follows complex instructions mid-conversation without derailing is a big unlock. Congrats on the launch! 🎙️
Report
How are you validating real user behavior at OpenAi right now?
Report
@rohanrecommends Solid update. Multilingual accuracy plus stronger dialog completion makes a real difference for support teams.
Replies
The team at @OpenAI shipped an interesting update!
GPT-Reatime-1.5 is OpenAI's flagship model audio model for voice agents & customer support.
Voice workflows just got stronger with gpt-realtime-1.5 in the Realtime API. The model offers more reliable instruction following, tool calling, and multilingual accuracy.
A +5% lift on Big Bench Audio and double-digit gains in alphanumeric transcription are not cosmetic improvements, they directly impact real-world reliability in production voice systems.
What stands out most from early partner results @Genspark @Sendbird:
66% human connection rate (up from 43.7%)
97.9% perfect score across scored conversations
Problem case rate cut in half
Stronger dialog completion
Those numbers point to better instruction adherence, cleaner tool calls, and more stable turn-taking, exactly what voice agents have historically struggled with.
Low latency + stronger interruption handling + improved multilingual accuracy makes this feel less like a demo upgrade and more like infrastructure maturing for enterprise use.
Excited to see what builders ship on top of this.
@rohanrecommends this is super awesome. congrats on your launch
BrandingStudio.ai
Instruction adherence in real-time voice is the unsexy problem that actually determines whether voice agents ship to production or stay in demos. Good to see this getting serious attention.
The multilingual accuracy improvement is the one I'm most curious about. Does it hold up equally across languages or are some still significantly behind English? That gap tends to be what blocks voice AI from working in non-US markets.
Building AI-powered products myself and the tool calling reliability in voice workflows is genuinely one of the harder problems to solve well. A model that actually follows complex instructions mid-conversation without derailing is a big unlock. Congrats on the launch! 🎙️
How are you validating real user behavior at OpenAi right now?