Forums
LLM-as-a-judge based monitoring is not enough for Voice AI
Most teams scaling Voice AI think they can monitor quality with a simple LLM prompt. They are wrong.
An LLM can t hear a "crunchy" voice line, it can t accurately measure a 500ms "barge-in," and it struggles with the nuances of true conversational flow.
When we built Cekura Monitoring, we realized we had to go beyond the LLM. We combined Heuristic and Statistical models with our Metric Optimizer to solve the "Scaling Wall."
Vocera is now Cekura: Big Updates 🚀
Hi Product Hunt Fam
Cekura (formerly Vocera) just relaunched with some major upgrades:
Chat AI Testing & Observability covers voice and chat flows
Auto Asserted Outcomes every generated scenario knows its expected result
Production Replay simulate real call transcripts for regression checks
Built in Quality Metrics e.g. Instruction Following & Hallucination detection out-of-the-box




