Garry Tan

Cekura - Observe and analyze your voice and chat AI agents

Out-of-the-box 30+ predefined metrics for analysis on CX, accuracy, conversation and voice quality. Compile perfect LLM judges by annotating just ~20 conversations and auto-improve in Cekura labs. Real-time, segmented dashboards to identify trends in Conversational AI. Smart statistical alerts so that you get notified only when metrics shift from historical baselines. Automated system pings to catch silent production failures.

Add a comment

Replies

Best

Congratulations on the launch team @Cekura

Sidhant Kabra

Thanks@manmohit 

Satvik Dixit

Thanks a lot @manmohit

Hamoodie Ali

⁠Is the metrics customizable @kabra_sidhant

Janhvi Nandwani

@humza_sheikh1 You can define Python-based custom metrics in Cekura with direct access to all processed call data, so you measure exactly what matters to you. You can also define your own success criteria using a rubric-style setup tailored to your use case. The platform is fully modular, so you can go from full automation to fine-grained control depending on what you need.

Sidhant Kabra

@humza_sheikh1  @janhvi_nandwani1 Just to add to it - you can even use our pred-defined metrics for eg: interruptions to define your success criteria. eg: if agent interrupts customer more than n times in the call, you can define an interruption metric failure

Vishruth N

Love the sped at which this team ships! I was curious do you also have plans to roll out observability for images/video agents?

Sidhant Kabra

@vishruth_n Currently we are focussed only on voice and chat modality. We have it our vision to support simulations and observability across modalities

Auren Hoffman

this is super duper cool. future of voice

Sidhant Kabra

Thanks@auren 

Satvik Dixit

@auren Thank you so much!!

Konstantin Sagachev

This is something we've been looking for. We deploy voice and chat AI agents for businesses (support, qualification, scheduling) and QA has always been the manual bottleneck — listening to call recordings, checking if the agent followed the script, catching edge cases.

The 30+ predefined metrics and CI/CD integration is exactly what's needed to ship agent updates with confidence. Do you support Vapi-based voice agents out of the box, or does it require custom integration?

Satvik Dixit

@ksagachev Yes, Vapi is supported out of the box, no custom integration needed. Takes <5 min to setup.

Shashij Gupta

@ksagachev We have a very deep integration with vapi. It should feel seemless

Sidhant Kabra

@ksagachev We have a native integration with Vapi for sending production conversations, tool calls and to run outbound simulations automatically

Randhir
@kabra_sidhant congrats on the launch and great to see as how Cekura shifts the focus from “ is the AI up ?” to " is the AI behaving correctly ? " for voice and chat agents. it was a missing layer for teams shipping real‑world conversational AI at scale. but how do you handle wildly different voice/chat‑agent use cases , any approach ?
Satvik Dixit

@kabra_sidhant  @randhir_kumar7 We find that all conversational agents (chat or voice) need similar metrics to evaluate the content of the conversation - metrics like relevancy, hallucination and customer satisfaction .
Voice agents add complexity, so we have metrics for interruption, latency, pronunciation, and voice quality.
For use-case-specific evaluation (did the agent book the appointment? collect insurance info?) teams can write custom LLM Judge metrics in plain English

Nimesh Chakravarthi

This is a massive launch for such a critical problem in conversational agents today. Curious, what are the most important metrics tracked by customers in the healthcare space?

Satvik Dixit

@nimeshmc Thanks! Healthcare is one of our most active verticals. Expected outcome is critical - did the agent follow required protocols like HIPAA disclaimers, consent, and verification steps?.

Hallucination detection is equally important - the agent must not invent symptoms, dosages, or medical advice.

Kumar Abhishek

This feels like Datadog but for AI behavior instead of infrastructure. That's a good positioning. Congratulations!!

Sidhant Kabra

@zerotox Actually we do test for infrastructure (customers run cron jobs) as well as workflows both but yes we are building Datadog for Conversational AI

Nuseir Yassin

How do you handle false positives in sentiment or hallucination detection?

Sidhant Kabra

@nuseir_yassin1 that's where our metric optimizer comes in. You can use it not only for your custom metrics but can also give feedback to our pre-defined metric in case of false positives and auto-improve

Mikita Aliaksandrovich

Congrats on the launch 🚀
Really important problem to solve!

Sidhant Kabra