Garry Tan

Cekura - Observe and analyze your voice and chat AI agents

Out-of-the-box 30+ predefined metrics for analysis on CX, accuracy, conversation and voice quality. Compile perfect LLM judges by annotating just ~20 conversations and auto-improve in Cekura labs. Real-time, segmented dashboards to identify trends in Conversational AI. Smart statistical alerts so that you get notified only when metrics shift from historical baselines. Automated system pings to catch silent production failures.

Add a comment

Replies

Best
Sidhant Kabra

Hi Product Hunt! 👋

We are excited to launch Cekura Monitoring for Voice and Chat AI companies. Most monitoring tools tell you if your AI is up. Cekura tells you if it is behaving.

When we had first launched Cekura QA, we thought we had solved the problem for both testing and monitoring . But as our users scaled, we noticed a painful pattern: While pre-production QA was automated, teams were still spending dozens of hours manually listening to thousands of calls.

The two big blockers we saw were:

  1. The Scaling Wall: Defining and optimizing custom metrics was taking too long, forcing teams back into manual spot-checks.

  2. Production Blindspot: Standard LLM metrics misses the Customer Experience in Voice AI - things like agent tone and customer sentiment that actually defines customer success.

We have rebuilt the monitoring layer from the ground up to solve this. Cekura Monitoring turns that "wall of noisy logs" into actionable signals.

🚀 What’s New in Cekura Monitoring:

  • 30+ Predefined Metric Suite: We track what actually breaks Voice and Chat agents across four critical categories:

    • Speech Quality: Voice clarity, pronunciation, and gibberish detection.

    • Conversational Flow: Silences, interruptions (barge-ins), and termination triggers.

    • Accuracy & Logic: Hallucinations, transcription accuracy, and relevancy.

    • Customer Experience: CSAT, Sentiment analysis, and drop-off points.

  • Metric Optimizer: Stop "vibes-based" prompt engineering. Define a metric (e.g., Successful User Authentication), tag 20 calls in our Labs interface, and our optimizer "compiles" a prompt that aligns with your specific feedback.

  • Statistical Intelligence: No more fixed, noisy thresholds. Our Alerting Engine learns your agent's baseline and only pings Slack when metrics shift from historical norms.

  • Automated Cron Jobs: Set up recurring health checks to simulate production conversations. Catch silent failures and regressions before your customers do.

  • Visual Dashboards: Real-time distribution charts for each metric. Views customized for each stakeholder

Who is this for?

Teams scaling Voice & Chat AI who are tired of listening to calls manually and need a way to prove their agents are actually working.

Sign up and try for free at cekura.ai or drop your questions below! We would love to hear how you’re currently handling Voice and Chat AI in production👇

Rohan Chaubey

@kabra_sidhant Many congratulations on the launch, Sidhant! I've been tracking it since the Vocera days, it's evolved impressively and keeps getting better. Thrilled to see the buzz in voice AI communities especially on Reddit. Onwards and upwards! :)

Sidhant Kabra

@rohanrecommends Thanks Rohan - also for all your guidance on best practices for Product Hunt Launches!

Janhvi Nandwani

Blind spots in production voice agents are brutal — you don't know your agent is skipping verification steps or missing required disclosures until a compliance team surfaces it weeks later. Monitoring 100% of live calls at the session level rather than spot-checking is the only real fix. The P50/P90 latency tracking and interruption detection on production traffic is also underrated — that's where infrastructure regressions hide.

adarsh raj

We are thrilled to share Cekura Monitoring with the PH community!

Most teams focus solely on whether a voice AI agent reaches the 'correct' outcome, but they often overlook the nuances that actually define the user experience: tone, transcription accuracy, TTS quality, and pronunciation.

While working on scaling to handle thousands of parallel calls, we realized just how easily these small details can degrade at volume. Cekura was built to ensure your agents don’t just work but they sound perfect.

Check out the product and let us know what you think!

Om Dahale

One of the most common issues we see voice agent makers run into is their agent keeps interrupting the caller. It's frustrating for users and easy to miss during development. With our interruption metric, teams can catch this early and fix it before it reaches real users, and that's just one of the many predefined metrics we offer out of the box, try it now!

Nimish Gahlot

How are you different from tracing platforms like Braintrust and Galileo ? Except Voice metrics.

Sidhant Kabra

@nimishg We are E2E conversational AI QA - Some of the big differences:

  • We run E2E multi turn simulations instead of trace level logging

  • These platforms does not offer Metric optimizer - without metric optimizer, it takes huge time to fine tune LLM-as-a-judge metrics

  • We also offer replay of production conversations to ensure the fix is incorporated.

In short we are very deep and verticalized in Conversational AI evals - they are more horizontal general agentic AI evals platforms

Shashij Gupta

@nimishg Braintrust/galileo are very horizontal for all llm agents. We are specialised for conversations, our UI, Metrics, dashboards are highly specialised for conversations.

Ishita Dev

So excited to see this live! 🎉

Been working closely on Cekura's monitoring features and what makes this special is how much it closes the loop for conversational AI teams — you're not just testing in pre-prod and hoping for the best, you're getting visibility into what's actually happening in production calls.

This one's been a long time coming! 🚀

Rahul Hiragond

Really excited to see this out 🎉

Working on alerting and simulation quality made it clear how hard it is to catch subtle regressions early—this is a big step toward making that reliable in production.

Glad to finally have this live 🚀

Nikunj Agarwal

Congratulations on the launch!!

Do you guys also support on prem deployment to ensure privacy?

Sidhant Kabra

@nikunjagarwal321 We support VPC deployments on customer instance. Additionally:

  • We sign BAA and DPA with customers

  • We have PII redaction on our side both from audio as well as transcript

Shashij Gupta
Mihir Kanzariya

The "is it behaving" vs "is it up" distinction is spot on. We've had AI chat agents pass every health check while giving completely wrong answers to customers. Uptime metrics are useless if the AI is confidently hallucinating.

How granular does the sentiment tracking get? Like can it detect when an agent starts being passive aggressive or gives a technically correct but unhelpful response? That's the stuff that kills user trust slowly.

Sidhant Kabra

@mihir_kanzariya We are currently building turn level sentiment tracking - should be live in a week's time. Currently it gives overall sentiment score but granular feedback on where sentiment turned negative.

We have a metric called relevancy which test whether the agent response is relevant to the user question or not

Shashij Gupta

@mihir_kanzariya Sentiment analysis can be made as specific as you want. Our pre-defined metric has 3 states: neutral, positive, negative. But it is very seamless to tune this metric and have many other states. You should be able to create a highly accurate custom metric within 5 mins

Rishav Mishra

⁠Is the metrics customizable ? For example I need to define success criteria by peak latency and not mean latency

Atul Jain

@rishav_mishra3 Yes, Cekura is modular in a way that lets you go from full automation to full control, depending on your needs.

One of our key features is Python based metrics with access to all processed data, so you can measure exactly what you care about, like peak latency instead of mean latency. We also support defining your own success criteria using a flexible rubric style configuration.

Shashij Gupta

@rishav_mishra3 yes they are customisable. We expose the code of our latency metric which you can customise to get peak latency instead.

123
•••
Next
Last