Sourav Sanyal

We asked what felt off about AI voices, you told us. We’re fixing it.

by

Over the past few months, we’ve been talking to a lot of you using Velo.

Real conversations, and people trying it out, sending clips, pointing things out.

And almost everyone said some version of the same thing: “It sounds like me but something feels missing.”

At first, we thought it was about accuracy. Maybe the voice wasn’t close enough. But the more we listened, the clearer it became - that wasn’t the issue.

The issue was how it felt. The tone stays a bit too samey. The emphasis doesn’t always land where you expect it to. And the little natural shifts that make your voice yours just aren’t fully there yet. It sounds right, but it doesn’t feel alive.

So we went back and started reworking how we think about voice cloning at Velo. Not just matching how you sound, but capturing how you express. The way your voice changes when you’re explaining something, when you’re just talking casually, or when you actually care about what you’re saying.

That’s what we’re building now. The next version of Velo is focused on higher fidelity voice cloning. More nuance. Better pacing. More natural expression.

Something that doesn’t feel like a generated voice reading your script, but closer to you actually speaking.

We’re still building it, but it’s coming together fast. We’re planning to ship this soon.

If you’ve used Velo before, we’d love to know - what do you think about Velo's voice cloning or other workflows? What would make it feel right?

We’re listening.

211 views

Add a comment

Replies

Best
Isaac Dominic

What signals define "feels alive" for you?

Sourav Sanyal
@isaac_dominic1 once it feels authentic is an internal benchmark, like exactly like how you sound, but we make your first take the best
Dontell Levesque

@isaac_dominic1 I feel most alive when I’m learning something new and my curiosity is fully engaged. It’s like my brain lights up and I want to keep going deeper.

Cerca Hedgecock

@isaac_dominic1 I’ve noticed I feel alive when I’m completely absorbed in a task and lose track of time. It’s like my mind stops wandering and I’m just there, doing. No pressure, no distraction, just flow.

Martha S Bako

@isaac_dominic1 For me, feeling alive often comes in quiet moments, not loud ones. Like when I’m walking alone and suddenly realize how peaceful everything feels. It’s subtle but it makes me feel deeply connected to myself.

Morgan Nabors

@isaac_dominic1 I feel alive when I’m creating something, even if it’s imperfect. Writing, planning or building ideas gives me a sense of momentum. It feels like I’m expressing something that already exists inside me.

Victoire Mathieu

What specific moment makes the voice feel less natural to you?

Sourav Sanyal
@victoire_mathieu we’ve been testing this internally, but there’s a slight electronic scratch in the voice and it’s not as emotive as we want it to be yet
Tina Kim

What kind of content exposes voice limitations the most? @sourav_sanyal

Sourav Sanyal
@tina_kim2 All numbers, LLMs are really bad at handling numbers or numerical ops
Daisy Morgan

Is this more about data or model architecture?

Sourav Sanyal
@daisy_morgan2 primarily model architecture
Dontell Levesque

I think adding more variation in pacing and slight imperfections could make it feel more human and less like a polished recording.

Sourav Sanyal
Quico Benford

Is what scenarios does the voice feel most alive to you? @sourav_sanyal

Sourav Sanyal
@quico_benford I actually don’t think we know yet internally too
Yara Simone

How do you avoid over-smoothing the voice?

Sourav Sanyal
@yara_simone we try and design the voice and add a lot of checkpoints while the voice is being generated
Violet Amelia

Does this work across different languages?

Sourav Sanyal
Rakesh Gupta

This really captures the gap I feel with most t AI voices. The sound is close, but the emotion and timing always feel slightly off.

Alex J Jemmy

Overall, focusing on how it feels instead of how it sounds feels like the right move 👍 if you get that right, it could change how people actually use voice tools

12
Next
Last