CY

Half your video isn’t being translated

We just launched on-screen text translation — not just subtitles.

As a quick example, we translated Jensen Huang’s GDC talk into Chinese:

  • audio → Chinese voice, tone preserved

  • on-screen content → localized, style preserved

👉 audio + visuals, fully aligned

https://youtu.be/fw82iPQUCU8

This is one of the core use cases behind our recent launch (#2 Product of the Month, right behind Google’s Stitch 2).

More examples coming — curious what you think.

85 views

Add a comment

Replies

Best
Oliver Nathan

This feels way more complete than subtitle only localization. How big a difference has on screen text translation made in output quality?

CY

@oliver_nathan2 Bigger than expected.

On-screen text often carries info that never shows up in the audio—and when it’s not translated, it breaks immersion instantly.

Once that layer is localized, it finally feels complete.

Paige Lauren

Audio translations gets most of the attention, but visual text is usually what breaks immersion. Was that the main gap you wanted to solve?

CY

@paige_lauren1 Yeah—that was a big part of it. Audio gets most of the attention, but on-screen text often carries key info that never makes it into the voice.

Jade Melissa

"Audio + visuals fully aligned" is the part that stands out most here, what was the hardest part technically?

CY

@jade_melissa1 yes, solving that meant treating video as a multimodal problem—not just voice, but visuals + layout together.

Ian Maxwell

This seems like one of those features that feel obvious once it exists, what made you prioritize it now?

CY

@ian_maxwell2 It felt obvious in hindsight, but much harder to actually solve.

We finally worked out an AI approach that can handle both audio and on-screen elements together—but getting there took longer than expected. The edge cases around layout + visuals are no joke.

Felt like the right time once the tech actually caught up.

CY

@chrismessina thanks again for the push to share more examples here — and again for hunting us 🙏

This Jensen video is one of the first concrete use cases from the launch. We’ll share more in the coming weeks.

Miles Anthony

Curious how you handle layouts. Does preserving style across different languages create a lot of edge cases?

CY

@miles_anthony2 Yeah — as expected, keeping style consistent across languages is surprisingly hard

Oliver Nathan

This feels especially useful for global YouTube content are longer form videos the strongest use case right now?

CY

@oliver_nathan3 yes, Youtuber is one of the strongest use case.

Sadie Charlotte

Translating on screen content probably changes the perceived quality a lot more than people expect. Have users reacted more to that than the voice layer?

CY

@sadie_charlotte1 Great question — we’re still figuring that out. Early signals suggest people really notice the on-screen text layer more than expected, but it’s a bit hard to separate from the overall “everything feels more native” effect.

Gaurav Singh

This is solving a real pain point for performance marketers running international campaigns.

The problem with "translated" video ads right now: you dub the voice, add subtitles, and call it done — but the on-screen text still says "Limited Time Offer" in English while the voiceover is in Thai. The disconnect kills trust and conversion rates in non-English markets.

Building ad-vertly.ai, we work with brands running paid social across markets and the localization gap is consistently one of the top creative complaints. A "translated" ad that's only 70% translated isn't really localized — it's just accented.

Full-layer translation (audio + on-screen + subtitles all in sync) is table stakes for any brand taking APAC or LATAM seriously. The Jensen Huang example is a perfect demo because it shows how much on-screen text carries the actual meaning of a technical talk.

Curious: are you seeing more adoption from marketing/creative teams or from content creators? I'd expect the use case to be quite different between the two.

Farhad Asbaghipour

This is a big gap most people overlook.

Subtitles alone don’t carry tone, context, or on-screen meaning — especially for content-heavy videos.

Full localization (voice + visuals + intent) feels like the real unlock for global reach.