fmerian

TwelveLabs Marengo 3.0 - The most powerful embedding model for video understanding

Marengo 3.0 is TwelveLabs' most significant model to date, delivering human-like video understanding at scale. A multimodal embedding model, Marengo fuses video, audio, and text for holistic video understanding to power precise video search and retrieval.

Add a comment

Replies

Best
Abdul Rehman

Congratulations to the team! This is a huge milestone for AI applied to real-world media.

Nick Vas

TwelveLabs is impressive in pushing the limits of video AI. It seems powerful and efficient. How does it handle complex scenes to ensure accurate context understanding across different video genres?

Savvas Konsta

Congrats!! Sure will be tested for Oasi.ai too!

Allie Bernacchi

Hey Product Hunt! 👋 This is Allie from @TwelveLabs!


Today we’re launching Marengo 3.0 (M3) — our biggest upgrade yet in multimodal AI.

If you’ve ever tried to build on top of models that say they understand video but collapse on long content, sports, or anything beyond short clips… M3 is built for you.

🚀 What’s M3?

M3 is a unified multimodal foundation model powering our Search API and Embed API.
It understands video, audio, images, and text in a single space — fast, efficient, and built for production.

🔥 Highlights

  • Breakaway speed on long-form video processing — practical at massive scale

  • 💾 512-d embeddings → up to 6× more storage-efficient with top-tier accuracy

  • 🎥 True multimodality across video, audio, image, and text

  • 🌍 Native multilingual support (English, Korean, Japanese, and more)

  • 🏀 Elite sports intelligence: fine-grained action recognition, player tracking, and temporal reasoning

  • 🧠 Handles hour-long videos, long queries, and composed queries (image + text)

💡 What you can build

Search platforms, AI agents that watch content, sports analytics tools, compliance systems, media workflows — anything that needs real video understanding.

🛠️ Try Marengo 3.0

Available via:

  • TwelveLabs SaaS (Search API + Embed API)

  • AWS Bedrock

I’m so proud of the research-first team behind this release — and excited to see what you build with M3.

Ask me anything below 👇

Chilarai M

Really awesome. You have a really cool website

Kshitij Mishra

loved it! keep building more products!

haiji

Hi,can I use it for my game promo video?

Hunter Carter

You couldn't come up with a better name than twelve labs? Surely you realize it will sound like a knockoff of 11 Labs to anyone who has heard of them.

Emily Kurze

@school_4_ants Fun fact: TwelveLabs has been around longer than ElevenLabs. ;) We're friends with them and even did a hackathon with them called 23Labs.

Mykyta Semenov 🇺🇦🇳🇱

Congratulations on the new release! We once made a similar service: we recognized text from videos, translated it, and generated videos with the translation. This way, YouTube bloggers could automatically create videos in 70+ languages. YouTube even officially recommended this service later.

Dana Ram

This is great!! Congrats on the launch

Would love to test this for auto generating summaries of short films. Does it handle narrative structure well or its more optimized for action/object detection?