Google

Organizing the world's information

4.9•70 reviews•

12K followers

Organizing the world's information

4.9•70 reviews•

12K followers

Visit website

Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exactly what you're looking for.

This is the 514th launch from Google. View more

TorchTPU

Launched this week

Running PyTorch Natively on TPUs at Google Scale

TorchTPU is Google's PyTorch-native backend for TPUs. Run existing PyTorch workloads with minimal code changes, get 50-100%+ speed gains with Fused Eager mode, and scale to 100K+ chip clusters — no static graph compilation required to get started

Free Options

Launch tags:Open Source•Developer Tools•Artificial Intelligence

Launch Team

Framer — Launch websites with enterprise needs at startup speeds.

Launch websites with enterprise needs at startup speeds.

Promoted

Hunter

📌

Google just made TPUs a first-class target for PyTorch, and you barely need to change your code.

The problem: TPUs power Gemini, Veo, and the largest AI clusters on earth, but using them from PyTorch required workarounds, framework rewrites, and deep hardware expertise most teams don't have.

The solution: TorchTPU is a PyTorch-native backend that lets you change one line of initialization and run your existing training loop on TPU, no core logic changes required.

What stands out:
⚡ Fused Eager mode: Auto-fuses ops on the fly for 50-100%+ speed gains, zero setup required by user

🐛 Debug Eager: Catches shape mismatches, NaNs, and OOM errors one op at a time so you fix bugs faster
🔁 Strict Eager: Async single-op dispatch mirrors the default PyTorch experience for a flat learning curve
🔧 torch.compile via XLA: Peak performance with full-graph compilation, battle-tested for TPU topologies
📦 Custom kernels via Pallas & JAX: Write low-level hardware instructions without breaking performance
🌐 DDP, FSDPv2, & DTensor supported: Scale distributed training without rewrites
🔀 MPMD support: Divergent code across ranks works without breaking your stack
💾 Shared Compilation Cache: Reduces recompilation overhead across single & multi-host deployments

On the roadmap for 2026:
- Public GitHub repo with docs and reproducible tutorials
- Dynamic shapes support via torch.compile
- vLLM and TorchTitan integrations
- Linear scaling validated up to full Pod-size TPU infrastructure
- Native multi-queue support for async codebases

Different because this isn't a wrapper or a fork; TorchTPU integrates at the PyTorch PrivateUse1 level so you get ordinary PyTorch Tensors on TPU hardware, no subclasses, no rewrites, no friction.

Perfect for ML engineers and research teams running PyTorch workloads who want to leverage Google TPU infrastructure without abandoning their existing codebase.

P.S. I hunt the latest and greatest launches in tech, SaaS and AI, follow to be notified → @rohanrecommends

Report

3d ago

@rohanrecommends For a mid-sized research setup, what's the biggest gotcha you've hit when scaling from single-host to multi-pod, and how does the shared cache help there?

Report

1d ago

honestly the Fused Eager mode is what caught my attention here — getting 50-100% speedups without touching your training loop is pretty wild. been running some PyTorch fine-tuning jobs on A100s and the compile step is always where things get messy. curious how the debug eager mode handles mixed precision edge cases though, that's usually where I spend half my debugging time. the fact that this works at PrivateUse1 level instead of being a wrapper is a huge deal for anyone maintaining custom training pipelines

Report

15h ago

Running existing PyTorch workloads on TPUs with minimal code changes is compelling — what's the experience like for jobs that depend on custom CUDA kernels? That's typically where XLA/TPU migration breaks down for large training pipelines.

Report

1d ago

The Skill Map output is what I want to understand better. Is it a snapshot — like a score you get once after a session — or does it update over time as you do more scenarios? Because a one-time assessment is pretty different from something that tracks how you're actually improving.

Report

1d ago

#1 Product Memoriam Product of the Year

More recipients →

Previous Google Launches

Vantage in Google LabsPractice & assess future-ready skills with AI-simulated team

Launched on April 19th, 2026

Notebooks in GeminiKeep every project, chat, and file in one focused space

Launched on April 18th, 2026

Subagents in Gemini CLIGemini CLI now runs specialist subagents in your terminal

Launched on April 16th, 2026

Gemini Robotics ER 1.6Google's SOTA robotics model for visual & spatial reasoning!

Launched on April 15th, 2026

View all Google launches

Forum Threads

p/google

•

3mo ago

Featured

Apple confirms Gemini-powered future for Apple Intelligence

In a notable shift in the AI landscape, Apple and Google have announced a multi-year collaboration under which Apple s next generation of Apple Foundation Models will be based on Google s Gemini models and cloud technology.

According to the joint statement, these models will help power upcoming Apple Intelligence features, including a more personalized Siri, expected later this year.

p/google

•

11mo ago

Google Annual I/O conference 2025 and updates you can't miss

Google released a lot of updates at its annual I/O conference.

One of them, which has been a success, is 3D video conferencing (it reminds me a bit of the era when 3D movies were a big boom in 2009). So I assume that we may soon see 4D and 5D experiences.

p/google

•

12mo ago

How can I actually get Google Cloud Credits? (Too much is never enough)

I m growing a small SaaS. And cloud costs are starting to hurt. I keep hearing about founders stacking $100-300k in Google Cloud credits, but all the advice feels vague or locked behind big-name accelerators.
Where did you actually get credits?
Any creative hacks or things to avoid?
If you ve cracked this, I d love to hear what worked.

And if you re still figuring it out too, just drop a comment. If I ve gathered some useful stuff, I'll be happy to share.

View all

Google just made TPUs a first-class target for PyTorch, and you barely need to change your code.

The problem: TPUs power Gemini, Veo, and the largest AI clusters on earth, but using them from PyTorch required workarounds, framework rewrites, and deep hardware expertise most teams don't have.

The solution: TorchTPU is a PyTorch-native backend that lets you change one line of initialization and run your existing training loop on TPU, no core logic changes required.

What stands out:
⚡ Fused Eager mode: Auto-fuses ops on the fly for 50-100%+ speed gains, zero setup required by user

Perfect for ML engineers and research teams running PyTorch workloads who want to leverage Google TPU infrastructure without abandoning their existing codebase.

P.S. I hunt the latest and greatest launches in tech, SaaS and AI, follow to be notified → @rohanrecommends

Google

Organizing the world's information

Organizing the world's information

TorchTPU

Previous Google Launches

Forum Threads

Apple confirms Gemini-powered future for Apple Intelligence

Google Annual I/O conference 2025 and updates you can't miss

How can I actually get Google Cloud Credits? (Too much is never enough)

Previous Google Launches

Forum Threads

Apple confirms Gemini-powered future for Apple Intelligence

Google Annual I/O conference 2025 and updates you can't miss

How can I actually get Google Cloud Credits? (Too much is never enough)

What's great

What needs improvement

What's great

What needs improvement

What's great

What's great

What needs improvement

What's great

What needs improvement

What's great