Mercury Edit 2 is a coding-focused diffusion LLM built specifically for next-edit prediction. It uses your recent edits and codebase context to suggest the next change, with much higher acceptance and much lower latency than typical code-edit models.
Mercury 2 ditches sequential decoding for parallel refinement. As the first reasoning diffusion LLM, it generates tokens simultaneously to hit 1,000+ tokens/sec. This delivers reasoning-grade quality inside tight latency budgets for your agentic loops.
Mercury, from Inception Labs, is the first commercial diffusion LLM. Up to 10x faster than autoregressive models, with comparable or better quality on coding tasks.