Over the last few months, we've been working on a problem we kept seeing in production AI systems:
LLM costs don't scale linearly with usage, they scale with context. As teams add RAG, tool calls, long chat histories, memory, and guardrails, prompts become huge and token spend quickly becomes the main bottleneck.
So we built a token compression layer designed to run before inference.
Last month, Cursor launched for the fifth time on Product Hunt in 2025.
The 2024 Product of the Year [1] still hits the charts. They have launched web and mobile agents, a visual editor, and 2.0, consistently ranking in the Top 5 Products of the Day.