Venkat Arvapally's profile on Product Hunt

About

COO and Co-Builder at AI20 Labs, specializing in scaling AI-native architectures and distributed infrastructure. Architect of agentic workflows and enterprise-grade systems, leveraging a deep stack in TypeScript, Node.js, and Next.js. Strategic technology partner dedicated to developing scalable, high-performance solutions for healthcare and electric mobility. Expert in distributed AI inference, focused on driving operational efficiency through agentic automation and Small Language Models (SLMs). Passionate entrepreneur and mentor committed to building the next generation of intelligent, distributed software ecosystems.

Forums

p/edgee

•

3mo ago

Token Compression for LLMs: How to reduce context size without losing accuracy

Hey, I'm Sacha, co-founder at @Edgee

Over the last few months, we've been working on a problem we kept seeing in production AI systems:

LLM costs don't scale linearly with usage, they scale with context.
As teams add RAG, tool calls, long chat histories, memory, and guardrails, prompts become huge and token spend quickly becomes the main bottleneck.

So we built a token compression layer designed to run before inference.

Venkat Arvapally

About

Badges

Forums

Token Compression for LLMs: How to reduce context size without losing accuracy