Launched this week

Edgee

Launched this week

The AI Gateway that TL;DR tokens

210 followers

The AI Gateway that TL;DR tokens

210 followers

Visit website

AI Infrastructure Tools

•

AI Metrics and Evaluation

•

LLM Developer Tools

Edgee compresses prompts before they reach LLM providers and reduces token costs by up to 50%. Same code, fewer tokens, lower bills.

Free Options

Launch tags:Software Engineering•Developer Tools•Artificial Intelligence

Launch Team / Built With

Flowstep — Generate real UI in seconds

Generate real UI in seconds

Promoted

Typeform

Hunter

As an indiehacker, I am always afraid of receiving an expensive bill because my AI feature suddenly saw a lot of usage. Anything that can help reduce costs and give me insights into what's going on, is welcome.

It's no brainer to use it from day 1, and see value right away.
Congrats @sachamorard team for building this💪

Report

3d ago

Maker

Thanks a lot @picsoung for the support 🙌

And totally agree! That "unexpected AI bill" fear is real, especially for indie hackers and small teams where one spike can ruin the month 😅

That's exactly why we built Edgee: so you can get cost visibility + optimizations (like token compression) from day one, before things get out of control.

Really appreciate you hunting and sharing this. Excited to hear what you build with it! 🚀

Report

3d ago

Maker

@sachamorard @picsoung We've heard this from pretty much every CTO and CEO we've talked to in Europe and the US. The end-of-month bill can be a real shock! 💸

Report

3d ago

Inyo

As a product guy in the agentic platform space, I’m definitely going to keep a close eye on this one. Good luck with the launch!

Report

3d ago

Maker

@yannick_mthy The agentic space is exactly where we’re seeing things get interesting (and complex) fast, especially with growing context sizes, tool calls, and multi-model orchestration.

Would love to hear how you're currently handling cost + routing on the agent side. Always keen to learn from teams building in this space. Thx

Report

3d ago

Product Hunt

Gateways can become a new reliability and latency bottleneck: what’s Edgee’s architecture for keeping p95/p99 overhead low (especially for streaming and agent tool-call loops), and how do you handle failure modes like retries causing traffic spikes or provider brownouts?

Report

2d ago

Maker

@curiouskitty Great question, and totally valid concern!

We're edge-native, so we avoid adding a centralized bottleneck and keep network hops minimal. Edgee is running on more than 100 points of presence around the world, on more than 10k servers, and we already process 3B+ requests a month ;)

Streaming is first-class, and pre-inference workloads run before the model call, so they don't block token streaming.

On reliability: we don't do blind retries. Routing is health-aware, with bounded retries, circuit-breaker behavior, and dynamic deprioritization during brownouts to avoid traffic amplification.

To summarize, Edgee will be to AI what CDNs were to the web.

Happy to go deeper if helpful!

Report

2d ago

Plezi

Congrats on the launch!
We're stuck on how to attribute LLM costs back to specific features. Does Edgee tag requests so we can track cost per feature?

Report

3d ago

Maker

Hello @benoit_collet, thanks for the interest !
Good question to ask, it is quite a pain we've experienced where cost was only analyzable by API key which could be painful as you might not want to have 50 different keys just for the purpose of cost categorization.

We've created the "tags" feature which allow you (via API headers or via our SDKs) to automatically define categories. Tags will be visible in your analytics dashboard to allow you to understand exactly where you are spending the most !

You can learn more on our documentation : https://www.edgee.ai/docs/integrations/langchain#tags

Report

3d ago

Maker

The documentation I've sent is part of our Langchain SDK

This doc enters more in depth into what tags really are

Report

3d ago

PhotoRoom

Congrats on the launch! will closely follow as the topic is complex and moves fast!

Report

3d ago

Maker

@olivier_lemarie1 Thank you ! Indeed, a very exciting and challenging topic and so many things to explore and improve :D We'll soon be having a series a blog posts going through all the details and the research around compression, so stay tuned !

Report

3d ago

Batch

💎 Pixel perfection

@sachamorard Token costs are definitely becoming a real problem once prompts get large (RAG, tools, agents…).

Curious how you handle compression without breaking output quality, especially for structured outputs?

Report

2d ago

Maker

@sachamorard @virtualgoodz Yeah alignement is a big issue when doing any prompt transformation !

In general, tracking performance across a mix of semantic preservation metrics like bert, cosine, rouge making sure that they don't degrade below a certain threshold is good proxy.

For structured output, things are trickier, as the compression shouldn't be "generative", in the sense of re-expressing with other tokens, so it's more deterministic through a more compact re-encoding of the structure, through crushing, factorizing repetitions and so on !

Glad to discuss this further if needs be :D

Report

2d ago

We're experimenting with cheaper models to control costs, but quality suffers.

Can Edgee help us stay on expensive models but reduce token usage instead?

Report

3d ago

@pierregodret Yes, that’s exactly what Edgee does.

Edgee optimizes your prompts at the edge using intelligent token compression, removing redundancy while preserving meaning, then forwards the compressed request to your LLM provider of choice. You can also tag requests with metadata to track usage/costs and get alerts when spend spikes.

Happy to discuss this further if you’d like.

Report

2d ago

Maker

Absolutely @pierregodret . With our token-compression model, the LLM bill mechanically decreases, so it's actually a good opportunity to afford a slightly more expensive model... for the same price ;)

Report

2d ago

@sachamorard But how do you ensure that critical context is not lost after compression.
How do you evaluate your model?
This would be a huge gain, But I am sceptical about quality, because two piece of text might be semantically similar but not mean the same thing.

Report

16h ago

1 2 3 4