We built Agent-Corex after hitting 'context bloat hell' with 200+ tools

by•11d ago

Hey everyone! 👋

We just shipped Agent-Corex, and I want to share the story of why we built it.

The Problem We Faced:

Six months ago, we were building an LLM agent system that had access to ~200 different tools. We did what seemed logical: we dumped all of them into the system prompt.

It was a disaster.

Our API costs exploded (30K tokens per request 😱)
Inference was slow (2.3 seconds per response)
The LLM kept getting confused about which tool to use
We were burning through context windows like crazy

We realized we had a problem: how do you intelligently select which tools to include without manually curating for every scenario?

The Solution:

We built a hybrid ranking system that:

Keyword matches your query against tool names/descriptions (<1ms)
Understands semantics using embeddings to find related tools (50-100ms)
Scores everything using a smart blend (30% keyword + 70% semantic)

Result? Only 5-10 tools per query instead of 200.

The impact:

✅ 68% reduction in API costs
✅ 4.6x faster inference
✅ Same capability (the LLM still has access to everything, just smarter selection)
✅ 95%+ test coverage, production ready

Why Open Source:

We realized this is a problem every team building LLM agents faces. So we open-sourced it (MIT license) with zero dependencies for basic usage.

What We're Looking For:

Early adopters - Try it, break it, tell us what sucks
Use cases - How are you using it? What edge cases are we missing?
Contributions - Better ranking algorithms? Different embedding models? We're all ears
Feedback - Before we build the enterprise version, what features would actually help?

Quick Start:

pip install agent-corex

Then:

from agent_core import rank_tools

# One line to get smart tool selection
relevant_tools = rank_tools(
    query="your task here",
    tools=all_your_tools,
    method="hybrid",
    top_k=5
)

We're at v1.0.1 and this is just the beginning. Would love to hear what you think, especially if you're already dealing with tool selection headaches.

Ask us anything:

How does it compare to your current approach?
Are there use cases we're not thinking about?
What would make this 10x better for your workflow?

Looking forward to building this with the community! 🚀

8 views

Replies

Best

Very interesting approach 👏 Tool selection is a big pain point in AI workflows. Would love to see how it scales with larger systems.

Report

11d ago

@dipjyoti_sharma
Appreciate it 🙌 — that’s exactly the problem we ran into as well.

Scaling is where things really start to break:

tool selection quality drops
token usage spikes
latency grows with every additional tool

What we’ve seen so far is that once you cross ~20–30 tools, naive approaches don’t hold up anymore.

With Agent-Corex, the focus is on keeping the toolset minimal per request using a retrieval + ranking layer, so even if you have 100s of tools overall, the model only sees a small, relevant subset at runtime.

Still early, but initial tests with larger tool sets are promising — especially in keeping both token usage and latency under control.

Would love to hear how you’re handling this on your side if you’ve worked with larger systems 👀

Report

10d ago

@chillbaba That makes a lot of sense 👏 Keeping the toolset minimal per request is a smart approach. Curious — how does it perform for beginner-level workflows or smaller projects? Also, do you see this being useful for content creators using AI tools (like blogging, SEO, etc.)?

Report

10d ago

@dipjyoti_sharma
Great question — and honestly this is something I was curious about too early on.

For smaller / beginner workflows, the benefits are still there, just less obvious at first:

fewer tools → less confusion for the model
more consistent outputs (especially for simple tasks)
lower token usage even on basic setups

Where it really starts to shine is when workflows grow a bit:

combining multiple tools (APIs, scrapers, content generators, etc.)
or chaining steps like research → draft → optimize

For content creators (blogging, SEO, etc.), I actually think it’s quite relevant:
instead of exposing every possible tool (keyword research, SERP analysis, writing, editing, publishing), the system can just bring in what’s needed per step.

So:

“write blog intro” → only writing tools
“optimize for SEO” → keyword + SEO tools

That keeps things faster and more predictable, even if the overall system has a lot of capabilities under the hood.

You don’t really feel the complexity, but you benefit from it as things scale.

Curious — are you using AI more for content workflows or something else?

Report

10d ago