ANKIT AGARWAL

We built Agent-Corex after hitting 'context bloat hell' with 200+ tools

byβ€’

Hey everyone! πŸ‘‹

We just shipped Agent-Corex, and I want to share the story of why we built it.

The Problem We Faced:

Six months ago, we were building an LLM agent system that had access to ~200 different tools. We did what seemed logical: we dumped all of them into the system prompt.

It was a disaster.

  • Our API costs exploded (30K tokens per request 😱)

  • Inference was slow (2.3 seconds per response)

  • The LLM kept getting confused about which tool to use

  • We were burning through context windows like crazy

We realized we had a problem: how do you intelligently select which tools to include without manually curating for every scenario?

The Solution:

We built a hybrid ranking system that:

  1. Keyword matches your query against tool names/descriptions (<1ms)

  2. Understands semantics using embeddings to find related tools (50-100ms)

  3. Scores everything using a smart blend (30% keyword + 70% semantic)

Result? Only 5-10 tools per query instead of 200.

The impact:

  • βœ… 68% reduction in API costs

  • βœ… 4.6x faster inference

  • βœ… Same capability (the LLM still has access to everything, just smarter selection)

  • βœ… 95%+ test coverage, production ready

Why Open Source:

We realized this is a problem every team building LLM agents faces. So we open-sourced it (MIT license) with zero dependencies for basic usage.

What We're Looking For:

  1. Early adopters - Try it, break it, tell us what sucks

  2. Use cases - How are you using it? What edge cases are we missing?

  3. Contributions - Better ranking algorithms? Different embedding models? We're all ears

  4. Feedback - Before we build the enterprise version, what features would actually help?

Quick Start:

pip install agent-corex

Then:

from agent_core import rank_tools

# One line to get smart tool selection
relevant_tools = rank_tools(
    query="your task here",
    tools=all_your_tools,
    method="hybrid",
    top_k=5
)

We're at v1.0.1 and this is just the beginning. Would love to hear what you think, especially if you're already dealing with tool selection headaches.

Ask us anything:

  • How does it compare to your current approach?

  • Are there use cases we're not thinking about?

  • What would make this 10x better for your workflow?

Looking forward to building this with the community! πŸš€

8 views

Add a comment

Replies

Best
Dipjyoti sharma
Very interesting approach πŸ‘ Tool selection is a big pain point in AI workflows. Would love to see how it scales with larger systems.
ANKIT AGARWAL

@dipjyoti_sharmaΒ 
Appreciate it πŸ™Œ β€” that’s exactly the problem we ran into as well.

Scaling is where things really start to break:

  • tool selection quality drops

  • token usage spikes

  • latency grows with every additional tool

What we’ve seen so far is that once you cross ~20–30 tools, naive approaches don’t hold up anymore.

With Agent-Corex, the focus is on keeping the toolset minimal per request using a retrieval + ranking layer, so even if you have 100s of tools overall, the model only sees a small, relevant subset at runtime.

Still early, but initial tests with larger tool sets are promising β€” especially in keeping both token usage and latency under control.

Would love to hear how you’re handling this on your side if you’ve worked with larger systems πŸ‘€

Dipjyoti sharma
@chillbaba That makes a lot of sense πŸ‘ Keeping the toolset minimal per request is a smart approach. Curious β€” how does it perform for beginner-level workflows or smaller projects? Also, do you see this being useful for content creators using AI tools (like blogging, SEO, etc.)?
ANKIT AGARWAL

@dipjyoti_sharmaΒ 
Great question β€” and honestly this is something I was curious about too early on.

For smaller / beginner workflows, the benefits are still there, just less obvious at first:

  • fewer tools β†’ less confusion for the model

  • more consistent outputs (especially for simple tasks)

  • lower token usage even on basic setups

Where it really starts to shine is when workflows grow a bit:

  • combining multiple tools (APIs, scrapers, content generators, etc.)

  • or chaining steps like research β†’ draft β†’ optimize

For content creators (blogging, SEO, etc.), I actually think it’s quite relevant:
instead of exposing every possible tool (keyword research, SERP analysis, writing, editing, publishing), the system can just bring in what’s needed per step.

So:

  • β€œwrite blog intro” β†’ only writing tools

  • β€œoptimize for SEO” β†’ keyword + SEO tools

That keeps things faster and more predictable, even if the overall system has a lot of capabilities under the hood.

You don’t really feel the complexity, but you benefit from it as things scale.

Curious β€” are you using AI more for content workflows or something else?