Raptor - Hot patch, cache, protect your for LLM API. Built in Rust.
by•
Rust-powered AI gateway that actually slaps. Semantic caching: 500ms → 8ms. Semantic firewall: catches jailbreaks/malicious actors by intent, not keywords. Hot-patch: fix hallucinations without redeploying.
One line change. Free tier. Your API bill will thank you.
Replies
Best
Maker
📌
We built the "Stripe for Ingestion" so you can stop writing parsers.
Hey Product Hunt!
I’m the maker of Raptor Data.
We built Raptor Data because the "Day 2" of building RAG apps is miserable. You build the prototype, but then users start updating documents. Suddenly, you're writing complex Python scripts to parse PDFs, manage versions, and huge AWS bills to re-embed 500-page files just because a date changed.
We wanted a "Stripe-like" experience for this. One line of code to handle the dirty work.
What is Raptor?
It's a lightweight, Edge-ready TypeScript SDK (other language support coming soon) that gives you:
1. Git-like Versioning: We hash every chunk and sentence. If you upload v2.pdf, we diff it against v1.pdf and only return the chunks that changed.
2. Universal Parsing: PDF and DOCx - we handle the edge cases, zip bombs, and encoding nightmares.
3. Cost Savings: By only embedding the "diffs," our users see ~90% reduction in Vector DB and OpenAI costs.
The Stack:
We built the SDK to be isomorphic. It runs in Node, the Browser, or Cloudflare Workers. The backend is heavily optimized FastAPI for parsing speed.
We have a Free Tier (1k pages). I'd love for you to npm install @raptor-data/ts-sdk and tell me if the DX lives up to the promise.
Happy shipping and thanks for your support!
Report
Maker
The Math That Made Us Build This
I want to share the numbers that convinced us this problem was worth solving.
We analyzed the embedding costs of a typical contract management system:
The Scenario:
500 contracts
Each contract averages 3 versions (negotiation rounds)
That's 62% waste eliminated. At scale, we've seen teams hit 90%+ savings.
The crazier part? Most teams don't even know they're doing this. They see embedding costs rise linearly with document count and assume that's normal. It's not.
Why This Happens
Most document pipelines treat each upload as isolated. They have no concept of:
"This is version 2 of that contract"
"These 950 chunks are identical to what we already have"
"Only these 50 chunks are actually new"
So they re-process everything. Every time.
We built Raptor to track document lineage automatically. When you upload `contract_v2.pdf`, we:
Detect it's related to `contract_v1.pdf`
Diff at the chunk level
Return only what changed
You only embed the diff
The Other Problem We Solve
Cost is one thing. Data quality is another.
PyPDF extracts this financial table:
Revenue
$1,000,000
$1,200,000
COGS
$200,000
Your AI sees flat text and has to guess "$1,000,000" means Q1 Revenue. It guesses wrong. You get hallucinations.
Raptor preserves table structure:
| Metric | Q1 2024 | Q2 2024 |
|---------|------------|--------------|
| Revenue | $1,000,000 | $1,200,000 |
Now your AI knows exactly what each number means.
Try It
If you're building RAG and want to see what you're currently wasting:
1. Process a document
2. Process an updated version
3. Check the dedup stats
The free tier is 1K pages/month. That's enough to run real tests.
Would love feedback on the SDK experience. We're optimizing for "Stripe-like" DX and want to know if we're hitting the mark.
Report
Maker
Here from Australia 🦘 Let me know if you have any questions!
Replies
The Math That Made Us Build This
I want to share the numbers that convinced us this problem was worth solving.
We analyzed the embedding costs of a typical contract management system:
The Scenario:
500 contracts
Each contract averages 3 versions (negotiation rounds)
~1,000 chunks per contract
Traditional RAG approach:
What actually changed between versions?
With version-aware processing:
That's 62% waste eliminated. At scale, we've seen teams hit 90%+ savings.
The crazier part? Most teams don't even know they're doing this. They see embedding costs rise linearly with document count and assume that's normal. It's not.
Why This Happens
Most document pipelines treat each upload as isolated. They have no concept of:
"This is version 2 of that contract"
"These 950 chunks are identical to what we already have"
"Only these 50 chunks are actually new"
So they re-process everything. Every time.
We built Raptor to track document lineage automatically. When you upload `contract_v2.pdf`, we:
Detect it's related to `contract_v1.pdf`
Diff at the chunk level
Return only what changed
You only embed the diff
The Other Problem We Solve
Cost is one thing. Data quality is another.
PyPDF extracts this financial table:
Revenue
$1,000,000
$1,200,000
COGS
$200,000
Your AI sees flat text and has to guess "$1,000,000" means Q1 Revenue. It guesses wrong. You get hallucinations.
Raptor preserves table structure:
| Metric | Q1 2024 | Q2 2024 |
|---------|------------|--------------|
| Revenue | $1,000,000 | $1,200,000 |
Now your AI knows exactly what each number means.
Try It
If you're building RAG and want to see what you're currently wasting:
1. Process a document
2. Process an updated version
3. Check the dedup stats
The free tier is 1K pages/month. That's enough to run real tests.
Would love feedback on the SDK experience. We're optimizing for "Stripe-like" DX and want to know if we're hitting the mark.
Here from Australia 🦘 Let me know if you have any questions!