Taylor Moore

The "Day 2" Problem in RAG: Why don't we treat documents like code?

by

We’ve all built the "Hello World" RAG app. You upload a PDF, chunk it, embed it, and chat with it. It works great.

But what happens on Day 2 when the user uploads Contract_v2.pdf with a single typo fix?

In 90% of pipelines I see, the logic is:

  1. Delete all old vectors.

  2. Re-parse the file.

  3. Re-embed the entire 500-page document.

This feels insane to me. We wouldn't re-compile an entire OS just to change one line of code. We use Git. We use incremental builds. Yet in AI, we are burning massive amounts of compute/money re-indexing unchanged data.

The "Stripe for Ingestion" Thesis
We realised that proper document versioning requires serious engineering discipline:

  • Structure-Aware Chunking (to prevent "boundary shifts" where one edit ripples through the whole file).

  • Fuzzy Deduplication (to catch typo fixes without re-embedding).

  • Zero-Retention Security (processing in memory so sensitive docs aren't stored).

We spent the last few months packaging this into an API (Raptor Data) to basically be "Git for Embeddings."

The Question:
For those running RAG in production: How are you handling document updates?
Are you eating the cost of re-embedding? Or did you build your own custom "diffing" logic?

I’d love to hear how others are solving this infrastructure bottleneck.

6 views

Add a comment

Replies

Be the first to comment