What breaks when you run long projects through AI agents?

I've been using AI agents for multi-session projects (not one-shot tasks...actual work with phases, dependencies, and handoffs between sessions). Curious what patterns other people are seeing.

Some things that keep breaking for me:

The agent forgets what happened last session and redoes work. It invents new scope nobody asked for. It says "done" but never actually verified anything. Rules from early in the project silently get dropped as context grows.

Has anyone found good solutions to any of these?

I've been experimenting with externalizing project state to files the agent re-reads each session, which has helped a lot...but I'm curious what approaches others have tried.

(Full disclosure: I'm building something in this space and launching here soon. But genuinely want to hear what's working and what's not for people running real projects through agents.)

3 views

What breaks when you run long projects through AI agents?

Replies