Sonny Macaravey

Everyone talks about perfect prompts, but the real problem is memory - say hello to thredly!

by

I’ve noticed something strange when working with AI like ChatGPT and Gemini. You can craft the most elegant prompt in the world, but once the conversation runs long, the model quietly forgets what was said earlier. It starts bluffing, filling gaps with confidence, like someone trying to recall a story they only half remember.

That made me rethink what prompt engineering even is. Maybe it’s not just about how you start a conversation, but how you keep it coherent once the context window starts collapsing.

I began testing ways to summarise old messages mid-conversation, compressing them just enough to preserve meaning. When I fed those summaries back in, the model continued as if it had never forgotten a thing.

That experiment eventually became thredly, a small tool I launched here recently that automates that process, turning long chats into structured memory hand-offs you can reuse anytime.

It turns out, memory might be the most underrated part of prompt design. The best prompt isn’t always the one that gets the smartest answer, it’s the one that helps the AI remember what it’s already learned.

Has anyone else tried building their own memory systems or prompt loops to maintain long-term context?

14 views

Add a comment

Replies

Best
Esther George
This is such a smart take, memory is definitely the underrated side of AI conversations. I’ve noticed the same thing: no matter how perfect the prompt, once the context window stretches, things start drifting. Sometimes I get so mad and say 'stop hallucinating and follow the rules given here' 😭😭it pisses me off for real (That's why I hardly ask it to do stuffs anymore). Automating the summarization/hand-off like Thredly does seems like a huge time-saver. Have you tested how it performs over really long threads, like 100+ messages? @thredly
Sonny Macaravey

@george_esther Yeah exactly! The longer the thread, the more it starts bending context instead of building on it 😅
That’s the main thing I wanted to fix. thredly keeps structure and reasoning intact for well over 100s messages, the summaries stay hierarchical, so when you restart a chat, it re-injects the logic without reloading the noise.
Still fine-tuning how it handles multi-branch threads, but for straight-line ones it’s rock solid.