Alexander Tibbets

Keystone - Teach your repo how to run itself

Keystone self-configures a working devcontainer for any git repo, all on its own. Give it a repo. Get back a Dockerfile, devcontainer.json, and a passing test runner. It runs a coding agent inside a sandboxed Modal environment so your machine is never touched. It’s open-source, works with Claude Code and Codex, and the dev containers it produces work in VS Code and GitHub Codespaces. pip install imbue-keystone

Add a comment

Replies

Best
Alexander Tibbets
At Imbue, we think code repos should self-describe their ideal environment. That’s why we built Keystone, to help agents automatically configure their own perfect Docker playground. Excited to hear your feedback!
swati paliwal

@mrtibbets How accurate has Keystone been so far at generating correct Docker configs for complex repos with multiple deps, like ML projects?

Thad Hughes

@mrtibbets  @swati_paliwal Lead developer here -- we've spent a fair bit of time tuning Keystone to make it handle a wide range of repositories, from simple Python projects to complex polyglot repos like Scipy, OpenCV, PyTorch, and TensorFlow.

There's definitely variation in agent performance, and this is something we're characterizing carefully and will dive into more in a subsequent research report, so stay tuned!

A short answer to your question: we see Keystone getting a project's tests to pass inside a Docker container 95+% of the time with Claude Opus.

Natalia Iankovych

Can I specify my preferences before the process starts, or does it automatically choose the best option?

Thad Hughes

@natalia_iankovych Right now, the Keystone CLI offers some configuration flags to specify which agent is used (Claude, Codex, or OpenCode), and some constraints like a deadline and a maximum inference cost.

We don't currently support preferences related to Dockerfile configuration, but these would be relatively easy to add. Feel free to leave us a feature request here if what we've already got doesn't meet your needs. Pull Requests are also very welcome!

Ehsan Noursalehi

@thad_hughes_imbue it has been great to see you evolve this work on container setup over the last year from just something for us, to something anyone can use

Kanjun Qiu

Proud of you @thad_hughes_imbue for the work shipping this! I've learned so much from using Keystone as a benchmark to understand cost, performance, and failure modes of Claude Code vs. Codex vs. Opencode.