Currently AI browser agents send screenshots to the model. Each screenshot costs thousands of tokens. Over a multi-step task, that means high latency and high API cost.
This package takes a different approach: it renders pages as ASCII wireframes with numbered elements. The agent sees [12]Sign Up instead of a 1280x720 image. Same information, far fewer tokens.
It started as a way to make my own agents cheaper to run. Then I build a package around it. Fully open source and open to feedbacks!
Report
Ascii wireframes are a cool idea maybe a small visual preview option could help too.
Great idea, best of luck with the launch :) looks very 1984 hacker vibe :) I love it and the practical applications seem very real especially for scraping etc
If someone is already using a Playwright-based MCP server (or a screenshot/vision-based computer-use setup), what’s the specific breaking point that typically makes them switch to Agent Browser, and what do they usually have to give up—if anything—in return for the token savings?
@curiouskitty The real breaking point is scale. With existing approaches, a single page can cost around 10k tokens, while Agent Browser typically uses only 1k–3k. And browser agents rarely perform just one action. They usually run multi-step workflows. This means Agent Browser can reduce browser operation costs by up to 70–90%. If a full screenshot is ever needed, the agent can still take one, so switching methods doesn’t mean giving anything up.
Replies
Agent Browser
Ascii wireframes are a cool idea maybe a small visual preview option could help too.
Agent Browser
@reid_anderson3 The library still exposes screenshot tooling, so agent can take a screenshot if needed.
AutonomyAI
Great idea, best of luck with the launch :) looks very 1984 hacker vibe :) I love it and the practical applications seem very real especially for scraping etc
Agent Browser
@lev_kerzhner Thanks! It is also capable of filling inputs, clicking buttons etc. with refs.
Product Hunt
Agent Browser
@curiouskitty The real breaking point is scale. With existing approaches, a single page can cost around 10k tokens, while Agent Browser typically uses only 1k–3k. And browser agents rarely perform just one action. They usually run multi-step workflows. This means Agent Browser can reduce browser operation costs by up to 70–90%. If a full screenshot is ever needed, the agent can still take one, so switching methods doesn’t mean giving anything up.