Connect AI agents to browser through raw CDP

Start new thread

OpenBrowser-AI - Connect AI agents to browser through raw CDP

OpenBrowser-AI

•24d ago

OpenBrowser connects AI agents to browser through raw CDP. No abstraction layer. The LLM writes Python in a persistent namespace, batching operations per call. Page state at ~450 characters. Benchmarked against 3 frameworks on 6 real tasks: 100% accuracy across the board, 2.6x fewer tokens, 59% lower inference costs. Methodology is public and reproducible. MIT licensed. CLI + MCP server. 15 LLM providers. Two published RL studies training open-source models for browser control.

Replies

Best

OpenBrowser-AI

Maker

📌

We built OpenBrowser because we wanted to see what happens when an AI agent talks directly to Chrome. Most browser automation goes through an abstraction layer between the LLM and the browser. We took a different path: raw CDP (Chrome DevTools Protocol). The LLM writes Python code that executes in a persistent namespace, batching multiple browser operations into a single tool call. Page state compresses to ~450 characters. The architecture is simple, and it turns out simplicity saves tokens. We wanted numbers, not intuition. We benchmarked 4 CLI frameworks head-to-head on 6 real browser tasks using Claude Sonnet 4.6 on AWS Bedrock, N=3 runs with randomized order, 10,000-sample bootstrap confidence intervals. All four achieved 100% accuracy. OpenBrowser used 2.6x fewer tokens on average and won 5 of 6 tasks on token efficiency. Every framework in the benchmark is a good tool. Ours just found a way to do the same work with less. Full methodology and reproducible scripts: https://docs.openbrowser.me/cli-... Then we went further. We are post-training open-source models specifically for browser control. Two published studies: SFT + GRPO reinforcement learning on Qwen3-8B for web form filling, and a cross-paradigm comparison with diffusion language models. Both papers and all trained models are public on ResearchGate and HuggingFace. What ships today: - CLI tool and MCP server (pip install openbrowser-ai) - 15 LLM providers (OpenAI, Anthropic, Google, Bedrock, Azure, Ollama, and more) - Cloud platform with saved auth profiles and scheduled browser workflows - Raw CDP engine with code batching and persistent variable namespace This started as a capstone project at the University of Toronto. Four students, four months, one architectural question that turned into a framework, a cloud platform, and two research papers. We would love your feedback. What browser automation tasks would you throw at this?

Report

25d ago

@billy_enrizky One task I'd love to see it crush is automating multi-tab research like pulling pricing from 5 vendor sites, compare in a table, flag outliers, without exploding tokens. How does the persistent namespace shine there?

Report

24d ago

OpenBrowser-AI

Maker

@swati_paliwal

Great question, this is exactly where the architecture pays off. The persistent namespace lets the agent open 5 vendor tabs, extract pricing with server-side JavaScript (returning maybe 200 tokens of clean JSON per site instead of the 124K-token full-page dumps other frameworks send), and keep all 5 results as Python variables across LLM calls, so the final comparison/outlier-flagging step is just a few lines of pandas that never touch the browser at all. In our benchmarks the total context for a multi-page extraction task stays under 10K tokens where competitors hit 50K+ for the same work, with identical 100% accuracy.

Report

24d ago

OpenBrowser-AI

Maker

Check out our website at https://openbrowser.me/

Report

24d ago

OpenBrowser-AI

Maker

Check out our GitHub here:

https://github.com/billy-enrizky...

Report

24d ago

OpenBrowser-AI

Maker

Check out all the 12 trained models here:

https://huggingface.co/billyenrizky

Report

24d ago

OpenBrowser-AI

Maker

Check out our documentation here:

https://docs.openbrowser.me

Report

24d ago

OpenBrowser-AI

Maker

Concentrate or Collapse: When Reinforcement Learning Meets Diffusion Language Models for Web Planning:
https://www.researchgate.net/profile/Muhammad-Enrizky-Brillian/research
Browser-in-the-Loop: Reinforcement Fine-Tuning LLM Agents for Web Form Filling:
https://www.researchgate.net/profile/Muhammad-Enrizky-Brillian/research

Report

24d ago