SCRAPR

The data layer for the agentic web

367 followers

The data layer for the agentic web

367 followers

Visit website

Automation tools

•

Browser Automation

SCRAPR is a new approach to web data extraction. Instead of relying on fragile DOM selectors or heavy browser automation, SCRAPR looks at how modern websites actually load their data and extracts structured responses directly from those sources. The goal is to make web data pipelines faster, more reliable, and easier to maintain. Right now SCRAPR is in early MVP and we’re looking for developers, data teams, and AI builders who need clean structured data from websites.

Free Options

Launch tags:Productivity•API•Artificial Intelligence

Launch Team

Wispr Flow: Dictation That Works Everywhere — Stop typing. Start speaking. 4x faster.

Stop typing. Start speaking. 4x faster.

Promoted

Looks cool — but how well does it actually handle hard targets like Cloudflare, JS-heavy sites, proxies, and rate limits in the real world?

Report

2mo ago

SCRAPR

Hunter

@paradox_hash The engine doesn’t rely on browser automation, so for JS-heavy sites it looks for the actual data endpoints the site calls (APIs, GraphQL, fetch requests) and pulls data directly from those. That avoids a lot of the usual scraping issues. For things like rate limits or protection layers, it behaves more like a normal HTTP client and adapts requests rather than brute-forcing pages.

Report

2mo ago

Does this handle things like fingerprinting and bot detection? Awesome that you coming at this with a new angle!

Report

2mo ago

SCRAPR

Hunter

@edgeghost Thanks Adam, appreciate it! SCRAPR mainly focuses on extracting data efficiently without needing a full browser, which already avoids a lot of the typical issues. For things like fingerprinting or bot detection, it really depends on how the specific site is set up. But yeah, the end goal is to make it work for any given website.

Report

2mo ago

How does this engine handle JavaScript-heavy or dynamic content without a browser, and what mechanisms ensure data accuracy when the source website changes its layout?

Report

2mo ago

SCRAPR

Hunter

@mordrag For JavaScript-heavy sites, the engine doesn’t use a browser. Instead it looks at the page’s code and finds the API requests the site uses to load its data (like fetch, axios, or GraphQL). Then it calls those data endpoints directly and pulls the real content from there. This makes it much faster and lighter than running a full browser.

Report

2mo ago

rtrvr.ai

So what happens when the API changes?

Sites like Linkedin also use server side rendering and hydration for pages so this approach won't work on most websites?

Report

2mo ago

SCRAPR

Hunter

@arjun_chintapalli Good question. If an API changes, SCRAPR isn’t tied to just one extraction path. It can re-analyze how the page delivers its data and adjust instead of relying on a fixed endpoint or selector.

And you’re right that some sites use server-side rendering or hydration. In those cases the content still exists in the page response or in subsequent requests, so SCRAPR can fall back to extracting it from the page structure when needed.

Report

2mo ago

AutonomyAI

This is such a clean solution to a problem that's been annoying developers forever. Rooting for you!

Report

2mo ago

SCRAPR

Hunter

@lev_kerzhner Thanks! 😊

Report

2mo ago

The network-call interception approach is genius. Most scrapers fight against the rendered HTML which is a losing battle - sites redesign constantly and JS-rendered content is a nightmare.

Going upstream to the actual data source (API calls) means you're getting the same clean data the site itself uses. Much more stable.

How do you handle authentication-required data? Like scraping my own logged-in dashboards to aggregate data from various services I use?

Report

2mo ago

This is such a smart pivot from the usual DOM-parsing headaches! As a dev who's spent way too many hours fixing scrapers because of a tiny CSS change, focusing on the data responses directly sounds like a lifesaver. How do you handle sites with heavy anti-bot protections or obfuscated API endpoints?

Report

2mo ago

1 2 3

Reviews

Most Informative