Scraping Zillow for the Little Guys

5 karagenit 3 5/22/2025, 10:56:05 PM caleb.software ↗

Comments (3)

bob1029 · 15h ago
My go-to for scraping "like a human" is to use headless browser automation. Dump the DOM after a few seconds to allow react jank to settle, and then parse the HTML document for desired properties.

If you do things correctly this approach should be mostly indistinguishable from an actual user. The automated browser instance is just a normal build of chrome. The biggest trick seems to be patience. If you only make a request once every 60 seconds or so, it's very unlikely that someone or something is going to care about it.

karagenit · 15h ago
Yeah, I'd probably go that route if I wanted to scrape more than the handful of pages I needed for this project. I wonder if it would work on Zillow or not. Even my simple workflow of "click the next button, save the request in devtools, repeat" was suspicious enough to trigger a captcha-type "are you a bot?" challenge. Maybe it was just too many requests quickly like you mentioned, or maybe they're doing something more advanced like mouse movement tracking.
bob1029 · 14h ago
If you use something like playwright, you can inject events arbitrarily during the scraping session to simulate things a human might do.

https://playwright.dev/docs/api/class-mouse#mouse-move