Ask HN: How to build services for FOSS projects when ToS forbids scraping?

3 ATechGuy 6 8/20/2025, 9:50:10 PM
I'm working on a service for FOSS developers to help enforce code license compliance and make projects more sustainable.

The challenge: many websites' Terms of Service explicitly prohibit scraping, crawling, or automation. At the same time, the information needed (repos, dependencies, metadata) is often available only through those sites.

For those who've built tools around open source ecosystems:

* How do you navigate ToS restrictions while still delivering value to users?

* Do you focus on official APIs only, even if they're limited?

* Are there established legal/technical best practices for this situation?

* How to balance compliance with ToS and the mission of supporting FOSS?

Curious to hear what others have done (or seen work) in this space.

Comments (6)

like_any_other · 3h ago
The law around scraping is unfortunately complex, but it's plausible your use of it (legal compliance) could be legal regardless of ToS ("you're not allowed to use automated tools to check if we're breaking your license" is quite fishy):

https://legalclarity.org/is-web-scraping-legal-a-look-at-the...

https://en.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn

Or you could base your business in Denmark: https://en.wikipedia.org/wiki/Web_scraping#European_Union

ATechGuy · 3h ago
Thanks for sharing the links! You're right, the legality around scraping is nuanced :( Use cases like license compliance might fall on the defensible side.
toomuchtodo · 3h ago
I’m unsure what your jurisdiction is, but if you’re in the US, speak with an attorney and get an attorney opinion letter. You’ll at least be able to demonstrate that you were acting in good faith.
ATechGuy · 3h ago
Thanks!
bigyabai · 3h ago
You're making a business, right? Pay for official API access.
ATechGuy · 3h ago
Right now the cost is a blocker (I'm bootstrapped, no funding yet). On top of that, sometimes official APIs either don't exist or don't expose the full dataset needed for license checks, which makes things tricky.