I created a Nationwide dataset of 155M land parcels using two GPUs and a 30TB hard drive.
Because I don't have $100K+ to buy the US parcel dataset from Regrid or ReportAll, I bought a pair of L40s and a 30TB NVMe hard drive, and used them to collect and harmonize 155M parcels into a single dataset from over 3,100 US counties.
And because I don't have a couple dozen employees to feed like Reportall and Regrid and Corelogic, my goal is to try to resell this dataset at much lower prices than the current incumbents, and make the data accessible to smaller projects and smaller budgets.
I ended up with close to 99% coverage of the United States.
Backend stack is a single server running Postgres, gemma3 on ollama, and a big pile of python and plpgsql. Website is running on Firebase with PMTiles as the mapping layer. Parcel file exports are served from Google Cloud Storage.
My plan is to open-source a big portion of this system once I can clean it up, but my first priority was getting a product on the market and trying to make this self-sustaining.
If anyone is interested in any of the technical details or if you want to try to do this yourself, I'm happy to share anything you want to know.
Because I don't have $100K+ to buy the US parcel dataset from Regrid or ReportAll, I bought a pair of L40s and a 30TB NVMe hard drive, and used them to collect and harmonize 155M parcels into a single dataset from over 3,100 US counties.
And because I don't have a couple dozen employees to feed like Reportall and Regrid and Corelogic, my goal is to try to resell this dataset at much lower prices than the current incumbents, and make the data accessible to smaller projects and smaller budgets.
I ended up with close to 99% coverage of the United States.
Backend stack is a single server running Postgres, gemma3 on ollama, and a big pile of python and plpgsql. Website is running on Firebase with PMTiles as the mapping layer. Parcel file exports are served from Google Cloud Storage.
My plan is to open-source a big portion of this system once I can clean it up, but my first priority was getting a product on the market and trying to make this self-sustaining.
If anyone is interested in any of the technical details or if you want to try to do this yourself, I'm happy to share anything you want to know.