Show HN: Extract Tables from Any Website – Images to JSON via OCR

1 valliappanr 0 6/22/2025, 7:28:34 PM github.com ↗
built a two-step open-source tool that extracts tables from any website, even the hard ones that rely on dynamic rendering or CSS.

Step 1: Capture tables as images using a headless browser Step 2: Run OCR to convert them into structured JSON

This works well when traditional HTML parsers fail, like for complex styles, merged cells, or JS-rendered content.

GitHub: https://github.com/enterpriseqa/extract_tables_from_websites Examples included. Feedback and contributions are welcome!

Comments (0)

No comments yet