Form16x – Simplify tax season: JSON output and regime comparisons from Form 16

2 taxedo 1 9/12/2025, 1:53:16 PM
I got tired of manually copying numbers from Form 16 PDFs into India’s tax filing portal every year. So I built *Form16x*, a Python CLI + library that parses these PDFs into structured JSON.

Beyond extraction, it can: - Consolidate multiple Form 16s if you switched jobs - Calculate taxes under both regimes → recommends the better one - Show salary/deduction breakdowns directly in the terminal (tree view, colored output) - Suggest tax optimizations (80C, 80D, NPS, etc.) - Provide a Python API (`TaxCalculationAPI`) with multi-year tax rules (AY 2020–2025)

*Repo:* https://github.com/ri-sh/Form16x

Form 16 is similar to a W-2 in the US or a T4 in Canada — semi-structured PDFs with inconsistent layouts. Filing usually means manual data entry. Form16x tries to make that structured and automatable.

Would love feedback from HN — both on the technical approach (PDF parsing + structured extraction) and whether this approach could extend to other countries’ tax forms.

Comments (1)

taxedo · 5h ago