Show HN: A Tool to Summarize Kenya's Parliament with Rust, Whisper, and LLMs

63 collinsmuriuki 7 6/22/2025, 5:33:04 PM github.com ↗
Bunge Bits summarizes long parliamentary sessions from the Kenyan National Assembly and Senate. Built with Rust, Whisper v3, and GPT-4o.

Sessions are typically 3–7 hours long, mixing English and Swahili. This tool transcribes, chunks, and summarizes them to make political content more accessible and searchable for the public.

https://bungebits.ke/summaries

Comments (7)

sodality2 · 1h ago
If you wanted to make the transcription locally hosted to save on OpenAI costs, you can use my crate, mutter [0], which makes self-hosting the Whisper model super easy :)

[0]: https://crates.io/crates/mutter

otherayden · 47m ago
Would you be willing to share usage data on this? This is an interesting case of LLM products seeming really useful, but I’m wondering if there’s a big market for this
dr_kretyn · 2h ago
That's a great idea and usage of LLMs. Not sure about Kenya specifically but many countries pass tiny updates that make significant changes, and discussion rarely mentions them. There's a lot of obfuscation by design. Highlighting some of these details even if only discussed details is great :)
jpmonette · 1h ago
Oh wow that's pretty cool, been working on something similar to that for my local assembly!
alexanderameye · 2h ago
Nice! Love to see initiatives like these.

I've been working on something in the same space for the Belgian federal parliament. The Belgian parliament livestreams sessions and publishes a single (long, bloated, dual-language) PDF report[0] for each session and that's it.

This means no search across sessions, no details of which parties voted how, no API etc. The only view you get is from the perspective of a single session which is not very useful when you're trying to figure out who to vote for.

I made 'zij werken voor u' (TheyWorkForYou[1] in Dutch) by scraping the PDFs file and parsing it with a Rust script automatically.

The scraped data (votes, questions, topics, dossiers) get put into .parquet files. I also compute some additional things like voting patterns, attendance and which topics interest specific PMs the most.

These parquet files are then fed into a static site generator and a search index is built. I also sprinkle in some summarization using Mistral[2]

The result is https://zijwerkenvooru.be/nl/votes/ (in Dutch) which allows you to look at the data from multiple viewpoints such as

- what questions did member X ask?

- how did party Y vote?

- what is happening around topic Z?

I also post new votes/questions on Bluesky[3]. The whole process (downloading, scraping, publishing, posting) is automated to run through GitHub Actions. I literally have to do nothing now.

I'm hoping the Belgian government will step up and improve their archaic and almost unusable site[4].

Thanks for sharing this project, I'm already getting inspired by it to improve zijwerkenvooru.be!

Edit: I’m thinking it might be good to have an overview of initiatives like these somewhere? Public initiatives to help with political transparency for each country?

[0]: https://www.dekamer.be/doc/PCRI/html/56/ip052x.html

[1]: https://www.theyworkforyou.com/

[2]: https://mistral.ai/

[3]: https://bsky.app/profile/zijwerkenvooru.be

[4]: https://www.dekamer.be/kvvcr/index.cfm

collinsmuriuki · 2h ago
This is fantastic! Love the automation and structure behind it, especially the .parquet approach and GitHub Actions pipeline. Super inspiring.

On my end, it’s a bit frustrating that our Parliament still only shares pdf reports weeks after sessions happen, likely compiled manually. No API, no transcript archive, and no structured metadata around bills, speakers, or topics.

That’s partly why I started building Bunge Bits: to sidestep the bottlenecks and make the information usable.

Appreciate you sharing zijwerkenvooru.be, bookmarking it for inspiration as I figure out what’s next.

arecsu · 3h ago
Looks good!