Show HN: I implemented a RNN from scratch by reading a dense neural network book

1 dangmanhtruong 0 8/12/2025, 5:58:12 PM github.com ↗

Hi everyone. I have been learning about deep learning for some time, and I've tried to implement CNN, neural networks, U-Net, transformers etc. to learn and understand them more and also to get my hands dirty on the frameworks, however I've noticed that many tutorials online are not very detailed, so concepts are not explained clearly, so people would understand neural networks only shallowly. On the other hand, many sources like books may show many, many equations but do not show the main points, so people reading those books would get lost in mathematical details, which hampers learning. When I tried reading about RNN, or LSTM, I've noticed that many tutorials do not fully explain them. Some show pictures to make visualization easier, some show forward equations but the backward equations are not discussed. But there is something which I don't think is talked about much, is that many tutorials, even if they show the backpropagation, only limit it to a single RNN layer (this is also true for LSTM/GRU).

Some time ago, I read this book called "Neural Network design" by M. Hagan, and I found the explanations of the book to be quite good (even though the book is not new). The book explains things clearly enough for you to build everything and does not handwaive anything. When I checked the part about RNN, I noticed that the book explains how to do backprop for RNN with arbitrary connections, not just one RNN layer, which I think is something not many sources online show. The book also derives for the conditions of different delays, which I think is completely skipped in other sources.

So I decided to go ahead and implement it. The URL provides link to my implementation, which includes: The implementation includes:

- Full BPTT for RNN networks with arbitrary recurrent connections and delays

- Comprehensive gradient checking using finite differences

- Bayesian regularization and multiple optimization algorithms

- Extensive numerical validation throughout

I think I learned a lot during the implementation, both about how to implement a neural network, and also about how to structure my program, etc. I tried to be systematic and included tests for correctness of backprop by approximate difference equation (you know the [f(x+delta)-f(x-delta)]/(2*delta) thing). This also made me try to learn about Einstein summation (using Numpy) which really help things. During this period, I also learned that equation (14.39) has some slight error which is fixed in later equations (this was confirmed in private emails with the authors). The gradient checking was essential for debugging these subtle mathematical issues.

Key lessons:

- Systematic software development techniques, coupled with mathematical rigour, help catch ML bugs more effectively.

- Implementing from first principles help solidify your understanding and reveal the inner workings which frameworks hide.

- Einstein summation is a good thing to make the maths much cleaner.

Even though RNN is not the latest architecture, I think there is value in firstly grounding in fundamentals before jumping to more complex models.

Show HN: I accidentally built a startup idea validation tool (validationly.com)

We got the internet all wrong (thedispatch.com)

We caught companies making it harder to delete your personal data online (themarkup.org)

Novel Protein to Treat Carbon Monoxide Poisoning in Minutes (health.pitt.edu)

Iceberg Table Corruption and Data Loss in Production (ryft.io)

Teaching New Ways to Build Software with Aravind Putrevu [audio] (coffeeandopensource.com)

Performative virtue-signaling has become a threat to higher ed (thehill.com)

Perplexity makes a bid for Google Chrome (9to5google.com)

AI content is tainting preprints: how moderators are fighting back (nature.com)

The Missing Protocol: Let Me Know (deanebarker.net)

Sam Altman challenges Elon Musk with plans for Neuralink rival (ft.com)

Arizona Iced Tea may be forced to raise price of 99c cans (dexerto.com)

No printers or PCs, Starbucks Korea tells customers (bbc.com)

Serverless WireGuard (proxylity.com)

See-thru Game Boy is a work of art (theverge.com)

OpenAI's big GPT-5 launch gets bumpy (axios.com)

Take part in decision-making survey (barchukova.me)

The Great Unbalding (nymag.com)

The Engineering Marvel That China Hopes Will Help Wean It Off Foreign Energy (wsj.com)

Altcoin season: Watch out for Base Ecosystem

Show HN: AI app for learning Sanskrit, English, Hindi, Kannada and more (indilingo.in)

Show HN: AI Trust Proof – Help AI Trust You with Blockchain-Verifiable Signals (aitrustproof.com)

Is Nuclear Energy Our Best Shot at Saving the Planet? (skeptic.com)

Print, a one-line BASIC program (10print.org)

Automated Browser Testing with Claude Code Agents and Browserbase (ritza.co)

Welcome to ParlaSpeech (clarinsi.github.io)

Cloudflare Repackages Pricing (blog.cloudflare.com)

Nanodevice brings personalized genomics closer to reality (phys.org)

.NET 10 Preview 7 is now available (devblogs.microsoft.com)

Perito Moreno became the first superstar glacier, now set to disappear (theconversation.com)

Slopfest: how I tried to fix my Twitter feed (stuff.hdemirev.com)

From object to iframe – general embedding technologies (developer.mozilla.org)

Researchers uncover secretive Russian spy unit by studying commemorative badges (intelnews.org)

Show HN: Mistral-7B training using pyspark,DeepSpeed (github.com)

Think Stats, 3rd edition (allendowney.github.io)

We Create (linus.coffee)

You've got drought: UK gov suggests you save water by – – – deleting old emails (theregister.com)

Gustav: A sprint orchestrator for Claude Code (github.com)

Laws of Tech: Commoditize Your Complement (gwern.net)

Russia Is Suspected to Be Behind Breach of Federal Court Filing System (nytimes.com)

Show HN: Shhhh Pricing – Price Hider for Non-Logged-In Users (apps.shopify.com)

OpenBench: Provider-agnostic, open-source evaluation infrastructure for LLMs (github.com)

UK government suggests deleting files to save water (theverge.com)

The 'Blue Guide' on the implementation of EU product rules 2022 (eur-lex.europa.eu)

Idea-Driven Ideas (vitalik.eth.limo)

Wireless OLED Contact Lens for Retinal Diagnostics (news.kaist.ac.kr)

Four radioactive wasp nests found on South Carolina nuclear facility (arstechnica.com)

Show HN: Variable Font Debugger (saewitz.com)

Exile Economics: If Globalisation Fails (lrb.co.uk)

The science-backed case for doing nothing: why your brain needs time to drift (psypost.org)

Show HN: I implemented a RNN from scratch by reading a dense neural network book

Comments (0)