Show HN: I implemented a RNN from scratch by reading a dense neural network book

3 dangmanhtruong 0 8/12/2025, 5:58:12 PM github.com ↗

Hi everyone. I have been learning about deep learning for some time, and I've tried to implement CNN, neural networks, U-Net, transformers etc. to learn and understand them more and also to get my hands dirty on the frameworks, however I've noticed that many tutorials online are not very detailed, so concepts are not explained clearly, so people would understand neural networks only shallowly. On the other hand, many sources like books may show many, many equations but do not show the main points, so people reading those books would get lost in mathematical details, which hampers learning. When I tried reading about RNN, or LSTM, I've noticed that many tutorials do not fully explain them. Some show pictures to make visualization easier, some show forward equations but the backward equations are not discussed. But there is something which I don't think is talked about much, is that many tutorials, even if they show the backpropagation, only limit it to a single RNN layer (this is also true for LSTM/GRU).

Some time ago, I read this book called "Neural Network design" by M. Hagan, and I found the explanations of the book to be quite good (even though the book is not new). The book explains things clearly enough for you to build everything and does not handwaive anything. When I checked the part about RNN, I noticed that the book explains how to do backprop for RNN with arbitrary connections, not just one RNN layer, which I think is something not many sources online show. The book also derives for the conditions of different delays, which I think is completely skipped in other sources.

So I decided to go ahead and implement it. The URL provides link to my implementation, which includes: The implementation includes:

- Full BPTT for RNN networks with arbitrary recurrent connections and delays

- Comprehensive gradient checking using finite differences

- Bayesian regularization and multiple optimization algorithms

- Extensive numerical validation throughout

I think I learned a lot during the implementation, both about how to implement a neural network, and also about how to structure my program, etc. I tried to be systematic and included tests for correctness of backprop by approximate difference equation (you know the [f(x+delta)-f(x-delta)]/(2*delta) thing). This also made me try to learn about Einstein summation (using Numpy) which really help things. During this period, I also learned that equation (14.39) has some slight error which is fixed in later equations (this was confirmed in private emails with the authors). The gradient checking was essential for debugging these subtle mathematical issues.

Key lessons:

- Systematic software development techniques, coupled with mathematical rigour, help catch ML bugs more effectively.

- Implementing from first principles help solidify your understanding and reveal the inner workings which frameworks hide.

- Einstein summation is a good thing to make the maths much cleaner.

Even though RNN is not the latest architecture, I think there is value in firstly grounding in fundamentals before jumping to more complex models.

F-Droid build servers can't build modern Android apps due to outdated CPUs

Best area of AI to focus on for a front end developer?

Ask HN: Are there software engineering areas that are safe from LLMs invasion?

Ask HN: What alternatives to GitHub are you using?

$160M VC-backed company just killed my EU trademark for a small OSS project

Ask HN: Would you swap your desk for a restaurant shift?

Ask HN: What LLM are you all using for coding assistance right now?

Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?

Ask HN: What toolchains are people using for desktop app development in 2025?

Ask HN: Are leetcode interviews going away?

Gemini's brutal assessment of a vibe coding session

Stripe suspended our account without clear reason – need advice

Ask HN: Is anyone running LLMs directly on GitHub Actions runners?

Ask HN: Why Is My Happiness Tied to My Productivity?

Ask HN: With all the AI hype, how are software engineers feeling?

Why do dev tools crush it on Product Hunt but never seem to raise money?

Snapchat open source cross-platform mobile framework. Looking for beta testers

ASK: Could AI Replace the Slush Pile Intern?

Ask HN: Do you do anything with the "cool" languages that get posted here?

GitHub Outage?

Tell HN: Regulations.gov Comments API is shutting down on Friday

Ask HN: How do you find Enterprise customers?

Dual-Crypt – Working AES+RSA Encryption Between Java Spring Boot and React/JS

Google's RCS disconnected in several countries

Ask HN: What do you dislike about ChatGPT and what needs improving?

What's your favorite CLI tool for integrating LLMs into your terminal workflow?

Ask HN: Canadian founders, how do you build in SF?

Ask HN: Advice for someone who wants to try AI-assisted coding?

Ask HN: Has anyone built anything useful using AI?

Vectorless: open-source PDF chatbot without RAG

Ask HN: What are some comfy/stress-free jobs a SWE can do? (LCOL country)

ChatGPT 5 is slow and no better than 4

Ask HN: What tech skill gave you the biggest boost in your career?

Ask HN: How would you build second brain in the AI era?

Does anyone know a detailed residential cost estimator

Tell HN: Charles Irby has passed away

Ask HN: Why is Usenet not coming back?

Ask HN: Best way to get a land line for my kids?

Ask HN: In which programming language is it better to make your own language?

ChatGPT-5 Can't Do Basic Math

Ask HN: What's Going on with AI Psychosis?

Show HN: I implemented a RNN from scratch by reading a dense neural network book

Comments (0)