Show HN: I felt lost in a codebase, so I built tools for AI to explore it
I want to share a recent project that was born out of a personal struggle.
Roughly two months ago, I was trying to contribute to a large open-source project to gain more experience. The repo was just too big for a first-timer, and I felt completely overwhelmed. I wanted to use an AI assistant to help me navigate, but I had no good way to provide the necessary context.
So, I built a tool to solve this: a self-hosted mcp server that gives an AI client a set of tools to explore and understand GitHub repos on its own.
My first attempt was naive: I used a library to ingest the whole codebase into a single text file and dump it to the AI. As you can guess, this immediately exceeded the context window limit for any reasonably sized project.
The better approach, and the one this tool uses, is to let the AI explore the code on demand. I gave it tools to fetch a directory tree, get the content of any file or folder, diff any two branches/commits/tags, and pull context from issues/PRs. This works surprisingly well, even on a massive repo like VS Code's (>27M tokens).
Ironically, I only recently learned that GitHub has its own official server for this! Luckily, my version has some capabilities that theirs doesn't (yet):
1. Get a clean directory tree: It uses the Git tree object to create a readable tree structure, which helps the AI (and humans) understand the project layout faster.
2. Get a diff between any two points: This isn't limited to PRs; you can compare any two branches, commits, or tags.
3. Get raw repo contents: It can fetch file contents as plain text or folder contents in the tree format, ready for AI to process directly.
Any feedback is highly appreciated!
Try it out: https://github.com/baonguyen09/github-second-brain Demo Video: https://github.com/user-attachments/assets/ecb05256-8bb0-427...
No comments yet