Stop fine-tuning LLMs for docs, use RAG

Comments (1)

MarineCG40 · 4h ago

I keep seeing people fine-tune LLMs for use cases where they probably don’t need to. In most doc/product scenarios, you don’t need another fine-tuned model—you just need retrieval-augmented generation (RAG). Why I think RAG wins in most cases: Fine-tuning is expensive, slow, and brittle. Most use cases don’t require “teaching” the model, just giving it the right context. With RAG, you keep things fresh: update your docs → update your embeddings → done. I built a small proof-of-concept to test this: a documentation assistant where docs are chunked + embedded, user queries are matched with cosine similarity, and GPT answers with the relevant context injected. Every query is logged, which turns out to be valuable—surfacing missing docs, common user struggles, and even feature requests. Demo: https://intlayer.org/doc/chat Write-up + code: https://intlayer.org/blog/rag-powered-documentation-assistan... My question: Do you see fine-tuning + RAG coexisting for these types of tasks? Or is RAG simply the obvious solution for 80% of real-world doc/product use cases?

Show HN: npm-daycare, an NPM proxy that filters out recent & small packages (github.com)

Show HN: Navi - First AI Voice Computer (hinavi.ai)

Show HN: I built a platform for long-form media recs (books, articles, etc.) (rhomeapp.com)

Show HN: Scientific Calculator for Android (play.google.com)

Show HN: I built a decentralized protocol for predicting interest rate movement (kairosswap.com)

Show HN: Ghostpipe – Connect files in your codebase to user interfaces (github.com)

Show HN: I built a tool to visually manage my LLM prompt templates and save them (promptcanvas.ml4den.com)

Show HN: Pyproc – Call Python from Go Without CGO or Microservices (github.com)

Show HN: A store that generates products from anything you type in search (anycrap.shop)

Show HN: AI Code Detector – detect AI-generated code with 95% accuracy (code-detector.ai)

Show HN: I wrote a from-scratch OS to serve my blog (github.com)

Show HN: I Collected Every Emoticon I Could Find – All Mood and Generator (emoticonhub.com)

Show HN: I reverse engineered macOS to allow custom Lock Screen wallpapers (cindori.com)

Show HN: Daffodil – Open-Source Ecommerce Framework to connect to any platform (github.com)

Show HN: AI-powered web service combining FastAPI, Pydantic-AI, and MCP servers (github.com)

Show HN: Quizquestions.org – A free library for quiz questions (quizquestions.org)

Show HN: Omarchy on CachyOS (github.com)

Show HN: Semlib – Semantic Data Processing (github.com)

Show HN: Dagger.js – A buildless, runtime-only JavaScript micro-framework (daggerjs.org)

Show HN: Clean Clode – Clean Messy Terminal Pastes from Claude Code and Codex (cleanclode.com)

Show HN: Alyx, a caffeine tracker with no accountability (alyxcaffeinetracker.com)

Show HN: Universal single-letter project commands to speed up your CLI workflow (github.com)

Show HN: MCP Server Installation Instructions Generator (hyprmcp.com)

Show HN: Should v0.2.0 – debugging Go tests made easier (github.com)

Show HN: Small Transfers – charge from 0.000001 USD per request for your SaaS (smalltransfers.com)

Show HN: Drop-in Redis replacement in Rust with 5M+ GET/s (github.com)

Show HN: Datadef.io – Canvas for data lineage and metadata management (datadef.io)

Show HN: Ruminate – AI reading tool for understanding hard things (tryruminate.com)

Show HN: I built an app store for open-source financial plans (on spreadsheets) (finfam.app)

Show HN: Vicinae – A native, Raycast-compatible launcher for Linux (github.com)

Show HN: Blocks – Dream work apps and AI agents in minutes (blocks.diy)

Show HN: I made a generative online drum machine with ClojureScript (dopeloop.ai)

Show HN: HN Term – browse HN using the terminal (github.com)

Show HN: Term.everything – Run any GUI app in the terminal (github.com)

Show HN: Ultraplot – A succint wrapper for matplotlib (github.com)

Show HN: Building a Deep Research Agent Using MCP-Agent (thealliance.ai)

Show HN: InfiniteTalk AI – AI Lip-Sync Video Generator for Long Videos (infinitetalk.net)

Show HN: CLAVIER-36 – A programming environment for generative music (clavier36.com)

Show HN: Open Line Protocol – a minimal wire for AI agents (MIT) (github.com)

Show HN: A tool to make a bootable USB installer out of macOS, or download it (macdaddy.io)

Show HN: TailGuard – Bridge your WireGuard router into Tailscale via a container (github.com)

Show HN: Bottlefire – Build single-executable microVMs from Docker images (bottlefire.dev)

Show HN: Labspace Directory – Biotech resource for lab space (labspacedirectory.com)

Show HN: Making a cross-platform game in Go using WebRTC Datachannels (pion.ly)

Show HN: Httpjail – HTTP(s) request filter for processes (github.com)

Show HN: C++ Compiler Support Page (cppstat.dev)

Show HN: PaperSync, making ArXiv papers collaborative (hackcmu25.vercel.app)

Show HN: A Daily Typing Challenge in the TUI (github.com)

Show HN: Helios, an open-source distributed AI network using idle community GPUs (github.com)

Show HN: Haystack – Review pull requests like you wrote them yourself (haystackeditor.com)

Stop fine-tuning LLMs for docs, use RAG

Comments (1)