LoRA Fine-Tuning Tiny LLMs as Expert Agents

Comments (3)

jamesbriggs · 17h ago

Sharing my walkthrough on fine-tuning LLMs with LoRA using NVIDIA's NeMo microservices. The result is a llama-3.2-1b-instruct model fine-tuned to be really good at function-calling, making it ideal for agent-use.

It was a ton of fun to figure it out and it brought back some nostalgia from the days of training ML models, tweaking learning rates, dropout, and watching loss charts in W&B.

Final performance was way better than any 1-3B parameter LLM I tried with agentic workflows in the past.

kordlessagain · 16h ago

Thank you for making this. I clicked through on the container page on the cookbook/gen-ai/training/lora/nvidia-nemo /nemo-lora-function-calling.ipynb and it was a 404. I did find this: https://catalog.ngc.nvidia.com/orgs/nim/teams/meta/container...

Can you point to a public version of this model you trained. I'd like to test with an agentic framework I'm working on.

jamesbriggs · 13h ago

My bad, the link was wrong - you found the right one. I've updated it in the repo too, thanks. Let me know how it goes!

I Am No Longer an AI Doomer (thedeepdish.org)

Safari Technology Preview 220 Release Notes (developer.apple.com)

Gurus of 90s Web Design: Zeldman, Siegel, Nielsen (cybercultural.com)

Flame Charts: The Time-Aware Sibling of Flame Graphs (polarsignals.com)

Haiti's Beleaguered Government Launches Drones Against Gangs (wsj.com)

Archaea (en.wikipedia.org)

Kigen – Design system token and style generator (kigen.design)

Einstein, Hawking, and Tao (Haiku) (jerry.wtf)

Michigan Isn't Supposed to Have Earthworms [video] (youtube.com)

Agents Aren't Juniors, They Are Amnesiac Spies (avdi.codes)

Agentic book – The Human Algorithm [pdf] (github.com)

Philips Ease (thefoggiest.dev)

Haskell Weekly Issue 474 (haskellweekly.news)

Intel Shows Off Professional Battlemage Cards (semiaccurate.com)

Ivey Business School's Value Investing Program – Adam Waterous [video] (youtube.com)

Top math software platform still offline following ransomware attack (techradar.com)

Show HN: OpenDeRisk – open-source intelligent app risk manager (github.com)

China extends its reach into the Solar System with launch of asteroid mission (arstechnica.com)

Elon Musk exits US Government after breaking with Trump on tax bill (theguardian.com)

Redesigning the Initial Rust Bootstrap Sequence (blog.rust-lang.org)

NI3 (en.wikipedia.org)

Rethinking African edtech: Why AI alone won't be enough (techcabal.com)

Show HN: Flags Quiz – Flags of All World Countries (flags-quiz.com)

Saying Bye to Glitch (pketh.org)

We built a distributed cache for S3 (clickhouse.com)

Read Frog – Open-Source AI Language Translator and Teacher in Browser (github.com)

The weight of an entire industry trying to convince you that you're inadequate (buttondown.com)

Show HN: Warden – A Native (and Free) AI Chat App for macOS (karatsidhu.gumroad.com)

What's cooking on Sourcehut? Q2 2025 (sourcehut.org)

STOC Best Paper Award: How to Find the Shortest Path – Faster (mpi-inf.mpg.de)

Cyber Resilience Act and Open Source: What Maintainers Need to Know [video] (youtube.com)

Glacier collapse buries most of Swiss village (bbc.com)

Show HN: Entropy – Sharing screen is scary in SaaS age (entropysec.io)

Emergency We Cannot Feel: On the Psychological Unreadiness for American Collapse (notesfromthecircus.com)

Statically typed languages are like Elephants

Raw.githubusercontent.com – How to authenticate and see headers with info? (github.com)

No iOS 19: Apple Going Straight to iOS 26 (macrumors.com)

Show HN: I made an AI prompt manager to stop rewriting the same prompts (echostash.app)

Front End Engineering Team Working Style Guide (github.com)

The Maid Who Restored Charles II (historytoday.com)

YouTube Is Swallowing TV Whole, and It's Coming for the Sitcom (bloomberg.com)

The Art of the Critic (metropolitanreview.org)

The Nature of Thought: A conversation with a Claude instance (docs.google.com)

Bidirectional typing with unification for higher-rank polymorphism (github.com)

Comprehensive Rust - a multi-day Rust course developed by the Android team (github.com)

Identifying Unmarked Iron (castironcollector.com)

Paradoxical Questions and Simple Wonder Lead to Great Science (quantamagazine.org)

Tesla is losing money insuring its own cars (electrek.co)

Codestral Embed – embedding model for code (mistral.ai)

Show HN: Sokuji – Open-source real-time speech translation for Microsoft Teams (github.com)

LoRA Fine-Tuning Tiny LLMs as Expert Agents

Comments (3)