Train a 70b language model at home (2024)

71 amrrs 13 7/24/2025, 7:13:14 PM answer.ai ↗

Comments (13)

doctoboggan · 1d ago
Maybe I misunderstand, but it seems like they are using LoRa, which is a fine tuning implementation. That requires an already existing trained LLM. If that's true, I think that the title of this submission is inaccurate, as this doesn't let you train a model from scratch with 2 consumer GPUs.
botro · 1d ago
Yes, they put this in footnote 1: "Throughout this article “training” can refer to either pre-training, or fine-tuning." But the article is just talking about fine-tuning.
oceanplexian · 1d ago
"The thing the word actually means isn't the way we're using it" isn't how I would use a footnote.
underlines · 1d ago
I hand curate github.com/underlines/awesome-ml so I read a ton about latest trends in this space. when I started to read the article, I felt a lot of information was weirdly familiar and almost outdated.

the space is moving fast after all. they just seem to be explaining QLoRA fine tuning, (yes great achievement and all the folks involved are heroes) but reading a trending article on HN - it felt off.

turns out I was too dumb to check the date: 2024 and the title is mixing up quantized adapter fine tuning with base model training. thanks lol

darkbatman · 1d ago
Would be nice to see some benchmarks.

Also from my experience you need more power to get some significant result. Mostly fine tuning would work if base model is very close to what you are trying to achieve and you won't be much happy with the results though.

Also context length becomes an issue trying to fit in with gpu with lesser ram.

lostmsu · 1d ago
Clickbait. They fine tune. Still sounds potentially useful.
nvtop · 1d ago
March 2024
mawadev · 1d ago
[flagged]
dang · 1d ago
"Don't be snarky."

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html

WesolyKubeczek · 1d ago
Better do it in winter, when you could use extra heat anyway.
mawadev · 1d ago
Thank you for your advice, I will take it into account when I train my 70B language model at home in the winter days
WesolyKubeczek · 1d ago
Everyone trains their 70B language model at home, even if they won't admit it. It's our little dirty secret.
smnplk · 1d ago
winter is coming