It's time for modern CSS to kill the SPA (jonoalderson.com)

Maybe I misunderstand, but it seems like they are using LoRa, which is a fine tuning implementation. That requires an already existing trained LLM. If that's true, I think that the title of this submission is inaccurate, as this doesn't let you train a model from scratch with 2 consumer GPUs.

botro · 1d ago

Yes, they put this in footnote 1: "Throughout this article “training” can refer to either pre-training, or fine-tuning." But the article is just talking about fine-tuning.

oceanplexian · 1d ago

"The thing the word actually means isn't the way we're using it" isn't how I would use a footnote.

underlines · 1d ago

I hand curate github.com/underlines/awesome-ml so I read a ton about latest trends in this space. when I started to read the article, I felt a lot of information was weirdly familiar and almost outdated.

the space is moving fast after all. they just seem to be explaining QLoRA fine tuning, (yes great achievement and all the folks involved are heroes) but reading a trending article on HN - it felt off.

turns out I was too dumb to check the date: 2024 and the title is mixing up quantized adapter fine tuning with base model training. thanks lol

darkbatman · 1d ago

Would be nice to see some benchmarks.

Also from my experience you need more power to get some significant result. Mostly fine tuning would work if base model is very close to what you are trying to achieve and you won't be much happy with the results though.

Also context length becomes an issue trying to fit in with gpu with lesser ram.

lostmsu · 1d ago

Clickbait. They fine tune. Still sounds potentially useful.

nvtop · 1d ago

March 2024

mawadev · 1d ago

[flagged]

dang · 1d ago

"Don't be snarky."

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html

WesolyKubeczek · 1d ago

Better do it in winter, when you could use extra heat anyway.

mawadev · 1d ago

Thank you for your advice, I will take it into account when I train my 70B language model at home in the winter days

WesolyKubeczek · 1d ago

Everyone trains their 70B language model at home, even if they won't admit it. It's our little dirty secret.

smnplk · 1d ago

winter is coming

It's time for modern CSS to kill the SPA (jonoalderson.com)

Do not download the app, use the website (idiallo.com)

Vanilla JavaScript support for Tailwind Plus (tailwindcss.com)

Experimental surgery performed by AI-driven surgical robot (arstechnica.com)

It's a DE9, not a DB9 (but we know what you mean) (news.sparkfun.com)

Why MIT switched from Scheme to Python (2009) (wisdomandwonder.com)

Animated Cursors (tattoy.sh)

Efficient Computer's Electron E1 CPU – 100x more efficient than Arm? (morethanmoore.substack.com)

Developing our position on AI (recurse.com)

CO2 Battery (energydome.com)

Steam, Itch.io are pulling ‘porn’ games. Critics say it's a slippery slope (wired.com)

Google in 1999: Search engines escape the portal matrix (cybercultural.com)

Programming vehicles in games (wassimulator.com)

Internet Archive is now a federal depository library (kqed.org)

Never write your own date parsing library (zachleat.com)

Running PostmarketOS on Android Termux proot without a custom ROM (2024) (ivonblog.com)

The future is not self-hosted (drewlyton.com)

Women dating safety app 'Tea' breached, users' IDs posted to 4chan (404media.co)

Steve Jobs' cabinet (perfectdays23.substack.com)

Windsurf employee #2: I was given a payout of only 1% what my shares where worth (twitter.com)

Show HN: Price Per Token – LLM API Pricing Data (pricepertoken.com)

Why is there a date of 1968 in the Intel Chipset Device Software Utility? (intel.com)

How to Catch a Wily Poacher in a Sting: A Thermal Robotic Deer (wsj.com)

Who has the fastest F1 website (2021) (jakearchibald.com)

Implementing a functional language with graph reduction (2021) (thma.github.io)

Show HN: Apple Health MCP Server (github.com)

Researchers value null results, but struggle to publish them (nature.com)

How to draw lambda diagrams (2020) (risingentropy.com)

Dwl: Dwm for Wayland (codeberg.org)

WhoFi: Deep Person Re-Identification via Wi-Fi Channel Signal Encoding (arxiv.org)

How to configure X11 in a simple way (eugene-andrienko.com)

Nullable but not null (efe.me)

Celebrating 20 Years of MDN (developer.mozilla.org)

Trucking's uneasy relationship with new tech (bbc.com)

Quantitative AI progress needs accurate and transparent evaluation (mathstodon.xyz)

The Tabs vs. Spaces war is over, and spaces have emerged victorious (xn--gckvb8fzb.com)

Immigration agents told a teenage US citizen: 'You've got no rights.' (theguardian.com)

Show HN: The Montana MiniComputer (mtmc.cs.montana.edu)

Claude Code Introduces Specialized Sub-Agents (docs.anthropic.com)

Games Look Bad: HDR and Tone Mapping (2017) (ventspace.wordpress.com)

Building Brain Box, a meta text adventure film adaptation (kubicki.org)

Monotonic and wall clock time in the Go time package (victoriametrics.com)

Quantum Scientists Have Built a New Math of Cryptography (quantamagazine.org)

Asciinema: Record and share your terminal sessions (asciinema.org)

Google spoofed via DKIM replay attack: A technical breakdown (easydmarc.com)

Brazil central bank to launch Pix installment feature in September (reuters.com)

The Mythical Machine-Month Paradox – How much could AI change programming? (tucson-josh.com)

Tamiya chairman Shunsaku Tamiya dies at 90 (dailyexpress.com.my)

The sad state of font rendering on Linux (2018) (pandasauce.org)

The "computer janitor" of the Manhattan project (allaboutcircuits.com)

Train a 70b language model at home (2024)

Comments (13)