Pretraining a LLM with less than $50 budget which outperforms Google BERT

Comments (1)

spindump8930 · 6h ago

The title makes it sound nice but the reported results are worse than random baselines on several benchmarks, including ones to claim superiority over BERT. At a glance, Hellaswag, boolq, winogrande are all at or below random guessing. At best this is a fun model with broken evaluation. At worst this is medium spam for clout farming - which won't work on anyone who can read the tables.

FreeDroidWarn (github.com)

Stone Age settlement found under the sea in Denmark (apnews.com)

Patrick Winston: How to Speak (2018) [video] (youtube.com)

The day Return became Enter (2023) (aresluna.org)

Amazon has mostly sat out the AI talent war (businessinsider.com)

Implementing a Foil Sticker Effect (4rknova.com)

The future of 32-bit support in the kernel (lwn.net)

Making Minecraft Spherical (bowerbyte.com)

Kazeta: An operating system that brings the console gaming experience of 90s (kazeta.org)

Raspberry Pi 5 support (OpenBSD) (marc.info)

Bear is now source-available (herman.bearblog.dev)

Apple pulls iPhone torrent app from AltStore PAL in Europe (theverge.com)

Adaptive LLM routing under budget constraints (arxiv.org)

Ripple – A TypeScript UI framework that takes the best of React, Solid, Svelte (github.com)

Cloudflare Radar: AI Insights (radar.cloudflare.com)

Growing the Java Language (youtube.com)

Ask HN: Who is hiring? (September 2025)

The buyer-pull and seller-push theories of sales (howtogrow.substack.com)

Thoughts on (Amazonian) leadership (daemonology.net)

Corruption and Control: Turkmenistan turned internet censorship into a business (blog.torproject.org)

Python: The Documentary – An origin story [video] (youtube.com)

The Tragic End of Natalia Nagovitsyna's Ordeal on Pobeda Peak (explorersweb.com)

One of Britain's largest stocks of second-hand books ever amassed (worldofinteriors.com)

Steve Ballmer Interview (acquired.fm)

The ABC Programming Language (homepages.cwi.nl)

Taiwan Submarine Cable Map Showing Current Outage (smc.peering.tw)

Effective learning: Rules of formulating knowledge (1999) (supermemo.com)

Optery (YC W22) Is Hiring in Engineering, Legal, Sales, Marketing (U.S., Latam) (optery.com)

We should have the ability to run any code we want on hardware we own (hugotunius.se)

Towards Memory Specialization: A Case for Long-Term and Short-Term RAM (arxiv.org)

CocoaPods trunk read-only plan (blog.cocoapods.org)

Ask HN: Who wants to be hired? (September 2025)

Detecting and countering misuse of AI (anthropic.com)

Rare IBM Schools Computer 1969 (retrocomputingforum.com)

What brain surgery taught me about the fragile gift of consciousness (bigthink.com)

Google AI Overview made up an elaborate story about me (bsky.app)

Desert Graves (2021) (desertmountaineer.com)

Eternal Struggle (yoavg.github.io)

Search engine referral report for 2025 Q2 (radar.cloudflare.com)

What Is Complexity in Chess? (lichess.org)

India's billion-dollar e-waste empire (restofworld.org)

Tetris is NP-hard even with O(1) rows or columns (2020) [pdf] (martindemaine.org)

UK's largest battery storage facility at Tilbury substation (nationalgrid.com)

A Unique, High-Tech (Family) Computer (nicole.express)

Free online party games that work instantly in any browser (bestpartygames.net)

Visitors dropped for a 6th straight month in Las Vegas (cbsnews.com)

I Was Wrong About Data Center Water Consumption (construction-physics.com)

Preserving Order in Concurrent Go Apps: Three Approaches Compared (destel.dev)

Using JWT to establish a trusted context for Row Level Security (vondra.me)

RIP: Amazon Prime Benefits Sharing Ends on Oct 1, 2025 (amazon.com)

Pretraining a LLM with less than $50 budget which outperforms Google BERT

Comments (1)