How we decreased GitLab repo backup times from 48 hours to 41 minutes (about.gitlab.com)

167 points by immortaljoe 3h ago 59 comments

Sandia turns on brain-like storage-free supercomputer (blocksandfiles.com)

85 points by rbanffy 3h ago 26 comments

Odyc.js – A tiny JavaScript library for narrative games (odyc.dev)

124 points by achtaitaipai 5h ago 27 comments

4-7-8 Breathing (breathbelly.com)

88 points by cheekyturtles 3h ago 35 comments

Meta: Shut down your invasive AI Discover feed (mozillafoundation.org)

333 points by speckx 3h ago 137 comments

A masochist's guide to web development (sebastiano.tronto.net)

124 points by sebtron 5h ago 11 comments

Too Many Open Files (mattrighetti.com)

58 points by furkansahin 3h ago 42 comments

Great Question (YC W21) Is Hiring a Director of Customer Success (ycombinator.com)

1 points by nedwin 1h ago 0 comments

Breakthrough in search for HIV cure leaves researchers 'overwhelmed' (theguardian.com)

98 points by colinprince 2h ago 8 comments

SaaS is just vendor lock-in with better branding (rwsdk.com)

33 points by pistoriusp 1h ago 12 comments

Curate your shell history (esham.io)

58 points by todsacerdoti 5h ago 42 comments

Series C and Scale (Cursor) (cursor.com)

20 points by fidotron 2h ago 14 comments

The time bomb in the tax code that's fueling mass tech layoffs (qz.com)

79 points by booleanbetrayal 2d ago 32 comments

Free Gaussian Primitives at Anytime Anywhere for Dynamic Scene Reconstruction (zju3dv.github.io)

38 points by trueduke 4h ago 6 comments

An Interactive Guide to Rate Limiting (blog.sagyamthapa.com.np)

90 points by sagyam 4h ago 27 comments

Swift and the Cute 2d game framework: Setting up a project with CMake (layer22.com)

70 points by pusewicz 8h ago 60 comments

Wendelstein 7-X sets new fusion record (heise.de)

72 points by doener 3d ago 2 comments

VPN providers in France ordered to block pirate sports IPTV (torrentfreak.com)

87 points by gasull 3h ago 56 comments

See how a dollar would have grown over the past 94 years [pdf] (newyorklifeinvestments.com)

52 points by mooreds 2h ago 72 comments

Ask HN: Any good tools for viewing congressional bills?

54 points by tlhunter 3h ago 31 comments

OpenAI is retaining all ChatGPT logs "indefinitely." Here's who's affected (arstechnica.com)

88 points by Bender 3h ago 42 comments

What you need to know about EMP weapons (aardvark.co.nz)

65 points by flyingkiwi44 8h ago 85 comments

Self-hosting your own media considered harmful according to YouTube (jeffgeerling.com)

1417 points by DavideNL 14h ago 611 comments

Weaponizing Dependabot: Pwn Request at its finest (boostsecurity.io)

59 points by chha 8h ago 37 comments

Small Programs and Languages (ratfactor.com)

90 points by todsacerdoti 5h ago 34 comments

How to (actually) send DTMF on Android without being the default call app (edm115.dev)

27 points by EDM115 7h ago 10 comments

Defending adverbs exuberantly if conditionally (countercraft.substack.com)

75 points by benbreen 23h ago 41 comments

Top researchers leave Intel to build startup with 'the biggest, baddest CPU' (oregonlive.com)

106 points by dangle1 5h ago 76 comments

Self-reported race, ethnicity don't match genetic ancestry in the U.S.: study (science.org)

52 points by pseudolus 4h ago 136 comments

The impossible predicament of the death newts (crookedtimber.org)

548 points by bdr 1d ago 178 comments

The Coleco Adam Computer (dfarq.homeip.net)

36 points by rbanffy 8h ago 17 comments

ThornWalli/web-workbench: Old operating system as homepage (github.com)

25 points by rbanffy 6h ago 5 comments

Jepsen: TigerBeetle 0.16.11 (jepsen.io)

198 points by aphyr 8h ago 56 comments

Exploring AI Integrations with Adobe Photoshop, InDesign and Premiere Pro (mikechambers.com)

9 points by mesh 4h ago 2 comments

Show HN: Air Lab – A portable and open air quality measuring device (networkedartifacts.com)

449 points by 256dpi 1d ago 178 comments

Show HN: Claude Composer (github.com)

145 points by mikebannister 20h ago 85 comments

Tokasaurus: An LLM inference engine for high-throughput workloads (scalingintelligence.stanford.edu)

208 points by rsehrlich 21h ago 23 comments

How we’re responding to The NYT’s data demands in order to protect user privacy (openai.com)

260 points by BUFU 18h ago 279 comments

APL Interpreter – An implementation of APL, written in Haskell (2024) (scharenbroch.dev)

128 points by ofalkaed 21h ago 60 comments

Test Postgres in Python Like SQLite (github.com)

137 points by wey-gu 18h ago 49 comments

Seven Days at the Bin Store (defector.com)

216 points by zdw 1d ago 109 comments

What a developer needs to know about SCIM (tesseral.com)

139 points by noleary 20h ago 31 comments

How much energy does it take to think? (quantamagazine.org)

51 points by nsoonhui 14h ago 49 comments

I made a search engine worse than Elasticsearch (2024) (softwaredoug.com)

117 points by softwaredoug 1d ago 22 comments

SkyRoof: New Ham Satellite Tracking and SDR Receiver Software (rtl-sdr.com)

110 points by rmason 1d ago 14 comments

AMD Radeon 8050S “Strix Halo” Linux Graphics Performance Review (phoronix.com)

62 points by rbanffy 8h ago 36 comments

Aether: A CMS That Gets Out of Your Way (lebcit.github.io)

30 points by LebCit 12h ago 26 comments

Show HN: Ask-human-mcp – zero-config human-in-loop hatch to stop hallucinations (masonyarbrough.com)

105 points by echollama 20h ago 53 comments

Open Source Distilling (opensourcedistilling.com)

72 points by nativeit 17h ago 33 comments

Czech Republic: Petition for open source in public administration (portal.gov.cz)

175 points by harvie 9h ago 26 comments

Reproducing the deep double descent paper

15 stpn 4 6/5/2025, 6:34:23 PM stpn.bearblog.dev ↗

Comments (4)

lcrmorin · 3h ago

Do you change regularisation ?

davidguetta · 21h ago

is this not because the longer you train, the more neurons 'die' (not uilized anymore cause the gradient is flat on the dataset) so you effectively get a smaller models as the training goes on ?

rsfern · 18h ago

I don’t think so? the double descent phenomenon also occurs in linear models under the right conditions. My understanding of this is that when the effective model capacity is exactly equal to the information in the dataset, there is only one solution that interpolates the training data perfectly, but when the capacity increases far beyond this there are many such interpolating solutions. Apply enough regularization and you are likely to find an interpolating solution that generalizes well

stpn · 15h ago

(post author here)

I was curious about this since it kind of makes sense, but I offer a few reasons why I think this isn't the case:

- In the 10% noise case at least, the second descent eventually finds a minima that's better than the original local minima which suggests to me the model really is finding a better fit rather than just reducing itself to a similar smaller model

- If it were the case, I think we might also expect the error for larger models to converge to the performance of smaller models? But instead they converge lower and better

- I checked the logged gradient histograms I had for a the runs. While I'm still learning how to interpret the results, I didn't see signs of vanishing gradients where dead neurons later in the model prevented earlier layers from learning. Gradients do get smaller over time but that seems expected and we don't have big waves of neurons dying which is what I'd expect to have the larger network converge on the size of the smaller one.