Backporting the C64 Commando music to the arcade hardware [video] (youtube.com)

"Attention Is All You Need" - I've always wondered if the authors of that paper used such a casual and catchy title because they knew it would be groundbreaking and massively cited in the future....

sivm · 1h ago

Attention is all you need for what we have. But attention is a local heuristic. We have brittle coherence and no global state. I believe we need a paradigm shift in architecture to move forward.

treyd · 48m ago

Has there been research into some hierarchical attention model that has local attention at the scale of sentences and paragraphs that feeds embeddings up to longer range attention across documents?

adastra22 · 3h ago

Definitely. I always assumed that, having been involved in writing similarly groundbreaking papers… or so we thought at the time. All my coauthors spent significant time thinking about what the best title would be, and strategies like that were common. (It ended up not mattering for us.)

iLoveOncall · 1h ago

I recommend reading this article which explains how you can get your papers accepted, and explains that a catchy title is the #1 most important thing: https://maxwellforbes.com/posts/how-to-get-a-paper-accepted/ (not a plug, I just saved it because it was interesting)

hyperbovine · 1h ago

It sounds like a typical neurips paper to me. And no, they did know what a big deal it would be, else google never would have given the idea away.

JSR_FDED · 4h ago

Any way to read this without making an account?

djoldman · 3m ago

just turn off JS.

rmonvfer · 4h ago

https://freedium.cfd/https://vinithavn.medium.com/from-multi...

kuidaumpf · 4h ago

https://freedium.cfd

qcnguy · 3h ago

Just click the x at the top right of the interstitial?

iLoveOncall · 1h ago

That only work for a few articles per month. But usually opening in incognito does the trick.

mrtesthah · 5h ago

Do we know if any of these techniques are actually used in the so-called "frontier" models?

gchadwick · 1h ago

Who knows what the closed source models use but certainly going by what's happening in open models all the big changes and corresponding gains in capability are in training techniques not model architecture. Things like GQA and MLA as discussed in this article are important techniques for getting better scaling but are relatively minor tweak vs the evolution in training techniques.

I suspect closed models aren't doing anything too radically different from what's presented here.

vinithavn01 · 4h ago

The model names are mentioned under each type of attention mechanism

Backporting the C64 Commando music to the arcade hardware [video] (youtube.com)

Ask HN: For what do we need a CEO in the age of AI?

The Real Mathematics Behind UUID Uniqueness (fastuuid.com)

Is it possible to allow sideloading *and* keep users safe? (shkspr.mobi)

Mongo can't stop winning (I hate it) [video] (youtube.com)

A proposal for clean cloud-free AI inference network (github.com)

'Sliding into an abyss': experts warn over rising use of AI for mental health (theguardian.com)

The Sovereign Individual is a retelling of a very old story (ft.com)

Show HN: I made a browser extension to save/load storage profiles between tabs (addons.mozilla.org)

Humanoid AI robot Xiao He to assist media personnel at SCO Summit in China (business-standard.com)

Remains of Ice Age man may be earliest evidence of conflict in Southeast Asia (cnn.com)

You do not need "analytics" for your blog (thisdaysportion.com)

Gym bros are already taking the strongest weight-loss drug (businessinsider.com)

Hardening Firefox – a checklist for improved browser privacy (andrewmarder.net)

The MEGA65: Who Remembers the Commodore 65? [video] (youtube.com)

LLMs for the Old and Infirm (oblomovka.com)

GPU Utilisation and Performance Improvements (interplayoflight.wordpress.com)

Mass Intelligence (oneusefulthing.org)

Late-Night Killing of a Teenage Girl on a Bicycle Unnerves Amsterdam (nytimes.com)

Cheat.sh (cheat.sh)

Image Encoding Using AutoGrad (aschrein.github.io)

Elegant Kosher Catering and Events

Pocketrade – AI growth engine for brokers (pocketrade.com)

Apple FastVLM WebGPU – Real Time in Browser VLM (huggingface.co)

Light field displays in a headset. The missing piece for AR [video] (youtube.com)

FTC claims Gmail filtering Republican emails threatens "American freedoms" (arstechnica.com)

Probability theorem gets quantum makeover after 250 years (phys.org)

Tell HN: My advice after I applied to 450 positions before getting hired

Learning from Heuristics (emiruz.com)

Ask HN: What is the chance of getting a donation through support on GitHub?

NASA Is Now Primarily an Intelligence / National Security Agency (nasawatch.com)

Hypervisor in 1k Lines (1000hv.seiya.me)

Nobody gets fired for reporting the error to Sentry (apimason.com)

Towards OpenPGP v6 in PGPainless and Bouncy Castle (warmwasserwerfer.de)

The diversity of OpenStreetMap tools and how they help create a commons (tzovar.as)

AI-Assisted Development: A Three-Act Play (betweentheprompts.com)

Amiga Hardware Reference Manual 3rd Edition (1991) (archive.org)

Show HN: Discover Fast-Growing Websites Before They Go Mainstream (websitegrowthtracker.com)

Computing Simplified Coverage Polygons (volkerkrause.eu)

Just in Time Code Generation Within WebAssembly (wingolog.org)

Overnight Reverse Repurchase Agreements: Treasury Securities Sold by the Fed (fred.stlouisfed.org)

The case for more intellectually challenging chatbots (maggieappleton.com)

Flashbacker: Claude Code state management with session continuity and personas (github.com)

Commodore apparently sponsored a Kyiv soccer team in 1986 (old.reddit.com)

California tech startup once worth $1B shuts down (sfgate.com)

project management software for property management (basanzietech.blogspot.com)

Show HN: Taskguru.so – Superhuman email, for tasks (for founders and PMs) (taskguru.so)

Transportation, divergence, and the industrial revolution (2014) (unenumerated.blogspot.com)

Check if your company was breached (havewebeenleaked.io)

Show HN: Nano Banana AI Studio – Free AI Image editor, no signup required (nanobananaai.studio)

From Multi-Head to Latent Attention: The Evolution of Attention Mechanisms

Comments (15)

Is it possible to allow sideloading and keep users safe? (shkspr.mobi)