The web does not need gatekeepers: Cloudflare’s new “signed agents” pitch (positiveblue.substack.com)

"Attention Is All You Need" - I've always wondered if the authors of that paper used such a casual and catchy title because they knew it would be groundbreaking and massively cited in the future....

sivm · 2h ago

Attention is all you need for what we have. But attention is a local heuristic. We have brittle coherence and no global state. I believe we need a paradigm shift in architecture to move forward.

ACCount37 · 8m ago

Plenty of "we need a paradigm shift in architecture" going around - and no actual architecture that would beat transformers at their strengths as far as eye can see.

I remain highly skeptical. I doubt that transformers are the best architecture possible, but they set a high bar. And it sure seems like people who keep making the suggestion that "transformers aren't the future" aren't good enough to actually clear that bar.

treyd · 1h ago

Has there been research into some hierarchical attention model that has local attention at the scale of sentences and paragraphs that feeds embeddings up to longer range attention across documents?

mxkopy · 39m ago

There’s the hierarchical reasoning model https://arxiv.org/abs/2506.21734 but it’s very new and largely untested

Though honestly I don’t think new neural network architectures are going to get us over this local maximum, I think the next steps forward involve something that’s

1. Non lossy

2. Readily interpretable

miven · 4m ago

The ARC Prize Foundation ran extensive ablations on HRM for their slew of reasoning tasks and noted that the "hierarchical" part of their architecture is not much more impactful than a vanilla transformer of the same size with no extra hyperparameter tuning:

https://arcprize.org/blog/hrm-analysis#analyzing-hrms-contri...

adastra22 · 4h ago

Definitely. I always assumed that, having been involved in writing similarly groundbreaking papers… or so we thought at the time. All my coauthors spent significant time thinking about what the best title would be, and strategies like that were common. (It ended up not mattering for us.)

iLoveOncall · 2h ago

I recommend reading this article which explains how you can get your papers accepted, and explains that a catchy title is the #1 most important thing: https://maxwellforbes.com/posts/how-to-get-a-paper-accepted/ (not a plug, I just saved it because it was interesting)

hyperbovine · 2h ago

It sounds like a typical neurips paper to me. And no, they did know what a big deal it would be, else google never would have given the idea away.

JSR_FDED · 5h ago

Any way to read this without making an account?

rmonvfer · 5h ago

https://freedium.cfd/https://vinithavn.medium.com/from-multi...

kuidaumpf · 5h ago

https://freedium.cfd

djoldman · 1h ago

just turn off JS.

qcnguy · 4h ago

Just click the x at the top right of the interstitial?

iLoveOncall · 2h ago

That only work for a few articles per month. But usually opening in incognito does the trick.

mrtesthah · 6h ago

Do we know if any of these techniques are actually used in the so-called "frontier" models?

gchadwick · 2h ago

Who knows what the closed source models use but certainly going by what's happening in open models all the big changes and corresponding gains in capability are in training techniques not model architecture. Things like GQA and MLA as discussed in this article are important techniques for getting better scaling but are relatively minor tweak vs the evolution in training techniques.

I suspect closed models aren't doing anything too radically different from what's presented here.

vinithavn01 · 5h ago

The model names are mentioned under each type of attention mechanism

Is it possible to allow sideloading *and* keep users safe? (shkspr.mobi)

Agent Client Protocol (agentclientprotocol.com)

Do the simplest thing that could possibly work (seangoedecke.com)

You do not need "analytics" for your blog (thisdaysportion.com)

From Multi-Head to Latent Attention: The Evolution of Attention Mechanisms (vinithavn.medium.com)

It turns out Nokia's legendary font makes for a great user interface font (osnews.com)

Show HN: I made an Animal Crossing style letter editor (acmail.idreesinc.com)

De minimis has ended (washingtonpost.com)

John Carmack's arguments against building a custom XR OS at Meta (twitter.com)

Lisp from Nothing, Second Edition (t3x.org)

Grok Code Fast 1 (x.ai)

The Grammar According to West (dwest.web.illinois.edu)

Essential Coding Theory [pdf] (cse.buffalo.edu)

The Theoretical Limitations of Embedding-Based Retrieval (arxiv.org)

Taylor Otwell: What 14 Years of Laravel Taught Me About Maintainability (maintainable.fm)

Deploying DeepSeek on 96 H100 GPUs (lmsys.org)

Emulating aarch64 in software using JIT compilation and Rust (pitsidianak.is)

Hermes 4 (hermes4.nousresearch.com)

SynthID – A tool to watermark and identify content generated through AI (deepmind.google)

Amiga Hardware Reference Manual 3rd Edition (1991) (archive.org)

Why Romania excels in international Olympiads (palladiummag.com)

Wikipedia as a Graph (wikigrapher.com)

Trying to get error backtraces in Rust libraries right (iroh.computer)

My Failures Onboarding at Splunk (people-work.io)

The web does not need gatekeepers: Cloudflare’s new “signed agents” pitch (positiveblue.substack.com)

How do I get into the game industry (garry.net)

Income Equality in Nordic Countries: Myths, Facts, and Lessons (aeaweb.org)

Nginx-CGI brings support for CGI to Nginx and angie (github.com)

Show HN: Hacker News em dash user leaderboard pre-ChatGPT (gally.net)

Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown (sosumi.ai)

Flunking my Anthropic interview again (taylor.town)

How did .agakhan, .ismaili and .imamat get their own TLDs? (data.iana.org)

God created the real numbers (ethanheilman.com)

15-Fold increase in solar thermoelectric generator performance (nature.com)

Automating Bug Bounty with N8n (lampysecurity.com)

A look at XSLT 3.0 (2017) (xml.com)

Thunder Compute (YC S24) Is Hiring (ycombinator.com)

Make any site multiplayer in a few lines. Serverless WebRTC matchmaking (oxism.com)

Seedbox Lite: A lightweight torrent streaming app with instant playback (github.com)

SQLite's documentation about its durability properties is unclear (agwa.name)

Lucky 13: a look at Debian trixie (lwn.net)

Accelerating life sciences research (openai.com)

Some users have noticed settings that let Meta analyze and retain phone photos (zdnet.com)

Fun and Immersive Typing Game (keybara.io)

Reloading Classes in Python (andrewpwheeler.com)

Data engineering and software engineering are converging (clickhouse.com)

Offline-First Landscape – 2025 (marcoapp.io)

AI’s coding evolution hinges on collaboration and trust (spectrum.ieee.org)

Ask HN: The government of my country blocked VPN access. What should I use?

Updates to Consumer Terms and Privacy Policy (anthropic.com)

From Multi-Head to Latent Attention: The Evolution of Attention Mechanisms

Comments (18)

Is it possible to allow sideloading and keep users safe? (shkspr.mobi)