Atlas: Learning to Optimally Memorize the Context at Test Time

Comments (2)

cgearhart · 13h ago

It seems like there’s been a lot of progress here, but it also seems like there’s an elephant in the room that RNNs will _always_ have worse memory than self-attention since the latter always has complete access to the full context. We pay for that in other ways, but it seems like the unstated hypothesis of RNNs is that we believe in the long run that RNNs will be “good enough” and their other performance benefits will eventually prevail. I’m not convinced that humanity will ever sink comparable resources into optimizing this family of models that has gone into Transformers to make them practical at the scale they run today.

arjvik · 15h ago

The Titans papers and the Test-Time Training papers (https://arxiv.org/abs/2407.04620) both have the same premise - models should "learn" from their context rather than memorize them. Very promising direction!

CARDIAC – CARDboard Illustrative Aid to Computation (en.wikipedia.org)

Show HN: Meto – Metabolic Health 3.0 with AI, Wearables, and Precision Care (meto.co)

Computer Stores of the 1970s and 1980s (2022) (pcmag.com)

Show HN: An open-source megarepo turning hackers into frontier AI researchers (github.com)

Ask HN: Who's building AI for DevOps? (Heroku 2.0 in the making?)

Go Error Propagation and API Contracts (matttproud.com)

The evidence suggests Covid-19 came from a lab (rationaloptimistsociety.substack.com)

Codex CLI is going native (github.com)

Illegal streaming boom traced to jailbroken Amazon Fire Stick devices (neowin.net)

RSC for Lisp Developers (overreacted.io)

The Forgotten World of BBS Door Games (2016) (pcmag.com)

The ocean, as told by tiny beautiful fossils – Knowable Magazine (knowablemagazine.org)

Rec kickball team, their millions (media.hubspot.com)

Italian Fascism (and art) Created the Cybertruck. [video] (youtube.com)

The Five Stages of Victory (thecritic.co.uk)

Announcing the formation of the WordPress AI team (wordpress.org)

Largest Chip Sets AI Speed Record, Beating Nvidia (forbes.com)

European Age Verification Solution (ageverification.dev)

The Dephaze Protocol – A Phase-Based Model of Reality (github.com)

Atari Means Business with the Mega ST – By Paul Lefebvre (goto10retro.com)

In the Desert, Bread (blog.greg.technology)

The 3D Gaussian Splatting Adventure: Past, Present, Future [video] (youtube.com)

Deepfakes just got even harder to detect: Now they have heartbeats (sciencefocus.com)

We turned browser screen recordings into executable, customizable AI agents (gabrieloperator.com)

U.S. Turns Against Floating Offshore Wind Turbines (spectrum.ieee.org)

China's quantum satellite can be hacked, Singapore-based scientist warns (scmp.com)

A Trump-fueled brain drain could be the rest of the world's brain gain (cnn.com)

Blender Feedback Survey 2024 Results (survey.blender.org)

I like Svelte more than React (it's store management) (river.berlin)

Ocean SDK is D-Wave's suite of tools for solving hard problems with OCs (github.com)

Robert Jarvik, 79, Dies; a Designer of the First Permanent Artificial Heart (nytimes.com)

Women who hate men: a comparative analysis across extremist Reddit communities (nature.com)

Cultural Differences in the Beauty Premium (pmc.ncbi.nlm.nih.gov)

YouTube comment history tool across 20Bs comments (youtube-tools.lolarchiver.com)

Zero-Touch QA for Every Web App (carryid.com)

Pension Funds Won't Save the Bond Market (wsj.com)

FAQ: What is a TPM and how can I use it on Linux? (debugging.works)

Terence Tao: DeepMind's open repository of formalized mathematics conjectures (mathstodon.xyz)

Shunn Manuscript Format in LaTeX (github.com)

Ironclad 0.7.0 – formally verified Unix-like kernel in SPARK and Ada (codeberg.org)

Show HN: I made a desktop app to run any MCP (onemcp.io)

Cinematography of "Andor" (pushing-pixels.org)

0.9999 ≊ 1 (lcamtuf.substack.com)

GNU Guix Migrating to Codeberg (guix.gnu.org)

People Get Good at Their Jobs Through Practice (betonit.ai)

Analogmaxxing (nklswbr.com)

Why 'wrench attacks' on wealthy crypto holders are on the rise (apnews.com)

Help me with some ideas, add to the list here (wishtogether.xyz)

Hyundai's first three-row electric SUV is here (electrek.co)

Neoq – a queue-agnostic background job library for Go (github.com)

Atlas: Learning to Optimally Memorize the Context at Test Time

Comments (2)