Diffusion based alternative to self attention

Comments (1)

deepGem · 2h ago

I spent a few weeks trying to build an alternative to self attention that scales memory linearly. I I got surprisingly good results. While in principle this makes a lot of sense, I am struggling to push the test accuracy above 86%.

Some of the alternatives I am about to consider:

1. Diffusion with sparse attention layers. 2. Hierarchical diffusion - next token diffusion combined with higher order chunk diffusion.

Still figuring out the code and I would love any feedback on these approaches.

Android Security Bulletin–September 2025 (source.android.com)

X to allow advertisers to upload product catalogs; grok will recommend them (x.com)

Give It Five Minutes (world.hey.com)

POLAR Loop – Screen-Free Health Band and Fitness Tracker (polar.com)

August in Beeper Land: MCP, API, Parity and More (blog.beeper.com)

Rhythm: A CLI tool in Dart to track org-mode schedules (github.com)

Carbon-fiber smart plastic: Self-healing, shape-shifting and stronger than steel (techxplore.com)

I'm giving away 100% equity to users – here's the math (fuzuck.org)

The Fantasy Football Training Camp Hosted Their Bootcamp on SpatialChat (how.spatial.chat)

Serving AI from the basement part I (ahmadosman.com)

Intel Arc Pro B50 Linux Performance Benchmarks (phoronix.com)

Don't Block the Event Loop (Or the Worker Pool) (nodejs.org)

Show HN: TwoTickets – meet through events, not swipes

Actual audio world models are getting closer and closer (bsky.app)

Are there any countries with no mosquitoes? (livescience.com)

Every Commodore Amiga Model Ever Made [video] (youtube.com)

How Does Anesthesia Work? (popsci.com)

Release Notes for Safari Technology Preview 227 (webkit.org)

Salesforce's weak quarterly revenue forecast signals lagging AI monetization (reuters.com)

Show HN: Listgitfiles.sh – Fetch Raw GitHub File URLs with One Command (gist.github.com)

Biased bots: AI hiring managers shortlist candidates with AI resumes (theregister.com)

Firefox ESR 115 for Windows 7/8.1, macOS 10.12-10.14 extended to March 2026 (whattrainisitnow.com)

Helping a Child Sleep: Practical Tips for Parents and Carers [video] (acamhlearn.org)

The agent economy: New paths to build and earn (developer.paypal.com)

Sweeteners can harm cognitive health equivalent to 1.6 years of ageing (theguardian.com)

Ask HN: Is Google AI Overview giving you scam phone numbers?

History of gamma-ray burst research (en.wikipedia.org)

Like Humans, Every Tree Has Its Own Microbiome, a New Study Has Found (nytimes.com)

SRE AI Agent Used at TikTok (josephfattah.com)

An Alternative to TradingView (aulico.com)

Every Single Human. Like. Always (randsinrepose.com)

Tech Model Railroad Club at MIT (tmrc.mit.edu)

Math Resource: All Ten – A Visual Math Game (From Beast Academy) (kidswholovemath.substack.com)

Slack update shows iPhone as a jailbroken device (old.reddit.com)

Practical Techniques for Claude Code and Codex CLI (coding-with-ai.dev)

The old Irish internet is being deleted on the 21st of October (old.reddit.com)

Tesla's Robotaxi So Revolutionary, It's Moving 'Safety Monitor' to Driver's Seat (jalopnik.com)

Tcl (Tool Command Language) (tcl-lang.org)

PLTR is the most de-risked 10x opportunity on the market

Troubleshooting ZFS – Common Issues and How to Fix Them (klarasystems.com)

Show HN: Validating demand for an AR and Agentic airport guide – would you pay?

Bridging the network cost gap: Operators need traffic-based cost intelligence (ciodive.com)

Most people's life satisfaction matches their personality traits (psycnet.apa.org)

Judge says Trump administration unlawfully blocked $2B from Harvard (cnn.com)

Prusa CEO declares "open hardware desktop 3D printing is dead", blames China (techradar.com)

ML Systems: Motivating Dense Models (jacobkahn.me)

FCC to block Wi-Fi hotspots for schoolkids (arstechnica.com)

Elicitation: CIA's Technique to Make People Talk Without Them Realizing [video] (youtube.com)

Hledger v1.50 (github.com)

I Taught My 3-Year-Old to Read 'The Hobbit' (thefp.com)

Diffusion based alternative to self attention

Comments (1)