TREAD: Token Routing for Efficient Architecture-Agnostic Diffusion Training

Comments (5)

platers · 2h ago

I'm struggling to understand where the gains are coming from. What is the intuition for why DiT training was so inefficient?

joshred · 2h ago

This is the high-level explanation of the simplest diffusion architecture. The model trains by taking an image and iteratively adding noise to the image until there is only noise. Then they take that sequence of noisier and noisier images and they reverse it. The result is that they start with only noise, and they predict the removal of noise at step until they get to the final step (which should be the original image (or training input)).

That process means they may require a hundred or more training iterations on a single image. I haven't digested the paper, but it sounds like they are proposing something conceptually similar to skip layers (but significantly more involved).

arjvik · 1h ago

Isn't this just Mixture-of-Depths but for DiTs?

If so, what are the DiT specific changes that needed to be made?

lucidrains · 1h ago

very nice, will have to try it out! this is the same research group from which Robin Rombach (of stable diffusion fame) originated from

earthnail · 2h ago

Wow, Ommer’s students never fail to impress. 37x faster for a generic architecture, ie no domain specific tricks. Insane.

Ashby (YC W19) Is Hiring Design Engineers in AMER and EMEA (ashbyhq.com)

EasyPost (YC S13) Is Hiring (easypost.com)

Tesorio (YC S15) Is Hiring a Senior GenAI Engineer (100% Remote) (tesorio.com)

OneSignal (YC S11) Is Hiring Engineers (onesignal.com)

Axle (YC S22) is hiring product engineers (ycombinator.com)

Mbodi AI (YC X25) Is Hiring a Founding Research Engineer (Robotics) (ycombinator.com)

ReadMe (YC W15) Is Hiring a Developer Experience PM (readme.com)

Weave (YC W25) is hiring a founding AI engineer (ycombinator.com)

Depot (YC W23) Is Hiring a Community and Events Manager (Remote) (ycombinator.com)

CoLoop (YC S21) Is Hiring AI Engineers in London

Trellis (YC W24) Is Hiring: Automate Prior Auth in Healthcare (ycombinator.com)

Type (YC W23) is hiring a founding engineer to build an AI-native doc editor (ycombinator.com)

Foundry (YC F24) is hiring staff-level product engineers (ycombinator.com)

GoGoGrandparent (YC S16) Is Hiring Back End and Full-Stack Engineers

Kyber (YC W23) is hiring enterprise account executives (ycombinator.com)

Converge (YC S23) well-capitalized New York startup seeks product developers (runconverge.com)

Great Question (YC W21) Is Hiring a VP of Engineering (Remote) (ycombinator.com)

Coverage Cat (YC S22) Is Hiring a Senior, Staff, or Principal Engineer (coveragecat.com)

Kaizen (YC X25) is hiring engineers to build browser agents that work (kaizenautomation.com)

Infracost (YC W21) hiring first PM to shift $600B cloud spend to proactive (ycombinator.com)

Sei (YC W22) Is Hiring a Full Stack Engineer in Chennai, India (ycombinator.com)

Artie (YC S23) Is Hiring Founding AEs (ycombinator.com)

Cedana (YC S23) Is Hiring a Systems Engineer (ycombinator.com)

CodeCrafters (YC S22) is hiring first Marketing Person (ycombinator.com)

PAX Markets (YC W25) is hiring a founding principal hardware (RTL) engineer (ycombinator.com)

Sendblue (YC S23) is hiring senior engineers (ycombinator.com)

Thunder Compute (YC S24) Is Hiring a C++ Systems Engineer (ycombinator.com)

Optery (YC W22) Is Hiring in Engineering, Legal, Sales, Marketing (U.S., Latam) (optery.com)

QuestDB (YC S20) Is Hiring a Technical Content Lead (questdb.com)

Depot (YC W23) Is Hiring a Technical Content Writer (Remote) (ycombinator.com)

Firebender (YC W24) Is Hiring (ycombinator.com)

TREAD: Token Routing for Efficient Architecture-Agnostic Diffusion Training

Comments (5)