Setol: SemiEmpirical Theory of (Deep) Learning

Comments (2)

charleshmartin · 1d ago

We present a SemiEmpirical Theory of Learning (SETOL) that explains the remarkable performance of State-Of-The-Art (SOTA) Neural Networks (NNs). We provide a formal explanation of the origin of the fundamental quantities in the phenomenological theory of Heavy-Tailed Self-Regularization (HTSR): the heavy-tailed power-law layer quality metrics, alpha and alpha-hat. In prior work, these metrics have been shown to predict trends in the test accuracies of pretrained SOTA NN models, importantly, without needing access to either testing or training data. Our SETOL uses techniques from statistical mechanics as well as advanced methods from random matrix theory and quantum chemistry. The derivation suggests new mathematical preconditions for ideal learning, including a new metric, ERG, which is equivalent to applying a single step of the Wilson Exact Renormalization Group. We test the assumptions and predictions of SETOL on a simple 3-layer multilayer perceptron (MLP), demonstrating excellent agreement with the key theoretical assumptions. For SOTA NN models, we show how to estimate the individual layer qualities of a trained NN by simply computing the empirical spectral density (ESD) of the layer weight matrices and plugging this ESD into our SETOL formulas. Notably, we examine the performance of the HTSR alpha and the SETOL ERG layer quality metrics, and find that they align remarkably well, both on our MLP and on SOTA NNs.

aprp1 · 23h ago

Amazing work Charles!

Thunder Compute (YC S24) Is Hiring a C++ Systems Engineer (ycombinator.com)

Optery (YC W22) Is Hiring in Engineering, Legal, Sales, Marketing (U.S., Latam) (optery.com)

QuestDB (YC S20) Is Hiring a Technical Content Lead (questdb.com)

Depot (YC W23) Is Hiring a Technical Content Writer (Remote) (ycombinator.com)

Firebender (YC W24) Is Hiring (ycombinator.com)

Better Auth (YC X25) Is Hiring (ycombinator.com)

Kapa.ai (YC S23) is hiring a software engineers (EU remote) (ycombinator.com)

Spice Data (YC S19) Is Hiring a Product Associate (New Grad) (ycombinator.com)

Extend (YC W23) is hiring engineers to build SOTA document processing (jobs.ashbyhq.com)

Piramidal (YC W24) is hiring a full stack engineer (ycombinator.com)

Mango Health (YC W24) Is Hiring (ycombinator.com)

Resolve (YC W15) Is Hiring an Operations and Billing Lead for Construction VR

Arva AI (YC S24) Is Hiring an AI Research Engineer (London, UK) (arva.ai)

Rejoy Health (YC W21) Is Hiring (ycombinator.com)

Weave (YC W25) is hiring an AI engineer (ycombinator.com)

CoinTracker (YC W18) is hiring to solve crypto taxes and accounting (remote)

Crimson (YC X25) is hiring founding engineers in London (ycombinator.com)

Martin (YC S23) Is Hiring Founding Engineers to Build a Better Siri (ycombinator.com)

Meticulous (YC S21) is hiring in UK to redefine software dev (tinyurl.com)

Infisical (YC W23) Is Hiring DevRel Engineers (ycombinator.com)

Sieve (YC X25) is hiring researchers to build large video datasets for AI labs (sievedata.com)

Activeloop (YC S18) Is Hiring AI Search and Python Back End Engineers(Onsite,MV) (careers.activeloop.ai)

Attimet (YC F24) – Quant Trading Research Lab – Is Hiring Founding Researcher (ycombinator.com)

Metriport (YC S22) is hiring engineers to improve healthcare data exchange (ycombinator.com)

Telli (YC F24) Is Hiring Engineers [On-Site Berlin] (hi.telli.com)

Continue (YC S23) is hiring software engineers in San Francisco (ycombinator.com)

UpCodes (YC S17) is hiring a Head of Ops to automate construction compliance (up.codes)

Enhanced Radar (YC W25) is hiring a founding engineer

Converge (YC S23) well-capitalized New York startup seeks product developers (runconverge.com)

Kyber (YC W23) Is Hiring Enterprise BDRs (ycombinator.com)

MindsDB (YC W20) is hiring an AI solutions engineer (job-boards.greenhouse.io)

Setol: SemiEmpirical Theory of (Deep) Learning

Comments (2)