Lago (Open-Source Usage Based Billing) is hiring for ten roles (ycombinator.com)

1 points by AnhTho_FR 23h ago 0 comments

Spark AI (YC W24) is hiring a full-stack engineer in SF (founding team) (ycombinator.com)

1 points by juliawu 1d ago 0 comments

Bitmovin (YC S15) Is Hiring a Junior Solutions Engineer in Denver (bitmovin.com)

1 points by slederer 1d ago 0 comments

SigNoz (YC W21, Open Source Datadog) Is Hiring DevRel Engineers (Remote)(US) (ycombinator.com)

1 points by pranay01 2d ago 0 comments

AccessOwl (YC S22) is hiring an Elixir Engineer to connect 100s of SaaS (ycombinator.com)

1 points by mathiasn 3d ago 0 comments

FurtherAI (YC W24) Is Hiring for Software and AI Roles (ycombinator.com)

1 points by sgondala_ycapp 3d ago 0 comments

Yarn (YC W24) is hiring engineers in NYC (ycombinator.com)

1 points by jasperstory 3d ago 0 comments

Expand.ai (YC S24) is hiring a founding engineer

1 points by timsuchanek 4d ago 0 comments

Optifye.ai (YC W25) is hiring a back end engineer

1 points by Vivaan_Baid 6d ago 0 comments

Kastle (S24) is hiring an engineer (ycombinator.com)

1 points by rishi443 6d ago 0 comments

Weave (YC W25) is hiring a founding AI engineer (ycombinator.com)

1 points by adchurch 7d ago 0 comments

Qfex (YC X25) – Back End Engineer for a 24/7 Stock Exchange (ycombinator.com)

1 points by NPDW 9d ago 0 comments

points by 9d ago 0 comments

Attimet (YC F24) – Quant Trading Research Lab – Is Hiring Founding Engineer (ycombinator.com)

1 points by kbanothu 10d ago 0 comments

Jiga (YC W21) Is Hiring Software Engs to Make Life of Mech Engs Easier (workatastartup.com)

1 points by grmmph 10d ago 0 comments

Foundry (YC F24) Hiring Early Engineer to Build Web Agent Infrastructure (ycombinator.com)

1 points by lakabimanil 11d ago 0 comments

Blaze (YC S24) Is Hiring (ycombinator.com)

1 points by faiyamrahman 12d ago 0 comments

Infracost (YC W21) is hiring software engineers (GMT+2 to GMT-6) (infracost.io)

1 points by aliscott 12d ago 0 comments

Solidroad (YC W25) Is Hiring (solidroad.com)

1 points by pjfin 14d ago 0 comments

Kyber (YC W23) Is Hiring a Technical Account Manager (ycombinator.com)

1 points by asontha 15d ago 0 comments

Roundtable (YC S23) Is Hiring a President / CRO (ycombinator.com)

1 points by timshell 16d ago 0 comments

Roame (YC S23) Is Hiring (ycombinator.com)

1 points by zman0225 16d ago 0 comments

GauntletAI (YC S17): All expenses paid AI training and guaranteed $200k+ job (gauntletai.com)

1 points by austenallred 16d ago 0 comments

SchemeFlow (YC S24) Is Hiring an Engineer (London) to Speed Up Construction (ycombinator.com)

1 points by andrewkinglear 17d ago 0 comments

Shaped (YC W22) Is Hiring (ycombinator.com)

1 points by tullie 17d ago 0 comments

Spice Data (YC S19) is hiring a software engineer – back end (ycombinator.com)

1 points by richard_pepper 17d ago 0 comments

Onlook (YC W25) Is Hiring an engineer in SF

1 points by D_R_Farrell 18d ago 0 comments

OneText (YC W23) Is Hiring a DevOps/DBA Lead Engineer (jobs.ashbyhq.com)

1 points by bluepnume 21d ago 0 comments

Gander (YC F24) Is Hiring Founding Engineers and Interns (ycombinator.com)

1 points by arjanguglani 21d ago 0 comments

Ziina (YC W21) the Series A fintech is hiring product engineers (ziina.notion.site)

1 points by faisaltoukan 22d ago 0 comments

Onyx (YC W24) – AI Assistants for Work Hiring Founding AE (ycombinator.com)

1 points by yuhongsun 22d ago 0 comments

Universal pre-training by iterated random computation

31 liamdgray 6 6/29/2025, 1:12:32 AM arxiv.org ↗

Comments (6)

visarga · 4h ago

Results are modest, maybe 20-30% fewer training steps to reach target performance. This won't solve the problem of organic data exhaustion. We need 100x more data.

They didn't test against actual language model pretraining, only tested against a random init.

- A: Pre-trained on their synthetic LSTM data -> fine-tuned on Wikipedia

- B: Pre-trained on different natural language corpus -> fine-tuned on Wikipedia

- C: Random initialization -> fine-tuned on Wikipedia

They only test A vs C, not A vs B.

impossiblefork · 33m ago

20-30% isn't modest. I think there is a big problem though, but it's that it's character level prediction.

It's not obvious how generate this kind of good synthetic data when it's to be fed to a tokenized model.

WithinReason · 3h ago

This paper addresses the problem of running out of data. You can't do B when you ran out of data so it's irrelevant.

bionhoward · 7h ago

This is a cool concept, but for comparison, I can’t help but wish there was more comparison between the treatment group and a control group that doesn’t see any universal pretraining data.

It’s good to compare various model sizes and evaluation tasks and random data generators. I just think the paper would more effectively prove its point if it could show models of same sizes which see this random data can learn better from evaluation data later on.

Could even take the initial checkpoint of the model before universal pretraining against the pretrained checkpoint. If the method works, the one that did UP will win.

Maybe I’m way off, I’ll admit I only skimmed it so far. Seems promising, just wishing for some controls.

yorwba · 4h ago

In figures 2, 4, and 6, the top left end of the training curves represents models that have not seen any pretraining data. In figure 5, they're represented by dashed curves.

liamdgray · 9h ago

Abstract: "We investigate the use of randomly generated data for the sake of pre-training a model. We justify this approach theoretically from the perspective of algorithmic complexity, building on recent research that shows that sequence models can be trained to approximate Solomonoff induction. We derive similar, but complementary theoretical results. We show empirically that synthetically generated data can be used to pre-train a model before the data is seen. We replicate earlier results that models trained this way show zero-shot in-context learning across a variety of datasets, and that this performance improves with scale. We extend earlier results to real-world data, and show that finetuning a model after pre-training offers faster convergence and better generalization."