JEP 515: Ahead-of-Time Method Profiling (openjdk.org)

To be clear, this is not a model trained on zero data, this is a pretrained model (Qwen 2.5 trained on 18 trillion tokens) finetuned using self-generated data grounded by a Python interpreter

scotty79 · 2h ago

I think at this point the initial process of exposing the empty model to all the available domain data in bulk is no longer interesting to many people. It's an obvious first step so it's barely mentioned anymore. What's currently worked on is what you do afterwards to get a useful tool in the end.

macrolime · 4h ago

Pretty sure OpenAI and/or DeepMind have already been doing something very similar for a while already, just without publishing it.

FieryTransition · 3h ago

Agreed, it's a pretty obvious solution to the problems once you are immersed in the problem space. I think it's much harder to setup an efficient training pipeline for this which does every single little detail in the pipeline correctly while being efficient.

gitroom · 41m ago

sometimes i feel like the whole self-play thing is kinda the obvious path now but still nuts seeing it actually work better than huge data dumps. you ever wonder how much of progress is just crazy good pipelines versus actual breakthroughs?

squillion · 3h ago

Warning: abuse of this technique may cause the model to go blind.

ogogmad · 1h ago

Is this a joke about wanking?

QuadmasterXLII · 2h ago

For everyone who says “modern incentives forbid publishing negative results,” let this stand as a counterexample!

fotcorn · 2h ago

Why do you think it's a negative result? The table on page 9 shows great results.

ogogmad · 1h ago

I think it's a pun. AlphaZero? AlphaNegative.

andy_ppp · 6m ago

-273°C isn’t it?

Waterluvian · 2h ago

Related to this: has anyone seen a model respond with “oh wait I was wrong…” when you follow-up with a “can you explain why this answer is right?”

I still find that my uses of GPT and others still struggle with a sort of tunnel vision.

mentalgear · 5h ago

"Despite using zero human-curated data, AZR achieves state-of-the-art results on diverse coding and math reasoning benchmarks, even outperforming models trained on large in-domain datasets. This demonstrates the potential for sophisticated reasoning skills to emerge purely through self-play without domain-specific supervision."

JEP 515: Ahead-of-Time Method Profiling (openjdk.org)

Ferranti punched tape codes, including Flexowriter. (1961) (chilton-computing.org.uk)

Beginner resources for formalizing lambda calculi (chrishenson.net)

The Flying Canon (asadk.com)

AI hallucinations are getting worse – and they're here to stay (newscientist.com)

The future of AI interaction: Beyond just text (epicai.pro)

Booting the RP2350 from UART (pfister.dev)

A lithium deposit valued at $1.5T discovered in the US state of Oregon (earth.com)

The IoT device "I'm in a meeting." (nullonerror.org)

To Educate Students about AI, Make Them Use It – Scientific American (scientificamerican.com)

Clean-names – Deduplicate and parse list of `dirty names' (github.com)

Consumerism: The First Universal Religion Humans Practice (simone.org)

FBI opens inquiry into 764, an online group that sexually exploits minors (theguardian.com)

Olivia Farnsworth. The "Bionic Teen" Who Feels No Pain (medium.com)

Spark, Your Personal Assistant (github.com)

Quick Thoughts on Evaluating Agents (rnikhil.com)

Is Padel Playground the Best Padel Tournament Management App?

How much does a man need? (rnikhil.com)

Help PLEASE – Can Google track RCS messages for a missing person?

Show HN: Developer First Observability with Prism (parseable.com)

Speed cameras coming to MTA bridges and tunnels in NYC (gothamist.com)

My, Earth really is full of things (2005) (web.archive.org)

The effect of ChatGPT on students learning performance (nature.com)

Regulatory Capture (en.wikipedia.org)

Teaching CS50 with AI [pdf] (cs.harvard.edu)

Mutually Beneficial Group Lies (stanislavkozlovski.com)

SWIG is a software development tool that connects programs written in C and C++ (swig.org)

US President Donald Trump Seeks to Cancel NASA's Mars Sample Return (scientificamerican.com)

In 2025, venture capital can't pretend everything is fine any more (pivot-to-ai.com)

Title of work deciphered in sealed Herculaneum scroll via digital unwrapping (finebooksmagazine.com)

Designing Cities for Families (bloomberg.com)

How to Think Like Rich Rubin (shaanpuri.com)

Philips on Printables (printables.com)

We Trade with Ants: The Little Speakers (fortressofdoors.com)

I asked Gemini to simulate an advanced Swift/iOS classroom (notes.suhas.org)

Claude and I write a utility program (blog.plover.com)

New Reports: Land Value Taxes in College Towns (progressandpoverty.substack.com)

Backdoor found in popular ecommerce components (sansec.io)

Show HN: GlassFlow – OSS streaming dedup and joins from Kafka to ClickHouse (github.com)

The British Airways position on various border disputes (drewdevault.com)

A turnkey OAuth and authentication system for Cloudflare Workers and Node.js (github.com)

Find Your Perfect AI Match (quiz.cstack.ai)

Apple wants to simplify adding multiple devices to captive Wi-Fi (appleinsider.com)

Intelligence on Earth Evolved Independently at Least Twice (wired.com)

Population Around a Point (tomforth.co.uk)

Debug-gym: an environment for AI coding tools to learn how to debug (microsoft.com)

Sea Level Rise (jbrr.dev)

DeepSeek Punctured the Myth That Silicon Valley Could Control AI (bloomberg.com)

Show HN: I curated 300 open-source LLM tools from Hacker News: all in one place (orchid-busby-0cc.notion.site)

Show HN: Printstack – I'll print and post your Substacks to you (printstack.net)

Absolute Zero: Reinforced Self-Play Reasoning with Zero Data

Comments (13)