Federal Reserve cuts interest rates by quarter point (ft.com)

1 points by alephnerd 41s ago 0 comments

Social Security admin denies DB data leak, DOGEs questions about a copy (theregister.com)

2 points by rntn 2m ago 0 comments

Show HN: LLMyourself.com – Type a name. Get a report. (llmyourself.com)

1 points by AlexNicita 5m ago 0 comments

GCJ-02, China's "Mars Coordinates" (steemit.com)

2 points by brendanashworth 6m ago 0 comments

Fed Cuts Rates by Quarter Point and Signals More Are Likely (wsj.com)

2 points by thm 9m ago 0 comments

Federal Reserve cuts interest rates by a quarter point (federalreserve.gov)

4 points by impish9208 11m ago 2 comments

Janet for Mortals (janet.guide)

1 points by Karrot_Kream 11m ago 0 comments

Anthropic irks White House with limits on models’ use (semafor.com)

4 points by mindingnever 14m ago 0 comments

Gen Z Leads Biggest Drop in FICO Scores Since Financial Crisis (bloomberg.com)

2 points by petethomas 14m ago 2 comments

Facing the possibility of consciousness in human brain organoids (cell.com)

3 points by XzetaU8 15m ago 0 comments

Google Open-Sources "Codegen Scorer" to Improve AI-Generation for Web Frameworks (blog.angular.dev)

3 points by mgechev 17m ago 0 comments

Glue teams vs. back-office teams (newsletter.posthog.com)

1 points by Twixes 18m ago 0 comments

Why Love Generative Art (artnome.com)

1 points by shreyas_p_238 19m ago 0 comments

Who controls the Internet and How it works? (binaryigor.com)

1 points by BinaryIgor 20m ago 0 comments

FTC Launches Inquiry into AI Chatbots Acting as Companions (ftc.gov)

3 points by mooreds 21m ago 0 comments

The snake-killer trial that led to California's last hanging (latimes.com)

1 points by axiomdata316 23m ago 0 comments

Supplementary Information for the DeepSeek R1 paper [pdf] (static-content.springer.com)

1 points by pr337h4m 25m ago 0 comments

Using a maintenance mode primitive to shard Postgres with zero downtime (gadget.dev)

2 points by draward 26m ago 0 comments

Redesigning Data Systems to Be Agent-First (muratbuffalo.blogspot.com)

2 points by KraftyOne 28m ago 0 comments

Revisiting the IPIP-NEO personality hierarchy with taxonomic graph analysis (journals.sagepub.com)

1 points by PaulHoule 29m ago 0 comments

Communications Is So Big (heidiwaterhouse.com)

1 points by mooreds 29m ago 0 comments

Tesla's 'self-driving' software fails at train crossings (nbcnews.com)

12 points by Veserv 30m ago 5 comments

Lomuto's Comeback for Quicksort Partitions (dlang.org)

1 points by fanf2 30m ago 0 comments

Don't Take the Auditor to the Strip Club (bloomberg.com)

5 points by ioblomov 30m ago 1 comments

Take Home Interviews in the Era of Claude (blog.reffie.me)

2 points by SoylentOrange 30m ago 1 comments

Learning the natural history of human disease with generative transformers (nature.com)

1 points by bookofjoe 31m ago 0 comments

Why random lines of video game dialogue get stuck in our heads (theguardian.com)

1 points by n1b0m 31m ago 0 comments

The Debian-based version and Linux Mint 22.3 will be appearing by year's end (zdnet.com)

1 points by CrankyBear 32m ago 0 comments

Viaduct, Five Years On: Modernizing the Data-Oriented Service Mesh (medium.com)

2 points by mooreds 32m ago 0 comments

Say Goodbye to Node.js HTTP. Meet Brahma-JS an Ultra HTTP (github.com)

1 points by StellaMary 32m ago 1 comments

Tesid: Textualised Encrypted Sequential Identifiers (temp.chrismorgan.info)

2 points by Palmik 32m ago 0 comments

Cookies vs. You. Who wins in 30 seconds? (consent.gg)

5 points by Brog_io 38m ago 0 comments

Unconditional separation between quantum and classical information (arxiv.org)

2 points by fuglede_ 38m ago 0 comments

Show HN: Vatify – Simple API for EU VAT validation and rate calculation (vatifytax.app)

1 points by passenger09 40m ago 0 comments

Works in Progress Magazine Print (worksinprogress.co)

1 points by tosh 41m ago 0 comments

Everactive's Self-Powered SoC at Hot Chips 2025 (old.chipsandcheese.com)

2 points by pella 42m ago 0 comments

Everything I Hate About React, I Hate About JavaScript (chadnauseam.com)

3 points by ChadNauseam 42m ago 2 comments

Ask HN: Is anyone else sick of AI splattered code

43 points by throwaway-ai-qs 42m ago 31 comments

How a rare gene variant contributes to Alzheimer's disease (news.mit.edu)

1 points by gmays 43m ago 0 comments

When Knowing Someone at Meta Is the Only Way to Break Out of "Content Jail" (eff.org)

1 points by Improvement 44m ago 0 comments

Sokosumi: Decentralized AI Agent Marketplace (sokosumi.com)

1 points by Padierfind 44m ago 0 comments

Login with PDF (joaomagfreitas.link)

2 points by freitzzz 44m ago 0 comments

Secure Credentials on Comet with 1Password (perplexity.ai)

1 points by elashri 44m ago 0 comments

Chef by Convex is now OSS (news.convex.dev)

1 points by meetpateltech 45m ago 1 comments

John Carmack's .plan Archive (github.com)

2 points by helloplanets 45m ago 0 comments

Securing Node.js development environment with AppArmor (dmitrychekanov.com)

1 points by ngram 46m ago 0 comments

DeepSeek writes less secure code for groups China disfavors (washingtonpost.com)

25 points by otterley 48m ago 7 comments

Ask HN: Do text platforms naturally become woke?

2 points by keepamovin 48m ago 2 comments

FND – Unpacking the Tipping Point of Functional Neurological Disorder (FND) (letstalkfnd.com.au)

1 points by nibblenum 49m ago 1 comments

Google Gemini earns gold medal in ICPC World Finals coding competition (arstechnica.com)

2 points by vok 50m ago 1 comments

Launch HN: RunRL (YC X25) – Reinforcement learning as a service

20 ag8 3 9/17/2025, 4:13:00 PM runrl.com ↗

Hey HN, we’re Andrew and Derik at RunRL (https://runrl.com/). We've built a platform to improve models and agents with reinforcement learning. If you can define a metric, we'll make your model or agent better, without you having to think about managing GPU clusters.

Here's a demo video: https://youtu.be/EtiBjs4jfCg

I (Andrew) was doing a PhD in reinforcement learning on language models, and everyone kept...not using RL because it was too hard to get running. At some point I realized that someone's got to sit down and actually write a good platform for running RL experiments.

Once this happened, people started using it for antiviral design, formal verification, browser agents, and a bunch of other cool applications, so we decided to make a startup out of it.

How it works:

- Choose an open-weight base model (weights are necessary for RL updates; Qwen3-4B-Instruct-2507 is a good starting point)

- Upload a set of initial prompts ("Generate an antiviral targeting Sars-CoV-2 protease", "Prove this theorem", "What's the average summer high in Windhoek?")

- Define a reward function, using Python, an LLM-as-a-judge, or both

- For complex settings, you can define an entire multi-turn environment

- Watch the reward go up!

For most well-defined problems, a small open model + RunRL outperforms frontier models. (For instance, we've seen Qwen-3B do better than Claude 4.1 Opus on antiviral design.) This is because LLM intelligence is notoriously "spiky"; often models are decent-but-not-great at common-sense knowledge, are randomly good at a few domains, but make mistakes on lots of other tasks. RunRL creates spikes precisely on the tasks where you need them.

Pricing: $80/node-hour. Most models up to 14B parameters fit on one node (0.6-1.2 TB of VRAM). We do full fine-tuning, at the cost of parameter-efficiency (with RL, people seem to care a lot about the last few percent gains in e.g. agent reliability).

Next up: continuous learning; tool use. Tool use is currently in private beta, which you can join here: https://forms.gle/D2mSmeQDVCDraPQg8

We'd love to hear any thoughts, questions, or positive or negative reinforcement!

Comments (3)

nextworddev · 53m ago

Is there any credence to the view that these startups are basically dspy wrappers

-_- · 36m ago

DSPy is great for prompt optimization but not so much for RL fine-tuning (their support is "extremely EXPERIMENTAL"). The nice thing about RL is that the exact prompts don't matter so much. You don't need to spell out every edge case, since the model will get an intuition for how to do its job well via the training process.

nextworddev · 10m ago

Isn’t the latest trend in RL mostly about prompt optimization as opposed to full fine tuning