Show HN: TextPolicy – reinforcement learning for text generation on a MacBook

4 teilom 0 8/30/2025, 4:34:08 PM github.com ↗

I built TextPolicy because I wanted a way to study reinforcement learning for text generation without needing a cluster or cloud GPUs. A MacBook is enough. The toolkit is simple: Implements GRPO and GSPO algorithms Provides a decorator interface for custom reward functions Includes LoRA and QLoRA utilities Runs on MLX, so it is efficient on Apple Silicon It is not intended for production. The purpose is learning and experimentation: to understand algorithms, to test ideas, to see how reward shaping affects behavior. Installation is through pip: pip install textpolicy There is a minimal example in the README. I am interested in feedback on: the clarity of the API, the usefulness of the examples, and whether this lowers the barrier for people new to RL. Repository: github.com/teilomillet/textpolicy

ETFs now hold more than $3.1T worth of just top US companies (signalbloom.ai)

A Denisovan skull is upending the story of human evolution (newscientist.com)

Scientists cram a computer into a single fiber of clothing (livescience.com)

Chicago has the most lead pipes in the nation. We mapped them all (grist.org)

Verizon customers report phone outages, with devices limited to SOS mode (cbsnews.com)

Show HN: Free printable coloring pages for K-Pop Demon Hunters (kpopdemonhunterscoloringpages.com)

For $65,000 a year, a teacher-less AI private school comes to Virginia (washingtonpost.com)

Semantic search and document parsing tools for the command line (github.com)

Re-reading "Moby-Dick" at Ahab's age (calebcrain.substack.com)

PFP: A Probabilistic Functional Programming Library for Haskell (2006) (web.engr.oregonstate.edu)

Show HN: Android compass, location app without ads (github.com)

Making a game that's fun for longer than it took to build it [video] (youtube.com)

Scottish brothers finish mammoth row across Pacific Ocean after 139 days (abc.net.au)

How to Avoid Fighting Rust Borrow Checker (qouteall.fun)

50nm accuracy 3D printed Micro-Manipulator open source [video] (youtube.com)

This is how Ukrainian Yak-52 Crews Hunt Russian Drones (twz.com)

What Burning Man Doesn't Want You to See (youtube.com)

Moral Equivalent of War Speech (en.wikipedia.org)

Kern: A production-grade structured Python logger(beats stdlib/loguru/structlog) (medium.com)

Huawei GPU's with 96GB of VRAM available at around 2000 USD (old.reddit.com)

Show HN: I made an English version of the game "Funeral of Freiren" (github.com)

Can China cope with a deindustrialised future? (economist.com)

The Airship Club That Might Never Have Existed (2013) (theatlantic.com)

Show HN: An interface for doing research fast with an LLM (proread.ai)

Quirks of Common Lisp Types (fosskers.ca)

Snapcast – Synchronous multi-room audio player (github.com)

Ask HN: Should single presidential signature control global trade policy?

Music to Break Models By (matthodges.com)

Cash Transfers Work (theatlantic.com)

Show HN: I made a game called "Freyren's Funeral Procession." (github.com)

Hurricane category 6 could be introduced under new storm severity scale (livescience.com)

Online Governance Surfaces and Attention Economies (osf.io)

BirdNET-Pi (birdweather.com)

Show HN: StegMark – Hide bookmarks inside images (local-first web app) (stegmark.dev)

HongMeng (HarmonyOS) Kernel (en.wikipedia.org)

Are tennis players the fittest athletes in the world? (cnn.com)

Trump is giving Russian cyber ops a free pass and putting democracy on the line (thebureauinvestigates.com)

Lisp interpreter with GC in <750 lines of Odin (and <500 lines of C) (github.com)

The Reason American Socialists Don't Win (theatlantic.com)

Microsoft Software Engineer Dies on Silicon Valley Campus at 35 (bloomberg.com)

Detection of an Anti-Solar Tail for 3I/Atlas (avi-loeb.medium.com)

Starship's heat shield appears to have performed quite well in test (arstechnica.com)

Gokop – Cooperative Ride-Hailing Platform (gokop.org)

Reform or Revolution? [video] (youtube.com)

Deleuzoguattarianism (philosophyball.miraheze.org)

An LLM Traded a Toe for a Foot (cto.berlin)

Intel amends CHIPS Act deal with US Commerce Department, gets $5.7B early (reuters.com)

AI Agents and Painted Facades (fulcrumresearch.ai)

Medicare Will Start Using AI to Help Make Coverage Decisions Next Year (newsweek.com)

Enrollment at trade schools is expected to grow 6.6% a year (finance.yahoo.com)

Show HN: TextPolicy – reinforcement learning for text generation on a MacBook

Comments (0)