Reliable by Design: Building Guardrails for AI and Other Unpredictable Systems [video] (youtube.com)

I'd love to see just tech/entrepreneurship updates without any politics/war/catastrophe updates. Does anyone currently use a tool that helps with this?

Comments (8)

toomuchtodo · 1d ago

Nothing existing to my knowledge, but you could probably put together a browser extension that would feed the front page to an LLM and auto hide anything not to your interest.

Maybe take inspiration from https://www.mosaique.info/ | https://news.ycombinator.com/item?id=44172340

PaulHoule · 1d ago

See https://ontology2.com/essays/HackerNewsForHackers/ and https://ontology2.com/essays/ClassifyingHackerNewsArticles/ noting you could do way better if you sucked down more text and used ModernBERT + SVM.

mieubrisse · 1d ago

Just finished reading both Paul, and I'm very interested to know how you implemented the reader that filters?

PaulHoule · 22h ago

Ok, those articles are from a thing I made long ago that worked on titles, I did it with some scripts and Jupyter notebooks.

I think you are talking though about my YOShInOn RSS reader uses the same scikit-learn library with the training and classification happening in a script that runs side-by-side with the web server with which I look at articles and make my judgements which is also written in Python. It is research code but it is also production in that I use it every day and I'm never afraid to demo it.

It uses the arangodb database which has a terrible license so I am making a library called "system of objects" that emulates some aspects of arangodb collections and documents over postgres tables and columns. At some point I move it to postgres and can put down that beast and feel free to either open-source or commercialize it.

At the core of it is a classification model that predicts the probability for "will I like this item?" and sampling N documents by taking the top N/k documents from each cluster. I would also blend in 30% of randomly sampled documents to keep the training data representative. The batch job looks up my judgements in the database, writes them into an numpy matrix and uses scikit-learn to train a model, it does the inference and puts its recommendations into the database, which I can see with my web front end which is done with flask and HTMX. I like this style for research/production code because you can build applications a "screen" at a time where a "screen" is a few python functions that answer a few URLs that make a web page work. The happy path of making judgements has to be very fast and easy, think TikTok or Tinder, because you will have to do that 1000s of time to make good models.

As a classification problem it's boring because it is a fuzzy problem. I might like an article today and hate it tomorrow so there is an upper ceiling to the accuracy.

So I am thinking the centaur use case that there is a stream of documents that you classify together with the model and the classification is something better defined, where the power of a more complex model to understand the document and determine something like "was the author angry?", "is this an account of a sports game?", "did the home team win?", etc.

That has me thinking about a general-purpose text classification kit which would have a small number of models chosen with practicality in mind and setting up some kind of benchmark against data sets from Kaggle.

I am not thinking about about better recommendations seriously because the problem is so vast and includes everything from: "reject anything from YouTube out of hand" to a nuanced analysis of what exactly "quality" means, not least a real-time instead of batch system that will tell me about a sports game today as opposed to next week and also push it to the front of any outbound queues -- yet, articles about carbon capture or video games or fast cars or circular economy or rural sociology can wait.

I'm interested more now in applying filtering based on people's emotional characteristics to social media, I mean maybe microblogging is dead, but it is just so much more fun if you can avoid the bottom 5% of bad behavior.

mieubrisse · 1d ago

Thank you so much for linking these! Exactly the sort of thing I'm looking for; still making my way through the first article but fingers crossed there's an easy-to-use implementation at the bottom.

toomuchtodo · 23h ago

Thanks Paul! TIL!

mieubrisse · 1d ago

Thank you for the points; the second article in particular looks exactly like what I'm trying to accomplish!

mech422 · 1d ago

lobste.rs ?

Highly curated, restricted posting, pretty much pure tech

Use your virtual card number for Apple Cash (support.apple.com)

The crisis of zombie social science (forkingpaths.co)

iCraft Editor – Help you easily create excellent 3D architecture diagrams (github.com)

QSafe: The First Quantum-Secure, Multi-Chain Crypto Wallet (qsafewallet.com)

Show HN: Equity Copilot – AI-powered equity grant assistant for tech employees (equitycopilot.app)

Tech billionaires are making a risky bet with humanity's future (technologyreview.com)

Anne Wojcicki Wins Bidding for 23andMe (wsj.com)

Show HN: The fastest way to create carousels (lumeo.me)

GameStop CEO Says the Company's Future Isn't in Games (gamespot.com)

Reliable by Design: Building Guardrails for AI and Other Unpredictable Systems [video] (youtube.com)

Hit songs are getting shorter (economist.com)

3D printing metal molds poised to accelerate US auto manufacturing (techxplore.com)

What does the DEI-free commitment mean? · Issue #40 · X11Libre/xserver (github.com)

AI and LLM Takes from the Field (medium.com)

Part of Alaska is under a heat advisory. That's a first (washingtonpost.com)

Rethinking the Patent Office (forbes.com)

The average ChatGPT request uses ~0.34Wh (engineeringprompts.substack.com)

After millions of years, why are carnivorous plants still so small? (smithsonianmag.com)

Open-source granola (meetings summary) (omi.me)

Powering next-gen services with AI in regulated industries (technologyreview.com)

Hackable AlphaFold 3 without Docker or MSAs (github.com)

Show HN: A Visual way to build complex prompts - Looking for product validation (thepromptindex.com)

Silicon Valley tech execs are joining the US Army Reserve (techcrunch.com)

The Israeli Attack Against Iran (mearsheimer.substack.com)

Ask HN: Has anyone digitally modeled the impact and collapse of the twin towers?

In Twist, U.S. Diplomacy Served As Cover for Israeli Surprise Attack (wsj.com)

Show HN: Free tool to download Microsoft Learn video (github.com)

The Growing Risk of Malicious Browser Extensions (socket.dev)

There's another leak on the ISS, but NASA is not saying much about it (arstechnica.com)

Apple's Liquid Glass is prep work for AR interfaces, not just a design refresh (omc345.substack.com)

Plunder: How Private Equity is reshaping HVAC (heatpumped.org)

Show HN: Infrabase: Natural language rules engine to manage your cloud account (infrabase.co)

The Viable Systems Model (fffej.substack.com)

Build It Twice (russellpollari.substack.com)

Observability with real insights and auto-fixes (cloudgrip.ai)

First Fossil Proof Found That Long-Necked Dinosaurs Were Vegetarians (nytimes.com)

The Postgres Developers guide to updates and deletes in ClickHouse (clickhouse.com)

The Return of Forgotten Math in Computer Graphics [pdf] (2012) (terathon.com)

Ask HN: Are senior engineers not senior anymore?

LLMs.txt Generator with Automated Monitoring (github.com)

All Starlink Direct to Cell Gen 1 satellites have now been launched (twitter.com)

Anti-Tesla demonstration highlights safety concerns with self-driving vehicles (statesman.com)

Things Jeremy says to do (2019) (forums.fast.ai)

A remote island escaped mass suicide in Battle of Okinawa (japantimes.co.jp)

Ask HN: Any way to get some OpenAI/Anthropic credits for school students?

Vox Media Union Reaches Agreement on Three-Year Contract (variety.com)

Phoenix contexts are simpler than you think (arrowsmithlabs.com)

Self-Adapting Language Models (arxiv.org)

Thoughts on Kagi Search after two months (olly.pagecord.com)

FlockRunner – A project based YAML command excecutor (github.com)

Ask HN: HN with Just the Tech?

Comments (8)