We should have the ability to run any code we want on hardware we own (hugotunius.se)

2064 points by K0nserv 11d ago 1201 comments

Cognitive load is what matters (github.com)

1575 points by nromiun 12d ago 526 comments

NPM debug and chalk packages compromised (aikido.dev)

1347 points by universesquid 3d ago 752 comments

I didn't bring my son to a museum to look at screens (sethpurcell.com)

1113 points by arch_deluxe 1d ago 370 comments

I ditched Docker for Podman (codesmash.dev)

1104 points by codesmash 6d ago 650 comments

30 minutes with a stranger (pudding.cool)

1083 points by MaxLeiter 7d ago 375 comments

996 (lucumr.pocoo.org)

1029 points by genericlemon24 5d ago 527 comments

Show HN: Term.everything – Run any GUI app in the terminal (github.com)

1024 points by mmulet 2d ago 141 comments

Next.js is infuriating (blog.meca.sh)

1024 points by Bogdanp 9d ago 578 comments

Germany is not supporting ChatControl – blocking minority secured (digitalcourage.social)

1015 points by xyzal 13h ago 335 comments

The MacBook has a sensor that knows the exact angle of the screen hinge (twitter.com)

1011 points by leephillips 4d ago 487 comments

Show HN: I recreated Windows XP as my portfolio (mitchivin.com)

1011 points by mitchivin 4d ago 316 comments

Charlie Kirk killed at event in Utah (nbcnews.com)

981 points by david927 1d ago 2782 comments

Anthropic agrees to pay $1.5B to settle lawsuit with book authors (nytimes.com)

980 points by acomjean 6d ago 736 comments

Signal Secure Backups (signal.org)

975 points by keyboardJones 3d ago 440 comments

Using Claude Code to modernize a 25-year-old kernel driver (dmitrybrant.com)

913 points by dmitrybrant 3d ago 313 comments

iPhone Air (apple.com)

885 points by excerionsforte 2d ago 1894 comments

Google can keep its Chrome browser but will be barred from exclusive contracts (cnbc.com)

861 points by colesantiago 9d ago 632 comments

I replaced Animal Crossing's dialogue with a live LLM by hacking GameCube memory (joshfonseca.com)

849 points by vuciv 1d ago 182 comments

Pontevedra, Spain declares its entire urban area a "reduced traffic zone" (greeneuropeanjournal.eu)

847 points by robtherobber 1d ago 916 comments

We all dodged a bullet (xeiaso.net)

816 points by WhyNotHugo 2d ago 477 comments

Stripe Launches L1 Blockchain: Tempo (tempo.xyz)

804 points by _nvs 7d ago 1068 comments

Mistral raises 1.7B€, partners with ASML (mistral.ai)

793 points by TechTechTech 2d ago 420 comments

Chat Control Must Be Stopped (privacyguides.org)

777 points by Improvement 3d ago 253 comments

“This telegram must be closely paraphrased before being communicated to anyone” (history.stackexchange.com)

775 points by azeemba 11d ago 135 comments

New Mexico is first state in US to offer universal child care (governor.state.nm.us)

773 points by toomuchtodo 2d ago 654 comments

Almost anything you give sustained attention to will begin to loop on itself (henrikkarlsson.xyz)

770 points by jger15 7d ago 222 comments

Where's the shovelware? Why AI coding claims don't add up (mikelovesrobots.substack.com)

757 points by dbalatero 8d ago 482 comments

Google AI Overview made up an elaborate story about me (bsky.app)

698 points by jsheard 10d ago 278 comments

Claude Code: Now in Beta in Zed (zed.dev)

681 points by meetpateltech 8d ago 406 comments

iPhone dumbphone (stopa.io)

679 points by joshmanders 3d ago 391 comments

Eternal Struggle (yoavg.github.io)

679 points by yurivish 11d ago 136 comments

ICE is using fake cell towers to spy on people's phones (forbes.com)

659 points by coloneltcb 2d ago 255 comments

KDE launches its own distribution (lwn.net)

653 points by Bogdanp 1d ago 472 comments

Claude now has access to a server-side container environment (anthropic.com)

650 points by meetpateltech 2d ago 341 comments

I'm absolutely right (absolutelyright.lol)

648 points by yoavfr 6d ago 266 comments

LLM Visualization (bbycroft.net)

639 points by gmays 7d ago 46 comments

Court rejects Verizon claim that selling location data without consent is legal (arstechnica.com)

633 points by nobody9999 21h ago 77 comments

Notes on Managing ADHD (borretti.me)

630 points by amrrs 11d ago 329 comments

Serverless Horrors (serverlesshorrors.com)

614 points by operator-name 4d ago 483 comments

MIT Study Finds AI Use Reprograms the Brain, Leading to Cognitive Decline (publichealthpolicyjournal.com)

613 points by cainxinth 8d ago 566 comments

No adblocker detected (maurycyz.com)

611 points by LorenDB 2d ago 375 comments

E-paper display reaches the realm of LCD screens (spectrum.ieee.org)

609 points by rbanffy 2d ago 201 comments

The maths you need to start understanding LLMs (gilesthomas.com)

609 points by gpjt 8d ago 119 comments

AI surveillance should be banned while there is still time (gabrielweinberg.com)

599 points by mustaphah 5d ago 220 comments

Wikipedia survives while the rest of the internet breaks (theverge.com)

599 points by leotravis10 7d ago 459 comments

Fil's Unbelievable Garbage Collector (fil-c.org)

598 points by pizlonator 6d ago 278 comments

Anthropic raises $13B Series F (anthropic.com)

590 points by meetpateltech 9d ago 635 comments

We already live in social credit, we just don't call it that (thenexus.media)

579 points by natalie3p 9d ago 676 comments

Google: 'Your $1000 phone needs our permission to install apps now' [video] (youtube.com)

564 points by robtherobber 11d ago 570 comments

Semlib: LLM-Powered Data Processing

1 anishathalye 1 9/11/2025, 3:29:09 PM anishathalye.com ↗

Comments (1)

anishathalye · 7h ago

Hi HN!

I've been thinking a lot about semantic data processing recently. A lot of the attention in AI has been on agents and chatbots (e.g., Claude Code or Claude Desktop), and I think semantic data processing is not well-served by such tools (or frameworks designed for implementing such tools, like LangChain).

As I was working on some concrete semantic data processing problems and writing a lot of Python code (to call LLMs in a for loop, for example, and then adding more and more code to do things like I/O concurrency and caching), I wanted to figure out how to disentangle data processing pipeline logic from LLM orchestration. Functional programming primitives (map, reduce, etc.), common in data processing systems like MapReduce/Flume/Spark, seemed like a natural fit, so I implemented semantic versions of these operators. It's been pretty effective for the data processing tasks I've been trying to do.

This blog post shares some more details on the story here and elaborates what I like about this approach to semantic data processing. It also covers some of the related work in this area (like DocETL from Berkeley's EPIC Data Lab, LOTUS from Stanford and Berkeley, and Palimpzest from MIT's Data Systems Group).

Like a lot of my past work, the software itself isn't all that fancy; but it might change the way you think!

The software is open-source at https://github.com/anishathalye/semlib. I'm very curious to hear the Hacker News community's thoughts!