Tell HN: Help restore the tax deduction for software dev in the US (Section 174)

1688 points by dang 12h ago 615 comments

Containerization is a Swift package for running Linux containers on macOS (github.com)

369 points by gok 7h ago 162 comments

Apple announces Foundation Models and Containerization frameworks, etc (apple.com)

571 points by thm 10h ago 351 comments

Discrete Mathematics: An Open Introduction [pdf] (discrete.openmathbooks.org)

37 points by simonpure 2h ago 4 comments

Sly Stone has died (abcnews.go.com)

207 points by brudgers 6h ago 25 comments

Implementing DOES> in Forth, the entire reason I started this mess (boston.conman.org)

35 points by todsacerdoti 2h ago 2 comments

Container: Apple's Linux-Container Runtime (github.com)

104 points by jzelinskie 7h ago 13 comments

Working with the EPA to Secure Exposed Water HMIs (censys.com)

9 points by doener 2d ago 0 comments

Why agents are bad pair programmers (justin.searls.co)

62 points by sh_tomer 4h ago 46 comments

Show HN: Munal OS: a graphical experimental OS with WASM sandboxing (github.com)

201 points by Gazoche 10h ago 70 comments

The Xerox Alto, Smalltalk, and rewriting a running GUI (2017) (righto.com)

56 points by rbanffy 7h ago 21 comments

What methylene blue can (and can’t) do for the brain (neurofrontiers.blog)

113 points by wiry 3d ago 71 comments

Apple introduces a universal design across platforms (apple.com)

492 points by meetpateltech 11h ago 785 comments

Apps shouldn't let users enter OpenSSL cipher-suite strings (00f.net)

5 points by jedisct1 3d ago 8 comments

Launch HN: Chonkie (YC X25) – Open-Source Library for Advanced Chunking

108 points by snyy 12h ago 36 comments

Go is a good fit for agents (docs.hatchet.run)

133 points by abelanger 5d ago 99 comments

Las Vegas is embracing a simple climate solution: More trees (npr.org)

92 points by geox 4h ago 55 comments

Doctors could hack the nervous system with ultrasound (spectrum.ieee.org)

127 points by purpleko 14h ago 14 comments

Show HN: An open-source rhythm dungeon crawler in 16 x 9 pixels (github.com)

27 points by jgalecki 3d ago 5 comments

Marines being mobilized in response to LA protests (cnn.com)

178 points by sapphicsnail 6h ago 153 comments

Bruteforcing the phone number of any Google user (brutecat.com)

483 points by brutecat 14h ago 151 comments

Hokusai Moyo Gafu: an album of dyeing patterns (ndlsearch.ndl.go.jp)

140 points by fanf2 13h ago 13 comments

Show HN: Most users won't report bugs unless you make it stupidly easy

198 points by lakshikag 13h ago 111 comments

Why quadratic funding is not optimal (jonathanwarden.com)

103 points by jwarden 13h ago 81 comments

Pi in Pascal's Triangle (2014) (cut-the-knot.org)

56 points by senfiaj 3d ago 9 comments

A Rippling Townhouse Facade by Alex Chinneck Takes a Seat in a London Square (thisiscolossal.com)

42 points by surprisetalk 3d ago 23 comments

Algovivo an energy-based formulation for soft-bodied virtual creatures (juniorrojas.com)

67 points by tzury 12h ago 4 comments

The Lexiconia Codex: A fantasy story that teaches you LLM buzzwords (medium.com)

5 points by isranimohit 4d ago 3 comments

Finding Shawn Mendes (2019) (ericneyman.wordpress.com)

362 points by jzwinck 21h ago 53 comments

RFK Jr.: HHS moves to restore public trust in vaccines (wsj.com)

178 points by ceejayoz 7h ago 239 comments

A man rebuilding the last Inca rope bridge (atlasobscura.com)

75 points by kaonwarb 3d ago 18 comments

Show HN: Somo – a human friendly alternative to netstat (github.com)

88 points by hollow64 10h ago 22 comments

The new Gödel Prize winner tastes great and is less filling (blog.computationalcomplexity.org)

99 points by baruchel 13h ago 30 comments

How do you prototype a nice language? (kevinlynagh.com)

29 points by surprisetalk 4d ago 6 comments

Debugging Azure Networking for Elastic Cloud Serverless (elastic.co)

7 points by bumblehean 3d ago 0 comments

Show HN: Glowstick – type level tensor shapes in stable rust (github.com)

45 points by bietroi 12h ago 4 comments

LLMs are cheap (snellman.net)

307 points by Bogdanp 16h ago 282 comments

A bit more on Twitter/X's new encrypted messaging (blog.cryptographyengineering.com)

116 points by vishnuharidas 9h ago 73 comments

Riding high in Germany on the world's oldest suspended railway (theguardian.com)

188 points by pseudolus 1d ago 100 comments

Maypole Dance of Braid Like Groups (2009) (divisbyzero.com)

33 points by srean 13h ago 4 comments

The Brilliant Milky Way Connects Photographers in an Annual Contest (thisiscolossal.com)

7 points by surprisetalk 3d ago 1 comments

Spectre.Console – create beautiful console applications (spectreconsole.net)

19 points by vyrotek 10h ago 1 comments

Potential and Limitation of High-Frequency Cores and Caches (2024) (arch.cs.ucdavis.edu)

21 points by matt_d 4d ago 11 comments

Astronomers have discovered a mysterious object flashing signals from deep space (livescience.com)

73 points by gmays 8h ago 39 comments

CoverDrop: A secure messaging system for newsreader apps (github.com)

89 points by andyjohnson0 20h ago 14 comments

Google battling 'fox infestation' on roof of £1B London office (theguardian.com)

61 points by pseudolus 7h ago 39 comments

Omnimax (computer.rip)

198 points by aberoham 1d ago 49 comments

Apple details the end of Intel Mac support and a phaseout for Rosetta 2 (arstechnica.com)

31 points by airhangerf15 5h ago 9 comments

BeBox Page – Everything about the BeBox (2013) (beunited.org)

4 points by doener 6h ago 0 comments

Prince's special custom-font symbol floppy disks (2016) (nymag.com)

42 points by arbesman 4d ago 16 comments

Verification, the Key to AI (2001)

36 anjneymidha 6 5/9/2025, 5:10:28 AM incompleteideas.net ↗

Comments (6)

jrvarela56 · 31d ago

This applies to coding agents. If the agent can't run the code, it's unlikely that it can produce working code. Add to running: linting, running tests, compiling, code review and any other tool/process humans do to check if software is 'good' or working.

If the agent can apply these processes to the output, then we're on our way to getting good chunk of our work done for us. Even from the product pov, if the agent is allowed to experiment by making deployments and check user-facing metrics, it eventually could build software product - but we should still solve the coding part as it seems easier to objectively verify quickly.

jgalt212 · 31d ago

You're right, but actually running the code can be destructive (even when run as intended). You really need to be careful about dev environments. Even the destructive operations will cost you time (and money) in resetting the dev environment.

jrvarela56 · 31d ago

Agreed and I think this highlights the importance of interactivity/snappiness as well as idempotency. This is needed for a human to play around with also.

If the agent has fast+safe feeback loop to experiment then it can go through more cycles, faster, and improve its output.

a3w · 31d ago

Nice. LLMs can prove barely anything, providing some sources, or doing pure math that already circulates. AFAICT, so far, no novel ideas have been proven, i.e. the "these systems never invented anything"-paradox for three years now.

Symbolic AI seems to prove everything it states, but never novel ideas, either.

Let's see if we get neurosymbolic AI that can do something both could not do on their own — I doubt it, AI might just be a doom cult after all.

tasuki · 31d ago

You can use an external proving mechanism and feed the results to the LLM.

A sufficiently rich type system (think Idris rather than C) or a sufficiently powerful test suite (eg property-based tests) should do the trick.

jbellis · 31d ago

Wow, 2001. Legitimately prescient.

And verification ("evaluation" we call it now) really is the key, although most people working on "AI apps" haven't figured it out yet.

Follow Hamel to catch up on the state of the art: https://x.com/HamelHusain