TPDE: A Fast Adaptable Compiler Back-End Framework (arxiv.org)
34 points by npalli 8h ago 9 comments
Show HN: I built an AI Agent that uses the iPhone (github.com)
16 points by rounak 7h ago 3 comments
How Often Do LLMs Snitch? Recreating Theo's SnitchBench with LLM
9 Philpax 4 5/31/2025, 10:53:39 PM simonwillison.net ↗
But this prompt literally overrides model's values and tells it to snitch, how else could it be interpreted? The test doesn't measure the snitching likelihood at all and won't generalize.
Misleading tests like this is basically water to Anthropic's mill. They are rooted in the AI doomsday cult and strongly biased towards finding the evidence that LLMs are misbehaving (and need to be gatekept and controlled by the Good Guys, i.e. Anthropic themselves).
I don't think overwhelming public officials with alarmist machine-generated spam is helpful to anyone.
EDIT: The "benchmark" doesn't even seem to contain any negative examples. What a joke.