HN Reader
Top
New
Best
Ask
Show
Jobs
Top
New
Best
Ask
Show
Jobs
More capable models are better at in-context scheming
6
miles
1
6/20/2025, 9:28:19 PM
apolloresearch.ai ↗
Comments (1)
chiph2o
· 5h ago
in-context scheming = alignment red flag
More capability + low clarity on intent = low trust
[-] Collapse
More capability + low clarity on intent = low trust