LLM Evals Are Just Tests. Why Are We Making This So Complicated?

2 camwest 1 8/10/2025, 3:23:48 AM cameronwestland.com ↗

Comments (1)

8organicbits · 7m ago
So, did the tests allow you to build a system that never confused existing features with new features? That seems like the problem statement, but I think I'm only seeing probabilistic testing.