Ask HN: Do cloud provided LLMs get "dumber" during business hours

3 dav43 1 8/28/2025, 5:17:26 AM
I have heard this mentioned, online and in casual conversation

** During business hours these endpoints may get so highly utilised that accuracy and quality decline (ignore latency) (effective available token capacity is reduced etc?**

Has anyone run studies/experiments to show (true or false) - across a large dataset - that the performance of these endpoints during peak usage hours vs, low usage hours changes or does not change (I assume some statistical significance test required).

I don't have the knowledge to answer this question or understand if it's a valid hypothesis. Anyone got resources on this? I could only find tangential mentions in these papers.

[0] https://arxiv.org/html/2507.18007v1 "These challenges contribute to bottlenecks during peak workloads, ultimately affecting inference service quality, scalability, and responsiveness, which requires accurate resource profiling for LLM inference task."

[1] Asking Gemini - https://share.google/aimode/bWb5w9dpf2ZeggqSj

Comments (1)

cranberryturkey · 3h ago
I've wondered this too actually.