Ask HN: Do cloud provided LLMs get "dumber" during business hours

3 dav43 1 8/28/2025, 5:17:26 AM

I have heard this mentioned, online and in casual conversation

** During business hours these endpoints may get so highly utilised that accuracy and quality decline (ignore latency) (effective available token capacity is reduced etc?**

Has anyone run studies/experiments to show (true or false) - across a large dataset - that the performance of these endpoints during peak usage hours vs, low usage hours changes or does not change (I assume some statistical significance test required).

I don't have the knowledge to answer this question or understand if it's a valid hypothesis. Anyone got resources on this? I could only find tangential mentions in these papers.

[0] https://arxiv.org/html/2507.18007v1 "These challenges contribute to bottlenecks during peak workloads, ultimately affecting inference service quality, scalability, and responsiveness, which requires accurate resource profiling for LLM inference task."

[1] Asking Gemini - https://share.google/aimode/bWb5w9dpf2ZeggqSj

Comments (1)

cranberryturkey · 3h ago

I've wondered this too actually.

My service was played in Enowars 9: Here is how it went (stoffregen.io)

UMichigan study: EVs are cleaner than ICEs over average vehicle life (insideevs.com)

Iran directed arson attacks in Australia and 'fanned the flames' of antisemitism (theguardian.com)

Show HN: Linkfy – On-Device URL Cleaner for iOS

Adafruit Fruit Jam – A RP2350 mini computer running classic Macintosh (cnx-software.com)

Nvidia Blackwell Ultra (GB300): -97% INT8/FP64,+50% FP4 Dense,+55% VRAM,+114% At (resources.nvidia.com)

The Fastest EV in the World Is Now Chinese (insideevs.com)

Make LLMs understand your 3D models (glb2png.com)

The desktop metaphor must be restored. It's under attack (2020) (medium.com)

AI robots are helping South Korea's seniors feel less alone (restofworld.org)

Printing Labels via SSH with Raspberry Pi Zero and Nix (nmattia.com)

Ask HN: Chinese domain name registration center, maybe scam?

Cloud-hosted and partitioned files best practices (github.com)

The A.I.-Profits Drought and the Lessons of History (newyorker.com)

FerretDB Cloud: MongoDB-compatible DBaas, built on open-source DocumentDB (blog.ferretdb.io)

The Group That's Been Swatting US Universities (wired.com)

How to Get Hired at a Startup the Checklist Every Candidate Needs (foundersarehiring.com)

The Tragic Betrayal of Kindness: A Farmer and the Poisonous Snake (storybook.baby)

Show HN: EPIC.CSS – An epic dark themed CSS framework (github.com)

Boston's New Balance shoes now get 'made in Tamil Nadu' tag (dtnext.in)

Joe Caroff, Designer of the James Bond 007 Logo, Dies at 103 (daringfireball.net)

Glow-in-the-dark succulents that recharge with sunlight (phys.org)

How much danger is America's central bank in? (economist.com)

10 Years of GKE (cloud.google.com)

Show HN: precision asteroid orbital dynamics library (github.com)

Oldest surviving Unix language program (en.wikipedia.org)

On Continuous Test Improvement (dsyme.net)

Top Gen AI Consumer Apps 2025 (a16z.com)

A Tool-changing 3D Printer For The Masses (hackaday.com)

A Vim hater's guide to Neovim (medium.com)

The Basics of Anchor Positioning (ishadeed.com)

2 years after a failed startup,I launched a new one and got my first paying user (buildingwithchris.com)

Show HN: VoiceHop – Real-Time Audio and Video Translation (qwikrank.com)

20k years ago, a single flood swept through the world (wionews.com)

North Korea's Kim Jong Un to Join Putin at China Military Parade (bbc.com)

Is AI taking developer jobs? (yourweekly.dev)

Generate a fog of war map from Google and activity data (github.com)

Ask HN: How can I get feedback about my GitHub side project?

Lumo the Private AI by Proton (lumo.proton.me)

Luna Park: The Visual Scripting Editor (luna-park.app)

Kubernetes v1.34: Of Wind and Will (O' WaW) (kubernetes.io)

Show HN: ProServer – Sign .ipa files with ease (github.com)

The Mozart of Chess (2004) (en.chessbase.com)

For Complex Applications, Rust Is as Productive as Kotlin[2020] (ferrous-systems.com)

Think you own those movies you've been buying digitally? Think again (theguardian.com)

Build interactive 3D globes in Framer (globe-map.framer.website)

Comparing PostgreSQL vs. MongoDB: Which Is Better for AI Workloads? (mongodb.com)

Bland,for fans of everything: what has the Netflix algorithm done to our films? (theguardian.com)

Compress Vectors by 4x by using 8-bit Rotational Quantization (weaviate.io)

Handling Background and Foreground States for Reliable Notifications (blog.clix.so)

Ask HN: Do cloud provided LLMs get "dumber" during business hours

Comments (1)