Show HN: Nelly – Create your own team of AI agents (nelly.is)
4 points by gitmagic 1h ago 0 comments
Show HN: OSle – A 510 bytes OS in x86 assembly (github.com)
159 points by shikaan 5d ago 32 comments
Will supercapacitors come to AI's rescue?
45 mfiguiere 60 5/6/2025, 7:30:58 PM spectrum.ieee.org ↗
Oh god...I can see it now. Someone will try to capitalize on the hype of LLMs and the hype of cryptocurrency and try to build a combined LLM training and cryptocurrency mining facility that that runs the mining between training spikes.
https://arxiv.org/abs/2504.13171
It should be noted that this type of inference is less useful on time-sensitive tasks, but most tasks truthfully don't require such time sensitivity (there exists slack time between when the task is given & when questions are asked).
1: https://platform.openai.com/docs/guides/batch
2: https://docs.anthropic.com/en/docs/build-with-claude/batch-p...
3: https://docs.parasail.io/parasail-docs/batch/batch-quickstar...
This looks a whole lot more like high frequency load smoothing. Really it seems to me like a continuation of a motherboard. Even if you have a battery backup on your PC you still have capacitors on the board for voltage fluctuations.
edit: otherwise I'm not getting what the entire article is about. it's as contrary to what I know about datacenter design as it can get.
it's.. just wrong.
because if so, I have some nice east-european guys to teach them proper load-balancing.
Then again, some data centers just use re-generation from giant flywheels, where the grid powers the flywheel to build up inertia, and thus load can be smoothed that way. The flywheels need to keep running at 60 Hz for at least 30 seconds to give the Diesel generators time to start, should the grid fail.
Run a flywheel at the main transformer, run batteries at your PDC and add supercaps at each point of load, and you may very well be able to show a much smoother load to the grid.
The only way to spread the spikes would be to make the training run slower, but that'd be a hard sell considering training can sometimes be measured in days.
where do you get those ntfractions of seconds? network? storage?
[1] https://www.youtube.com/watch?v=vXsT6lBf0X4
[1] https://github.com/pytorch/pytorch/pull/132936/files#diff-98...
> Another solution is dummy calculations, which run while there are no spikes, to smooth out demand.
Otherwise a lot of expensive GPU capital is idle between bursts of computation.
Didn't DeepSeek do something like this to get more system level performance out of less capable GPUs?
There's probably something that could be done on the individual systems so that they don't modulate power use quite so fast, too; at some latency cost, of course. If you go all the way to the extremes, you might add a zero crossing detector and use it to time clock speed increases.
I imagine common power rail systems in hyperscaler equipment helps a bit with this, but for sure switching PSUS chop up the input voltage and smooth it out. And that leads to very strange power draws.
Ramp rate has been a generation side thing for a century. Every morning, load increases from the pre-dawn low, and which generators can ramp up output at what speed matters. Ramp rate is usually measured in megawatts/minute. Big thermal plants, atomic and coal, have the lowest ramp rates, a few percent per minute.
Ramp rate demand side, though, is a new thing. There are discussions about it [1] but it's not currently something that's a parameter in electric energy bills.
[1] https://www.aceee.org/files/proceedings/2012/data/papers/019...
The data center issue is not related to power factor.
If you want to smooth out data centers then you need hourly pricing to force them to manage their demand into periods where excess grid capacity is not being used to serve residential loads.
Drawing high intermittent loads at high frequency likely makes the utility upset and leads to over-building supply to the customer to cope with peak load. If you can shave down those peaks, you can use a smaller(cheaper) supply connection. A smoother load will also make the utility happy.
Remember that electricity generation cannot ramp up and down quickly. Big transient loads can cause a lot of problems through the whole network.
Slightly related, you can actually hear this effect depending on your GPU. It’s called coil whine. When your GPU is doing calculations, it draws more power and whines. Depending on your training setup, you can hear when it’s working. In other words, you want it whining all the time.
I'm surprised it's not cheaper to modulate all those compressor motors they presumably already have
Yes, just like the octopussies. /s