As there is ongoing drama with Zen 5 and power issues, there are people with the instruments and the motivation to investigate this. You should consider contacting Gamers Nexus, and help them to get your test suite running. They can measure power draw and do a thermal analysis of this CPU, and they'd likely be eager to do it, given the possibility of making a bunch of dramatic YouTube content about design flaws in widely used hardware. That's pretty much their whole schtick in recent years.
> Modern CPUs measure their temperature and clock down if they get too hot, don't they?
Yes. It's rather complex now and it involves the motherboard vendors firmware. When (not if) they get that wrong CPUs burn up. You're going to need some expertise to analyze this.
fxtentacle · 29m ago
He's a bit sensationalist, yes, but I am thankful that he saved us from buying affected Intel CPUs.
spookie · 18m ago
Not sure of sensationalist or just doing great reporting. I take him as one of the last good tech journalists on the platform.
db48x · 4m ago
> The so-called TDP of the Ryzen 9950X is 170W. The used heat sinks are specified to dissipate 165W, so that seems tight.
TDP numbers are completely made up. They don’t correspond to watts of heat, or of anything at all! They’re just a marketing number. You can't use them to choose the right cooling system at all.
The room temperature or precise way the paste was applied should not matter. Modern CPUs have very advanced dynamic voltage and frequency scaling (DVFS), which accounts for several sensors, including temperature.
These big x86 CPUs in stock configuration can throttle down to speeds where they can function with entirely passive cooling, so even if the cooler was improperly mounted, they'd only throttle.
All that to say, if GMP is causing the CPU to fry itself, something went very wrong, and it is not user error or the room being too hot.
secabeen · 8m ago
I would be interested to see if they had the same result with PTM7950 thermal material instead of paste. I've seen significantly better temps with these modern phase-change compounds, and they essentially eliminate application errors.
craftkiller · 35m ago
Looking at the AM5 pinout[0], it looks like those pins are VDDCR and VSS. There might be a little bit of PCIe sprinkled in towards the outer edges, but I'm not 100% on the orientation of this pinout vs the orientation of the CPU. I don't know anything about electricity so I've got nothing else to add.
I've heard some really wild noises coming out of my zen4 machine when I've had all cores loaded up with what is best described as "choppy" workloads where we are repeatedly doing something like a parallel.foreach into a single threaded hot path of equal or less duration as fast as possible. I've never had the machine survive this kind of workload for more than 48 hours without some kind of BSOD. I've not actually killed a cpu yet though.
bee_rider · 19m ago
Is that, like, an intentional stress-test for the hardware that you’ve come up with?
fxtentacle · 30m ago
"We suspect that GMP's extremely tight loops around MULX make the Zen 5 cores use much more power than specified, making cooling solutions inadequate."
I feel like if this was heat related, the overall CPU temperature should still somewhat slowly creep up, thereby giving everything enough time for thermal throttling. But their discoloration sure looks like a thermal issue, so I wonder why the safety features of the CPU didn't catch this...
jeffbee · 23m ago
Are we talking "slowly" in a relative sense? A silicon die of this size has a thermal mass (guessing) around 10⁻³ J/K but a power dissipation rate over 200W, so it can rise from room temperature to junction temperature limits almost instantly.
topspin · 18m ago
People without a background in electronics don't appreciate what modern CPUs and GPUs are doing: the amount of current flowing through these devices is just mind blowing. With adequate cooling, a Ryzen 9 9950X is handling somewhere in the neighborhood of 150-200 amps under high load.
tw04 · 26m ago
That looks like a combination of improperly mounting the heatsink and noctuna being wrong in their recommendation to offset it. I’d imagine for gaming cooling one side more makes sense but my completely uneducated guess is that GMP is working a different part of the CPU than gaming does.
toast0 · 16m ago
They had failures with standard mounting and offset mounting.
Also, take a look at a delidded 9950; the two cpu chiplets are to one side, the i/o chiplet is in the middle, and the other side is a handful of passives. Offsetting the heatsink moves the center of the heatsink 7mm towards the chiplets (the socket is 40mm x 40mm), but there's still plenty of heatsink over the top of the i/o chiplet.
> Modern CPUs measure their temperature and clock down if they get too hot, don't they?
Yes. It's rather complex now and it involves the motherboard vendors firmware. When (not if) they get that wrong CPUs burn up. You're going to need some expertise to analyze this.
TDP numbers are completely made up. They don’t correspond to watts of heat, or of anything at all! They’re just a marketing number. You can't use them to choose the right cooling system at all.
https://gamersnexus.net/guides/3525-amd-ryzen-tdp-explained-...
These big x86 CPUs in stock configuration can throttle down to speeds where they can function with entirely passive cooling, so even if the cooler was improperly mounted, they'd only throttle.
All that to say, if GMP is causing the CPU to fry itself, something went very wrong, and it is not user error or the room being too hot.
[0] https://upload.wikimedia.org/wikipedia/commons/2/2d/Socket_A...
I've heard some really wild noises coming out of my zen4 machine when I've had all cores loaded up with what is best described as "choppy" workloads where we are repeatedly doing something like a parallel.foreach into a single threaded hot path of equal or less duration as fast as possible. I've never had the machine survive this kind of workload for more than 48 hours without some kind of BSOD. I've not actually killed a cpu yet though.
I feel like if this was heat related, the overall CPU temperature should still somewhat slowly creep up, thereby giving everything enough time for thermal throttling. But their discoloration sure looks like a thermal issue, so I wonder why the safety features of the CPU didn't catch this...
Also, take a look at a delidded 9950; the two cpu chiplets are to one side, the i/o chiplet is in the middle, and the other side is a handful of passives. Offsetting the heatsink moves the center of the heatsink 7mm towards the chiplets (the socket is 40mm x 40mm), but there's still plenty of heatsink over the top of the i/o chiplet.
This article has some decent pictures of delidded processors https://www.tomshardware.com/pc-components/overclocking/deli...
Everything is offset towards one side and the two CPU core clusters are way towards the edge, offset cooling makes sense regardless of usage.
What kind of sh..t are they pushing on us?
Even if Raspberry seems slow it gets more and more attractive.