Telum II at Hot Chips 2024: Mainframe with a Unique Caching Strategy

74 rbanffy 19 5/19/2025, 10:27:34 AM chipsandcheese.com ↗

Comments (19)

jfindley · 57m ago
It's a shame the article chose to compare solely against AMD CPUs, because AMD and Intel have very different L3 architectures. AMD CPUs have their cores oranised into groups, called a CCX, each of which have their own small L3 cache. For example the Turin-based 9755 has 16 CCXs each with 32MB of L3 cache. Far less cache per core than the mainframe CPU being described. In contrast to this, Intel uses an approach that's a little closer to the Telum II CPU being described - a Granite Rapids AP chip such as 6960P has 432 MB of L3 cache shared between 72 physical cores, each with its own 2MB L2 cache. This is still considerably less cache, but it's not quite as stark a difference as the picture painted by the article.

This doesn't really detract from the overall point - stacking a huge per-core L2 cache and using cross-chip reads to emulate L3 with clever saturation metrics and management is very different to what any x86 CPU I'm aware of has ever done, and I wouldn't be surprised if it works extremely well in practice. It's just that it'd have made a stronger article IMO if it had instead compared dedicated L2 + shared L2 (IBM) against dedicated L2 + shared L3 (intel), instead of dedicated L2 + sharded L3 (amd).

jonathaneunice · 3h ago
Virtual L3 and L4 swinging gigabytes around to keep data at the hot end of the memory-storage hierarchy even post L2 or L3 eviction? Impressive! Exactly the kind of sophisticated optimizations you should build when you have billions of transistors at your disposal. Les Bélády's spirit smiles on.
exabrial · 3h ago
What languages are people still writing mainframe code in? In 2011 working for a prescription rx processor, COBOL was still the name of the game.
BugheadTorpeda6 · 1m ago
For applications or middlewares and systems?

For applications, COBOL and Java. For middleware and systems and utilities etc, assembly, C, C++, probably still some PL/X going on too.

rbanffy · 3h ago
There's also lots of Java as well, and IBM is making a big effort on porting existing Unix utilities to z/OS (which is a certified UNIX). With Linux, the choices are the same as with other hardware platforms. I assume you'll find lots of Java and Python running on LinuxONE machines.

Running Linux, from a user's perspective, it feels just like a normal server with a fast CPU and extremely fast IO.

jiggawatts · 2h ago
> extremely fast IO.

I wonder how big a competitive edge that will remain in an era where ordinary cloud VMs can do 10 GB/s to zone-redundant remote storage.

Cthulhu_ · 2h ago
GB/s is one metric, but IOPS and latency are others that I'm assuming are Very Important for the applications that mainframes are being used for today.
FuriouslyAdrift · 1h ago
Latency is much more important than thoughput...
RetroTechie · 1h ago
On-site.

Speed is not the only reason why some org/business would have Big Iron in their closet.

inkyoto · 2h ago
Guaranteed sustained write throughput is a distinguished feature of the mainframe storage.

Whilst cloud platforms are the new mainframe (so to speak), and they have all made great strides in improving the SLA guarantees, storage is still accessed over the network (plus extra moving parts – coordination, consistency etc). They will get there, though.

bob1029 · 2h ago
You can do a lot of damage with some stored procedures. SQL/DB2 capabilities often go overlooked in favor of virtualizing a bunch of Java apps that accomplish effectively the same thing with 100x the resource use.
exabrial · 2h ago
Hah, anecdote incoming, but 100x times a resource usage is probably accurate. Given, 100x of a human hair is still just a minuscule grain of sand, but that’s the scale margins Mainframe operators work in.

As one grey beard said it to me: Java is loosely typed and dynamic compared to colbol/db2/pl-sql. He was particularly annoyed that the smallest numerical type a ‘byte’ in Java was quote: “A waste of bits” and that Java was full of “useless bounds checking” both of which were causing “performance regressions”.

The way mainframe programs are written is: the entire thing is statically typed.

thechao · 33m ago
When I was being taught assembly at Intel, one of the graybeards told me that the greatest waste of an integer was to use it for a "bare" add, when it was a perfectly acceptable 64-wide vector AND. To belabor the the point: he used ADD for the "unusual set of XORs, ANDs, and other funky operations it provided across lanes". Odd dude.
uticus · 5m ago
> ...it was a perfectly acceptable 64-wide vector AND.

sounds like "don't try to out-optimize the compiler."

PaulHoule · 1h ago
I knew mainframe programmers were writing a lot of assembly in the 1980s and they probably still are.
pjmlp · 2h ago
RPG, COBOL, PL/I, NEWP are the most used ones. Unisys also has their own Pascal dialect.

Other than that, there are Java, C, C++ implementations for mainframes, for a while IBM even had a JVM implementation for IBM i (AS/400), that would translate JVM bytecodes into IBM i ones.

Additionally all of them have POSIX environments, think WSL like but for mainframes, here anything that goes into AIX, or a few selected enterprise distros like Red-Hat and SuSE.

specialist · 40m ago
Most impressive.

I would enjoy an ELI5 for the market differences between commodity chips and these mainframe grade CPUs. Stuff like design, process, and supply chain, anything of interest to a general (nerd) audience.

IBM sells 100s of Z mainframes per year, right? Each can have a bunch of CPUs, right? So Samsung is producing 1,000s of Telums per year? That seems incredible.

Given such low volumes, that's a lot more verification and validation, right?

Foundaries have to keep running to be viable, right? So does Samsung bang out all the Telums for a year in one burst, then switch to something else? Or do they keep producing a steady trickle?

Not that this info would change my daily work or life in any way. I'm just curious.

TIA.

bell-cot · 43m ago
Interesting to compare this to ZFS's ARC / MFSvsMRU / Ghost / L2ARC / etc. strategy for (disk) caching. IIR, those were mostly IBM-developed technologies.
msbah · 8m ago
co fcnc