That's from the period when there was no standardization of how the CPU talked to the graphics device. Triangles or quads? Shared memory or command queues? DMA from the CPU side or the graphics device side? Graphics as part of the CPU/memory system or as part of the display system? Can the GPU cause page faults which are serviced by the virtual memory system?
Now we have Vulkan. Vulkan standardizes some things, but has a huge number of options because hardware design decisions are exposed at the Vulkan interface. You can transfer data from CPU to GPU via DMA or via shared memory. Memory can be mapped for bidirectional transfer, or for one-way transfer in either direction. Such transfers are slower than normal memory accesses. You can ask the GPU to read textures from CPU memory because GPU memory is full, which also carries a performance penalty. Or you can be on an "integrated graphics" machine where CPU and GPU share the same memory. Most hardware offers some, but not all, of those options.
This is why a lot of stuff still uses OpenGL, which hides all that.
(I spent a few years writing AutoCAD drivers for devices now best forgotten, and later trying to get 3D graphics to work on PCs in the 1990s. I got to see a lot of graphics boards best forgotten.)
saltcured · 4h ago
And that was an evolution of earlier 2D cards where you had a potential mixture of CPU-addressable framebuffer and various I/O ports to switch modes between text and raster graphics, adjust video modes in DACs, adjust color palette lookup tables, load fonts for text modes, and maybe address some 2D coprocessors for things like "blitting" (kind of like rectangular 2D DMA), line drawing, or even some basic polygonal rendering with funny options like dithering or stipple shading...
mrandish · 7h ago
This kind of retrospective from key people who were involved is invaluable from an historical perspective. I find hearing first-hand accounts of the context, assumptions, thought processes, internal debates, technical limitations, business realities and even dumb luck a good way to not only understand how we got here but how to do as well (or better) going forward.
While the nitty gritty detail of recollections captured when still fresh in memory can be fascinating, I especially appreciate reflections written a few decades later as it allows putting the outcomes of key decisions in perspective, as well as generally enabling more frank assessments thanks to fewer business and personal concerns.
rhdjsjebshjffn · 5h ago
I'm excited about this too, but it's a little concerning there's a brand in the title. There's no shortage of those from ati, intel, amd, apple, ibm, the game gaggle, etc to interview. The fact that nvidia succeeded where others failed is largely an artifact of luck.
mrandish · 2h ago
> largely an artifact of luck.
I disagree with "largely". Luck is always a factor in business success and there are certainly some notable examples where luck was, arguably, a big enough factor that "largely" would apply - like Broadcast.com's sale to Yahoo right at the peak of the .com bubble. However, I'm not aware of evidence luck was any more of a factor in NVidia's success than the ambient environmental constant it always is for every business. Luck is like the wind in competitive sailing - it impacts everyone, sometimes positively, sometimes negatively.
Achieving and then sustaining substantial success over the long run requires making a lot of choices correctly as well as top notch execution. The key is doing all of that so consistently and repeatedly that you survive long enough for the good and bad luck to cancel each other out. NVidia now has over 30 years of history through multiple industry-wide booms, downturns and fundamental technology transitions - a consistent track record of substantial, sustained success so long that good luck can't plausibly be a significant factor.
That said, to me, this article didn't try to explain NVidia's long-term business success. It focused on a few key architectural decisions made early on which were, arguably, quite risky in that they could have wasted a lot of development on capabilities which didn't end up mattering. However, they did end up paying off and, to me, the valuable insight was that key team members came from a different background than their competitors and their experiences with multi-user, multi-tasking, virtualized mini and mainframe architectures caused them to believe desktop architectures would evolve in that direction sooner rather than later. The takeaway being akin to "skate to where the puck is going, not where it is." In rapidly evolving tech environments, making such predictions is greatly improved when the team has both breadth and depth of experience in relevant domains.
jjtheblunt · 4h ago
Nvidia’s Cg language made developers prefer their hardware, I’d say.
No comments yet
jacobgorm · 5h ago
I remember sitting next to David Rosenthal at a conference reception (must have been FAST, which makes sense given his involvement with LOCKSS) in San Jose some time around 2010 or 2011, not knowing up front who he was. He explained some of the innovations he had made at NVIDIA around making the hardware more modular and easier for parallel teams to work on, and we chatted about the rumors I had heard about SUN thinking about licensing the Amiga hardware, which he confirmed but said would have been a bad idea, because the hardware didn't support address space protection. I guess I didn't know enough about him or NVIDIA to be sufficiently impressed at the time, but he was a very friendly and down to earth person.
cadamsdotcom · 4h ago
> all an application could do was to invoke methods on virtual objects .. the application could not know whether the object was implemented in hardware or in the resource manager's software. The flexibility to make this decision at any time was a huge advantage. As Kim quotes Michael Hara as saying:
> “This was the most brilliant thing on the planet. It was our secret sauce. If we missed a feature or a feature was broken, we could put it in the resource manager and it would work.”
Absolutely brilliant. Understand the strengths and weaknesses of your tech (slow/updateable software vs fast/frozen hardware) then design the product so a missed deadline won’t sink the company. A perfect combo of technically savvy management and clever engineering.
pjmlp · 8h ago
I wanted to buy a Voodoo card, and due to PCI incompatible version, had to trade it back for a Riva TNT.
Back then I was quite p***d not being able to keep the Voodoo, how little did I know how it was going to turn out.
stewarts · 5h ago
We hold you singularly responsible for the eventual failure of the Voodoo3/4/5 and Nvidia domination.
pjmlp · 4h ago
Sorry.... :)
ahartmetz · 3h ago
I had a Riva TNT at the time, and a friend's Voodoo... 2? sure ran Half-Life better than my card. The Voodoo 2 probably really was the better GPU at the time. Later games required later APIs (unsupported by older cards) - and much higher performance - anyway, so longevity was not much either way.
hackyhacky · 7h ago
> At a time when PC memory maxed out at 640 megabytes,
Pretty sure the author meant write 640 kilobytes.
usefulcat · 6h ago
It's hard to tell exactly what time frame the author is referencing there. For context, NV1 was released in '95, by which time it was not uncommon for a new PC to have 8-16 MB of memory (I had a 486 with 16 MB by '94). Especially if you planned to use it for gaming.
npalli · 5h ago
The sentence and paragraph which makes it clear that this was megabytes and not kilobytes
At a time when PC memory maxed out at 640 megabytes, the fact that the PCI bus could address 4 gigabytes meant that quite a few of its address bits were surplus. So we decided to increase the amount of data shipped in each bus cycle by using some of them as data. IIRC NV1 used 23 address bits, occupying 1/512th of the total space. 7 of the 23 selected one of the 128 virtual FIFOs, allowing 128 different processes to share access to the hardware. We figured 128 processes was plenty.
AStonesThrow · 5h ago
Okay but "640" is a completely fictitious number for installed RAM in any given PC.
PC memory was nearly always sold in powers of two. So you could have SIMMs in capacity of 1MiB, 2MiB, 4, 8, 16MiB. You could usually mix-and-match these memory modules, and some PCs had 2 slots, some had 4, some had a different number of slots.
So if you think about 4 slots that can hold some sort of maximum, we're thinking 64MiB is a very common maximum for a consumer PC, and that may be 2x32 or 4x16MiB. Lots of people ran up against that limit for sure.
640MiB is an absurd number if you think mathematically. How do you divide that up? If 4 SIMMs are installed, then their capacity is 160MiB each? No such hardware ever existed. IIRC, individual SIMMs were commonly maxed at 64MiB, and it was not physically possible to make a "monster memory module" larger than that.
Furthermore, while 64MiB requires 26 bits to address, 640MiB requires 30 address bits on the bus. If a hypothetical PC had 640MiB in use by the OS, then only 2 pins would be unused on the address bus! That is clearly at odds with their narrative that they were able to "borrow" several more!
This is clearly a typo and I would infer that the author meant to write "64 megabytes" and tacked on an extra zero, out of habit or hyperbole.
chadaustin · 4h ago
You are straight up wrong. The first computer I ever built was a Pentium 2, RivaTNT, and it had 640 MB RAM.
I can’t find the purchase receipts or specific board brand but it had four SDRAM slots, and I had it populated with 2x64 and 2x256.
Edit: Found it in some old files of mine:
I was wrong! Not four DIMM slots... three! One must have been 128 and the other two 256.
Pentium II 400, 512k cache
Abit BF6 motherboard
640 MB PC100 SDRAM
21" Sony CPD-G500 (19.8" viewable, .24 dot pitch)
17" ViewSonic monitor (16" viewable, .27 dot pitch)
RivaTNT PCI video card with 16 MB VRAM
Creative SB Live!
Creative 5x DVD, 32x CD drive
Sony CD-RW (2, 4, 24)
80 GB Western Digital ATA/100
40 GB Western Digital ATA/100
17.2 GB Maxtor UltraDMA/33 HDD
10.0 GB Maxtor UltraDMA/33 HDD
Cambridge SoundWorks FourPointSurround FPS2000 Digital
3Com OfficeConnect 10/100 EtherNet card
3 Microsoft SideWinder Gamepads
Labtec AM-252 Microphone
Promise IDE Controller card
Hauppage WinTV-Theatre Tuner Card
Clamchop · 2h ago
The article is a touch confusing, but I'm pretty sure I agree that they meant the 640 kilobyte limit of the OG PC architecture. The Pentium II dates from 1997, the NV1 to 1995, and the new PCI bus with its whopping 32-bits to 1992. 640MiB would have been a prodigious amount of memory at the time of launch.
I don't think any mathematical relationship between the address bus and either 640KiB or 640MiB was intended, it was just the anchor point for how huge 4GiB of addressing was viewed at the time.
The article then goes on to say that the NV1 used 23 bits of the address bus but adds in the next paragraph that 16 bits remained to use for data. That math isn't working out for me.
Actually, I'm really struggling to understand how this scheme would work at all. It strongly implies open addressing with no other MMIO devices to conflict with, but that's just not how I thought PCI worked. Maybe someone who knows more can explain it to me.
npalli · 1h ago
My reading was 640MiB was seen as some extraordinary upper bound that was unlikely to be breached in 1995 leaving lot of bits of the address bus for NV1 for quite some time. The 640 KiB seems definitely not a limit as even IBM PC/AT released in 1984! had an upper limit of 16MiB. So, as NV1's designer you could not assume 640KiB was some sort of upper bound on PC's of 1995 when designing the scheme. As to why 640MiB and not something else, I believe Windows 95 could address 2GB in theory but would start becoming unstable around 512MiB so maybe he chose 640MiB.
The whole thing a bit ironic since Bill Gates took great pains to say he never said 640KiB is all you need (or something like that). Given my example of the IBM PC/AT it definitely was not common understanding of upper limits in 1995 apocryphal or not.
Clamchop · 1h ago
Yeah, I dunno. Besides being a lot for 1995, the address space stuff, if taken at face value, means you'd only have to knick 3 bits off before starting to eat into it. Shrug
accrual · 1h ago
Impressive build, probably sounded pretty wild upon startup with 4x ATA drives!
trentnelson · 2h ago
Oh man, the Abit motherboards! That takes me back. How much did this cost and at what time? Presume very late 90s.
AStonesThrow · 3h ago
Alright then! Humbly, I stand corrected about my poor speculation without research. It looks like 640MiB has been a perfectly achievable configuration, especially with 2x256+2x64 or such. That is, I must say, a huge amount of RAM. Like, way more than any video game ever specified in HW requirements. What use cases applied that you could use up 640MiB in that era, I wouldn't know!
I remain a bit mystified about why it would be a hard maximum, though. Did such motherboards prevent the user from installing 4x256MiB for a cool 1GiB of DRAM? Was the OS having trouble addressing or utilizing it all? 640MiB is not a mathematical sort of maximum I was familiar with from the late 1990s. 4GiB is obviously your upper limit, with a 32-bit address bus... and again, if 640MiB were installed, that's only 2 free bits on that bus.
So I'm still a little curious about this number being dropped in the article. More info would be enlightening! And thank you for speaking up to correct me! No wonder it was down-voted!
chadaustin · 2h ago
I did a bunch of media and software development back then so RAM helped a lot. Why 640? Not sure. My particular board could have gone up to 768. I did some googling and found some boards that maxed out at 1 GB.
That was a weird time in computing. Things were getting fast and big quickly (not that many years later, I built a dual-socket Xeon at 2.8 GHz, and before that my brother had a dual socket P3 at 700 MHz.) but all the expansion boards were so special-purpose. I remember going out of my way to pick a board with something like seven expansion slots.
But I think your question about why the author said 640 is fair! Maybe they had a machine like mine around then. Or maybe it’s something NVIDIA was designing around?
rjsw · 6h ago
Maybe that is what they were thinking but anything designed to work with a PCI bus would have been introduced after PCs became capable of using more memory than that.
killme2008 · 7h ago
Really fascinating story—thanks for sharing! Graphics programming has been a major driving force behind the widespread adoption of object-oriented programming, and the abstraction of devices in this context is truly elegant.
artyom · 5h ago
This reads as one of the many engineering marvel stories (e.g. Bell Labs, Xerox) where revolutionary technology is created by a combination of (a) clever engineers with enough "free" time, and (b) no clueless managers around.
whyowhy3484939 · 4h ago
You can read in Kernighans History of Unix that really good managers - "enlightened management" IIRC - were involved and not just involved, some of them were absolutely crucial or Unix won't have existed. It's not like you can just let loose a couple of big brains and things will work out fine. They won't (and didn't).
trinsic2 · 3h ago
I stopped reading right here:
> Because Nvidia became one of the most valuable companies in the world, there are now two books explaining its rise and extolling the genius of Jensen Huang,
Yeah, he's a real genius. (Sarcasm). He is a marking guy, there is no genius behind this man.
The fact that Nvidia uses its market position cause harm to the industry by strong-arming partners to toe the line makes this company a problem, just like all the others. They operate like any other predatory corporation.
tasty_freeze · 45s ago
> He is a marking guy,
You would be wrong. He worked at AMD as a design engineer, and later went to LSI logic helping customers put out custom ASICs. One of his customers, a big customer, was Sun, helping with their SPARC processor and the GX graphics chips, and no doubt many others.
In 1989-1991 I did three ASICs at LSI Logic -- and Jensen was my liason there on the latter two. He was incredibly smart, hard working, technically knowledgeable, kind, patient, and generous with his time despite being very busy.
The marketing stuff came later (or maybe said better: it was latent and it came out later)
jwmcq · 2h ago
Probably read the rest? I did not see Jensen's name on any of the patents that this key engineer discusses the detail and rationale of, and I feel that those names are listed fairly deliberately.
Now we have Vulkan. Vulkan standardizes some things, but has a huge number of options because hardware design decisions are exposed at the Vulkan interface. You can transfer data from CPU to GPU via DMA or via shared memory. Memory can be mapped for bidirectional transfer, or for one-way transfer in either direction. Such transfers are slower than normal memory accesses. You can ask the GPU to read textures from CPU memory because GPU memory is full, which also carries a performance penalty. Or you can be on an "integrated graphics" machine where CPU and GPU share the same memory. Most hardware offers some, but not all, of those options.
This is why a lot of stuff still uses OpenGL, which hides all that.
(I spent a few years writing AutoCAD drivers for devices now best forgotten, and later trying to get 3D graphics to work on PCs in the 1990s. I got to see a lot of graphics boards best forgotten.)
While the nitty gritty detail of recollections captured when still fresh in memory can be fascinating, I especially appreciate reflections written a few decades later as it allows putting the outcomes of key decisions in perspective, as well as generally enabling more frank assessments thanks to fewer business and personal concerns.
I disagree with "largely". Luck is always a factor in business success and there are certainly some notable examples where luck was, arguably, a big enough factor that "largely" would apply - like Broadcast.com's sale to Yahoo right at the peak of the .com bubble. However, I'm not aware of evidence luck was any more of a factor in NVidia's success than the ambient environmental constant it always is for every business. Luck is like the wind in competitive sailing - it impacts everyone, sometimes positively, sometimes negatively.
Achieving and then sustaining substantial success over the long run requires making a lot of choices correctly as well as top notch execution. The key is doing all of that so consistently and repeatedly that you survive long enough for the good and bad luck to cancel each other out. NVidia now has over 30 years of history through multiple industry-wide booms, downturns and fundamental technology transitions - a consistent track record of substantial, sustained success so long that good luck can't plausibly be a significant factor.
That said, to me, this article didn't try to explain NVidia's long-term business success. It focused on a few key architectural decisions made early on which were, arguably, quite risky in that they could have wasted a lot of development on capabilities which didn't end up mattering. However, they did end up paying off and, to me, the valuable insight was that key team members came from a different background than their competitors and their experiences with multi-user, multi-tasking, virtualized mini and mainframe architectures caused them to believe desktop architectures would evolve in that direction sooner rather than later. The takeaway being akin to "skate to where the puck is going, not where it is." In rapidly evolving tech environments, making such predictions is greatly improved when the team has both breadth and depth of experience in relevant domains.
No comments yet
> “This was the most brilliant thing on the planet. It was our secret sauce. If we missed a feature or a feature was broken, we could put it in the resource manager and it would work.”
Absolutely brilliant. Understand the strengths and weaknesses of your tech (slow/updateable software vs fast/frozen hardware) then design the product so a missed deadline won’t sink the company. A perfect combo of technically savvy management and clever engineering.
Back then I was quite p***d not being able to keep the Voodoo, how little did I know how it was going to turn out.
Pretty sure the author meant write 640 kilobytes.
At a time when PC memory maxed out at 640 megabytes, the fact that the PCI bus could address 4 gigabytes meant that quite a few of its address bits were surplus. So we decided to increase the amount of data shipped in each bus cycle by using some of them as data. IIRC NV1 used 23 address bits, occupying 1/512th of the total space. 7 of the 23 selected one of the 128 virtual FIFOs, allowing 128 different processes to share access to the hardware. We figured 128 processes was plenty.
PC memory was nearly always sold in powers of two. So you could have SIMMs in capacity of 1MiB, 2MiB, 4, 8, 16MiB. You could usually mix-and-match these memory modules, and some PCs had 2 slots, some had 4, some had a different number of slots.
So if you think about 4 slots that can hold some sort of maximum, we're thinking 64MiB is a very common maximum for a consumer PC, and that may be 2x32 or 4x16MiB. Lots of people ran up against that limit for sure.
640MiB is an absurd number if you think mathematically. How do you divide that up? If 4 SIMMs are installed, then their capacity is 160MiB each? No such hardware ever existed. IIRC, individual SIMMs were commonly maxed at 64MiB, and it was not physically possible to make a "monster memory module" larger than that.
Furthermore, while 64MiB requires 26 bits to address, 640MiB requires 30 address bits on the bus. If a hypothetical PC had 640MiB in use by the OS, then only 2 pins would be unused on the address bus! That is clearly at odds with their narrative that they were able to "borrow" several more!
This is clearly a typo and I would infer that the author meant to write "64 megabytes" and tacked on an extra zero, out of habit or hyperbole.
I can’t find the purchase receipts or specific board brand but it had four SDRAM slots, and I had it populated with 2x64 and 2x256.
Edit: Found it in some old files of mine:
I was wrong! Not four DIMM slots... three! One must have been 128 and the other two 256.
I don't think any mathematical relationship between the address bus and either 640KiB or 640MiB was intended, it was just the anchor point for how huge 4GiB of addressing was viewed at the time.
The article then goes on to say that the NV1 used 23 bits of the address bus but adds in the next paragraph that 16 bits remained to use for data. That math isn't working out for me.
Actually, I'm really struggling to understand how this scheme would work at all. It strongly implies open addressing with no other MMIO devices to conflict with, but that's just not how I thought PCI worked. Maybe someone who knows more can explain it to me.
The whole thing a bit ironic since Bill Gates took great pains to say he never said 640KiB is all you need (or something like that). Given my example of the IBM PC/AT it definitely was not common understanding of upper limits in 1995 apocryphal or not.
I remain a bit mystified about why it would be a hard maximum, though. Did such motherboards prevent the user from installing 4x256MiB for a cool 1GiB of DRAM? Was the OS having trouble addressing or utilizing it all? 640MiB is not a mathematical sort of maximum I was familiar with from the late 1990s. 4GiB is obviously your upper limit, with a 32-bit address bus... and again, if 640MiB were installed, that's only 2 free bits on that bus.
So I'm still a little curious about this number being dropped in the article. More info would be enlightening! And thank you for speaking up to correct me! No wonder it was down-voted!
That was a weird time in computing. Things were getting fast and big quickly (not that many years later, I built a dual-socket Xeon at 2.8 GHz, and before that my brother had a dual socket P3 at 700 MHz.) but all the expansion boards were so special-purpose. I remember going out of my way to pick a board with something like seven expansion slots.
But I think your question about why the author said 640 is fair! Maybe they had a machine like mine around then. Or maybe it’s something NVIDIA was designing around?
> Because Nvidia became one of the most valuable companies in the world, there are now two books explaining its rise and extolling the genius of Jensen Huang,
Yeah, he's a real genius. (Sarcasm). He is a marking guy, there is no genius behind this man.
The fact that Nvidia uses its market position cause harm to the industry by strong-arming partners to toe the line makes this company a problem, just like all the others. They operate like any other predatory corporation.
You would be wrong. He worked at AMD as a design engineer, and later went to LSI logic helping customers put out custom ASICs. One of his customers, a big customer, was Sun, helping with their SPARC processor and the GX graphics chips, and no doubt many others.
In 1989-1991 I did three ASICs at LSI Logic -- and Jensen was my liason there on the latter two. He was incredibly smart, hard working, technically knowledgeable, kind, patient, and generous with his time despite being very busy.
The marketing stuff came later (or maybe said better: it was latent and it came out later)
[1] https://en.wikipedia.org/wiki/TMS34010