That's from the period when there was no standardization of how the CPU talked to the graphics device. Triangles or quads? Shared memory or command queues? DMA from the CPU side or the graphics device side? Graphics as part of the CPU/memory system or as part of the display system? Can the GPU cause page faults which are serviced by the virtual memory system?
Now we have Vulkan. Vulkan standardizes some things, but has a huge number of options because hardware design decisions are exposed at the Vulkan interface. You can transfer data from CPU to GPU via DMA or via shared memory. Memory can be mapped for bidirectional transfer, or for one-way transfer in either direction. Such transfers are slower than normal memory accesses. You can ask the GPU to read textures from CPU memory because GPU memory is full, which also carries a performance penalty. Or you can be on an "integrated graphics" machine where CPU and GPU share the same memory. Most hardware offers some, but not all, of those options.
This is why a lot of stuff still uses OpenGL, which hides all that.
(I spent a few years writing AutoCAD drivers for devices now best forgotten, and later trying to get 3D graphics to work on PCs in the 1990s. I got to see a lot of graphics boards best forgotten.)
saltcured · 1h ago
And that was an evolution of earlier 2D cards where you had a potential mixture of CPU-addressable framebuffer and various I/O ports to switch modes between text and raster graphics, adjust video modes in DACs, adjust color palette lookup tables, load fonts for text modes, and maybe address some 2D coprocessors for things like "blitting" (kind of like rectangular 2D DMA), line drawing, or even some basic polygonal rendering with funny options like dithering or stipple shading...
mrandish · 3h ago
This kind of retrospective from key people who were involved is invaluable from an historical perspective. I find hearing first-hand accounts of the context, assumptions, thought processes, internal debates, technical limitations, business realities and even dumb luck a good way to not only understand how we got here but how to do as well (or better) going forward.
While the nitty gritty detail of recollections captured when still fresh in memory can be fascinating, I especially appreciate reflections written a few decades later as it allows putting the outcomes of key decisions in perspective, as well as generally enabling more frank assessments thanks to fewer business and personal concerns.
rhdjsjebshjffn · 2h ago
I'm excited about this too, but it's a little concerning there's a brand in the title. There's no shortage of those from ati, intel, amd, apple, ibm, the game gaggle, etc to interview. The fact that nvidia succeeded where others failed is largely an artifact of luck.
jjtheblunt · 1h ago
Nvidia’s Cg language made developers prefer their hardware, I’d say.
No comments yet
cadamsdotcom · 1h ago
> all an application could do was to invoke methods on virtual objects .. the application could not know whether the object was implemented in hardware or in the resource manager's software. The flexibility to make this decision at any time was a huge advantage. As Kim quotes Michael Hara as saying:
> “This was the most brilliant thing on the planet. It was our secret sauce. If we missed a feature or a feature was broken, we could put it in the resource manager and it would work.”
Absolutely brilliant. Understand the strengths and weaknesses of your tech (slow/updateable software vs fast/frozen hardware) then design the product so a missed deadline won’t sink the company. A perfect combo of technically savvy management and clever engineering.
jacobgorm · 2h ago
I remember sitting next to David Rosenthal at a conference reception (must have been FAST, which makes sense given his involvement with LOCKSS) in San Jose some time around 2010 or 2011, not knowing up front who he was. He explained some of the innovations he had made at NVIDIA around making the hardware more modular and easier for parallel teams to work on, and we chatted about the rumors I had heard about SUN thinking about licensing the Amiga hardware, which he confirmed but said would have been a bad idea, because the hardware didn't support address space protection. I guess I didn't know enough about him or NVIDIA to be sufficiently impressed at the time, but he was a very friendly and down to earth person.
pjmlp · 5h ago
I wanted to buy a Voodoo card, and due to PCI incompatible version, had to trade it back for a Riva TNT.
Back then I was quite p***d not being able to keep the Voodoo, how little did I know how it was going to turn out.
ahartmetz · 29m ago
I had a Riva TNT at the time, and a friend's Voodoo... 2? sure ran Half-Life better than my card. The Voodoo 2 probably really was the better GPU at the time. Later games required later APIs (unsupported by older cards) - and much higher performance - anyway, so longevity was not much either way.
stewarts · 2h ago
We hold you singularly responsible for the eventual failure of the Voodoo3/4/5 and Nvidia domination.
pjmlp · 59m ago
Sorry.... :)
hackyhacky · 3h ago
> At a time when PC memory maxed out at 640 megabytes,
Pretty sure the author meant write 640 kilobytes.
usefulcat · 3h ago
It's hard to tell exactly what time frame the author is referencing there. For context, NV1 was released in '95, by which time it was not uncommon for a new PC to have 8-16 MB of memory (I had a 486 with 16 MB by '94). Especially if you planned to use it for gaming.
rjsw · 3h ago
Maybe that is what they were thinking but anything designed to work with a PCI bus would have been introduced after PCs became capable of using more memory than that.
npalli · 2h ago
The sentence and paragraph which makes it clear that this was megabytes and not kilobytes
At a time when PC memory maxed out at 640 megabytes, the fact that the PCI bus could address 4 gigabytes meant that quite a few of its address bits were surplus. So we decided to increase the amount of data shipped in each bus cycle by using some of them as data. IIRC NV1 used 23 address bits, occupying 1/512th of the total space. 7 of the 23 selected one of the 128 virtual FIFOs, allowing 128 different processes to share access to the hardware. We figured 128 processes was plenty.
AStonesThrow · 2h ago
Okay but "640" is a completely fictitious number for installed RAM in any given PC.
PC memory was nearly always sold in powers of two. So you could have SIMMs in capacity of 1MiB, 2MiB, 4, 8, 16MiB. You could usually mix-and-match these memory modules, and some PCs had 2 slots, some had 4, some had a different number of slots.
So if you think about 4 slots that can hold some sort of maximum, we're thinking 64MiB is a very common maximum for a consumer PC, and that may be 2x32 or 4x16MiB. Lots of people ran up against that limit for sure.
640MiB is an absurd number if you think mathematically. How do you divide that up? If 4 SIMMs are installed, then their capacity is 160MiB each? No such hardware ever existed. IIRC, individual SIMMs were commonly maxed at 64MiB, and it was not physically possible to make a "monster memory module" larger than that.
Furthermore, while 64MiB requires 26 bits to address, 640MiB requires 30 address bits on the bus. If a hypothetical PC had 640MiB in use by the OS, then only 2 pins would be unused on the address bus! That is clearly at odds with their narrative that they were able to "borrow" several more!
This is clearly a typo and I would infer that the author meant to write "64 megabytes" and tacked on an extra zero, out of habit or hyperbole.
chadaustin · 1h ago
You are straight up wrong. The first computer I ever built was a Pentium 2, RivaTNT, and it had 640 MB RAM.
I can’t find the purchase receipts or specific board brand but it had four SDRAM slots, and I had it populated with 2x64 and 2x256.
Edit: Found it in some old files of mine:
I was wrong! Not four DIMM slots... three! One must have been 128 and the other two 256.
Pentium II 400, 512k cache
Abit BF6 motherboard
640 MB PC100 SDRAM
21" Sony CPD-G500 (19.8" viewable, .24 dot pitch)
17" ViewSonic monitor (16" viewable, .27 dot pitch)
RivaTNT PCI video card with 16 MB VRAM
Creative SB Live!
Creative 5x DVD, 32x CD drive
Sony CD-RW (2, 4, 24)
80 GB Western Digital ATA/100
40 GB Western Digital ATA/100
17.2 GB Maxtor UltraDMA/33 HDD
10.0 GB Maxtor UltraDMA/33 HDD
Cambridge SoundWorks FourPointSurround FPS2000 Digital
3Com OfficeConnect 10/100 EtherNet card
3 Microsoft SideWinder Gamepads
Labtec AM-252 Microphone
Promise IDE Controller card
Hauppage WinTV-Theatre Tuner Card
artyom · 2h ago
This reads as one of the many engineering marvel stories (e.g. Bell Labs, Xerox) where revolutionary technology is created by a combination of (a) clever engineers with enough "free" time, and (b) no clueless managers around.
whyowhy3484939 · 1h ago
You can read in Kernighans History of Unix that really good managers - "enlightened management" IIRC - were involved and not just involved, some of them were absolutely crucial or Unix won't have existed. It's not like you can just let loose a couple of big brains and things will work out fine. They won't (and didn't).
Really fascinating story—thanks for sharing! Graphics programming has been a major driving force behind the widespread adoption of object-oriented programming, and the abstraction of devices in this context is truly elegant.
rjsw · 4h ago
I still have a NV1 card.
christkv · 33m ago
Me too i also have a rendition verite card which i guess in some ways is the first real fully programmable consumer gpu as it has a risc processor.
rjsw · 28m ago
I also have an even older NEC ISA card with a TI TMS34010 chip [1], that was a programmable CPU/GPU.
Now we have Vulkan. Vulkan standardizes some things, but has a huge number of options because hardware design decisions are exposed at the Vulkan interface. You can transfer data from CPU to GPU via DMA or via shared memory. Memory can be mapped for bidirectional transfer, or for one-way transfer in either direction. Such transfers are slower than normal memory accesses. You can ask the GPU to read textures from CPU memory because GPU memory is full, which also carries a performance penalty. Or you can be on an "integrated graphics" machine where CPU and GPU share the same memory. Most hardware offers some, but not all, of those options.
This is why a lot of stuff still uses OpenGL, which hides all that.
(I spent a few years writing AutoCAD drivers for devices now best forgotten, and later trying to get 3D graphics to work on PCs in the 1990s. I got to see a lot of graphics boards best forgotten.)
While the nitty gritty detail of recollections captured when still fresh in memory can be fascinating, I especially appreciate reflections written a few decades later as it allows putting the outcomes of key decisions in perspective, as well as generally enabling more frank assessments thanks to fewer business and personal concerns.
No comments yet
> “This was the most brilliant thing on the planet. It was our secret sauce. If we missed a feature or a feature was broken, we could put it in the resource manager and it would work.”
Absolutely brilliant. Understand the strengths and weaknesses of your tech (slow/updateable software vs fast/frozen hardware) then design the product so a missed deadline won’t sink the company. A perfect combo of technically savvy management and clever engineering.
Back then I was quite p***d not being able to keep the Voodoo, how little did I know how it was going to turn out.
Pretty sure the author meant write 640 kilobytes.
At a time when PC memory maxed out at 640 megabytes, the fact that the PCI bus could address 4 gigabytes meant that quite a few of its address bits were surplus. So we decided to increase the amount of data shipped in each bus cycle by using some of them as data. IIRC NV1 used 23 address bits, occupying 1/512th of the total space. 7 of the 23 selected one of the 128 virtual FIFOs, allowing 128 different processes to share access to the hardware. We figured 128 processes was plenty.
PC memory was nearly always sold in powers of two. So you could have SIMMs in capacity of 1MiB, 2MiB, 4, 8, 16MiB. You could usually mix-and-match these memory modules, and some PCs had 2 slots, some had 4, some had a different number of slots.
So if you think about 4 slots that can hold some sort of maximum, we're thinking 64MiB is a very common maximum for a consumer PC, and that may be 2x32 or 4x16MiB. Lots of people ran up against that limit for sure.
640MiB is an absurd number if you think mathematically. How do you divide that up? If 4 SIMMs are installed, then their capacity is 160MiB each? No such hardware ever existed. IIRC, individual SIMMs were commonly maxed at 64MiB, and it was not physically possible to make a "monster memory module" larger than that.
Furthermore, while 64MiB requires 26 bits to address, 640MiB requires 30 address bits on the bus. If a hypothetical PC had 640MiB in use by the OS, then only 2 pins would be unused on the address bus! That is clearly at odds with their narrative that they were able to "borrow" several more!
This is clearly a typo and I would infer that the author meant to write "64 megabytes" and tacked on an extra zero, out of habit or hyperbole.
I can’t find the purchase receipts or specific board brand but it had four SDRAM slots, and I had it populated with 2x64 and 2x256.
Edit: Found it in some old files of mine:
I was wrong! Not four DIMM slots... three! One must have been 128 and the other two 256.
[1] https://en.wikipedia.org/wiki/TMS34010