The Voodoo cards had no right to look as good as they did for their time. Someone rebuilding one from scratch is exactly the kind of project HN was made for.
I had a Voodoo3, can't remember the model number anymore, but my friend who had a TNT2 would often comment about how much worse the 3dfx's 16bit color looked vs the TNT2's 32bit. I could never tell a difference.
Thank you! These things do pack in a ridiculous amount of functionality for what they do. Probably why they look so good but also why it took 30 years for a hardware re-implementation.
Yes but the Nvidia NV-1 preceding the Vooodoo was much more impressive. Using NURBS you could display perfectly round objects. Also it had forward texture mapping which significantly improves cache utilization and would be beneficial even today.
It was just way harder to program for. Triangles are much simpler to understand than bezier curves after all. And after Microsoft declared that DirectX only supports triangles the NV-1 was immediately dead.
> Also it had forward texture mapping which significantly improves cache utilization and would be beneficial even today.
Not really. Forward texture mapping simplifies texture access by making framebuffer access non-linear, reverse texture mapping has the opposite tradeoff. But that is assuming rectangular textures without UV mapping, like the Sega Saturn did; the moment you use UV mapping texture access will be non-linear no matter what. Besides that, forward texture mapping has serious difficulties the moment texture and screen sampling ratios don't match, which is pretty much always.
There is a reason why only the Saturn and the NV-1 used forward texture mapping, and the technology was abandoned afterwards.
I love the software look so much though! I never did like the blurring of textures :)
They're both beautiful in their own way, the darkness and glow in the hardware versions, some certain pixellated charm and roughness in the software version
A 3dfx Voodoo Banshee was the first graphics card I ever bought. I bought it to play the EverQuest beta, which also would have been around 1999. I remember logging into that game for the first time and it felt like a life-changing experience. And it kind of was.
I remember really liking the 3dfx splash screen[1] for some reason. Maybe because it was the only thing that actually ran smoothly on that card. But still, I was a loyal 3dfx user - probably because of their marketing which someone else mentioned in the comments - and was sad when it went out of business a couple years later.
I exhausted my teenage savings to buy the Voodoo 1 due to the Linux support. Granted, I was running Red Hat at the time so the installation consisted of installing what, two RPMs? Played a lot of Q3 and Unreal on that card.
Same here. I remember some kernel module or video driver named tdfx, and then, struggling to make X11 work with this DRI (Direct rendering infrastructure or something like that) setting on. It was very rewarding to see it enabled on glxinfo's output after days compiling half of your system and trying to figure out what was wrong, specially when the access to the internet was limited, and then being able to launch GLtron with hardware acceleration. Also remember playing Quake 3 and America's Army games around that time.
Fun times, now everything is straightforward on Linux but I somehow miss that era when you actually had to do everything by yourself.
Very cool! I am wondering one thing: how fast is it? Much of the "secret sauce" of the Voodoo is its high speed: a first-gen Verite or (God forbid) any ViRGE takes many more cycles for common operations like, say, Z-buffered pixels.
I'm guessing this isn't fully cycle-accurate, but is it at least somewhat "IPC-accurate"? I'm guessing yes? But much of that was also derived from Voodoo's (for the time) crazy high memory bandwidth AFAIK.
The Voodoo was fast but also expensive, and you needed an additional VGA card. I think it was around USD 300 back then, that's more than USD 600 today and you'll still need another card.
Are you sure this is AI? Normally when I read AI written stuff I zone out because it can go entire paragraphs without saying anything. The sentences here seem short and to the point.
Their previous posts published before ChatGPT seem similar enough. Although, they have way more em dashes and this one has none, almost like they were removed on purpose... lol
I'm fairly sure not because I have proof, but because of all the "not this, but that!" clauses.
If you spend time generating text with LLMs, there is a style that you learn to recognize pretty quickly.
Also, to be clear -- I'm not saying that we shouldn't use LLMs to help us produce the best text/prose we can -- but letting them just generate a lot of the text doesn't led to the best outcome imo.
I find your (and my!) reaction to LLM generated text fascinating. It has a distinct smell, and I honestly can't really put words to why I find it repellent, I just know that I do.
I tend to feel the same way, although I'm actively trying to move past it. I'm OK at writing, but thanks to a combination of educational background and natural aptitude, I'm darned near illiterate at higher math. That puts me behind the 8-ball as an engineer, even though I've been reasonably successful at both hardware and software work. I tend to miss tricks that are obvious to my peers, but when I do manage to come up with something useful, I'm able to communicate with my peers and connect with my customers. While I don't need or want LLM assistance with writing, I can't deny that recent models have been a godsend for getting me out of trouble in the math department.
Now, here's somebody who's clearly strong on the quantitative side of engineering, but presumably bad at communicating the results in English. I consider both skill sets to be of equal importance, so what right do I have to call them out for using AI to "cheat" at English when I rely on it myself to cover my own lack of math-fu? Is it just that I can conceal my use of leading-edge tools for research and reasoning, while they can't hide their own verbal handicap?
That doesn't sound fair. I would like to adopt a more progressive outlook with regard to this sort of thing, and would encourage others to do the same. This particular article isn't mindless slop and it shouldn't be rejected as such.
Besides all that, before long it won't be possible to call AI writing out anyway. We can get over it now or later. Either way, we'll have to get over it.
> before long it won't be possible to call AI writing out anyway
Once we're there, we're there. Tree falling in a forest with no one around, etc. Once that happens then I'll stop reacting badly to it, but it hasn't yet (not without careful prompting anyway).
I find it odd the author adds all these extra semantics to their input registers, rather than keeping the FIFOs, "drain + FIFOs", "float to fixed point converting register", etc as separate components, separate from the task of being memory mapped registers. The central problem they were running into was one where they let the external controller asynchronously change state in the middle of the compute unit using it.
I'm noting down this conetrace for the future though, seems like a useful tool, and they seem to be doing a closed beta of sorts.
Maybe I'm misunderstanding, but that functionality is implemented in another component. The register bank only records the category of each register and implements the memory-mapped register functionality.
This list of registers and their categories are then imported in separate components which sit between incoming writes and the register bank. The advantage is that everything which describes the properties of the registers is in a single file. You don't have to look in three different places to find out how a register behaves.
Well still, why tie this kind of processing to the registers themselves? Sure having a shorthand to instantiate a queue of writes I could see, but float to fixed conversion has no place being part of a memory mapped register bank.
Wouldn't it be more sensible to have one module for converting the AXI-Lite (I presume?) memory map interface to the specific input format of your processor, and then have the processor pull data from this adaptor when it needs it?
That way still all handling of inputs is done in the same place.
Edit: maybe, what it comes down to is: Should the register bank be responsible for storing the state the compute unit is working on, or should the compute unit store that state itself? In my opinion, that responsibility lies with the compute unit. The compute unit shouldn't have to rely on the register bank not changing while its working.
You do have a nice point here. Then the compute unit can simply stall the commands coming out of the register bank. Without this I need to stall the write FIFO, which feels less elegant and has given me some pain in terms of combinational loops. The drawback though is that you have to duplicate a significant amount of registers in the compute unit.
Tangentially related, that screenshot of Screamer 2 caught me off guard completely, I loved that game to death, and I feel I was the only one of my friends to have played it. Tremendous handling model and superb music.
Funny, I saw the news about the new game just before I saw this article. I didn't know it was a reboot at all, I'd never heard of the originals. It looks cool.
It’s been a while since I’ve struggled with Xilinx tools, but I can’t imagine there aren’t any hardware limitations these days. Does this run on a Spartan 6, or do you need the latest UltraScale for it?
This fits and runs in a DE-10 Nano without too much difficulty, uses around 70% of the fabric. I've been working on timing closure and just got it to 50 MHz.
Note that I also implemented cache components not present in the original Voodoo in order to be more flexible in terms of the memory that can be used. So it could be quite a bit smaller, maybe 50% of the fabric if you got rid of that.
That's quite impressive. 70% is obviously way too big for a MiSTer core, but I wonder if one day we will have an affordable FPGA board able to simulate a late '90s PC...
FPGA simulations are a naive attempt to guess at Metastability problems by finding a "steady state" latency after a certain amount of simulation time. Clock domain crossing mitigation only gets folks so far, and state propagation issues often get worse with larger and faster chips.
Note, there are oversized hobby Voodoo cards that max out the original ASIC count and memory limits. There are also emulators like 86box that simulate the hardware just fine for old games.
I have such fond memories of my old Voodoo card. Surprised how much nostalgia those pictures evoked - its rendering really had a unique look this that (LLM-generated?) FPGA captured quite well.
IIRC, it was a gigantic (for the time) beast that barely fit in my chassis - BUT it had great driver support for ppc32/macos9 (which was already on its way out), and actually kept my machine going for longer than it had any right to.
And then, like a month after I bought it, NVidia bought 3dfx and immediately stopped supporting the drivers, leaving me with an extremely performant paperweight when I finally upgraded my machine. Thanks Jensen.
I agree, but can't tell if it's the nostalgia speaking. Like, I just went and tried to figure exactly what model of PowerMac my Voodoo card was plugged into, and just got a dangerous rush of nostalgia for model names like "PowerPC 8600" - which is an objectively very boring name but I think it meant something profound to me at one point in my life.
I guess it's cool because it could possibly produce a single board design able to emulate many designs with a flash update including SLI requiring 2 Voodoo cards plus a host 2D card that could all be placed onto said one card. I don't know how one engineers the analog DAC bandwidth to render SVGA faithfully at 1600x1200 @ 60 Hz from a FPGA frame buffer though.
Btw, most 8 MiB vintage Voodoo 2 cards can be upgraded to 12 MiB by simply soldering on more RAM. I managed to snag a bunch of legit 125 MHz chips that work with every card produced.
If you want to see what it's supposed to look like, copy the screenshot into GIMP, go into "Color, Levels" and in the "Input Levels" section, there should be a textbox+spinner with a "1.00". Set that to 0.45.
Apparently Voodoo cards defaulted to 1.3 gamma instead of the standard 2.2. I wonder why that is, since it theory using a non-standard gamma would just reduce your color range with no real benefit.
This is definitely fixable in the design though by looking at the DAC gamma register. I'll do so once I get to the scan-out implementation on the DE-10 Nano.
I recall Quake orignally being super dark, Like you were supposed to be playing this in some basement tomb. But we were 'testing' Pentium Pro workstations in a brightly lit windowed office, so we had to adjust the game's brightness. So I wonder if this was a "make Quake look good for demos" thing.
Nor did their marketing:
https://news.ycombinator.com/item?id=35027437
It was just way harder to program for. Triangles are much simpler to understand than bezier curves after all. And after Microsoft declared that DirectX only supports triangles the NV-1 was immediately dead.
Not really. Forward texture mapping simplifies texture access by making framebuffer access non-linear, reverse texture mapping has the opposite tradeoff. But that is assuming rectangular textures without UV mapping, like the Sega Saturn did; the moment you use UV mapping texture access will be non-linear no matter what. Besides that, forward texture mapping has serious difficulties the moment texture and screen sampling ratios don't match, which is pretty much always.
There is a reason why only the Saturn and the NV-1 used forward texture mapping, and the technology was abandoned afterwards.
They're both beautiful in their own way, the darkness and glow in the hardware versions, some certain pixellated charm and roughness in the software version
Getting it working in linux in ~1999 was really not easy, especially for a teenager with no linux experience.
My networking card wasn't working either, so I had to run to a friend's house for dial-up internet access, searching for help on Altavista.
Very cool project. Way above my head, still!
I remember really liking the 3dfx splash screen[1] for some reason. Maybe because it was the only thing that actually ran smoothly on that card. But still, I was a loyal 3dfx user - probably because of their marketing which someone else mentioned in the comments - and was sad when it went out of business a couple years later.
[1] https://www.youtube.com/watch?v=LanTZ_AnAso
I believe I tried redhat, but had issues with that as well. I never went back to it--moved to debian and never looked back.
Fun times, now everything is straightforward on Linux but I somehow miss that era when you actually had to do everything by yourself.
Also had the issue with modem, paging through the manual figured out the initialisation string
AT&FX1
I'm guessing this isn't fully cycle-accurate, but is it at least somewhat "IPC-accurate"? I'm guessing yes? But much of that was also derived from Voodoo's (for the time) crazy high memory bandwidth AFAIK.
Their previous posts published before ChatGPT seem similar enough. Although, they have way more em dashes and this one has none, almost like they were removed on purpose... lol
I don't know what is real anymore.
If you spend time generating text with LLMs, there is a style that you learn to recognize pretty quickly.
Also, to be clear -- I'm not saying that we shouldn't use LLMs to help us produce the best text/prose we can -- but letting them just generate a lot of the text doesn't led to the best outcome imo.
Now, here's somebody who's clearly strong on the quantitative side of engineering, but presumably bad at communicating the results in English. I consider both skill sets to be of equal importance, so what right do I have to call them out for using AI to "cheat" at English when I rely on it myself to cover my own lack of math-fu? Is it just that I can conceal my use of leading-edge tools for research and reasoning, while they can't hide their own verbal handicap?
That doesn't sound fair. I would like to adopt a more progressive outlook with regard to this sort of thing, and would encourage others to do the same. This particular article isn't mindless slop and it shouldn't be rejected as such.
Besides all that, before long it won't be possible to call AI writing out anyway. We can get over it now or later. Either way, we'll have to get over it.
Once we're there, we're there. Tree falling in a forest with no one around, etc. Once that happens then I'll stop reacting badly to it, but it hasn't yet (not without careful prompting anyway).
https://lockbooks.net/pages/overclocked-launch
I'm noting down this conetrace for the future though, seems like a useful tool, and they seem to be doing a closed beta of sorts.
This list of registers and their categories are then imported in separate components which sit between incoming writes and the register bank. The advantage is that everything which describes the properties of the registers is in a single file. You don't have to look in three different places to find out how a register behaves.
Wouldn't it be more sensible to have one module for converting the AXI-Lite (I presume?) memory map interface to the specific input format of your processor, and then have the processor pull data from this adaptor when it needs it? That way still all handling of inputs is done in the same place.
Edit: maybe, what it comes down to is: Should the register bank be responsible for storing the state the compute unit is working on, or should the compute unit store that state itself? In my opinion, that responsibility lies with the compute unit. The compute unit shouldn't have to rely on the register bank not changing while its working.
https://store.steampowered.com/app/2814990/Screamer/
Or does this only run in simulation anyway?
Note that I also implemented cache components not present in the original Voodoo in order to be more flexible in terms of the memory that can be used. So it could be quite a bit smaller, maybe 50% of the fabric if you got rid of that.
Note, there are oversized hobby Voodoo cards that max out the original ASIC count and memory limits. There are also emulators like 86box that simulate the hardware just fine for old games.
https://www.youtube.com/watch?v=C4295RCp0GQ
>Or does this only run in simulation anyway?
If they are a LLM user, than it is 100% an April fools joke. =3
IIRC, it was a gigantic (for the time) beast that barely fit in my chassis - BUT it had great driver support for ppc32/macos9 (which was already on its way out), and actually kept my machine going for longer than it had any right to.
And then, like a month after I bought it, NVidia bought 3dfx and immediately stopped supporting the drivers, leaving me with an extremely performant paperweight when I finally upgraded my machine. Thanks Jensen.
Btw, most 8 MiB vintage Voodoo 2 cards can be upgraded to 12 MiB by simply soldering on more RAM. I managed to snag a bunch of legit 125 MHz chips that work with every card produced.
If you want to see what it's supposed to look like, copy the screenshot into GIMP, go into "Color, Levels" and in the "Input Levels" section, there should be a textbox+spinner with a "1.00". Set that to 0.45.
This is definitely fixable in the design though by looking at the DAC gamma register. I'll do so once I get to the scan-out implementation on the DE-10 Nano.