BFG GeForce 8800 GTS

Author: Geordan Hankinson, Tom K
Editor: Howard Ha
Publish Date: Wednesday, November 8th, 2006
Originally Published on Neoseeker (http://www.neoseeker.com)
Article Link: http://www.neoseeker.com/Articles/Hardware/s/bfg8800gts/
Copyright Neo Era Media, Inc. - please do not redistribute or use for commercial purposes.

Update: Power consumption numbers have now been made available on the last page!

Introduction

It is an odd situation indeed when technology is entirely obsolesced within eight months of release, but it is a pattern that NVIDIA are clearly promoting. The release of the 7800 GTX in June of 2005 was really quite a groundbreaking leap in performance. By March of this year however, a marginally faster 7900 GT could be had for just nearly half the price that the 7800 GTX originally released for. Arguably not a simple refresh product, the 7900 GTX brought even greater levels of performance to the high end segment, only to be overshadowed by the 7950 GX2 three months later (as far as single cards go).

One would assume that this rapid product release pace would eventually have to stop somewhere. The rumors, however, began flying faster than ever soon after the 7900 series release that NVIDIA's first DirectX 10 product was expected to ship within the year! Along with these tidbits were rumblings of massive power requirements and seperate power supplies designed specifically to handle the new DirectX GPU's from NVIDIA as well as from ATI. These rumors reaffirmed that NVIDIA was not simply making a DirectX 10 compatible 7900, but instead were pushing for an entirely new, presumably immensely powerful, GPU architecture.

This brings us to the past month where we have seen most, if not all of the details surrounding today's release hit the internet in one form or another. These details only solidified everyone's ideas on how new and improved this card was actually going to be. NVIDIA have spent four years developing the G80 which is being brought to retail in two forms, as the GeForce 8800 GTX and the GeForce 8800 GTS. We will be taking a look at the all new architecture to see what's changed as well as looking at our card specifically, the GeForce 8800 GTS from BFG.



Before we dive into the details however, here is a chart highlighting the specifications of the two new cards. There are likely a couple of perplexing numbers there and we will be covering those over the next few pages so keep reading!

Architecture

Though we received a 8800 GTS for today's review, we will be looking at the 8800 GTX's architecture as our primary example. We will talk more about the differences between the GPU's later on the article.

We'll start with the most notable aspect of the new architecture - the unified shader pipeline. While news of a unified design from NVIDIA came as a suprise when it hit the rumor channels a number of weeks ago, this decision was made based on the entire ethos behind DirectX 10. What's important to note about DirectX 10, is that aside from the new geometry shader step which we will likely see implemented in future DX10 based games, Microsoft has not added anything drastically different from what DirectX 9 already offers. The central theme of DirectX 10 is optimization and this extends from reduced CPU usage to the new push for unified architectures.

Instead of being divided up into seperate vertex and pixel shaders as in the past, NVIDIA has unified the entire shader pipeline. The result is what they are calling their Gigathread technology. This centers around a completely different approach to GPU building and this NVIDIA supplied graph does a good job of contrasting the two concepts.

The Old Way: Vertex and Pixel Shaders

The diagram below shows the classic GPU architecture which we have all grown quite accustomed to. This design does not maximize efficiency as at any given moment, not every one of the vertex shaders may be being utilised while all of the pixel shaders may be under maximum load or vise versa. This effectively leaves unused pipes that sit idle waiting for the other units to catch up before receiving more instructions.



The New Way: Unified Architecture!

NVIDIA's approach to unified architecture as detailed in the diagram below, was to get rid of the vertex and pixel shader pipelines as we know them, and replace those with completely decoupled "stream processors", as they are being dubbed. In the case of the 8800 GTX, the core clock speed is 575 MHz (500 MHz GTS). In a standard GPU this would implicate that the vertex and pixel shading units also run at this speed. In 8800 series architecture however, these units (now bundled together as stream processors) run at a completely seperate clock speed which in the 8800 GTX's instance, is 1350 MHz (1200 MHz GTS).



If the diagram above does not make any sense, keep reading! The concept of completely decoupled pipelines within the GPU is an odd thing to grasp but is facilitated by a central dispatch processor (or arbiter in ATI/Microsoft speak) that keeps the stream processors consistently utilised. The dispatch processor essentially sends data it receives through the stream processors which loop that data multiple times until all the necessary operations are complete before outputting to the Raster Operations Pipeline (ROPs) and then to memory.

The decoupling motif extends even further to the decoupling of the shader pipelines (stream processors) from the texture units. In the past, shader pipes would often be limited by the texture units fetching and filtering and thus a bottleneck would arise. Because these have been seperated on 8800 series cards, the stream processors can be performing other calculations while the texture units (which work at only 575 MHz) work over longer operations. The figure below shows an illustration of what this might look like in some instances.



All these design decisions come together to create the following diagram. You can see all 128 stream processors (96 in the case of the GTS) in their arrangements here.



You can see here the path that the render data takes as it enters the GPU and is processed through the new shader structure. The vertice is effectively run through multiple wash cycles as it moves through the dispatcher, through a stream processor, back through dispatch (depending on the nature of the data) etc before being output  to the ROP.

Some have wondered about the effectiveness of these seemingly general purpose stream processors in relation to their 'dedicated' vertex and pixel shader predecessors. If comparing pure shader vs shader performance, the stream processors in 8800 series cards should theoretically be able to do either operation just as fast as a dedicated unit. The real potential performance hold up however would be in the scheduling overhead that gets introduced by having to dispatch multiple threads to different sub processors. Fortunately, any inefficencies in NVIDIA's Thread Processor design will be negated by the fact that the 8800 has 128 pipelines that are available to perform either operation at any time, loosing a major performance bottleneck. One final note about the unified design is that the performance benefits it brings extend to current DirectX 9 games as well as future DirectX 10 games which should mean tangible performance deltas while we wait for DirectX 10 titles to hit with Vista next year.

The final point worth mentioning here is NVIDIA's new marketing speak for their physics processing. This will be implemented into some DirectX 10 games and will allow physics processing to be done directly off of the GPU through the stream processors. This is obviously to encourage the purchase of two boards in SLI (or three as may be the case with the GTX) to maximise graphics and physics performance.

Memory Arrangement

When details initially emerged on the G80 last month, there was much suprised discussion over the memory configuration and the higher bus width. NVIDIA has spent virtually no time discussing this seemingly major enhancement (seeing as this is the first external memory bus on a GPU over 256 bits) however, and really, there isn't a whole lot to discuss. We would presume that the sub system in this case functions similarly to the memory bus on 7 series cards and has simply been expanded to accomodate more memory. They have however mentioned future support for GDDR4, though GDDR3 is the memory used on current 8800 cards. As seen on the previous page, the 8800 GTX has a 384 bit wide memory bus and 768 MB of GDDR3 memory while the GTS features a 320 bit memory bus and 640 MB of total memory. The memory clock speeds on the GTX are 900 MHz while the GTS loses some and operates at 800.

Keep reading for a look at the card and an overview of the new image quality enhancements made!

The Board

Again, for today's article we received a 8800 GTS for review.  I'm told that we were originally scheduled to get both a GTS and GTX but the GTX, as of the morning of this article, has not yet arrived - check back because we will be receiving the higher specced card this week for further testing.

The GeForce 8800 series features an entirely new cooler that really takes up as much of the available PCB as is possible. The cooler on the GTS and GTX are identical and are made of aluminum fins and feature a single heatpipe. While not as outlandish as the coolers that first showed up on 7800 GTX 512 MB and later 7900 GTX cards, the coolers on 8800 boards are very large. Thankfully the new cooler is at least as quiet as on 7900 GTX cards and does not cause much of any aural distrubance.






It's nice to see a black PCB here and it seems that most NVIDIA partners are going down this route as well. With many motherboards now being produced in black, the matching colors should be appreciated by those running windows in their side panels.

BFG's 8800 GTS

BFG package their board with their usual bundle which includes nothing too far out of the ordinary. This includes the usual connectors as seen below as well as a shirt and a pack of their teflon mouse feet.




On to image quality!

Image Quality

NVIDIA have put lots of effort in to the 8800 to try and ensure that ATI has no arguments going for it anymore. Specifically, this boils down to the ability to render full HDR as well Anti Aliasing at the same time. This only applies to two games currently (Oblivion and Far Cry) however it was still something NVIDIA needed to address. With the 8800 series, NVIDIA have added full 128 bit precision which allows for proper HDR to be rendered along with AA filtering.

Along with support for AA and HDR, the 8800 series also comes packing new AA filter tech. It's called Coverage Sampling AntiAliasing and is designed to provide high levels of AA at a minimal performance loss. This is cut from NVIDIA's press package: New CSAA modes include 8x, 16x, and 16xQ. Each of these CSAA modes enhances built-in application antialiasing modes with much higher quality antialiasing. The fourth new mode, 8xQ, is standard 8x multisampling.

Here's the standard image they package with the card to show the difference:



Also new in the image quality category is NVIDIA's promise that their AF algorithm in the G80 does not take shortcuts. They are claiming that there are no optimizations being made and we're hoping that this turns out to be true in the course of things.




Test Setup

Our testing setup consisted of the following:

Intel Core 2 Duo E6400 @ 2.66 GHz (266 * 10)
Corsair PC2-8500 2 GB @ DDR2-800, 5-5-5-15/1T
ASUS P5N32-SLI Premium (nForce 590 SLI Intel Edition)

Cards tested were:

BFG GeForce 8800 GTS (duh)
BFG GeForce 7900 GTX OC
BFG GeForce 7950 GX2
ATI Radeon X1950 XTX
ATI Radeon X1900 XT

Drivers used:

NVIDIA Chipset 9.53 (
NVIDIA ForceWare 96.94
ATI CATALYST 6.10

3DMark 06



3DMark 06




Call of Duty 2




The performance in these two tests is nothing outstanding and while the GTS does manage to hold its own against all but the GX2 in 3D Mark, Call of Duty 2 is a complete disaster. We're wondering if this may be down to driver enhancements and we're hoping to see performance in this game go up.

Company of Heroes


Far Cry


Company of heroes presented a performance anomaly with the 7950 GX2 which we can figure down to being an SLI miscommunication between the game and the driver. The 8800 GTS has a very large lead here and it's hard to say how the GX2 would perform if the cards performance were being represented properly.

Far Cry is a tight race with the GTS scraping just underneath the GX2.

F.E.A.R.



Prey


In both cases here, the GTS stacks in neatly underneath the GX2 and above the 7900 GTX and X1950.

Quake 4


Splinter Cell 3


X3


Quake 4 scales quite well across the board however the situation gets somewhat confused with Splinter Cell and X3. With filtering on, an X1950 XTX beats the 8800 GTS and in X3, the 8800 is beaten out by all but the 7900 GTX.

Power Consumption


The 8800 GTS has the highest idling power consumption out of all the competitors here -- it draws a full 27W more than the X1900 XT. On the other hand, it has the second-lowest power consumption when it's actually doing something useful. Props to NVIDIA for packing 2.5x as many transistors into this GPU, but still keeping power consumption while gaming at very reasonable levels!

Conclusion

Naturally it's quite difficult to judge a new video card technology in its entirety based on a card that isn't the representation of the best available in the series, however we will be looking at the bigger picture when we receive our 8800 GTX. What this review does give us is a look into the underpinnings of the 8000 series which will doubtless be expanding into the other price segments presumably next year. NVIDIA have put together an extremely competitive package and the value to be had in the future proofing through DX10 support as well as the current DX9 performance is really high.

The 8800 GTS is a very quick card and for the money (expected to be over or around $450) does offer alot of features.  As you've seen in the charts, performance is quite good even against the best of class from ATI.  You did also see that it's performance in current games doesn't necessarily outgun the current heavyweights.  Its worth noting that at the price range that is intended however, its hard to imagine a person choosing an X1950XTX over the 8800 GTS simply based on the performance and future proofing offered by the GTS.  This card is sure to shine in future DirectX 10 games and anyone in the market at this price range would be silly not to consider this card. Our verdict on 8000 series performance however will not come until we've had a chance to look at the 8800 GTX in the near future.

Retailers are expected to stock both cards today and despite rumors of a delay, there should be plenty of cards available on e-tail shelves.

»Neoseeker.com

Copyright Neo Era Media, Inc., 1999-2014.
All Rights Reserved.

Please do not redistribute or use this article in whole, or in part, for commercial purposes.