Author: Chris Ledenican
Editor: Howard Ha
Publish Date: Thursday, December 22nd, 2011
Originally Published on Neoseeker (http://www.neoseeker.com)
Article Link: http://www.neoseeker.com/Articles/Hardware/s/AMD_HD_7970/
Copyright Neo Era Media, Inc. - please do not redistribute or use for commercial purposes.
Last year AMD released the Northern Islands architecture starting with the Barts graphics processors, which was then followed by the Cayman architecture. The Northern Islands graphics cards have since become an extremely successful series, but the product range as we know it was an adaption on AMD's part to compensate for TSMC cancelling their 32nm process in favor of 28nm. This forced AMD to release a second generation of 40nm parts. All those issues are behind AMD now, as its "Southern Islands" architecture finally is among us, bringing with it the first 28nm GPU codenamed “Tahiti”.
The first AMD card to utilize this new architecture is the high-end Radeon HD 7970. This new card uses the “Tahiti XT” GPU which aside from being built on 28nm process also uses the brand new AMD “Graphics Core Next” architecture. Graphics Core Next (GDC) is a revolutionary new architecture that eliminates the previous VLIW design for a non-VLIW SIMD engine. Additionally, the “Tahiti XT” core includes 2048 stream processors, 32 raster units, 128 texture units and has a transistor count of 4.31 billion. Just looking at the specifications alone, one can see that the HD 7970 is going to have no issues pushing pixels, but with the GCD architecture it also has up to 1.3 times the compute power of previous generation cards.
Along with the changes to the architecture, AMD has expanded the video output features of the graphics card and kicked up the thermal performance. On the video front, AMD has expanded upon the Eyefinity ecosystem by including features such as stereoscopic 3D support across three displays and support for up to five independent audio streams via the HDMI and DisplayPort outputs. Moving on over to the new thermal solution, AMD has kicked the heatsink up a notch by including their sixth generation vapor chamber design and an improved blower style fan that not only pushes more CFM than the previous generation design, but also features better acoustic levels.
The last few paragraphs give a small overview of what the HD 7970 brings to the table, so throughout our review we intend to cover as much of the architecture and features as we can. It is important to note that the HD 7970 is not yet available commercially at time of writing. While the official launch date for the HD 7970 is slated for January 9th 2012, AMD bumped up the review date to December 22nd 2011 so as of now this is a paper launch. The hard launch remains set for January 9th.
Graphics Core Next is a scalable architecture designed to be optimized for both graphics horsepower and compute power. The GPU is compiled of 32 Compute Units, dual Geometry engines, 8 Render Back-Ends, 768KBs of read/write L2 cache, a 384-bit memory bus, all running on a PCI Express 3.0 x16 bus interface and packing in a total 4.81 billion transistors. Like the HD 6900 series the dual geometry engines that each have their own Rasterizer and Render Back-End units, but share the 32 Compute Units. In essence you can think of these Geometry engines as having dual cores inside the GPU.
As mentioned earlier, the Southern Islands GPU is designed to work as both a graphics and compute engine. This is due to the ACE engine which posits three devices in one GPU that are completely asynchronous from one other: the graphics pipeline, the direct compute pipeline and two parallel pipelines inside the GPU. Each of these run independently and asynchronously to the primary graphics pipeline. This allows the GPU to process an intensive compute simultaneously with the graphics pipeline at it is running an intensive 3D application, and maximizes the utilization of the available 3.71TFLOPS of compute power. Paired with this is the processing core and dual DMA engines which allow the HD 7970 to saturate the Gen 3.0 PCI-Express interface with a TeraFlop of bidirectional double precision.
On top of this, all of the GDDR5 memory is protected by single error correction, and double error detection when used in a compute environment. All the internal SRAM has the same capabilities, and also have ECC data protection.
While many are wondering exactly how the efficiency of CGN compares to VLIW4, AMD has stated we should see an more performance per millimeter with the GCN architecture than what was available in previous generations. According to Eric Demers from AMD, the peak physical improvement over the previous generation architecture has a theoretical peak improvement of up to 7 to 7.5 times (compared to the Radeon HD 6970). The Tahiti architecture also includes an improved Gen 9 tessellator unit that increases the tessellation throughput over the previous generation via more efficient vertex re-use, larger parameter caches and improved off-chip buffering. All of this gives the GCN architecture up to 4 times the throughput in comparison to the Radeon HD 6900 series.
The basic compute unit includes all the instruction, wavefront and scheduling; essentially this unit can be thought of as its own core. Each compute unit has four sub-units that each run a way-front of 64-bit vector lanes over four cycles that are completely independent of each other. This is a huge departure from the VLIW engine, which had 5 or 4 math units that executed the individual instructions in a parallel process. Since the new GCN architecture runs fully scalar, it eliminates the compilation issues of the VLIW design.
In addition, the scalar runs in parallel to the unit and can issue instructions of its own, allowing complex operations to be moved to the scalar for improved efficiency. The unit also includes 16KB L1 cache that has both read and write functions allowing the textures to run through the cache as opposed to entering the cache to be processed and then passed on to the back-end before being exported back into the cache. This function can now be handled exclusively though the L1 cache via the read/write functions.
Moving over to this form of computing doesn't necessarily translate into improved graphics performance, but since the VLIW compiling was not necessary efficient, it will improve the total compute power, i.e. parallel processing of the GPU. The compute unit on the other hand has four independent wavefronts running in parallel as well as a scalar programing model at the lane level to ensure that all instructions running through the GPU automatically work. This eliminates all port conflicts, simplifying the compiler and instructions, thus improving the compute performance dramatically.
When it comes to the memory, each L1 cache has 64 bytes of bandwidth per clock, and the HD 7970 has a total of 32 giving it up to 2TB/s of bandwidth. As mentioned before, each L1 cache has both read and write functions, something new to the Southern Islands architecture. However, each L1 still has a L2 cache to fall back on, but the L2 also now includes both read and write functions. In total the HD 7970 comes packed with twelve L2 caches that feed back into the L1 cache which are also 64 bytes per clock. In total this gives the HD 7970 around 710GB/s of cache bandwidth with the default GPU clock speed of 925MHz for both reads and writes.
The GPU also includes a 16KB instruction cache and 32KB scalar data cache that are shared per four compute units, which as mentioned earlier are also backed by the L2 cache. Additionally, each compute unit has has its own registers and local data share. There is also a global data share unit that works as a manage buffer on the chip to allow sharing between any wavefront on the chip.
The Southern Islands architecture also includes a texture mapping technique called Partially Resident Texture, or PRT. Essentially, what this feature do is take advantage of all the memory hardware available, and turn the local frame buffer into a local texture cache. So, what does this mean? First, the local graphics memory can behave like a hardware-managed cache where texture data can be streamed in on demand. This prevents stuttering as the pages are brought in, and texture stream has the ability to handle the process more efficiently.
PRT also Improves the memory efficiency and image quality with very large, detailed textures. This allows for texture sizes up to 32 TB (16k x 16k x 8k x 128-bit). This is done by turning the textures into 64KB chunks that dynamically selects the textures that are needed to be loaded into the memory. So, essentially the textures that are not going to be displayed are not loaded. PRT also translates through the page table's every request. This allows the data to be rendered if the data is available, and if not the application can manage the textures. This allows it to dynamically opt to use lower resolution bitmaps for the lower resolution which will make the textures slightly blurry, but there will be no lag time in these situations.
Like the HD 6900 series, the Southern Islands graphics cards come equipped with AMD PowerTune technology. In essence, PowerTune is a means to set a predefined TDP by adjusting the clock speeds in real time. The way in which PowerTune is utilized is very different than the on-board regulation chips used on NVIDIA’s GTX 500 series. NVIDIA’s power management system monitors the power coming from the rails, while AMD’s technology instead relies on performance counters that are embedded throughout the GPU. These performance counters have an internal algorithm that dynamically calculate how much power is being used, and adjust accordingly. This allows PowerTune to maintain the power draw at the predefined level, effectively eliminating huge surges in power from occurring. Since games operate at a lower peak power rate than benchmarking applications such as Kombuster, in-game performance will not be negativity affected.
AMD has also introduced a new feature called "Zero Core Power" which maximizes the idle power consumption of the board. This is a potentially exciting new addition to the AMD graphics card series, and one we will touch on in the next few pages.
Along with the changes to the architecture, AMD is also introducing Eyefinity 2.0. For the most part, the changes are being made at the driver level, but there is one new feature being added to the Southern Islands graphics cards like the HD 7970.
This new feature gives the HD 7970 the ability to simultaneously output multiple, independent audio streams. Essentially this means each video source that has the ability to support audio will have its own dedicated audio signal. This allows a single HD 7970 to connect to multiple displays, with each having its own audio signal. In total the graphics card can support up to five audio signals. You can be fragging people on one monitor while watching your favorite show on a separate TV that is also connected to the Southern Islands graphics card. The technology also follows the video so if the source changes, the audio seamlessly switches to the other device as well. This is actually an interesting feature that really pushes the expansion options of the Radeon series forward.
The next feature is one that we have been hoping would come along for some time now. This is of course the merging of Eyefinity and HD3D. Unlike the previous feature, having a Southern Islands graphics card is not required to run Eyefinity in 3D. Instead this is a simple driver update that will enable the feature for all graphics cards that already support both technologies. Stereoscopic 3D technology from both AMD and NVIDIA is still niche at best, but we are glad AMD is moving forward and taking Eyefinty to its next logical step.
Another driver fix is the addition of flexible bezel compensation. The image below should give you a good idea of what this is all about. Essentially, it allows anyone to pair three non-identical monitors together and not have to worry about the images not lining up. Instead, the user can adjust the display to have the on screen image align perfectly across the displays.
AMD has also added a task bar positioning feature. Anyone that uses Eyefinity knows that previous drivers pushed the main desktop display to the leftmost screen. With this new positioning feature, the user can now pick which display the task bar is set to. Again, this is a huge improvement over the previous generation Eyefinity, and will make using this technology more seamless and convenient as the main desktop can now be manually configured to fit the individual needs of any Eyefinity user.
Eyefinity 2.0 also includes a custom resolution feature that allows the display resolution to be manually set to best fit the users' needs. While most gamers will be happy simply setting the resolution to 5760x1080, there are a handful of people that prefer greater customization. The new Eyefinity 2.0 also adds support for 5x1 Landscape with 1920x1200 and 2560x1600 monitors. This means Eyefinity is no longer limited to monitors at or below 1080p, increasing the available display real estate even further.
The "Tahiti XT" GPU is designed to fit into the enthusiast market and sits above all other single GPU graphics cards currently available. AMD is also going to release the "Tahiti Pro" graphics processor at a later date, but for the time being it is still being kept behind closed doors. Later down the line, AMD is also going to be releasing "Pitcairn" (to replace the HD 6800 series) and "Cape Verde", which is the Southern Islands budget GPU.
Aesthetically, the HD 7970 is best described as a hybrid between the HD 5000 and HD 6000 series graphics cards. As you can see, the HD 7970 still sports the classic AMD red and black color scheme and uses a rear mounted blower style fan, but the shroud has a more rounded design. The new visual style is actually quite appealing in comparison to the VHS design of the HD 6000 series. Along with the improved look, the rounded design of the back-end of the shroud improves ventilation when the graphics card is used in CrossFireX. As far as the dimensions go, the HD 7970 is roughly the same size as the HD 6970, so even with the reduced die size the PCB is still 10.5-inches in length.
The "Tahti XT" core itself is built on a 28nm process with a die size of 365mm² and has roughly 4.31 billion transistors. With the smaller node, AMD was able to turn up the engine to 925MHz, while still including 2048 streaming processors, 32 ROPS and 128 texture units, giving the card a total compute power of 3.79 TFLOPs peak single precision FP and 947 GFLOPs of double precision. Additionally, the HD 7970 comes with a massive 3GB GDDR5 fame buffer clocking in at 1375MHz (5.5Gbps QDR) and running on a 384-bit memory bus. This gives the HD 6970 substantial memory bandwidth but since most games are not memory limited, the biggest impact on performance should be seen when using the graphics cards Eyefinity or HD3D technologies.
The back of the PCB is clean for a high-end graphics card, but all the solder points give away some of the card's secrets. Looking at the back of the board, we can see twelve soldering points for the memory that surround the GPU in a C-shaped pattern. Interestingly enough, the board includes solder points for two full 8-pin power connectors. We aren't entirely sure what AMD was working on there, as that would given the board a total power rating of 375W which far higher than any single GPU graphic card currently on the market.
The back also includes two CrossFireX connectors and a PCIe Generation 3.0 x16 lane. By using the PCIe 3.0 interface, the board has double the maximum data rate over Gen 2.0, giving the card up to 32 GB/s of bi-directional bandwidth on an x16 connector. It is going to be hard for a single graphics card to saturate the PCIe Gen 3 interface with so much bandwidth, so the benefit will most likely only be noticeable with scaling multiple graphics cards together in CrossFireX configuration.
Like the 6900 series, the HD 7970 comes equipped with PowerTune technology. PowerTune is basically a power management system that maximizes the performance of the board via dynamic power adjustment. It does this by increasing the GPU clock speed in real time when the GPU detects power headroom, and throttling the clocks when a certain power limit is exceeded. This allows the board to adjust the clock speeds on a microsecond level. The maximum PowerTune rating for the HD 7970 is around 220W at load, but even with a power envelope lower than the HD 6970, AMD has included a 8+6 pin power configuration for up to 300W of power. This gives the board plenty of headroom which should come in handy during overclocking, and also ensures the board is fully stable during peak power consumption.
With the 28nm node we were expecting the power consumption to be slightly lower than 220W, but even so AMD has still managed to impress us with the board's idle power features. What they have done is added a feature called "Zero Core Power Technology" that reduces the idle power consumption by up to 95%. This allows the graphics card to reduce the power down to around 3 Watts when the system is idle for long periods of time, whereas a standard idle period has a power consumption rating of 15W. At both idle and long idle, the board utilizes a GPU clock deep sleep and DRAM stuttering features, along with compressing the contents of the frame buffer. This gives the HD 7970 roughly a 45% improvement in idle power consumption in comparison to the HD 6970.
One of the most interesting features of "Zero Core Power" is seen when using CrossFireX. Traditionally, anyone using multiple GPUs in a single system had to deal with a high power idle state, simply because each card was still actively drawing system power; each graphics card could produce 30+ watts of power even when the system wasn't under load. With "Zero Core Power", the extra graphics cards in a CrossFireX system are disabled, shutting down the fans and capping any voltage from going to the core. Since PowerTune works on a microsecond level, "Zero Core Power" will not interfere with gaming as all the GPUs can become active again in microseconds.
The default video outputs on the HD 7970 are similar to the 6000 series. In total there are two Mini-DP connectors, a single HDMI 1.4a connector and a Dual-Link DVI connector. Noticeably missing however is the stacked DVI port that was first introduced with the 6800 series graphics cards. The reason behind removing the port is not due to a substantial change in the video configuration, or even the MST Hubs becoming available. Instead AMD explains that removing the second DVI port actually reduces the level of turbulence being created, thus decreasing the overall acoustic levels during operation. Each HD 7970 will also ship with an HDMI to DVI dongle, and mini-DP to DVI dongle that allow the card to support up to three DVI connections out of the box.
The two on-board Mini-DP ports use the 1.2 standard which allows them to support up to three monitors per port (via MST Hub) and also support AMD HD3D technology. The middle HDMI 1.4a connector also supports 3GHz speeds with frame packing. Essentially this allows the connection to run the frames faster, thus creating a smoother gaming experience. The HDMI and DP ports can also be paired together to support HD3D Surround which increases both the depth and field of view in games, making for a truly unique gaming experience.
One area of the HD 7970 where AMD paid particular attention to (due to user feedback) is the acoustic and thermal levels. The previous Southern Islands graphics cards performed well on the thermal front, as none of the GPUs (besides the HD 6990) ran excessively hot. However, the included fan and ventilation design made graphics cards such as the HD 6970 and HD 6950 louder that what was needed to properly cool the GPU. This time around, AMD has tweaked the design and is now using its sixth generation vapor chamber design, along with a redesigned blower style fan and heatsink shroud.
The overall design of the heatsink is similar to that used on the 6970, but the sixth generation vapor chamber includes a multi-step vapor chamber technology. This gives the vapor chamber two steps with three discrete levels. One of the levels comes down and sits directly on top the GPU, while another sits over the board and the third is located toward the back-end of the PCB. Additionally, AMD uses a second generation design of their phase changing thermal paste which improves the thermal performance, and according to AMD the TIM alone can affect temperatures by as much as 2°C to 3°C.
While AMD has decided to stick with a blower style fan like the one seen in the previous generation cards, it actually uses an optimized fan blade technology. All this means is that the fan has larger and wider blades that allow it to push higher CFM with a lower decibel rating than the fans used on the HD 6900 series. The specs for the fan show it runs at 1.7A @ 12V DC and has a 2 ball bearing design, but fortunately we don't have the exact CFM or dBA rating for the fan. Another feature we already touched on was the removal of the stacked DVI connector to both improve the exhaust rate and lower the turbulence. According to AMD, removing the stacked DVI port not only improved the acoustics, but also reduced the temperature by as much as 7°C.
Here we have the HD 7970 stripped down to the PCB level. As you can see, the board utilizes a robust power management system at the back-end that includes a 6-phase power design, all solid Japanese capacitors, and a CHL8228G voltage controller from the CHiL Semiconductor Corporation. The CHL8228G is a dual-loop digital multi-phase controller that can drive up to 8 phase units, and features Input Voltage Management to allow up to 3 input voltages to be monitored. This will ensure the card is adequately powered, and improves the overall power efficiency. Additionally, the HD 7970 has dual power connectors, which along with the PCIe power slot provides the board with up to 300W of power. This leaves roughly 80W of untouched power that can be tapped into when overclocking, this gives plenty of headroom for AMD's manufacturing partners, and consumers to play around with.
The HD 7970 includes the dual BIOS toggle switch that was first introduced with the Cayman architecture. Essentially this switch allows the user to toggle back and forth between a protected and unprotected BIOS. By default, the switch is set to the protected setting which runs the card at the default settings of the manufacturer. However, when the switch is set in the "unprotected" position, the BIOS can be flashed and the settings will be internally stored. This means the graphics card can boot either the default or custom BIOS depending on the position of the switch. It is best to think of this feature as insurance for the card. If you flash the BIOS and the card runs into an issue, the BIOS can be reverted to the default settings by simply toggling the switch.
When it came to overclocking we had high expectations for the HD 7970. First off, the 28nm node should allow us to push the chip farther than the Cypress GPU, and since AMD also gave the card plenty of additional power headroom, we expected to easily hit over a gigabyte in clock speed. To our surprise, we easily scaled the GPU core upward to 1125MHz, which is the threshold available in the Catalyst Control Center. Getting to this speed required no additional voltage on our part, but most impressive was that we didn't observe a single crash or pixel error while scaling to 1125MHz. In total this is just shy of an 18% overclock, which is quite good considering the GPU clock already had a default clock speed of 925MHz.
The memory was also able to scale quite well and once again in our testing we pushed the GDDR5 memory to the limits imposed by the Catalyst Control Center. By default the memory frequency is set at 1375MHz (5.5Gbps effectively), and in our labs we were able to increase the frequency by 200MHz, giving us a final clock speed of 1575MHz which is a quad-data rate of 6.3Gbps. Again this is a decent margin of overclocking, and we are expecting the overclocked performance to dominate all the other single GPU graphics cards currently available.
||AMD Radeon HD 7970||AMD Radeon HD 5830||AMD Radeon HD 5870||AMD Radeon HD 6950||AMD Radeon HD 6970|
||3GB GDDR5||1GB GDDR5||1GB GDDR5||2GB GDDR5||2GB GDDR5|
||Nvidia GTX 460||Nvidia GTX 470||Nvidia GTX 480||Nvidia GTX 570||Nvidia GTX 580|
||1GB GDDR5||1.25GB GDDR5||1.5GB GDDR5||1.25GB GDDR5||1.5GB GDDR5|
Futuremark's latest 3DMark 2011 is designed for testing DirectX 11 hardware running on Windows 7 and Windows Vista. The benchmark includes six all new benchmark tests that make extensive use of all the new DirectX 11 features including tessellation, compute shaders and multi-threading.
In our first synthetic benchmark the HD 7970 easily surpassed all the single GPU graphics cards from our test bed, including the GTX 580 and HD 6970. Looking at the scores in terms of percentages, the HD 7970 was around 12 to 16 percent faster than the GTX 580, and 27 to 30 percent faster than the HD 6970. As you can see, the HD 7970 definitely has stronger DX11 performance than the previous generation cards, but the NVIDIA Fermi architecture stands to the challenge as well.
With the GPU and memory overclocked, the HD 7970 picked up a good about of steam. At 1125MHz, it got pretty darn close to the GTX 590 when running the benchmark at the highest settings.
3DMark Vantage is the stunning sequel to 3DMark 06. Futuremark's benchmarking programs have always been at the center of every bragging match. The best way to show that you've got the greatest gaming rig is to show that you've got the highest 3DMark score. Vantage does just that. It puts your system through a series of strenuous tests, and provides you with a score to brag about!
The HD 7970 basically trounced all other single GPU graphics cards in 3DMark Vantage, and actually placed ahead of both the HD 6990 and GTX 590 in certain tests. This is of course just a synthetic test, so it's not yet a clear indication of the HD 7970 packing real world performance; the card could just as easily been optimized for this specific benchmark.
Aliens vs Predator is a DX11 Benchmark that runs though a scene straight out of the classic 80’s movie, Aliens. Since it uses DX11, it can often be more than a graphics card can handle.
Aliens vs Predator has been on the market for a while, so AMD had time to optimize their hardware for this title. This means we should see the true gaming power of the HD 7970 through this benchmark as it shouldn't be limited by any performance issues with the launch day drivers.
The results from the HD 7970 were actually quite impressive. At stock, the Tahiti XT based card was able to perform around 26 percent better than the GTX 580, and 30 percent better than the HD 6970. Once we overclocked the graphics card, the performance difference between the HD 7970 and GTX 580 grew to 37 percent.
Batman: Arkham City is the sequel to the smash hit, Batman: Arkham Asylum. The game was created with the Unreal 3 Engine, and includes areas with extreme tessellation, high res textures and dynamic lighting. Batman, also includes native support for PhysX and is also optimized for Nvidia 3DVision technology.
Batman Arkham City is another game where the HD 7970 performed around 25 percent faster than the GTX 580, and over 30 percent better when against the HD 6970. So, in comparison to the HD 6790, the HD 7970 is a huge leap forward but it has a hard time pushing any higher past the GTX 580. Of course this could just be an issue with the release drivers, but at this point the HD 7970 is consistently 25 percent faster than the GTX 580.
Battlefield 3 is designed to deliver unmatched visual quality by including large scale environments, massive destruction, dynamic shadows. Additionally, BF 3 also includes character animation via ANT technology, which is also being utilized in the EA Sports franchise. All of this is definitely going to push any system its threshold, and is the reason so many gamers around the world are currently asking if their current system is up to the task.
Unlike the other games we benchmark, the performance of Battlefield 3 is tested during online game play. We ensure our results are accurate by running through each resolution four times before averaging the results.
The performance in Battlefield 3 isn't in keeping keep with what we've seen from the HD 7970 thus far. The performance difference between the GTX 580 and HD 7970 in this benchmark was below 10 percent. Our best guess would be a driver issue but even if this is the case, AMD would be hard pressed to garner a much higher performance lead over the GTX 580 simply by updating the drivers.
Crysis 2 is a first-person shooter developed by Crytek and is built on the CryEngine 3 engine. While the game was lacking in graphical fidelity upon its release, Crytek has since added feature such as D11 and high quality textures. This improved the in-game visuals substantially, which in turn pushes even high-end hardware to the max.
Crysis 2 was another game where the performance difference between the HD 7970 and GTX 580 was less than 10 percent. The results in this benchmark were so close that we spent hours re-running it, but with our settings of MSAA x4 and AF x16 the results kept turning out the same. Again, this could be related to the drivers, but regardless it is not a good sign when we see a new architecture scaling so closely with a graphics card that has been out on the market for a year.
Overclocking did help a lot, but the performance difference between the GTX 580 and HD 7970 was still less than 20 percent even with and with the GPU clock at 1125MHz.
DiRT 3 is the third installment in the DiRT series and like it's predecessor incorporates DX11 features such as tessellation, accelerated high definition ambient occlusion and Full Floating point high dynamic range lighting. This makes it a perfect game to test the latest DX11 hardware.
The HD 7970 ranged from 12 to 18 percent better in comparison to the GTX 580, which again was not quite what we were expecting. However, the difference between the HD 6970 and HD 7970 was over 20 percent. Again, overclocking improved the frame rates dramatically, and in this benchmark the HD 7970 was able to scale above the GTX 590 at 2560x1600.
Metro 2033 puts you right in the middle of post apocalyptic Moscow, battling Mutants, rivals and ratio-active fallout. The game is very graphics intensive and utilizes DX11 technology, making it a good measure of how the latest generation of graphics cards perform under the latest standard.
The results in Metro 2033 saw the HD 7970 go back up to a 30 percent performance increase, making it the only single GPU graphics card to achieve a frame rate of over 40FPS at 2560x1600. When the GPU was clocked at 1125MHz, we continued to see excellent scaling as the difference between the overclocked HD 7970 and the GTX 580 at 2560x1600 climbed to nearly 40 percent.
Total War: Shogun 2 is a game that creates a unique gameplay experience by combining both real-time and turn-based strategy. The game is set in 16th-century feudal Japan and gives the player control of a warlord battling various rival factions. Total War: Shogun 2 is the first in the series to feature DX11 technologies to enhance the look of the game, but with massive on-screen battles it can stress even the highest-end graphics cards.
In this benchmark the HD 7970 outperformed the GTX 580 by around 20 to 25 percent, while hovering around the same level with the HD 6970. Overclocking pushed the HD 7970 into overdrive, as it was able to again compete with the GTX 590 at certain resolutions.
To measure core GPU temperatures, we run three in-game benchmarks and record the idle and load temperature according to the min and max temperature readings recorded by MSI Afterburner. The games we test are Crysis 2, Lost Planet 2 and Metro 2033. We run these benchmarks for 15 minutes each. This way we can give the included thermal solution and GPU time to reach equilibrium.
From a thermal standpoint we can see that AMD really took the Tahiti XT GPU cooling seriously. When running the HD 7970 at the default ,settings the temperature never peaked above 78°C. This wasn't the case for all the other high-end graphics cards we tested, as the majority of GPUs sat in the 85°C range by comparison. Even when overclocked, the GPU core stayed slightly cooler than the GTX 580, but with the core sitting at 1125MHz the total temperatures did increase by 8°C.
As mentioned before, the new blower style fan was quieter than the fans used in previous generation cards, but it was still noticably audible when the RPM level was above 30 percent. However, the sound coming from the fan blades is purely the sound of air being pushed through the heatsink, and not the high pitched tone that the fans on the HD 6900 series produced.
To measure power usage, a Kill A Watt P4400 power meter was used. Note that the numbers represent the power drain for the entire benchmarking system, not just the video cards themselves. For the 'idle' readings we measured the power drain from the desktop, with no applications running; for the 'load' situation, we took the sustained peak power drain readings after running the system through the same in-game benchmarks we used for the temperature testing. This way we are recording real-world power usage, as opposed to pushing a product to it's thermal threshold.
At the default settings we were actually surprised to see the HD 7970 use more power than the HD 6970, even if it was only by 7 watts. However, while the HD 7970 does use slightly more power under load, the percentage of additional power is well under the performance increase in comparison to the HD 6970. So, overall the HD 7970 provides more performance per watt than the previous generation cards.
We finally have our first 28nm GPU and while the performance was not exactly earth shattering, the HD 7970 is still a strong graphics card that easily outperforms all single GPU solution, and has an improved performance per watt ratio over the previous generation. Breaking it down on paper, the HD 7970 is on average 20% to 25% faster than NVIDIA's GeForce GTX 580 and 30% to 35% faster than AMD's Radeon HD 6970. Of course there are variations to this, on almost a game for game basis. In games such as Crysis 2 the HD 7970 performed nearly identical the GTX580, while the performance in Metro 2033 was at times over 30%. So, for anyone expecting the HD 7970 to come out of the gate with a 50% performance increase, this simply just not the case.
However, the HD 7970 has plenty of additional overclocking headroom, and when the GPU clock speed is increased over the 1GHz barrier, the GPU really comes alive. In our labs we were able to increase the GPU clock speed to 1125MHz, and at this speed the performance was substantially better. In games such as Aliens vs. Predator and Metro 2033 we observed huge increases in performance, boosting the frame rate to levels where it was running nearly 40% better than the GTX580 and at times 45% better than the HD 6970, giving it the performance numbers most of us were expecting from Southern Islands parts. With this being the case, it seems it will really be up to AMD partners to push the Tahiti XT core to its limits and produce a graphics card that is capable of achieving a nearly 40% performance increase out-of-the-box.
In addition, the HD 7970 is built to be a computing powerhouse. With this new architecture, AMD has brought to the market their Graphics Core Next design which eliminates the VLIW engine in favor for a design that is more efficient at running parallel processes. This gives the HD 7970 a total compute power of 3.79TFLOPS. With this amount of compute power and an improved architecture the HD 7970 can perform parallel computing much quicker, and some of the demonstrations that AMD showed us at the launch event were very impressive, to say the least. So, expect more companies to start taking advantage of the Compute architecture in the future.
The HD 7970 also hit the right chords in terms of power management, and thermal performance. With the new “Zero Core Power” technology, AMD was able to get the HD 7970 down to less than 3 watts during long idle periods. With the 28nm node and PowerTune, the HD 7970 has a low standard idle rating of just 15W. This is a huge step forward when it comes to power management, and ensures that a system is efficient regardless of the amount of GPUs being used.
The thermal performance was also another area we were impressed with. During testing the HD 7970 maintained a low core temperature for the amount of current that was flowing through the GPU. The increased efficiency is due to the improved vapor chamber design which features a multiple chambers at the base of the heatsink, and better ventilation for improved heat dissipation. The redesigned fan also lowered the overall acoustic level, but we can’t say that it was whisper quiet. During load, the fan would spin upward of 36%, and at this RPM level the fan sound was noticeable, and still a bit loud. However, the high pitched tone which plagued AMD's Cayman graphics cards is gone, and the overall acoustics are much better.
Our largest complaint with the HD 7970 is that the drivers don't yet appear ready for prime time. In the course of a week AMD has sent us three drivers that each fixed various issues, or improved performance in a game. Since the card doesn't have consistent performance across the board, we have to conclude that the drivers are just not yet optimized. This leads to the card under performing in games such as Crysis 2 and Battlefield 3, as our results showed less than a 10 percent difference between the HD 7970 and GTX 580. As mentioned before, overclocking improved performance, but in a handful of games the frame rates were just not where they should have been for a card of this caliber.
Also, in our opinion, we would have liked to see the HD 7970 come in at $499 and not $549, based on the performance. Adding to this we expect most (if not all) retailers will most likely tack on an additional premium. Still, this is not the first graphics card to weigh in at $549 at launch and unless NVIDIA has something to counter with, AMD will keep the price at $549 simply because they can. The launch price of the HD 7970 is also similar to what the 3GB GTX 580 currently retails for, but since the HD 7970 outperforms even the 3GB GTX 580, it is hard for us to fault AMD for choosing such a high price tag. Also, the current pricing is only $50 higher than the 1.5GB GTX580.
Overall, the performance of the HD 7970 is good, if not great. It offers a ton of new features expands upon the Eyefinity ecosystem and walks away with the single GPU performance crown. However, since the 3D performance cannot quite hit 50% over previous generation cards, we think it is slightly overpriced. Keep in mind however that this card has plenty of overclocking headroom, so if you are looking to achieve the best possible performance levels it is essential to either overclock the graphics card manually, or keep an eye out for factory overclocked models.
Please do not redistribute or use this article in whole, or in part, for commercial purposes.