News Headlines
- Tue, Jun 18
- The Witcher 3: Wild Hunt Impressions: Bigger and better than ever, plus a horse
- Mon, Jun 17
- Forget booth babes, IndieCade breaks the norm with booth BROS to E3 2013
- Bayonetta 2 E3 2013 Hands-On: Fighting angelic centaurs on a jet fighter, enough said?
- CD Projekt RED opening 20-person studio in Krakow to work on smaller 20-hour game
- May NPD Number Crunch: Injustice holds the fort during another weak month for the industry
New Articles
Related Articles
Graphics Core Next is a scalable architecture designed to be optimized for both graphics horsepower and compute power. The GPU is compiled of 32 Compute Units, dual Geometry engines, 8 Render Back-Ends, 768KBs of read/write L2 cache, a 384-bit memory bus, all running on a PCI Express 3.0 x16 bus interface and packing in a total 4.81 billion transistors. Like the HD 6900 series the dual geometry engines that each have their own Rasterizer and Render Back-End units, but share the 32 Compute Units. In essence you can think of these Geometry engines as having dual cores inside the GPU.
As mentioned earlier, the Southern Islands GPU is designed to work as both a graphics and compute engine. This is due to the ACE engine which posits three devices in one GPU that are completely asynchronous from one other: the graphics pipeline, the direct compute pipeline and two parallel pipelines inside the GPU. Each of these run independently and asynchronously to the primary graphics pipeline. This allows the GPU to process an intensive compute simultaneously with the graphics pipeline at it is running an intensive 3D application, and maximizes the utilization of the available 3.71TFLOPS of compute power. Paired with this is the processing core and dual DMA engines which allow the HD 7970 to saturate the Gen 3.0 PCI-Express interface with a TeraFlop of bidirectional double precision.
On top of this, all of the GDDR5 memory is protected by single error correction, and double error detection when used in a compute environment. All the internal SRAM has the same capabilities, and also have ECC data protection.

While many are wondering exactly how the efficiency of CGN compares to VLIW4, AMD has stated we should see an more performance per millimeter with the GCN architecture than what was available in previous generations. According to Eric Demers from AMD, the peak physical improvement over the previous generation architecture has a theoretical peak improvement of up to 7 to 7.5 times (compared to the Radeon HD 6970). The Tahiti architecture also includes an improved Gen 9 tessellator unit that increases the tessellation throughput over the previous generation via more efficient vertex re-use, larger parameter caches and improved off-chip buffering. All of this gives the GCN architecture up to 4 times the throughput in comparison to the Radeon HD 6900 series.

The basic compute unit includes all the instruction, wavefront and scheduling; essentially this unit can be thought of as its own core. Each compute unit has four sub-units that each run a way-front of 64-bit vector lanes over four cycles that are completely independent of each other. This is a huge departure from the VLIW engine, which had 5 or 4 math units that executed the individual instructions in a parallel process. Since the new GCN architecture runs fully scalar, it eliminates the compilation issues of the VLIW design.
In addition, the scalar runs in parallel to the unit and can issue instructions of its own, allowing complex operations to be moved to the scalar for improved efficiency. The unit also includes 16KB L1 cache that has both read and write functions allowing the textures to run through the cache as opposed to entering the cache to be processed and then passed on to the back-end before being exported back into the cache. This function can now be handled exclusively though the L1 cache via the read/write functions.
Moving over to this form of computing doesn't necessarily translate into improved graphics performance, but since the VLIW compiling was not necessary efficient, it will improve the total compute power, i.e. parallel processing of the GPU. The compute unit on the other hand has four independent wavefronts running in parallel as well as a scalar programing model at the lane level to ensure that all instructions running through the GPU automatically work. This eliminates all port conflicts, simplifying the compiler and instructions, thus improving the compute performance dramatically.

When it comes to the memory, each L1 cache has 64 bytes of bandwidth per clock, and the HD 7970 has a total of 32 giving it up to 2TB/s of bandwidth. As mentioned before, each L1 cache has both read and write functions, something new to the Southern Islands architecture. However, each L1 still has a L2 cache to fall back on, but the L2 also now includes both read and write functions. In total the HD 7970 comes packed with twelve L2 caches that feed back into the L1 cache which are also 64 bytes per clock. In total this gives the HD 7970 around 710GB/s of cache bandwidth with the default GPU clock speed of 925MHz for both reads and writes.
The GPU also includes a 16KB instruction cache and 32KB scalar data cache that are shared per four compute units, which as mentioned earlier are also backed by the L2 cache. Additionally, each compute unit has has its own registers and local data share. There is also a global data share unit that works as a manage buffer on the chip to allow sharing between any wavefront on the chip.

The Southern Islands architecture also includes a texture mapping technique called Partially Resident Texture, or PRT. Essentially, what this feature do is take advantage of all the memory hardware available, and turn the local frame buffer into a local texture cache. So, what does this mean? First, the local graphics memory can behave like a hardware-managed cache where texture data can be streamed in on demand. This prevents stuttering as the pages are brought in, and texture stream has the ability to handle the process more efficiently.
PRT also Improves the memory efficiency and image quality with very large, detailed textures. This allows for texture sizes up to 32 TB (16k x 16k x 8k x 128-bit). This is done by turning the textures into 64KB chunks that dynamically selects the textures that are needed to be loaded into the memory. So, essentially the textures that are not going to be displayed are not loaded. PRT also translates through the page table's every request. This allows the data to be rendered if the data is available, and if not the application can manage the textures. This allows it to dynamically opt to use lower resolution bitmaps for the lower resolution which will make the textures slightly blurry, but there will be no lag time in these situations.

Like the HD 6900 series, the Southern Islands graphics cards come equipped with AMD PowerTune technology. In essence, PowerTune is a means to set a predefined TDP by adjusting the clock speeds in real time. The way in which PowerTune is utilized is very different than the on-board regulation chips used on NVIDIA’s GTX 500 series. NVIDIA’s power management system monitors the power coming from the rails, while AMD’s technology instead relies on performance counters that are embedded throughout the GPU. These performance counters have an internal algorithm that dynamically calculate how much power is being used, and adjust accordingly. This allows PowerTune to maintain the power draw at the predefined level, effectively eliminating huge surges in power from occurring. Since games operate at a lower peak power rate than benchmarking applications such as Kombuster, in-game performance will not be negativity affected.
AMD has also introduced a new feature called "Zero Core Power" which maximizes the idle power consumption of the board. This is a potentially exciting new addition to the AMD graphics card series, and one we will touch on in the next few pages.

Article Index |
|
The only thing I have to disagree with you on is that it's overpriced... or rather, it's not 'relatively' overpriced compared to Nvidia's GTX 580. For having an average of 25.8% higher performance in 2560x1600 (With only a single score under 18% difference) I think the $50 (10%) more you pay over the GTX 580 is worth it if you're spending that much to begin with. On a whole though both cards are overpriced.
Add on to that all the new or improved features and it's a pretty solid package imo, even though I was hoping for a bit more from the 28nm node.
ATI is trying to take advantage of its new GPU as the "fastest single-core chip" before Kepler is out. The price will be lowered afterwards plus some performance improvements through new drivers.
Was going to opt for one of the new Sapphire Dual Fan 6970s but since in a months time the 7xxx series will be out, will probably wait until these get on the market.
Only issue is that I can't fit a full 275mm GPU in my case. Ideally needs to be less than 250mm.
TBH after the failings of the FX/Bulldozer CPUs, AMD does need this to hit the market strong, since for the past years nVidia have been infront of AMD on performance. Seems AMD is starting to go for better price/performance instead of trying to compete head on with Intel.
Mind you, if the Piledrivers improve the Bulldozer architechture and fix it's issues (by having 8 true cores instead of modules), and manages to fit an AM3+ socket, then I might be tempted to go for one of them aswell.
The 6970 was only around $300 at launch, so I honestly expected this one to be around $400. AMD is not going to sell many cards with this price point, because to be honest, it's NOT worth it. A 15% increase over a 580 in most cases is awful, not only because it's 28nm, but because it's a whole new Architecture.
Overpriced, underperforming, not worth it. I'll wait for Kepler.
Also, the HD 6970 had an MSRP of closer to $400 at launch, so it was expected this card would retail higher due to the better performance.
Anyway, this should be at $400, if it were $400, it would be reasonable for sure. $350 would be the sweet spot that would really just destroy Nvidia. The thing I love about AMD/ATI's cards were the fact that they offered the best price/performance. $550 is absolutely overpriced, you can't even argue it. The GTX 580 is also overpriced, and while this does beat out the 580 for a similar price ($50 more than the 3GB 580), this is a standard reference next-gen card. It should be around the same price as their standard current gen cards, or around $50 more. It's nothing amazing in the performance department, either. If it had a solid 40%-45% increase over the GTX 580, I could see $550.
That said... I was still expecting a bit more wow factor, I'm guessing the 7990 when that comes around will do the big leap though much like the 6990 did. Maybe we can get a MARS version...
the price is high, but what do you expect? its new tech. they are always overpriced initally. sure, the initial price is high, even by those standards, but im sure if you wait a month or two, it will drop considerably. in any case, i believe that the extra price is partly justified with all the features offered, ecpecially considering that it has some better power management technology added in, as well as the improvements to eyefinity with the audio and the 3d features. i myself wouldnt use such a feature (the 3d), but hey, its there for others if they want it. ive never really understood the hype behind 3d anyway.
i say let people wait until its cheaper, and then im sure it would be a great value card.
I believe anandtech did a test with pci-2.1 and there was no difference in gaming. In GPGPU calculations there was something like a 7-10% performance loss. Don't have the exact numbers in front of me though.
^Fanboy squeal amirite?
Anyway, although the performance is great on this card, I'm particularly interested in the cooling. I hope NvIDIA takes a leaf out of AMD's book and improves the cooling solutions on their future cards, as I'm not terribly interested in going back to AMD in the future (NVIDIA is just a more logical choice, considering its feature set and software support).
I look forward to a review of a 670 or 680 card (or equivalent, though I don't see them changing it).
My jaw drops at the thought of a 690 GPU though, I can only imagine how epic that card will be (performance and price wise XD).
Normally I would've gone for an Intel+nVidia build but to build it to the spec I would've wanted it would've cost me around £1000 atleast and I don't have that kinda money. AMD seems to have move to competing on price, hence why I've gone for an AMD build at a little over £500. However I've noticed alot of games are displaying nVidia logos on.
Granted I might not have explored every type of card on there but I'm going by what I've seen so far.