EVGA GeForce GTX 660 Ti 2GB SC Edition Review

Author: Roger Cantwell
Editor: Howard Ha
Publish Date: Friday, August 17th, 2012
Originally Published on Neoseeker (http://www.neoseeker.com)
Article Link: http://www.neoseeker.com/Articles/Hardware/Reviews/evga_gtx660ti_sc/
Copyright Neo Era Media, Inc. - please do not redistribute or use for commercial purposes.

A little over four months ago, NVIDIA turned the gaming world upside down with the launch of the GeForce GTX 680. This card introduced us to the new Kepler architecture. While enthusiasts were still impressed by the performance of the GTX 680 and scrambled to get their hands on one, NVIDIA dropped another bomb, this time with the launch of the GTX 670, part of the GK104 lineup. Not slowing down one bit, NVIDIA has gone for a hat trick with the new GTX 660 Ti!

The GTX 660 Ti we will be reviewing is the newest card to feature the Kepler architecture, and like its predecessors it features a 28nm core, GPU Boost, TXAA and Adaptive VSync. Being third in line behind the fastest single GPU graphics card solution, NVIDIA's GTX 680, it was expected that NVIDIA would trim some areas for the GTX 660 Ti; for instance, the total number of CUDA cores have been reduced to 1344 compared to the GTX 680's 1536, and Texture Units have also been lowered from 128 to 112. This doesn't mean the GTX 660 Ti is unable to deliver on performance, but it certainly allowed NVIDIA to reduce the cost of the card, giving it a more attractive price tag of $309.99.

NVIDIA's Geforce GTX 660 Ti was built using the same GK104 GPU as the GTX 680, however there are naturally differences between its implementation in the GTX 660 Ti. Let's take a look at several areas where NVIDIA made some changes. As mentioned earlier, the GTX 660 Ti sees a reduction in it's total number of CUDA cores from 1536 to 1344, and features a slightly leaner memory interface of 192-bit (down from 256-bit on the GTX 680), three 64-bit memory controllers (192-bit memory interface), seven next-generation Streaming Multiprocessors (SMX) instead of eight, and 24 ROP Units.

NVidia equipped the Kepler series with a single GigaThread Engine which retrieves the required information from the systems memory. At this point the information is sent to the frame buffer. Now the Engine takes over and creates and forwards the threads from the memory to the Graphics Processing Clusters, where it is then sent to the execution units. The GigaThread Engine is followed by a total of four Graphics Processing Clusters, where most all of the operations are carried out. Each Graphics Processing Cluster has it's own raster engine, along with all the resources required for shading, texturing and computations.

NVIDIA has also revamped the memory sub-system of the Kepler series, allowing for an overall boost to the clock speed. With the redesign of the memory interface, NVIDIA was able to push the maximum memory frequency to 1502MHz data rate. NVIDIAhas also reduced the amount of memory controllers on the GTX 660 Ti to three, giving it a 192-bit interface. However the GTX 660 Ti's memory controller supports mixed density memory module operation, allowing NVIDIA to outfit it with a 2GB frame buffer that runs on a 192-bit GDDR5 interface, giving it a total bandwidth rating of 144.2GB/s.



Each of the four Graphics Processing Cluster contain two Streaming Multiprocessors units that has been fully optimized to give the best performance per watt. NVIDIA accomplished this by running the shaders at the same frequency as the GPU clock instead of double the GPU clock. This approach has given the Kepler series twice the performance per watt compared to the Fermi architecture, allowing NVIDIA to place more CUDA cores in a single Streaming Multiprocessors (SMX) unit. Each Streaming Multiprocessor (SMX) contains 192 CUDA cores for a total of 1344. Now that the CUDA core clock runs equal to the GPU clock, overall performance per CUDA core is reduced, but with the 1:1 clock design it also allows the GTX 660 Ti to achieve the same throughput and still remain inside a lower power envelope.

Examining the execution units, NVidia has designed the CUDA cores to perform pixel, vertex and geometry shading, along with any physics compute calculations. The texture units meanwhile handle texture filtering, load/store units and retrieving and saving data to and from memory, with the Special Function Units (SFUs) taking care of processes that are related to transcendental and the graphics interpolation instructions. The PolyMorph Engine is responsible for such tasks like vertex retrieval, tessellation, and view port transform, along with attribute setup and handling stream output.



NVIDIA's new Boost Clock is another one of the enhancements made to the Kepler series. In a way NVIDIA's Boost Clock works a lot like Intel's Turbo Boost by automaticaly  adjusting clock speeds as needed, therefore increasing performance ondemand. On the other hand, Boost Clock is different due to the fact that the maximum Boost Clock frequencies may not be where the graphics processor clock will stop at during graphic intensive applications. NVIDIA's Boost Clock functions at both a hardware and software level to automatically boost the GPU clock speed, and in most cases it will increase the GPU clock speed far above that of the Boost Rating.

The GTX 660 Ti's board power is only 150W, the Boost Clock can only increase overall clock speed if it is able to remain inside power limit while under load. Another adavantage of GPU Boost is the fact that it performs independently, so there is no need to fuss about with game profiles and no actions are required by the user, providing an on-the-fly performance boost for gamers. This new technology operates at a microsecond level, and does repeated checks on the GPU's voltage and conditions to determine if the clocks should be increased, or if they should be decreased to the base 3D clock. In addition, the GTX 660 Ti's maximum thermal threshold is 98 degrees C.

With the launch of the Kepler series, NVIDIA has introduced us to a load of new technologies. Some are exclusive to the Kepler series, while others affect current NVIDIA hardware through driver updates.

First to be added was Adaptive Vsync. As you should know, VSync works to prevent screen tearing, which happens when frame rates exceed your monitor's refresh rate. The downside to enabling VSync, is that you have the occasional stutter when frame rates drop below the normal locked VSync frame rate; this occurs once again as frame rates return to their normal locked rate.



With the release of the 301.42 WHQL drivers NVIDIA added what is called Adaptive VSync, a technique that automatically turns VSync off if the frame rates fall below the normal locked rate, and enables it once they return to the normal locked rate, greatly reducing stuttering and preventing tearing. The graph below should give you a good idea of how Adaptive Vsync works.



When NVIDIA released the R300 drivers they added FXAA to the NVIDIA Control Panel, paving the way for hundreds of games to utilize the new technology due to the fact that developers no longer need to implement it in their titles. FXAA was developed by NVIDIA assist in the reduction of visible aliasing that can be seen in games. This is accomplished by applying FXAA and other post processing techniques that include motion blur and bloom. Since FXAA is a post processing shading technology and not a deferred shader like MSAA, helping to increase performance and at the same time decreasing the load on the memory.



NVIDIA's new TXAA meanwhile is an all new anti-aliasing technique that was introduced with the latest Kepler series of cards. It has been designed specifically to help in the reduction of temporal aliasing, the crawling and flickering in movement. TXAA uses a combination of several techniques, hardware AA, along with a custom AA resolve along with a temporal filter. TXAA filters pixels using a contribution of samples both inside and outside of the pixel, along with samples from the prior frames, offering the highest quality filtering possible.



One last feature that is an exclusive to Kepler, the updated 3D Surround technology. Kepler and Fermi based hardware both support 3D Vision and Surround, but with Kepler you can now run both on a single card. Fermi based hardware required the use of two graphics cards if you wanted to run more than two displays, but Kepler can drive up to four displays with no need for adapters.

NVIDIA's GTX 660 Ti has been designed to offer best-in-class performance while at the same time fitting into a more attractive price range when compared to the GTX 680 and AMD's Radeon HD 7970. In Max Payne 3, the GTX 660 Ti is twice as fast as the GTX 470, and again in The Secret World the GTX 660 Ti provided double the performance over GTX 470. The GTX 660 Ti should offer gamers more bang for the buck, and is quite possibly be the best GPU available in its price range.

NVIDIA's GTX 660 Ti is equipped with the same GK104 GPU as the GTX 680, albeit slimmed down. The GPU is built on the same 28nm fabrication process, with a die size of 295mm² and is loaded with a total of 3.54 billion transistors. The GTX 660 Ti features base and Boost clock speeds of 915MHz and 980MHz, respectively. This however will be the first Kepler graphics card to not boost over 1GHz (unofficially the card is quite capable of boosting over this, however). NVIDIA's GTX 660 Ti includes four Graphics Processing Clusters with seven SMX units, giving the GTX 660 Ti a total of 1344 CUDA cores, 24 ROPs units, and 112 Texture units. NVIDIA has also tweaked the memory specifications; the GTX 660 Ti has a 192-bit memory interface and is equipped with a 2GB frame buffers clocked at 1502MHz (6008MHz effective).

NVIDIA's GTX 660 Ti is enclosed in the typical plastic cover with the EVGA logo located by the exhaust port.

NVIDIA has significantly shortened the PCB on the GTX 660 Ti, to only seven inches with the remaining 2.5 inches is an enclosure for the exhaust fan. That is a lot of power for such as small package. NVIDIA has configured the memory on the GTX 660 Ti, where the chips are placed on both sides of the PCB. The PCB contains six on the front and two on the back, leaving room for an additional four more for the AIB partners to increase the overall memory capacity.

The layout of the PCB also demonstrates how the onboard circuitry has been optimized to make the best use of available space on the board. Looking at the PCB you can see NVIDIA has mounted a power supply on the front, with dual SLI connectors along with the memory and GPU. NVIDIA’s GTX 660 Ti utilizes the Gen 3.0 PCIe interface allowing for double the maximum data rate over Gen 2.0, giving the card up to 32 GB/s of bi directional bandwidth on a x16 connector.

NVIDIA's new Graphics processing unit design is limited by the voltage when it comes to clock speeds. And the voltage limit is set by the TDP which means the Boost clocks can scale higher than the target rating, but only up to the 150W TDP. Another interesting aspect of the power design is because the length of the PCB is so small, the power connectors are located at the middle of the card and not at the rear. Its going to be interesting to see what NVIDIA partners are going to do with the board layout. As I have only seen the sample I received, I suspect some will be using a larger heat sink although I have read that they are working on single slot designs. We at some point may even see a shorter PCB down the line.

the video outputs on the GTX 660 Ti have been reworked to support the expanded 3D Vision and Surround technologies. With a total of two DVI ports, a single HDMI port and a full sized Display Port. Out of the ports, both the DisplayPort and DVI connections can support resolutions of up to 2560x1600, while the HDMI port is capable of supporting resolutions of up to 1080p and comes with native support for all the latest HDMI 1.4a features.

With a maximum power TDP of 150W, the GTX 660 Ti didn't require a large heat sink. The thermal solution used on the GPU consists only of a medium sized fin stack, with heatpipes attached to the cooler. The PCB does however have a larger aluminum heatsink on the front mounted power supply to help increase the thermal performance of the onboard power circuitry.

The heatsink is just a square fin stack with a larger copper base. When installed onto the GPU, the fin stack is positioned with the opening facing toward the back of the PCB. This allows the air from the fan to be push through the fin stack and exhaust out the rear of the card. This allows the air to be blown outside of the PC instead of being trapped inside it.

The fan used on the GTX 660 Ti is similar to the one found on the GTX 680. This means it has the same custom design which allows it to maximize airflow while still running at a very low acoustic level. NVIDIA was able to do this by designing the fan with a special acoustic damping material that improves the noise output by up to 5dBA without affecting the CFM. The result of these custom features is a much quieter graphics card that is still more thermal efficient than any of NVIDIA's prior high end graphics cards.

NVIDIA reduced the board size of the GTX 660 Ti by almost three inches when comparing it to higher-end graphics cards. With the PCB board reduced in size, the room NVIDIA had left to work with was much smaller, so everything on the PCB had to be very compact. NVIDIA had to place the power supply to one of the GPU and rotate the core to improve power integrity and to help increase efficiency. The power circuitry also had to be moved to the front of the board to help maximize the available space. The PCB of course also includes the 295 mm² GK104 GPU, eight memory chips with room for four more and the dual six pin power connectors. The VRM has added cooling to help improve efficiency.

GPU Boost & Overclocking:

The base GPU Boost clock of the GTX 660 Ti is set at 980MHz, but since this is just a target the Boost clock traditionally runs higher. The GTX 660 Ti we received was no exception, as our Boost speed was 1084MHz, which is 9.6% faster than NVIDIA's GPU Boost target. These clocks were hit without adjustments to the power target or GPU offset, meaning the Boost feature dynamically adjusted the clock speeds by nearly 10% higher without any work on the user's part.

When it came to overclocking, we again used the EVGA Precision X software utility. In our labs, the GTX 660 Ti was able to reach a stable clock speed of 1215MHz. To reach this frequency we increased the Power Target to 123% and increased the voltage as well. This gave us a total voltage rating of 1175mV, which was the main factor in achieving such as high overclock. In comparison to the target Boost clock speed and actual Boost clock, the final frequency we reached was 27% and 15% higher respectively. The memory also overclocked substantially, which was surprising considering it is already clocked at 6000MHz effective. Our end results netted an additional 1603MHz (6450MHz effective), up from 1502MHz. At this speed, the memory bandwidth was increased to over 200GB/s.

Hardware Configuration:

Test Setup:

Drivers:

Benchmarks DX 11:

Test Settings:

All benchmarks will be performed at a resolution of 1920x1080 when using a single display. A resolution of 5760x1080 will be used for Benchmarks in Eyefinity & Surround. Vsync will be disabled in the control panel, AA will be set to x4 with AF set to x16, all in-game settings will be set to high, or very high.

Usage:

Specifications:

(Note: All models might not be included in this review. The table below is to be used for comparisons)

Model
AMD Radeon HD 7970
AMD Radeon HD 7950
AMD Radeon HD 7870 GHz Edition
AMD Radeon HD 6950
AMD Radeon HD 6970
Processing Cores
2048
1792
1280
1408
1536
Core Clock
925MHz
850MHz
1000MHz
800MHz
880MHz
Memory Clock
1375MHz
1250MHz
1200MHz
1250MHz
1375MHz
Memory Interface
384-bit
384-bit
256-bit
256-bit
256-bit
Memory Type
3GB GDDR5
3GB GDDR5
2GB GDDR5
2GB GDDR5
2GB GDDR5
Fabrication Process
28nm
28nm
28nm
40nm
40nm
MSRP
$479
$379
$329
$249
$349

 

 

 

 

 

 

 

 

 

 

 

 

 

 

NVIDIA Specifications
 
Model
Nvidia GTX 660 Ti
Nvidia GTX 680
Nvidia GTX670
Nvidia GTX 570
Nvidia GTX 580
Processing Cores
1344
1536
1344
480
512
Core Clock/ Boost Clock
915MHz / 980MHz
1006MHz / 1058MHz
915MHz / 980MHz
742MHz
782MHz
Memory Clock
1504MHz
1504MHz
1504MHz
1250MHz
1002MHz
Memory Interface
192-bit
256-bit
256-bit
320-bit
384-bit
Memory Type
2GB GDDR5
2GB GDDR5
4GB GDDR5
1.25GB GDDR5
1.5GB GDDR5
Fabrication Process
28nm
28nm
28nm
40nm
40nm
MSRP
$309
$499
$399
$299
$389

Batman: Arkham City is the sequel to the smash hit, Batman: Arkham Asylum. The game was created with the Unreal 3 Engine, and includes areas with extreme tessellation, high res textures and dynamic lighting. Batman, also includes native support for PhysX and is also optimized for Nvidia 3DVision technology. The Top graph reflects our results at 1920x1080, while the lower graph reflects our results Eyefinity and Surround results at 5760x1080

The GTX 660 Ti is hot the heals of the HD 7970 in Batman Arkham City.

Crysis 2 is a first-person shooter developed by Crytek and is built on the CryEngine 3 engine. While the game was lacking in graphical fidelity upon its release, Crytek has since added feature such as D11 and high quality textures. This improved the in-game visuals substantially, which in turn pushes even high-end hardware to the max. The Top graph reflects our results at 1920x1080, while the lower graph reflects our results Eyefinity and Surround results at 5760x1080.

Looking at the graph below, the 660 Ti was able to keep up with the 7950, and easily out performed the 7850 altogether.

Futuremark's latest 3DMark 2011 is designed for testing DirectX 11 hardware running on Windows 7 and Windows Vista. The benchmark includes six all new benchmark tests that make extensive use of all the new DirectX 11 features including tessellation, compute shaders and multi-threading.

Unigine Heaven became very popular very fast, because it was one of the first major DirectX 11 benchmarks. It makes great use of tessellation to create a visually stunning heaven.

Ungine narrows the gap between AMD and NVIDIA. Here the GTX 660 Ti was only 2FPS slower than the HD 7970.

Total War: Shogun 2 is a game that creates a unique gameplay experience by combining both real-time and turn-based strategy. The game is set in 16th-century feudal Japan and gives the player control of a warlord battling various rival factions. Total War: Shogun 2 is the first in the series to feature DX11 technologies to enhance the look of the game, but with massive on-screen battles it can stress even the highest-end graphics cards.

Eyefinity & Surround has a way of bringing the big boys down, but as you can see the GTX 660 Ti had no problem keeping pace even at 5760x1080.

Temperature:

To measure the GPU temperatures, we ran three game benchmarks and recorded the idle and load temperature according to the minimum and maximum temperatures posted by MSI Afterburner. The games we tested are Crysis 2, Lost Planet 2, running these benchmarks for 15 minutes each. This way we can give the included thermal solution and GPU time to reach equilibrium.

Power Consumption:

To measure power usage, a Kill A Watt power meter was used. Note that the numbers represent the power drain for the entire benchmarking system, not just the video cards themselves. For the idle readings we measured the power drain from the desktop, with no applications running. For load testing, we took the sustained peak power drain readings after running the system through the same in-game benchmarks we used for the temperature testing. This way we are recording real world power usage, as opposed to pushing a product to its thermal threshold.

I have to admit I was a little suprised at the power usage of the 660 Ti, it was higher than I expected it to be.

The Kepler architecture continues to meet and exceed everyone’s expectations, and looking at the performance numbers it is easy to see why.  We can see how the GTX 660 Ti outperformed both the Radeon HD 7850 and HD 7950 minus a few exceptions. For instance the GTX 660 Ti only used a little more power than the HD 7850 and HD 7950, which was very surprising. The GTX 660 Ti is priced at $309, so it sits right in the middle of AMD’s HD 7850 and HD 7950. By placing the GTX 660 Ti in between the two, NVIDIA is effectively taking out both the HD 7850 and 7950 at the same time; why pay more for a graphics card that offers far less performance?

One advantage AMD still has going for it over the Kepler cards is they have equipped their Southern Islands series with 3GB of GDDR5 memory with a 384 bit interface, giving AMD the advantage in some games and synthetic benchmarks. Even with this on their side AMD’s Southern Islands lineup still lacks the overall performance that NVIDIA’s Kepler lineup is capable of delivering. In today’s market, gamers are always on the lookout for the best bang for the buck, making the GTX 660 Ti the absolute best available if they are looking to stay within the $300 to $350 price range.

NVIDIA really brought their A-game with the new Kepler technology, and the GTX 660 Ti is a grand example of this. The GTX 660 Ti has a long list of features that include the new TXAA film style anti-aliasing that uses a custom CG film style AA resolve. Improvements have also been made to the standard AA, along with enhancements that allow you to run NVIDIA’s 3D Vision Surround with a single graphics card, Adaptive-Vsync, and NVIDIA’s new GPU Boost which dynamically adjusts the clocks giving an instant boost to performance.

With all of these new features it is no surprise gamers are having a hard time finding the Kepler series of graphics cards on the shelf. The Kepler series of cards are like a tsunami that is headed straight for the Southern Islands, and AMD had best break out the life vests! Anyone have some shark repellant handy?

 

»Neoseeker.com

Copyright Neo Era Media, Inc., 1999-2014.
All Rights Reserved.

Please do not redistribute or use this article in whole, or in part, for commercial purposes.