The RV870 Story: AMD Showing up to the Fight
by Anand Lal Shimpi on February 14, 2010 12:00 AM EST- Posted in
- GPUs
The Payoff: How RV740 Saved Cypress
For its first 40nm GPU, ATI chose the biggest die that made sense in its roadmap. That was the RV740 (Radeon HD 4770):
The first to 40nm - The ATI Radeon HD 4770, April 2009
NVIDIA however picked a smaller die. While the RV740 was a 137mm2 GPU, NVIDIA’s first 40nm parts were the G210 and GT220 which measured 57mm2 and 100mm2. The G210 and GT220 were OEM-only for the first months of their life, and I’m guessing the G210 made up a good percentage of those orders. Note that it wasn’t until the release of the GeForce GT 240 that NVIDIA made a 40nm die equal in size to the RV740. The GT 240 came out in November 2009, while the Radeon HD 4770 (RV740) debuted in April 2009 - 7 months earlier.
NVIDIA's first 40nm GPUs shipped in July 2009
When it came time for both ATI and NVIDIA to move their high performance GPUs to 40nm, ATI had more experience and exposure to the big die problems with TSMC’s process.
David Wang, ATI’s VP of Graphics Engineering at the time, had concerns about TSMC’s 40nm process that he voiced to Carrell early on in the RV740 design process. David was worried that the metal handling in the fabrication process might lead to via quality issues. Vias are tiny connections between the different metal layers on a chip, and the thinking was that the via failure rate at 40nm was high enough to impact the yield of the process. Even if the vias wouldn’t fail completely, the quality of the via would degrade the signal going through the via.
The second cause for concern with TSMC’s 40nm process was about variation in transistor dimensions. There are thousands of dimensions in semiconductor design that you have to worry about. And as with any sort of manufacturing, there’s variance in many if not all of those dimensions from chip to chip. David was particularly worried about manufacturing variation in transistor channel length. He was worried that the tolerances ATI were given might not be met.
A standard CMOS transistor. Its dimensions are usually known to fairly tight tolerances.
TSMC led ATI to believe that the variation in channel length was going to be relatively small. Carrell and crew were nervous, but there’s nothing that could be done.
The problem with vias was easy (but costly) to get around. David Wang decided to double up on vias with the RV740. At any point in the design where there was a via that connected two metal layers, the RV740 called for two. It made the chip bigger, but it’s better than having chips that wouldn’t work. The issue of channel length variation however, had no immediate solution - it was a worry of theirs, but perhaps an irrational fear.
TSMC went off to fab the initial RV740s. When the chips came back, they were running hotter than ATI expected them to run. They were also leaking more current than ATI expected.
Engineering went to work, tearing the chips apart, looking at them one by one. It didn’t take long to figure out that transistor channel length varied much more than the initial tolerance specs. If you get a certain degree of channel length variance some parts will run slower than expected, while others would leak tons of current.
Engineering eventually figured a way to fix most of the leakage problem through some changes to the RV740 design. The performance was still a problem and the RV740 was mostly lost as a product because of the length of time it took to fix all of this stuff. But it served a much larger role within ATI. It was the pipe cleaner product that paved the way for Cypress and the rest of the Evergreen line.
As for how all of this applies to NVIDIA, it’s impossible to say for sure. But the rumors all seem to support that NVIDIA simply didn’t have the 40nm experience that ATI did. Last December NVIDIA spoke out against TSMC and called for nearly zero via defects.
The rumors surrounding Fermi also point at the same problems ATI encountered with the RV740. Low yields, the chips run hotter than expected, and the clock speeds are lower than their original targets. Granted we haven’t seen any GF100s ship yet, so we don’t know any of it for sure.
When I asked why it was so late with Fermi/GF100, NVIDIA pointed to parts of the architecture - not manufacturing. Of course, I was talking to an architect at the time. If Fermi/GF100 was indeed NVIDIA’s learning experience for TSMC’s 40nm I’d expect that its successor would go much smoother.
It’s not that TSMC doesn’t know how to run a foundry, but perhaps the company made a bigger jump than it should have with the move to 40nm:
Process | 150nm | 130nm | 110nm | 90nm | 80nm | 65nm | 55nm | 40nm |
Linear Scaling | - | 0.866 | 0.846 | 0.818 | 0.888 | 0.812 | 0.846 | 0.727 |
You’ll remember that during the Cypress discussion, Carrell was convinced that TSMC’s 40nm process wouldn’t be as cheap as it was being positioned as. Yet very few others, whether at ATI or NVIDIA, seemed to believe the same. I asked Carrell why that was, why he was able to know what many others didn’t.
Carrell chalked it up to experience and recounted a bunch of stuff that I can’t publish here. Needless to say, he was more skeptical of TSMC’s ability to deliver what it was promising at 40nm. And it never hurts to have a pragmatic skeptic on board.
132 Comments
View All Comments
devene - Sunday, February 14, 2010 - link
Just like many others, I've been a long time reader and I just couldn't carry on without leaving a comment:This has been an article, just like the RV770 one. It may not reveal many facts but is tremendously insightful and inspiring. Thank you for bringing this deeply hidden information out to the public and to the "fans". Please do everything in your power to continue this trend.
Once again, thank you Anand,
devene
medi01 - Sunday, February 14, 2010 - link
Germans say "lange Rede kurzer Sinn". So many pointless sentences that do not tell anything even remotely interesting.TGressus - Sunday, February 14, 2010 - link
Even the home team could not be sold on Eyefinity...William Gaatjes - Sunday, February 14, 2010 - link
Fantastic article."
First, it massively increased the confidence level of the engineering team. There’s this whole human nature aspect to everything in life, it comes with being human. Lose confidence and execution sucks, but if you are working towards a realistic set of goals then morale and confidence are both high. The side effect is that a passionate engineer will also work to try and beat those goals.
"
Finally, someone accepting and using human nature.
And see it works out...
The fun part is that a requested functionality that is desired but can not make it within the expected timeframe, can still be worked on and can be ready for the next "bulge" in the market. This way you relieve your engineers form stress, you have the time to sort errors and bugs out, you have time to solve unforseen consequences that always happen( people can get sick, a bug in software, machines breaking down) and you have a feature for the market department to market to the consumer for the next iteration of the product. This way you can use the free market to build an in the end perfect device. It is all about balance. If you have to invest to much energy in situation a, you will have less energy for situation b in a certain timeframe. We are bound by laws of nature meaning there is no "perpetuum mobile" in this universe. Nothing comes for free...
aegisofrime - Sunday, February 14, 2010 - link
Anand, you have taken an article that is really technical in nature, and turned it into something entertaining to read and yet informative for non-engineer types. My hats off to you. This is really the right balance of information and readability. If only all the Scientific Papers I have to read were written like this!dukeariochofchaos - Sunday, February 14, 2010 - link
i wonder if you will give fermi the same drama queen touch?i hope so.
Jamahl - Sunday, February 14, 2010 - link
I don't think anyone wants to read nvidia's marketing department tell us how awesome PhysX and CUDA is again tbh.TGressus - Sunday, February 14, 2010 - link
I suspect Fermi will be able to stand on it's technological innovation.RJohnson - Sunday, February 14, 2010 - link
...and it's exorbitant price/die size will exclude mere mortals from owning one.Spoelie - Sunday, February 14, 2010 - link
That depends entirely on the openness of NVIDIA on the subject, historically not one of their strong points.In fact ATi's take on NVIDIA's design process has been more informative than what has come out of NVIDIA itself.
But here's to hoping..