Tessellation: Because The GS Isn't Fast Enough
Microsoft and AMD tend to get the most excited about tessellation whenever the topic of DX11 comes up. AMD jumped on the tessellation bandwagon long ago, and perhaps it does make sense for consoles like the XBox 360. Adding fixed function hardware to quickly and efficiently handle a task that improves memory footprint has major advantages in the living room. We still aren't sold on the need for a tessellator on the desktop, but who's to argue with progress?
Or is it really progressive? The tessellator itself is fixed function rather than programmable. Sure, the input to and output of the tessellator can be manipulated a bit through the Hull Shader and Domain Shader, but the heart of the beast is just not that flexible. The Geometry Shader is the programmable block in the pipeline that is capable of tessellation as well as much more, but it just doesn't have the power to do tessellation on any useful scale. So while most everything has been moving towards programmability in the rendering pipe, we have sort of a step backward here. But why?
The argument between fixed function and programmable hardware is always one of performance versus flexibility and usefulness. In the beginning, fixed function was necessary to get the desired performance. As time went on, it became clear that adding in more fixed function hardware to graphics chips just wasn't feasible. The transistors put into specialized hardware just go unused if developers don't program to take advantage of it. This made a shift toward architectures where expanding the pool of compute resources that could be shared and used for many different tasks became a much more attractive way to go. In the general case anyway. But that doesn't mean that fixed function hardware doesn't have it's place.
We do still have the problem that all the transistors put into the tessellator are worthless unless developers take advantage of the hardware. But the reason it makes sense is that the ROI (return on investment: what you get for what you put in) on those transistors is huge if developers do take advantage of the hardware: it's much easier to get huge tessellation performance out of a fixed function tessellator than to put the necessary resources into the Geometry Shader to allow it to be capable of the same tessellation performance programmatically. This doesn't mean we'll start to see a renaissance of fixed function blocks in our graphics hardware; just that significantly advanced features going forward may still require the sacrifice of programability in favor of early adoption of a feature. The majority of tasks will continue to be enabled in a flexible programmable way, and in the future we may see more flexibility introduced into the tessellator until it becomes fully programmable as well (or ends up just being merged into some future version of the Geometry Shader).
Now don't let this technical assessment of fixed function tessellation make you think we aren't interested in reaping the benefits of the tessellator. Currently, artists need to create different versions of their objects for different LODs (Level of Detail -- reducing or increasing complexity as the object moves further or nearer the viewer), and geometry simulation through texturing at each LOD needs to be done by pixel shaders. This requires extra work from both artists and programmers and costs a good bit in terms of performance. There are also some effects than can only be done with more geometry.
Tessellation is a great way to get that geometry in there for more detail, shadowing, and smooth edges. High geometry also allows really cool displacement mapping effects. Currently, much geometry is simulated through textures and techniques like bump mapping or parallax occlusion mapping or some other technique. Even with high geometry, we will want to have large normal maps for our lighting algorithms to use, but we won't need to do so much work to make things like cracks, bumps, ridges, and small detail geometry appear to be there when it isn't because we can just tessellate and displace in a single pass through the pipeline. This is fast, efficient, and can produce very detailed effects while freeing up pixel shader resources for other uses. With tessellation, artists can create one sub division surface that can have a dynamic LOD free of charge; a simple hull shader and a displacement map applied in the domain shader will save a lot of work, increase quality, and improve performance quite a bit.
If developers adopt tessellation, we could see cool things, and with the move to DX11 class hardware both NVIDIA and AMD will be making parts with tessellation capability. But we may not see developers just start using tessellation (or the compute shader for that matter) right away. Because DirectX 11 will run on down level hardware and at the release of DX11 we will already have a huge number cards on the market capable of running a subset of DX11 bringing with it a better, more refined, programming language in the new version of HLSL and seamless parallelization optimizations, we will very likely see the first DX11 games only implementing features that can run completely on DX10 hardware.
Of course, at that point developers can be fully confident of exploiting all the aspects of DX10 hardware, which they still aren't completely taking advantage of. Many people still want and need a DX9 path because of Vista's failure, which means DX10 code tends to be more or less an enhanced DX9 path rather than something fundamentally different. So when DirectX 11 finally debuts, we will start to see what developers could really do with DX10.
Certainly there will be developers experimenting with tessellation, but these will probably just be simple amplification to get rid of those jagged edges around curved surfaces at first. It will take time for the real advanced tessellation techniques everyone is excited about to come to fruition.
109 Comments
View All Comments
Hrel - Sunday, February 1, 2009 - link
This is one of the most poorly written articles I've ever read on anandtech. It's like the author couldn't organize his thoughts properly. Also, the speculation was riddled with subjective assumptions. I'm not sure if the author just doesn't know this topic very well or if he hadn't slept in 3 days, but this could have been done much better. Great topic though, and interesting subject matter.GourdFreeMan - Sunday, February 1, 2009 - link
Derek, the DX10 geometry shader was never really intended to do tessellation, and really should not be thought of as a generalized tessellator. It was designed to offer a generalized hardware implementation of vertex effects such as skinning, vertex blending and tweening (see the dolphin demo in the DX SDX for what I am refering to here).If it becomes desirable at some point in time in the future to offer fully programable tessellation, then vertex shader, hull shader, tessellator, domain shader and geometry shader could all be merged into another compute shader earlier in the pipeline to do generalized vertex manipulation.
Of course, it is also possible that the existing tessellator will prove more efficient as fixed function hardware, and only minor functionality improvements will be added.
eXistenZ - Sunday, February 1, 2009 - link
HelloI just wanted to add, that ATi graphic cards have tesslator included since Radeon 8500, but i can be wrong...
I remember "Truform" technology, which is working in Serious Sam, or Return to Castle Wolfenstein, and Counter Strike 1.6 (it is not working now in Counter Strike).
I want to know, if author of this article forgot about it, or im wrong about this technology.
Sorry for my english, im from Slovakia :)
haukionkannel - Sunday, February 1, 2009 - link
There has been a tessalation unit in ATI cards for some time. It's not the same as is reguired in DX11, but guite near. I think that it was mentioned in the article...From what I know is that DX10 has been slow because in most games it's just dx9 with some clued features from dx10 above it. With pure dx10 codepath it would have been faster, but that would have left all those XP-customers out, and would not have been sound economically...
The author hopes that Win7 win encourage the transfer from XP, so there will be larger amount of DX10 and DX11 platforms. So it would become enonomically possible to make DX11 based games (just leaving out some pure DX11 features so that older dx10 cards could handle the games.) So actually when dx11 games comes out, they would be actually first to make use of all dx10 features...
Well there are so many dx9 machines in the world that even that will take time. So we will see poor dx10, dx11 performance until the XP customers are not needed by game companies, and even then there are those pure console transfers without any optimization like GTA...
I hope that "Chattered horisont" from Futuremark shows what DX10 can really do. It is goint to be pure DX10 game, so it can use advantages that dx10 can offer. On the other hand it can be next Crysis that looks really good, but makes your hardware moan for more power. We will see...
yyrkoon - Sunday, February 1, 2009 - link
"On the flipside, DX11 will be able to run on down level hardware."Um . . . Eh ? English ?
"This may not significantly speed up the graphics subsystem (especially if we are already very GPU limited), but this does increase the ability to more easily explicitly massively thread a game and take advantage of the increasing number of CPU cores on the desktop. "
... and significantly slow things down even further.
" These code resources are huge and can be hard to manage without OOP (Object Oriented Programming) constructs. But there are some differences to how things work in other OOP languages. "
I think you would find many experienced programmers who would say that OOP is a way of programming, not necessarily a language type, and I would have to agree with them. Now if you mean languages that *support* OOP, then sure, I can live with that.
Also, one other minor thing that kind of bothers me. You speak of Directx 6, but was Directx 6 an actual redistributable ? I definitely do not remember it, but I *do* remember Directx 5, Direct 7, 8, . . . and even that thing MS claims never existed . . . WinG.
DerekWilson - Friday, February 6, 2009 - link
down level hardware == hardware that meets a lower DX spec (like DX10 hardware).allowing games to be more mulithreaded using a fine grained synchronization scheme ala DX11 should not slow things down if developers take advantage of it correctly (which will be much easier than doing your own management here).
yes i did mean languages that support the OOP model.
DX6 was a Win98 thing ... it existed and actually was (iirc) the first version of DX to be hardware accelerated ... at least that's how I remember it.
DX4, on the other hand, never existed -- MS skipped from DX3 to DX5.
frozentundra123456 - Sunday, February 1, 2009 - link
I was initially unhappy with both Vista and DX10. However, I have come to accept Vista, but dont know if it is that much improved over WinXP. I only have Vista because I bought a new computer with that OS intstalled. I dont really know of anything I do with Vista that could not be done with XP. The only advantage to Vista is that it is supposedly more secure than XP, but I never had any major security problems with XP, nor have I had any with Vista.DX10 is still more of a disappointment to me. It requires too many resources and does not seem to offer corresponding improvements in visual quality. Nearly every game I have that is DX10 compliant, I run in DX9 mode because the performance improvement in DX9 more than makes up for the slight visual improvement with DX10. (Yes, I know I need a better graphics card.) I have an HD2600 pro, which was supposedly a "mid range" DX10 card when it came out, and it is virtually worthless for trying to play in DX10 mode, as I stated above.
I wonder if DX9 will still be supported when DX11 comes out. If not, they had better make DX11 run better on low to midrange hardware than DX10, or there will be a lot of unhappy users.
epyon96 - Saturday, January 31, 2009 - link
Since Derek claims that Direct X 11 is simply a superset of Dx10, why does Microsoft release it simply as 10.2 instead? I am curious what makes a Direct X version and what determines an incremental move forward.ltcommanderdata - Saturday, January 31, 2009 - link
I'd like to know that too. Since to me DX9.0c (SM3.0) seems to have been a pretty major step forward from DX9.0 (SM2.0), even a whole new shader model, yet it was only given a letter subscript. It should have at least been DX9.1.My cynical view? It's all marketing and Microsoft appeasing hardware vendors for their own benefit. For example DX8.1 was supposed to be a decent step forward, going from SM1.1 to SM1.4 with longer shaders and other features. Yet nVidia refused to support SM1.4 and managed to convince Microsoft to call SM1.3 DX8.1 compliant even though it's closer to SM1.1 than SM1.4. My suspicion is that Microsoft agreed with nVidia, because at that time nVidia was making the GPU for the XBox and Microsoft needed them.
A similar situation occurred with DX9.0c and SM3.0. This time ATI wasn't going to offer immediate support for SM3.0 in their GPUs. So in order for ATI's X8xx generation to not look so far behind, SM3.0 was only marketed as DX9.0c instead of DX9.1 or something more major. Why would Microsoft appease ATI? Conveniently, ATI was making the GPU for Microsoft's next-gen XBox 360, so Microsoft needed them.
This might not actually be true, but it's interesting that the swings in XBox GPU choice corresponds with Microsoft's degree of emphasis on DirectX capability.
In the case of DX11, I think there is sufficient new capabilities with Tessellation and Compute Shaders to justify a major number increase. I believe what Derek means is that DX11 is a superset of DX10 in the same way DX9 is a superset of DX8. They both offer backwards compatibility. In contrast, DX10 is not compatible with DX9 and Vista actually has separate DX10 and DX9 APIs (and third Vista specific DX9.0L) while DX8, DX7, etc can run on the DX9 API.
GourdFreeMan - Sunday, February 1, 2009 - link
Microsoft originally had some soft guidelines in this respect. Letter releases were to represent minor changes in the API such as the range and precision allowed for constants, max number of loop iterations in pixel and vertex shaders, etc. Point releases would permit added functionality to stages of the rendering pipeline. Version releases could include changes to the rendering pipeline itself. In practice, point and letter releases have been to support vendor-specific functionality, and version releases have set a baseline for all vendors.Microsoft's guildelines fit for all DirectX changes except 9.0c, which was really a vendor-specific change to fit the nVIDIA 6000 series hardware. (ATi did not have SM3.0 cards until its next hardware generation).