Drilling Down: DX11 And The Multi-Threaded Game Engine
In spite of the fact that multi-threaded programming has been around for decades, mainstream programmers didn't start focusing on parallel programming until multi-core CPUs started coming along. Much general purpose code is straightforward as a single thread; extracting performance via parallel programming can be difficult and isn't always obvious. Even with talented programmers, Amdahl's Law is a bitch: your speed up from parallelization is limited by the percent of code that is necessarily sequential.
Currently, in game development, rendering is one of those "necessarily" sequential tasks. DirectX 10 isn't set up to appropriately handle multiple threads all throwing commands at the GPU. That doesn't mean parallelization of renderers can't happen, but it does limit speed up because costly synchronization techniques or management threads need to be implemented in order to make sure nothing steps out of line. All this limits the benefit of parallelization and discourages programmers from trying too hard. After all, it's a better idea to put more of your effort into areas where performance can be improved more significantly. (John Carmack put it really well once, but I can't remember the quote... and I'm doing too much benchmarking to go look for it now. :-P)
No matter what anyone does, some stuff in the renderer will need to be sequential. Programs, textures, and resources must be loaded up; geometry happens before pixel processing; draw calls intended to be executed while a certain state is active must have that state set first and not changed until completion. Even in such a massively parallel machine, order must be maintained for many things. But order doesn't always matter.
Making more things thread-safe through an extended device interface using multiple contexts and making a lot of synchronization overhead the responsibility of the API and/or graphics driver, Microsoft has enabled game developers to more easily and effortlessly thread not only their rendering code, but their game code as well. These things will also work on DX10 hardware running on a system with DX11, though some missing hardware optimizations will reduce the performance benefit. But the fundamental ability to write code differently will go a long way to getting programmers more used to and better at parallelization. Let's take a look at the tools available to accomplish this in DX11.
First up is free threaded asynchronous resource loading. That's a bit of a mouthful, but this feature gives developers the ability to upload programs, textures, state objects, and all resources in a thread-safe way and, if desired, concurrent with the rendering process. This doesn't mean that all this stuff will get pushed up in parallel with rendering, as the driver will manage what gets sent to the GPU and when based on priority, but it does mean the developer no longer has to think about synchronizing or manually prioritizing resource loading. Multiple threads can start loading whatever resources they need whenever they need them. The fact that this can also be done concurrently with rendering could improve performance for games that stream in data for massive open worlds in addition to enabling multi-threaded opportunities.
In order to enable this and other threading, the D3D device interface is now split into three separate interfaces: the Device, the Immediate Context, and the Deferred Context. Resource creation is done through the Device. The Immediate Context is the interface for setting device state, draw calls, and queries. There can only be one Device and one Immediate Context. The Deferred Context is another interface for state and draw calls, but many can exist in one program and can be used as the per-thread interface (Deferred Contexts themselves are thread unsafe though). Deferred Contexts and the free threaded resource creation through the device are where DX11 gets it multi-threaded benefit.
Multiple threads submit state and draw calls to their Deferred Context which complies a display list that is eventually executed by the Immediate Context. Games will still need a render thread, and this thread will use the Immediate Context to execute state and draw calls and to consume the display lists generated by Deferred Contexts. In this way, the ultimate destination of all state and draw calls is the Immediate Context, but fine grained synchronization is handled by the API and the display driver so that parallel threads can be better used to contribute to the rendering process. Some limitations on Deferred Contexts include the fact that they cannot query the device and they can't download or read back anything from the GPU. Deferred Contexts can, however, consume the display lists generated by other Deferred Contexts.
The end result of all this is that the future will be more parallel friendly. As two and four core CPUs become more and more popular and 8 and 16 (logical) core CPUs are on the horizon, we need all the help we can get when trying to extract performance from parallelism. This is a good move for DirectX and we hope it will help push game engines to more fully utilize more than two or even four cores when the time comes.
109 Comments
View All Comments
epyon96 - Sunday, February 1, 2009 - link
That's very insightful. Can you go into more detail?I am confused because there appeared to be significant differences between Dx9C and Dx9B since NVidia made it sound like the difference was like the difference between Dx8.1/2/3 and Dx8.4 which did seem very significant if memory serves me right.
The difference between 8.4 and 9 seemed minimal in quality of the final output.
GourdFreeMan - Monday, February 2, 2009 - link
The guidelines I spoke of were mentioned on the MSDN Forums circa 2003 regarding how changes to Direct3D would affect DirectX versioning, but seem to have been abandoned in favor of the bimonthly SDK updates following the DX 9.0c release. Bimonthly updates led to faster bug fixes, which in prior versions of DirectX sometimes required a letter update.If you are interested in the exact technical changes between DirectX versions, I suggest downloading the old SDK versions prior to the move to bimonthly updates and looking at the Changes section of the documentation.
Regarding the move between DirectX 8 and DirectX 9, Shader Model 2.0 was introduced making way for games such as Far Cry (admittedly Far Cry was a DX 9.0b game, but the changes from 9.0 to 9.0b mainly involved SM 2.0a and SM 2.0b which for Far Cry meant enhanced performance on ATi and nVIDIA cards). Far Cry would later be patched to support DX 9.0c and SM 3.0, adding features like HDR, but I would argue that the unpatched game still looked considerably better than DX8 titles.
(Incidentally there is no DirectX 8.3 and 8.4 -- there was 8.1a and 8.1b in the progression instead).
epyon96 - Saturday, January 31, 2009 - link
I wish the article had more background on what you just hypothesized (obviously with some substantiated facts) instead instead of the unnecessary vista bashing. It wound satisfy an actual curiosity.I remember that's one of the reasons why the in depth analysis of the development cycle of R770 was so well liked.
gamerk2 - Saturday, January 31, 2009 - link
The issue with DX11 is this: You need to supply a DX10 codepath for those who won't update GFX cards (you can't release a game no one has hardware for), but also would need a DX9 codepath for XP.Why would anyone release a game with three seperate grpahics code paths? Its for that reason I see a slow use of DX11, as long as XP holds 15-20% market share.
ltcommanderdata - Saturday, January 31, 2009 - link
If I remember those OS market share reports correctly, as of the end of last year Windows XP had about 65% market share, Vista has about 20% after 2 years, and Mac is nearing 10%. Even if Windows 7 is a roaring success, XP just has too much built-up market share to disappear overnight, so XP and DX9 compatibility will be required for at least another 2 years. The other thing that works against Windows 7 is that even if it isn't released until next year, it's introduction looks to be right in the middle of this economic recession, since things probably won't really pick up until late 2010 or 2011. When the economy does pick up again, there will be huge demand as companies finally switch from XP which would be 10 years old by then, but the first year of Windows 7 sales will probably be slow.bobvodka - Saturday, January 31, 2009 - link
Well, to be fair, you don't have to have a DX10 path and a DX11 path as such. A few important features work on DX10 cards anyway, such as the multi-threaded rendering stuff, so you need a DX11 and a DX9 path at most; you just have to do some feature detection to find out if you are on a DX11, DX10 or DX10.1 card.Still a slight pain, but not as much as 3 real code paths.
DarkMadMax - Saturday, January 31, 2009 - link
And main reason is consoles. There are practically no PC exclusives anymore among large budget titles (e.g. the ones who concentrate on graphics) . So all games target xbox360 hardware (if they dont they are ps3 exclusives). So until new generation of consoles appears there will be no progress in graphics. Periodhaukionkannel - Saturday, January 31, 2009 - link
To me, this article mostly talks about new features of DX11 and that some fundamental fealtures can benefit allso dx10 and dx10.1 hardware...To me it seems that the Vista part was only there to say why there are not any real DX10 games now, even the features are there. I didn't read it as an Vista hate like many people here seems to think of it.
All in all it was very good article abou how DX11 can allow those promises that DX10 promised to flourish better this time.
scruffypup - Saturday, January 31, 2009 - link
That though this article was supposed to be about DirectX 11, Derek's bias and opinion about Vista overshadowed the subject of the article,...This article shows poor writing at its finest,.. afterall doesn't writing 101 teach one to make the article about the subject you are writing about and not something else?
Again I say,... Derek does a disservice to anandtech with this bias. If you want to put in your bias towards an unrelated subject,.. at least show clearly the links (relevancy) to your intended subject material and how you come by a conclusion to support that claim other than just spouting off needlessly,... for that is what you have done essentially, as it held no relevance to the subject material the way you wrote the article.
chizow - Saturday, January 31, 2009 - link
Article summary:1)DX11 offers nothing new over DX10, as quoted in the article its just a strict superset that builds on and adds features to DX10 capability.
2)Vista and DX10 sucked because no one wanted to use them.
Derek, like many others I disagree with your assessment of Vista's importance in the overall OS hierarchy, here's just a quick list:
1) First OS to bring 64-bit support to the mainstream.
2) First OS to offer multi-threaded driver improvements. Look at Rel 180 and 8.12 Hot Fix, where multi-threaded drivers are all the rage.
3) First OS to offer DX10 support. We're finally seeing some of the performance benefits we were promised in DX10 with multi-threaded drivers and improved AA with reading of the multi-sample depth buffer.
4) Much better OS stability compared to XP. It wasn't always the case, but contrary to your article, most of the problems were fixed in July/August with the various video hot fixes (Ryan Smith can probably confirm or deny this).
I think Win7 just emphasizes how good Vista is, and how many light years ahead both are compared to XP. You could say Win7 is like Mojave SE, not Vista SE, as you can clearly see all the Vista-haters who are running Win7 glowing about all the features and stability they've missed out on for at least a year (since Vista SP1).