NVIDIA's Back with NV35 - GeForceFX 5900 Ultra
by Anand Lal Shimpi on May 12, 2003 8:53 AM EST- Posted in
- GPUs
Stage 2: Vertex Processing
At the forefront of the 3D pipeline we have what has commonly been referred to as one or more vertex engines. These "engines" are essentially a collection of pipelined execution units, such as adders and multipliers. The execution units are fairly parallelized, to the point where there are multiple adders, multipliers, etc in order to exploit the fact that most of the data they will be working on is highly parallel in nature.
The functional units that make up these vertex engines are all 32-bit floating point units, regardless of whether we're talking about an ATI or NVIDIA architecture. In terms of the efficiency of these units, ATI claims that there is rarely a case when they process fewer than 4 vertex operations every clock cycle, while NVIDIA says that the NV35 can execute at least 3 vertex operations every clock cycle but gave the range from 3 - 4 ops per clock.
It's difficult to figure out why the discrepancy exists without looking at both architectures at a very low level, which as we mentioned at the beginning of this article is fairly impossible due to both manufacturers wanting to keep their secrets closely guarded.
An interesting difference that exists between the graphics pipeline and the CPU pipeline is the prevalence of branches in graphics code. As you will remember from our articles detailing the CPU world, branches occur quite commonly in code (e.g. 20% of all integer instructions are branches in x86 code). A branch is any piece of code where a decision must be made and the outcome of which determines which instruction to execute next. For example, a general branch would be:
If "Situation A" then begin executing the following code
Else, if "Situation B" then execute this code
As you can guess, branches are far less common in the graphics world. Extremely complex lighting algorithms are much more likely to contain branches than any other sort of code as well as vertex processing in general. Obviously in any case where branches exist, you will want to be able to predict the outcome of a branch before evaluating it in order to avoid costly pipeline stalls. Luckily, because of the high bandwidth memory subsystem that GPUs are paired up with as well as the limited nature of branches in graphics code to begin with, the branch predictors in these GPUs don't have to be too accurate. Whereas in the CPU world you need to be able to predict branches with ~95% accuracy, the requirements are no where near as stringent in the GPU world. NVIDIA insists that their branch predictor is significantly more accurate/efficient than ATI's, however it is difficult to back up those claims with hard numbers.
19 Comments
View All Comments
hivix - Wednesday, December 26, 2018 - link
hello this gonna works for you https://gamescraft.co/gamescraft-org-ml/seoshouts - Saturday, December 29, 2018 - link
Nvidia is the best graphic card i have used among.https://www.linkedin.com/company/orlando-seo-consu...
johncena2018 - Sunday, January 13, 2019 - link
https://freemusicallyfollowers.club/saber2 - Monday, February 25, 2019 - link
can I use it to Fortnite ? https://star2fut.fr/achat-v-bucks-fortnite-pas-che...don't sure
happywheelspace - Monday, June 3, 2019 - link
Great to know about this. Worth reading article, https://happywheelspace.com/yeeeeman - Tuesday, August 6, 2019 - link
Hi. Can you please update the graphs on these articles, since they don't show up anymore?spaces - Friday, February 28, 2020 - link
It's worth knowing it, thanks for sharing https://geometrydashspace.com/NewCM - Sunday, September 20, 2020 - link
So we can see here for soemthing important.https://enoot.eu/