Digging Deeper: Galloping Horses Example
Rather than pull out a bunch of math and traditional timing diagrams, we've decided to put together a more straight forward presentation. The diagrams we will use show the frames of an actual animation that would be generated over time as well as what would be seen on the monitor for each method. Hopefully this will help illustrate the quantitative and qualitative differences between the approaches.
Our example consists of a fabricated example (based on an animation example courtesy of Wikipedia) of a "game" rendering a horse galloping across the screen. The basics of this timeline are that our game is capable of rendering at 5 times our refresh rate (it can render 5 different frames before a new one gets swapped to the front buffer). The consistency of the frame rate is not realistic either, as some frames will take longer than others. We cut down on these and other variables for simplicity sake. We'll talk about timing and lag in more detail based on a 60Hz refresh rate and 300 FPS performance, but we didn't want to clutter the diagram too much with times and labels. Obviously this is a theoretical example, but it does a good job of showing the idea of what is happening.
First up, we'll look at double buffering without vsync. In this case, the buffers are swapped as soon as the game is done drawing a frame. This immediately preempts what is being sent to the display at the time. Here's what it looks like in this case:
Good performance but with quality issues.
The timeline is labeled 0 to 15, and for those keeping count, each step is 3 and 1/3 milliseconds. The timeline for each buffer has a picture on it in the 3.3 ms interval during which the a frame is completed corresponding to the position of the horse and rider at that time in realtime. The large pictures at the bottom of the image represent the image displayed at each vertical refresh on the monitor. The only images we actually see are the frames that get sent to the display. The benefit of all the other frames are to minimize input lag in this case.
We can certainly see, in this extreme case, what bad tearing could look like. For this quick and dirty example, I chose only to composite three frames of animation, but it could be more or fewer tears in reality. The number of different frames drawn to the screen correspond to the length of time it takes for the graphics hardware to send the frame to the monitor. This will happen in less time than the entire interval between refreshes, but I'm not well versed enough in monitor technology to know how long that is. I sort of threw my dart at about half the interval being spent sending the frame for the purposes of this illustration (and thus parts of three completed frames are displayed). If I had to guess, I think I overestimated the time it takes to send a frame to the display.
For the above, FRAPS reported framerate would be 300 FPS, but the actual number of full images that get flashed up on the screen is always only a maximum of the refresh rate (in this example, 60 frames every second). The latency between when a frame is finished rendering and when it starts to appear on screen (this is input latency) is less than 3.3ms.
When we turn on vsync, the tearing goes away, but our real performance goes down and input latency goes up. Here's what we see.
Good quality, but bad performance and input lag.
If we consider each of these diagrams to be systems rendering the exact same thing starting at the exact same time, we can can see how far "behind" this rendering is. There is none of the tearing that was evident in our first example, but we pay for that with outdated information. In addition, the actual framerate in addition to the reported framerate is 60 FPS. The computer ends up doing a lot less work, of course, but it is at the expense of realized performance despite the fact that we cannot actually see more than the 60 images the monitor displays every second.
Here, the price we pay for eliminating tearing is an increase in latency from a maximum of 3.3ms to a maximum of 13.3ms. With vsync on a 60Hz monitor, the maximum latency that happens between when a rendering if finished and when it is displayed is a full 1/60 of a second (16.67ms), but the effective latency that can be incurred will be higher. Since no more drawing can happen after the next frame to be displayed is finished until it is swapped to the front buffer, the real effect of latency when using vsync will be more than a full vertical refresh when rendering takes longer than one refresh to complete.
Moving on to triple buffering, we can see how it combines the best advantages of the two double buffering approaches.
The best of both worlds.
And here we are. We are back down to a maximum of 3.3ms of input latency, but with no tearing. Our actual performance is back up to 300 FPS, but this may not be reported correctly by a frame counter that only monitors front buffer flips. Again, only 60 frames actually get pasted up to the monitor every second, but in this case, those 60 frames are the most recent frames fully rendered before the next refresh.
While there may be parts of the frames in double buffering without vsync that are "newer" than corresponding parts of the triple buffered frame, the price that is paid for that is potential visual corruption. The real kicker is that, if you don't actually see tearing in the double buffered case, then those partial updates are not different enough than the previous frame(s) to have really mattered visually anyway. In other words, only when you see the tear are you really getting any useful new information. But how useful is that new information if it only comes with tearing?
184 Comments
View All Comments
davidri - Sunday, July 26, 2009 - link
I'll just stick to vsync on with whatever the default frame buffer is in the Nvidia control panel. I get very good performance (most games I play run at 60fps) and no vertical tearing. I don't care to deal with third party apps.griffhamlin - Thursday, July 16, 2009 - link
you can force tripple buffering in DX . some appz exist for that.D3DOverrider , for one...
Muhammed - Wednesday, July 8, 2009 - link
I must say , Excellent article up to the page number 2 , after that things started to get REAL messy .I don't consider myself too stupid , nor too genius , but I am confident I am smart , and everything was fine till the second page , where you explained the principles of the idea , I quickly understood it just from one concentrated read , but the horses example is simply HORRIBLE , I understand you didn't want to waste 9 pages on a simple thing like V-Sync , hence so you wrapped up the concept quickly , but this has left us readers really confused .
Firstly , you started slow (page 2), elaborating on every little detail , then you provided an example that should make the picture even clearer , but on the contrary .. you put a lot of possibilities and new concepts into this example , and you successfully made it MORE COMPLEX , instead of being SIMPLER .
Secondly , horrible elaboration in the example made it even more convoluted , adding the complexity into the equation = HORRIBLE Example .
I am waiting for a follow up article .. one with even 18 pages , I will read them all .. every last letter , for this is the price of knowledge , just remember SIMPLIFY and ELABORATE .
Thanks you for your understanding .
quarup - Tuesday, July 7, 2009 - link
The following seems confusion, could you please clarify or reword it:"In double buffering, this happens with every frame even if the next frames done after the monitor is finished receiving and drawing the current frame (meaning that it might not be displayed at all if another frame is completed before the next refresh)."
It sounds like it says two contradicting things about double-buffering + vsync:
1. a swap buffer happens once per frame
2. a frame might be skipped if we're rendering frames too fast (this sounds more like triple buffering?)
Also:
"With triple buffering, front buffer swaps only happen at most once per vsync."
Isn't this true with double buffer + vsync, too?
pakotlar - Sunday, July 5, 2009 - link
t.buffering is great, but tighter integration between the abstraction layer and developer tools along with general programming protocols on GPU's, should allow (maybe with the use of a dynamic LOD system like SPVO) should kill the need for t.buffering or vsync. there has to be a better solution in place today for homogenous hardware.happymanz - Saturday, July 4, 2009 - link
Hi,I mostly play games at 1024x768@120hz if they are non competative, and 800x600\640x480@160hz if they are competative. (I am not able to notice any tearing at 160hz)
What settings are recommended for gamers using CRT or 120hz LCD monitors? (most people will not notice any tearing even at 120hz)
Alot of older (and still popular) games run various versions of the quakeengine where physics are affected by the FPS (I'm no expert on the matter)
(I have yet to see any LCD monitor getting close in terms of imagequality, and so far it seems you cant have your cake and eat it aswell when it comes to different types of panels)
urebelscum - Thursday, July 2, 2009 - link
Nice article; I loved seeing the example. However, from reading all the commons, I think a follow up article is needed. First, another example is needed: when rendering is slower than monitor refresh. I thought I pictured what would happen, but now I'm not sure. Maybe another example, covering what happens when a game drops below the 60 fps threashhold, but if the other example is clear, maybe not. The rest of the followup should include more info: basically add render ahead to the first example, and a list of which games use true triple buffer, and which use mis-named render ahead.The last is why I still don't use "triple buffering" all the time. It seems most games I play are calling render ahead the wrong name, so I leave the wrongly called "triple buffering" "disabled".
Two things that probably are beyond the scope are: how to tell if a game is using true triple buffering or if it's using render ahead, and what devs need to do to use true triple buffering. (I'm following a couple open source games that say they support triple buffering, but might be using render ahead.)
DerekWilson - Thursday, July 2, 2009 - link
Thanks for the feedback. I'm looking into the possibility of a follow up and appreciate your suggestions of things to look into.castanza - Tuesday, June 30, 2009 - link
Enabling triple buffering whenever possible is not the right idea.Again, the choices are:
1) double buffer w/o vsync
2) double buffer w/ vsync
3) triple buffer w/ vsync
Suppose your screen refreshes @ 60 Hz (pretty common now).
The key question is: can your machine pump out 60 fps consistently for the game in question?
If it can, then you probably want double buffer w/ vsync. I enabled this setting in L4D and I can enjoy NO TEARING with minimum lag because L4D gives very nice frame rates on my machine, not often dipping below 60 fps.
If it can't, then your choice depends on how well you can tolerate lag in this particular game. In some games you may not notice it, in others you will. I find it less noticeable in driving games than fps games for example. I tried this setting in L4D, and I found the added lag unacceptable. Anyway if you don't mind a bit of extra input lag in this particular game, then you want triple buffer w/ vsync. Otherwise, if getting rid of lag is more important than eliminating tearing, you'll choose double buffer w/o vsync.
That's my $0.02 on this subject :)
Hrel - Monday, June 29, 2009 - link
Sounds to me like the best option would be a tripple buffer with a rendering que; coupled with a rendered frames management feature.From this, I THINK the to get the best image a COMPLETED frame should be shown every single refresh of the monitor.
And in order to reduce lag as much as possible, the most recent fully rendered frame should be put out to the monitor; and all the older frames should be thrown out. Which means skipping could occur but with proper management it should be so minor that we really won't notice.
Especially as monitor refresh rates go up to 120HZ and beyond.
Comments anyone???