Wrapping It Up
So there you have it. Triple buffering gives you all the benefits of double buffering with no vsync enabled in addition to all the benefits of enabling vsync. We get smooth full frames with no tearing. These frames are swapped to the front buffer only on refresh, but they have just as little input lag as double buffering with no vsync at the start of output to the monitor. Even though "performance" doesn't always get reported right with triple buffering, the graphics hardware is working just as hard as it does with double buffering and no vsync and the end user gets all the benefit with out the potential downside. Triple buffering does take up a handful of extra memory on the graphics hardware, but on modern hardware this is not a significant issue.
Just to recap, from our previous example, here are what the three frames we looked at rendering stack up side by side.
Triple Buffering
Double Buffering
Double Buffering with vsync
We've presented the qualitative argument and the quantitative argument in support of triple buffering. So, now the question is: does this data change things? Are people going to start looking for that triple buffering option more now than without this information? Let's find out.
{poll 135:300}
The major difference in the technique we've described here is the ability to drop frames when they are outdated. Render ahead forces older frames to be displayed. Queues can help smoothness and stuttering as a few really quick frames followed by a slow frame end up being evened out and spread over more frames. But the price you pay is in lag (the more frames in the queue, the longer it takes to empty the queue and the older the frames are that are displayed).
In order to maintain smoothness and reduce lag, it is possible to hold on to a limited number of frames in case they are needed but to drop them if they are not (if they get too old). This requires a little more intelligent management of already rendered frames and goes a bit beyond the scope of this article.
Some game developers implement a short render ahead queue and call it triple buffering (because it uses three total buffers). They certainly cannot be faulted for this, as there has been a lot of confusion on the subject and under certain circumstances this setup will perform the same as triple buffering as we have described it (but definitely not when framerate is higher than refresh rate).
Both techniques allow the graphics card to continue doing work while waiting for a vertical refresh when one frame is already completed. When using double buffering (and no render queue), while vertical sync is enabled, after one frame is completed nothing else can be rendered out which can cause stalling and degrade actual performance.
When vsync is not enabled, nothing more than double buffering is needed for performance, but a render queue can still be used to smooth framerate if it requires a few old frames to be kept around. This can keep instantaneous framerate from dipping in some cases, but will (even with double buffering and vsync disabled) add lag and input latency. Even without vsync, render ahead is required for multiGPU systems to work efficiently.
So, this article is as much for gamers as it is for developers. If you are implementing render ahead (aka a flip queue), please don't call it "triple buffering," as that should be reserved for the technique we've described here in order to cut down on the confusion. There are games out there that list triple buffering as an option when the technique used is actually a short render queue. We do realize that this can cause confusion, and we very much hope that this article and discussion help to alleviate this problem.
184 Comments
View All Comments
Schmide - Friday, June 26, 2009 - link
Nice analysis."In my explanation I'm going to refer to any buffer swapping as copying from one buffer to another; how it is implemented by the hardware is irrelevant."
I think you have to elaborate on what's a copy and what's a swap as they are very different operations. Copying locks both surfaces preventing the use of both and takes processing power, while swapping locks nothing and takes no processing power.
In your explanation, you would have very different results if it was a copy or a swap.
PrinceGaz - Friday, June 26, 2009 - link
I understand if the card is actually copying blocks of memory between fixed buffers that it would have a performance impact. Whether it is copying or swapping is irrelevant as to what I was trying to explain, which is why I said how it is implemented by the hardware isn't relevant. It's best just to assume that in reality it is swapping buffers, which is equivalent to an instantaneous copy/move.The only thing is I've now realised there is a third way triple-buffering could work, which is roughly halfway between the two methods I proposed; if both B and C have been filled at the vertical-refresh meaning the card has been stalled, B is discarded, C is moved to A to be displayed, and the card now starts rendering to B, then to C. Again that makes no difference when the framerate is below the refresh-rate but it allows the card to render at up to double the refresh-rate to reduce the lag to a minimum of one instead of two refreshes when conditions allow.
Schmide - Friday, June 26, 2009 - link
Ok I get it.3 buffers A, B and C all fully renderable. The rendering goes as so:
A is rendered and queued to become the primary on the refresh after it's completion.
B begins rendering right after A is completed then queued to become primary the refresh after A or the next refresh after B completes.
C begins rendering after B finishes rendering and queues as B was queued to A.
repeat.
The advantage being, when vsinc is on, instead of starting the rendering of the next surface after the swap, you start a rendering immediately on 3rd surface rather than wait for the current primary to become available after the swap.
Makes sense.
DerekWilson - Saturday, June 27, 2009 - link
This is a good explanation of render ahead ... it's different than using triple buffering for page flipping.PrinceGaz - Friday, June 26, 2009 - link
ooops, made a slight error in my lag calculations when framerate can exceed refresh-rate.Two frames ago (my method) is correct- 0.033 seconds always because that is how long two refreshes take at 60hz.
Your method (constant back-buffer updating) will be between one and two frames ago, not zero to one frames like I said, so at 100fps the lag would actually be between 0.010 and 0.020 seconds, not 0 and 0.010 seconds like I said earlier.
Sorry about that.
DerekWilson - Friday, June 26, 2009 - link
Actually, this is precisely why I wanted to write this article.The technique you describe here:
"The way I understand triple-buffering works is that once B and C both have frames rendered to them waiting to be displayed, the graphics-card then pauses until the vertical-refresh, at which point B is copied to A to be displayed, C is moved to B, and the card is free to start work on rendering a new frame to fill the now empty C. No frames are thrown away, and the card is not constantly churning out frames which won't be displayed."
While it uses 3 buffers, is actually called render ahead. I'm talking about a page flipping method (which can actually be combined with render-ahead, but that's beyond the scope of this piece).
The benefit of the DirectX render ahead approach (which can be up to 8 frames iirc), is that it incurs a higher potential latency in order to achieve much smoother action.
If my framerate fluctuates a lot between high and low rates, it's possible that the game will feel "jerky." But using render ahead, I can essentailly cache up to X (the default is 3) quickly rendered frames so that the next long frametime I hit doesn't cause the monitor to keep showing the exact same image for multiple frames while it waits -- instead, if i have my 3 frames rendered ahead, if the frametime of my next frame is anything up to 50ms, my rendered ahead frames can be spat out, one after the other, sequentially until my 4th frame is ready. if frame rate goes back up after this, then i've successfully smoothed things out.
the price is, of course, that high framerate means we will see lag because no frames are dropped.
but this is not triple buffering.
DirectX does not support actual triple buffering out of the box -- it has to be programmed by the developer. OpenGL does actually support triple buffering inherently as I described.
I promise that the description I gave in the article is what triple buffering is supposed to be. calling render ahead triple buffering because the default uses 3 buffers has definitely caused a lot of confusion (especially when the wikipedia page cites this as a reason that render ahead is "an implementation" of triple buffering ... which is disingenuous in my mind).
we don't, afterall, call anything that uses 2 buffers double buffering -- like if I use 2 MRTs while I'm rendering something, am I double buffering then? in the same sense that 3 frame render ahead is triple buffering then sure ... but not if we are talking about page flipping.
... so ...
also, the lag between 0 and 1 frames that i was talking about is lag between the END of rendering the frame and when it is displayed. If frametime is taken into account, the input that generated that frame will have happened, at a maximum of (frametime + 16.67ms) ... so it could be longer ago than just one frame but not /because/ of triple buffering.
GreyMulkin - Friday, June 26, 2009 - link
VSYNC for life!This article has convinced me that I only want to use double buffering + vsync. I despise tearing - your example of "only seeing it on fast turns" is disingenuous. When vsync is off, even games with simple graphics tear, everywhere. I find it distracting and undesirable. But I also don't want the input lag associated with being 1 more frame away from the action.
"While enabling vsync does fix tearing, it also sets the internal framerate of the game to, at most, the refresh rate of the monitor (typically 60Hz for most LCD panels)."
I see no problem with this. I'd much rather have the game running near a constant framerate than to have it bounce back and forth between high and low performance. Consistency is what I want.
"This can hurt performance even if the game doesn't run at 60 frames per second as there will still be artificial delays added to effect synchronization. Performance can be cut nearly in half cases where every frame takes just a little longer than 16.67 ms (1/60th of a second). In such a case, frame rate would drop to 30 FPS despite the fact that the game should run at just under 60 FPS."
Well, if your frames are taking longer than 16.67ms to render, then you're not actually rendering 60 fps. Duh! Longer rendering times mean lower framerates. If you don't like the framerate you're seeing, turn down some settings (resolution, quality, AA, AF, etc). Triple buffering won't fix the problem of render times which are too long for the desired framerate.
GreyMulkin - Friday, June 26, 2009 - link
To elaborate on the "while enabling vsync..." thing. Vsync off usually makes the game try to run as fast (high fps) as possible which is generally recognized as a good thing. But because scenes differ in complexity, the frames rendered per second will vary wildy and that will affect input processing which is usually tied directly to fps. So what I don't want is the *feel* of the game, the smoothness of mouse movements, etc, to be affected.MadMan007 - Friday, June 26, 2009 - link
Depends upon the game. For online competitive FPS I disable vsync, for single player games or less twitchy ones I enable it. There's no poll option for this.TonyB - Friday, June 26, 2009 - link
I still use my 21" CRT, i run at 100hz refresh rate with vsync on w/ triple buffering.at 100hz, your input lag is only 10ms (1/100) compared to 16.6ms (1/60) not to mention i'm capped at 100 fps instead of 60fps.
this is why i'm still using CRT instead of LCD.