Realworld Testing w/ High Speed Video
Inspired by Mythbusters, we wanted to use a high speed camera to really measure what's happening with millisecond resolution. We were disappointed when we first looked into this, as "real" high speed cameras cost in excess of 10k USD. But then we stumbled upon the Casio Exilim EX-F1 and it's horrific quality but hugely fast video capability (actually, quality isn't that bad in VERY high light situations). At 1200 frames per second, we get video output at a resolution of 336x96, which is freaking tiny. But it's enough. All we need to do is count the frames between two events and multiply it by 0.833 (the number of milliseconds per frame) and we can assess the duration of incredibly short term events.
First up, we're looking at response time for the Dell 3007WFP display. Rather than relying on the manufacturer reported response time (which look at a limited number of cases and likely don't include worst case performance), we're using our camera to watch a few frames go by and observe how long it takes for pixels to transition from one color to the next. As we can see in the video, it takes nearly a full frame (~16ms) for some colors to change (pay attention to the yellow and black stripes with the orange lettering at the bottom of the screen). We timescaled this video down 10x over the already very slow 2.5% realtime speed to 0.25% realtime to help in seeing what's going on.
Getting to the actual game testing, we wanted to look at two titles: one a twitch shooter, and another with notoriously bad input lag. We chose Team Fortress 2 for our twitch shooter and Fallout 3 for our laggy game. We also did further testing on a CRT with TF2 just to see how low we could actually get input latency (with some pretty impressive results). Our test methodology was to set up the camera to capture both input generation (either a mouse click or move) and the display. The resolution of our camera was such that we had to really work to cram everything in the frame, but we got enough to be useful. We ran multiple tests and counted frames between when the hand hit the mouse and when we could see the result on the monitor.
For Team Fortress 2 we looked at three scenarios: no vsync, vsync enabled, and vsync enabled with our flip queue (render ahead / pre-rendered frames) set to zero. Our frametime at 2560x1600 was typically 9ms give or take small amount (but we never dropped below 100 FPS let alone refresh rate) with our GTX 280 remaining the performance bottleneck (CPU time was still significantly less than frametime).
First we'll look at the case with no vsync. Look for when the finger stops moving to when the shotgun blast appears on the wall (Valve makes sure to calculate the hit before it even starts the gun firing animation; sometimes hit and gun fire happen in the same frame, sometimes they are one frame separated). This certainly isn't frame by frame, but you should be able to download the clip from youtube and step through it yourself if you are so inclined.
This test took about 45 frames: between 37ms and 38ms from input generation to display. This is very good considering what we predicted as a best case. Average over multiple runs was a little higher, resulting in 51ms of typical input lag plus or minus about 12ms (our maximum being 63ms). This fluctuation is due to how all the factors we talked about either line up or don't.
When we turn on vsync, we see a lot more delay.
We can see even better how the hit happens before the gunfire animation again. This is very key in any twitch shooter. The lag is significantly longer with vsync enabled. This example shows about 94 or 95 frames (~79ms) input lag, which was our lowest input lag time in this test. Our average was about 89ms ranging up to 96ms at maximum. In this case, we lose the input latency advantage of no vsync but we gain more consistent input lag times and no tearing.
But we also decided to check and make sure that we weren't getting stuck with any penalties due to flip queuing as the average latency increase seemed a little high. So we set maximum pre-rendered frames to zero in the NVIDIA control panel (formerly maximum render ahead) to see what happened. Whenever your framerate is always higher than your refresh rate you never want any flip queueing going on (unless you are using a multiGPU configuration). So let's check it out.
And we see 70ms as our new best. Our worst case is 85ms, which is better than our previous average. Average latency here is 76ms +/- about 6ms (with the one 85ms exception). From this data, it seems as though Valve uses a 1 frame flip queue (1 frame render ahead) when vsync is enabled unless it is forced off in the driver. When the game has framerates of less than the refresh rate then this is fine, but when framerate is always higher than refresh it will absolutely incur a performance penalty along the lines of what we are seeing here.
For those who want vsync in TF2 and have consistently >60fps performance without multiGPU, absolutely set the flip queue to zero in the driver.
Next up is our Fallout 3 test and we'll compare a notoriously laggy game with a notoriously responsive game. Our framerate in this game was consistently between 38 and 45 frames per second during the testing, meaning that frametime will play a bigger role as it will add at something between 22ms and 27ms of latency.
And we have 136ms. And this test was much more consistent with our average latency being 136ms as well. Our low was 130ms and our high was 142ms, giving us the same tight spread of +/- 6ms we saw with TF2 and vsync, only vsync was not enabled here (which suggests some internal rate monitoring to me). Of course, this variability is now a much lower percentage of the overall input lag; not that that offers much consolation.
The number of complaints we've seen on the net about Fallout 3 and input lag (though notably a subjective improvement over Oblivion) combined with our own experience lead us to believe that somewhere between TF2's snappy 89ms worst case input latency (which we still couldn't feel) and Fallout 3's ~50% longer latency we cross the point of perceptibility and distraction. That 100ms number certainly looks like a good point for developers to keep input latency below.
Let's take a look at what happens when we turn on vsync.
Once again, 136ms. Our input lag stats were actually the same. And yes we rechecked to make sure vsync was enabled here and disabled there. Since Oblivion benefited from reducing the maximum number of frames rendered ahead, we tested this out as well. When set to 1 or 0 we saw no change in performance with Fallout 3 at least in this test. It's unclear whether different settings or performance levels would benefit, but for now it seems like Fallout 3 does really well at producing a consistently high input lag.
So that does it for our LCD testing. But we did want to learn whether or not the LCD panel added any extra delay over a CRT display. So we pulled out a Sony GDM F-500 encased in the purple and grey of a Sun monitor from an age gone by. We tested at lower resolutions as the display couldn't muster anything closer to 2560x1600, but at 1600x1200@60Hz our numbers still made sense comparatively: we would expect them to be a little lower but not by much.
So, our average here is about 43ms, which is a lower average than on the Dell LCD. Our minimum is 35ms and max is 52ms. In the best case, 35ms (for the CRT) isn't much better than 37ms (for the LCD) at all. Our worst case in this test was much better (as was our average) than the LCD. But it's clear that the LCD is capable of input latency as low as the test with this CRT. We could (and likely do) have an advantage on the CRT side due to resolution. We do have to consider that framerate was higher for the CRT test. It is likely that this difference reduced our worst case and average performance.
But let's take a look at the CRT with vsync.
In this test our best case is 78ms and worst case is 88ms. The average is 84ms. This is, again, very similar in worst case to the LCD test, but the average case is less better this time around. This makes sense as vsync normalizes performance to 60FPS but there may still be a slight advantage for the lower resolution. It seems as if the Dell 3007WFP isn't significantly disadvantaged when compared to a CRT.
But there is one more advantage over LCD panels: refresh rate. We tested 120Hz, though we did have to lower resolution once more to make it work. This will increase performance on its own, but the results are impressive.
With an average of 25ms input latency, minimum of 17ms and maximum of 29ms, the results here show an incredible impact on input latency when running at a higher framerate and refresh rate. These numbers come in really low. The 1152x864 resolution does provide about 200 FPS which will also have an impact on latency, but the fact that 1600x1200 had a higher framerate than 2560x1600 but showed similar latencies, we suspect that the limit was refresh rate. Although it could be that the higher refresh rate was able to allow the advantages of the higher framerate to shine through.
Either way, this is incredibly low input lag.
85 Comments
View All Comments
PrinceGaz - Friday, July 17, 2009 - link
Just to add"For input lag reduction in the general case, we recommend disabling vsync"
It is rather ironic that you used that phrase, when in the your previous article you were strongly stating the case for v-sync always being used (preferably with triple-buffering).
Unless you are certain that nVidia's OpenGL implementation of triple-buffering works how you think it does (and not how most people think it does), posting articles may be unwise.
DerekWilson - Sunday, July 19, 2009 - link
In the "general case" we mean /when triple buffering is not available/ (real triple buffering is not available in the majority of games), in order to reduce input lag, vsync should be disabled.Here's a better break down of what we recommend:
Above 60FPS:
triple buffering > no vsync > vsync > flip queue (any at all)
Always EXACTLY 60 FPS (very unlikely to be perfectly consistent naturally)
triple buffering == vsync == flip queue (1 frame) > no vsync
Below 60FPS (non-twitch shooter):
triple buffering == flip queue (1 frame) > no vsync > vsync
Below 60FPS (twitch shooter or where lag is a big issue):
no vsync >= triple buffering == flip queue (1 frame) > vsync
... to us the fact that at less than 60 FPS you start on the exact same frame with either triple buffering, no vsync or with a 1 frame flip queue is enough -- you will get less than 1 frame of difference in the lag and the difference you get with no vsync is only on part of the frame anyway.
...
And ...
NVIDIA has absolutely CONFIRMED that they do OpenGL triple buffering in the way that I described it in the previous article. This confirmation is directly from their OGL driver team.
I was just able to sit down with AMD two days ago and, while they don't do things in exactly the way I describe them, their OpenGL triple buffering does do something quite interesting at > 60FPS ... which I'll talk about in an upcoming article. They also said they are looking into what it would take to do what I was talking about -- but that they haven't yet because their workstation customers tend to care more about what happens at < 60fps (which is a case where a 1 frame flip queue == triple buffering).
The situation with DirectX is a little more restrictive with both NVIDIA and AMD... I will also address this issue in an upcoming article.
BoFox - Friday, July 17, 2009 - link
Anandtech does yet another ground-breaking article, thanks to Derek Wilson and the use of a high-speed camera this time!It is rather interesting how you mention the human reaction time! To detect and process the image signal in your brain, and to react to it (usually faster if through reflex), are all part of the entire reaction time of 200-300 ms, correct? Well, we should be testing it on the professional gamers like Jonathan 'Fatal1ty' Wendel, to see how much he has tuned his reaction time in making it so reflexive that it's as low as 150ms or even lower. Why not try interviewing him and use your high-speed camera to test that on him, and on an average gamer?!? That would be a SUPER-awesome article here to read!!! :D
About the rendering ahead, some games are designed to render ahead by as much as 3 frames, and changing that to 0 could incur a large performance hit--sometimes by as much as 50%. I have experienced this with some games a while back, cannot remember which ones exactly, and that was on a single video card (not when I was using SLI). The performance penalty would usually bring down the frame rates to below the refresh rate, thus making it very undesirable. There are some other articles cautioning against it, but as long as it does not make a difference in the game that you are playing, then great. At least, as long as it does not drop the frame rates to below the refresh rate, then great!
DerekWilson - Sunday, July 19, 2009 - link
If your framerate is above refresh rate (except for the multiGPU case) you should definitely not use anything more than 0 frames of render ahead. Triple buffering as I described in my previous article (and as NVIDIA does in OpenGL) will be a good option, but double buffered w/ vsync or no vsync will definitely be better than any sort of render ahead.If your performance is below refresh rate, you'll want to either use no vsync, exactly 1 frame render ahead, or triple buffering. double buffered vsync (0 flip queue/pre-rendered frames with vsync) will give you a significant drop in performance and increase in input lag when performance is below refresh. This is as you say -- by as much as 50%. But only when performance is already below refresh.
nvmarino - Friday, July 17, 2009 - link
Hey Derek, thanks for the great article. Nice job addressing many of the BS theories and comments about lag going on in the threads of your triple buffering article...And now I've got even more fuel for the fire of my lust for someone to release a 1920x1080 resolution display with support for 120Hz refresh rate at the INPUT and good black levels. If only I could set my PC to spit out 120Hz to my display, my 24Fps Blu-ray and 60Fps TV content would play without judder, AND my games would have zero tearing (well as long as they don't run above 120Fps) and minimal lag. Not only that, but I'd also have an ideal setup for shutter-based steroscopic 3d, and it would open the door for some of the new stuff like smoothing video via frame interpolation of 24P content to 120P currently going on in the PC videophile community. Can you say HTPC nirvana! Where's my 120Hz!!!!
DerekWilson - Friday, July 17, 2009 - link
yeah, i want a true 120hz 1920x1080 display too :-) mmmm that would be tasty.Belinda fox - Friday, July 17, 2009 - link
If someones reaction times are up to 220ms and your results are around the 100ms mark surely this says something about the validity of results and conclusion.Sorry i never read all the article just first and last one, due to reaction times being mentioned as part of lag.
BoFox - Friday, July 17, 2009 - link
Just read the article first.If you had better common sense, you would figure out that the human response time of >200ms comes AFTER seeing what is on the monitor, which takes 100ms to display in the first place.
Let's say that I appear just around the corner right in front of you, in a FPS game, like Unreal Tournament 3, for example. First, it takes like 50ms ping time for your computer to receive that information. Second, it takes your monitor 100ms to display the information. Third, it takes you 200ms to react to what you're seeing.
lyeoh - Sunday, July 19, 2009 - link
And AFTER you react to what you are seeing you press a key/button or move your mouse and the relevant portion of the input lag takes into effect (input device, game, network).With a 100+ms input lag handicap even on the LAN you could be "dead" before you even see anything or be able to do anything about it.
So would be good if have some figures on wireless keyboards and mice vs wired (usb and ps2).
I personally think wireless stuff will add a lot more lag due to the modulation/demodulation, and other "tricks" they have to do.
DerekWilson - Friday, July 17, 2009 - link
That is definitely an issue ...i didn't get into network play as it gets really sticky though ... the local client does do limited prediction based on last known info about other players -- if this prediction data is accurate enough at small intervals and if ping isn't bad then it isn't a huge issue ... but at the same time, servers may resolve hits (as apparent to the client) as hits even if they are misses (in reality as per what the player did) or it may not (the difference results in either increased or reduced effectiveness of techniques like bunny hopping). but it is way more complex and depends on how the client predicts things between server updates and how the server resolves differences between clients and all sorts of funky stuff.
that's why i wanted to stick with mouse-to-display issues here ...
but you are right that there are other factors (including human reaction time) that add up to make it so input lag can be an important slice of a whole chain of events and can affect the gaming experience.