CausticOne and the Initial Strategy
Out of the gate, Caustic isn't going after gaming. They aren't even going after the consumer market. Which, we think, it quite a wise move. Ageia showed very clearly the difficulty that can be experienced in trying to get support for a hardware feature that is not widely adopted into games. Game developers only have so many resources and they can't spend their time on aspects of game programming that completely ignore the vast majority of their users. Thus, the target for the first hardware will be in film, video and other offline rendering or simulation areas.
The idea isn't to displace the CPU or the GPU in rendering or even raytracing specifically, but to augment and assist them in an application where rays need to be cast and evaluated.
The CausticOne is a board built using FPGAs (field programmable gate arrays) and 4GB of RAM. Two of the FPGAs (the ones with heatsinks on them) make up SIMD processing units that handle evaluation of rays. We are told that the hardware provides about a 20x speedup over modern CPU based raytracing algorithms. And since this hardware can be combined with CPU based raytracing techniques, this extra speed is added on top of the speed current rendering systems already have. Potentially, we could integrate processing with CausticOne into GPU based raytracing techniques, but this has not yet been achieved. Certainly, if a single PC could make use of CPU, GPU and raytracing processor, we would see some incredible performance.
Caustic Graphics didn't go into much detail on the processor side of things, as they don't want to give away their special sauce. We know it's SIMD, and we know that it is built to handle secondary incoherent rays very efficiently. One of the difficulties in building a fast raytracing engine is that as you look at deeper and deeper bounces of light, we find less coherence between rays that have been traced back from the eye. Essentially, the more bounces we look at the more likely it is that rays near each other will diverge.
Speeding up raytracing on traditional hardware requires building packets of rays to shoot. In packets with high coherence, we see a lot of speed up because we reuse a lot of the work we do. Caustic Graphics tells us that their hardware makes it possible to shoot single rays without using packets and without hurting performance. Secondary incoherent rays also don't show the same type of performance degradation we see on CPUs and especially GPUs.
The CausticOne has a huge amount of RAM on board because, unlike with the GPU, the entire scene needs to be fully maintained in the memory of the card in order to maintain performance. Every ray shot needs to be checked against all the geometry in a scene, and then secondary rays shot from the first intersection need to have information about every other object and light source. With massive amounts of RAM and two FPGAs, we know beyond a shadow of a doubt that Caustic Graphics' hardware must be very fast at branching and very adept at computing rays once they've been traced back to an object.
Development is ongoing, and the CausticOne is slated to go to developers and those who run render farms. This will not be an end user product, but will be available to those who could have a use for it now (like movie studios with render farms or in High Performance Computing (HPC, or big iron) systems). Developers of all kinds will also have access to the hardware in order to start developing for it now before the consumer version hits the streets.
Their business model will be service and support for those who want and need it with CausticOne. Caustic Graphics has extended OpenGL ES 2.0 with GLSL to include support for shooting rays from shaders. They hope that their extensions will eventually become part of OpenGL, which may actually be useful in the future especially if hybrid rasterizing and raytracing rendering engines start to take off. They went with the ES spec, as it's less encumbered by legacy elements present in OpenGL.
On top of OpenGL ES 2.0 with Caustic's extensions, developers can use the CausticRender package which is a higher level set of tools designed on top CausticGL (which is what they're calling their extended GL API). This allows developers to either dig down to the low level or start writing engines more intuitively. There are more tools that Caustic is working on, and they do hope to see content creation ISVs and others start building their own tools as well. They want to make it easy for anyone who already has a raytracing engine to port their software to a hardware accelerated version, and they also want people who need to render raytraced scenes to have software that can take advantage of their hardware.
Focusing on developers and render farms first is a great way to go, as it sort of mirrors the way NVIDIA used the HPC space to help get CUDA exposure. There are applications out there that never have enough power, and offering something that can provide an order of magnitude speed up is very attractive in that space. Getting software support in the high end content creation and rendering packages would definitely help get Caustic noticed and possibly enable them to trickle down into more consumer oriented markets. But that brings us to next year and the CausticTwo.
48 Comments
View All Comments
HelToupee - Tuesday, April 21, 2009 - link
Go outside. Look around. Real-time raytracing is here today! The future is now!! :)MrPickins - Monday, April 20, 2009 - link
The FPGA implementation surprised me as well. It's impressive that they can get such performance out of a pair of them.SonicIce - Monday, April 20, 2009 - link
I'll give them 12 months...Harbinger - Monday, April 20, 2009 - link
I'm pretty sure they will succeed. Just make a working prototype and prove to Pixar/Dreamworks/Disney/whatever that this thing will hugely accelerate they're rendering.They don't have to appeal to masses that expect a wide variety of features on a wide variety of platforms and software. They target a very very specific segment and if they can convince that segment they'll gonna be fine.
DerekWilson - Tuesday, April 21, 2009 - link
You are right, except if Larrabee competes with this in terms of speeding up raytracing ... but we'll have to wait and see on that one. If they focus on a niche market, they could succeed.RamarC - Monday, April 20, 2009 - link
agreed, unless they get a mainstream rendering app to sign on and can get some royalties out of the software end. if not, nvidia will just implement a similar api and they'll promote using quadros as render accelerators.ssj4Gogeta - Monday, April 20, 2009 - link
Unlike Ageia PhysX, this is not about the API, but the hardware.smartalco - Monday, April 20, 2009 - link
Except, given that this is /custom hardware/, nvidia can't just role out a CUDA update