The Chip

One benefit of the tile based rendering technique is that it is extremely efficient, meaning that less raw power is needed for a fast performing chip. This is reflected in the KYRO's core design, which houses a relatively small number of transistors, 12 million to be exact, on a 0.25 micron process. Interestingly, the KYRO turns out to have about the same transistor count and manufacturing process as a VSA-100 chip (14 million transistors on an enhanced 6-layer 0.25 micron process) used by 3dfx on the Voodoo4 and Voodoo5. While the manufacturing process may be larger, the chip itself is noticeably smaller than the behemoth GeForce 2 GTS and its 25 million transistors on a 0.18 micron process. Despite the similar specifications, the KYRO is also much smaller than a single VSA-100.

Unfortunately, this smaller size does not necessarily mean that the KYRO is able to pack the same punch as the GeForce or a Voodoo4/5. in a smaller package. The KYRO lacks hardware T&L, a feature that has just recently been embraced by the video card and gaming industries, but accounts for a large portion of the GeForce's transistors. Although the power of hardware T&L is yet to be utilized to any great extent by any game currently on the market, future games are expected to implement higher polygon counts that only a hardware T&L unit can process. Imagination Technologies / STMicro are betting that games such as these will not come out until their next product, the PowerVR Series 4, is released with its own hardware T&L unit. As far as KYRO goes, they are following 3dfx's lead by leaving the T&L processes up to the CPU for now, with future products destined to provide hardware T&L.

The main advantage to reducing chip size as drastically as Imagination Technologies / STMicro have been able to is that production price of the chip is significantly lower. Fewer transistors means less silicon which results in more chips per wafer. More chips per wafter means higher yields and therefore lower cost. In addition, for a given manufacturing process, fewer transistors results in lower heat output. Thus, we were not surprised when our reference board arrived at the lab with an extremely small 3.8 x 3.8 cm heatsink, which is nearly 1 cm x 1 cm less than reference GeForce cards. The active cooling fan/heatsink combination looked more like a fan mounted on a metal plate than a fan actively cooling the fins of a heatsink.

As a result of the KYRO's small size, heat produced by the chip is minimal. This was especially the case with our KYRO card, which was running at 115 MHz in the core. Although the final shipping version of the card is slated to have a core clock speed of 125 MHz, the heat produced by an additional 10 MHz of overclocking should be minimal. With this in mind, it seems odd that PowerVR would have chosen to have a metal plate of sorts cooled by a fan over a non-active cooling solution. The reason for this is most likely cosmetic. When people see a video card without an active cooling element, they assume that the card is not very powerful. Since processors have been running hot for many years now, lack of a heatsink/fan stack would be frowned upon. In an effort to improve image, an active cooling element was used. This choice will leave some consumers pleased that their card has a heatsink and fan on it, others will be left wondering why it is so small.

It should be noted that our card is a pre-production reference board running pre-production silicon. The final design of the board and cooling solution will be left up to the individual card manufacturers.

The RAM

While the die size of the KYRO has decreased, the available memory options has increased. KYRO is set to support 16, 32 and 64 MB of SDRAM. Our reference board arrived at the lab outfitted with eight 8 MB SDRAM modules produced by IBM and rated at 7 ns / 143 MHz for a total of 64 MB of RAM. Our board did not even come close to utilizing the 143 MHz that these chips were rated to; the core clock and the memory clock speeds are synchronous, meaning that our test board was running with a 115 MHz memory clock. Even the shipping card, with a memory clock speed of 125 MHz, is not set to take advantage of the 143 MHz RAM, making this choice an odd one. Perhaps there are plans of a higher memory clocked card or maybe they just got a good deal on IBM memory. Further, shipping boards may feature different memory; only time will tell. Besides, even if it ships with faster memory than necessary, that could help when it comes time to overclock ;)

It is natural to ask why, in this world of DDR memory graphics cards, does the KYRO use standard SDR SDRAM chips? Besides lowering production costs even more, as SDR memory chips have been falling in price ever since the introduction of the DDR chip, the tile based rendering architecture utilized by the KYRO results in extremely less emphasis on the memory bus. The deferred texturing of the KYRO dramatically decreases the amount of data being sent to and from the memory. By decreasing the amount of data that has to travel over the memory bus, the memory bottleneck we are used to seeing in video cards is all but gone. There is sufficient time for data to be written to and from the memory because the amount of data that must travel there is smaller. This is analogous to cars entering a freeway from a town. Think of a traditional video card, such as the GeForce. In this case, houses in the town would act like the video card processor, producing cars to travel to the freeway. Imagine a town of 1 million people all trying to get on the same on-ramp on the same freeway. There would be no problem getting from the houses to the on-ramp, but once 1 million cars try to go to the same on-ramp and the same freeway, a bottleneck is instantly formed. As stated earlier, the deferred texturing technique of the KYRO only passes the information that is going to be used into the memory. This is like our model town during Labor day. Rather than send out all 1 million cars only the people vital for the towns operation will travel. This reduces the number of cars to, lets say 200,000, meaning that the backup and delay encounter on the freeway on a normal day is all but gone. It is for this reason that the KYRO does not need the enormous bandwidth DDR memory would provide: since the KYRO passes less information, there is no bottleneck formed.

The Future & History of Tile Rendering The Card & Drivers
Comments Locked

1 Comments

View All Comments

  • Lanning Donald - Saturday, March 28, 2020 - link

    Reading these specifications of KYRO has made me so much interested in purchasing and using this technology for the commercial purposes. I have visited https://legitimate-writing-services.blogspot.com/2... site to get paper writing help and now I am hoping to reap out some fantastic benefits after using this technology.

Log in

Don't have an account? Sign up now