History and TheoryTiles have already been trying to find their place in the market of 3D graphics accelerators for a couple of years. And though they have been unable to succeed before, the new attempts look more impressive. This is especially important from the point of memory shortage problems of modern accelerators designed by the classical architecture. Imagination Technologies has been successfully developing PowerVR tile chips for a long time (there are 3 families of chips already and another one is coming soon), the physical production of chips is held by STMicroelectronics. By the way, the number of manufactured tile accelerator chips has already exceeded dozens of millions, but until now the majority of them have been installed in different game machines: Note that the chips of the second series (PowerVR Series 2) were used in two widespread game consoles and in PCs. We remember well the failure of the PVR250 (Neon) PC chip, mainly because of production problems. After that Imagination Technologies has rejected NEC as a partner in favor of STM. Well, that's history and today the situation is much more interesting. Series 3 is going to occupy automobile computers and even (!) mobile applications. And it is chiefly intended not for notebooks, but for cell telephones and PDA. Besides PC chips there are set-top boxes and an arcade console. In the world of top 3D consoles the KYRO hardly competes with the horrifying power of the PSX2 or XBox. But maybe the next generation (Series 4) will be successful in this field. So, here are the KYRO II specifications:
Looking at the specs, it becomes clear that the absence of the KYRO II in the scheme above does not change anything - the KYRO II is just a tweaked and slightly redesigned variant of the KYRO I. A better technology (0.18 instead of 0.25) has allowed to increase the frequency and to reduce the energy consumption. Besides, the number of transistors is higher, maybe not only to increase the pipeline clock rate at the expense of decreasing its stages' complexity, but also to implement a more perfect memory controller (caching scheme). But these are just cosmetics - the rendering architecture has not undergone any essential changes. T&L is still not present, and the chip is still somewhere between DX7 (closer to it if not to consider the absence of T&L) and DX6 in the set of possibilities. For the fans of revolutions we recommend to wait for the next chip generation (Series 4) - there we shall see at last the hardware T&L, the hardware binding of polygons on tiles and possibly the overbright lighting (the calculation of lighting and filling with a higher accuracy of color transmission and a possibility to handle values more than 1). And now let's focus on what we have got today - the KYRO II and its interpretation of the tile method: Z buffer is absent. Well, it's present, but only inside of the chip and its size is equal to one tile (32X16 pixels). The chip also stores all tile's pixel colors at 32 bit accuracy. Certainly, the access to the in-chip tile buffer is provided optimally - not only with the maximum speed, but also simultaneously for some tasks. The image construction is organized as follows:
This flexible mechanism of texture combination allows to omit the multipass rendering, but the majority of modern games can't use the capability to combine up to 8 textures at once effectively and continue to construct the image at several passes, combining two textures per cycle. The complete construction of a tile in the internal buffer (where the color accuracy is always 32 bit) greatly increased the visual quality of an image constructed at 16-bit mode - usual accelerators truncate all the intermediate data to 16 bit during the multipass image drawing, and in this case only the final values are truncated. Anyway, the tile rendering requires a redesign of the classical approach to the image construction. The appearance of DX8 with its stage setup, effect files and pixel shaders installs a certain optimism that the multipass construction will be gradually replaced by these technologies. The tile architecture greatly reduces the requirements to the memory bandwidth. But still the KYRO II specifications look puny with only two pipelines and one texture block per each, with SDR memory and 175 MHz chip. In case of big textures filtered with a large amount of samples (trilinear or anisotropic filtering) even the tile architecture won't help this chip as it would be limited by the texture memory bandwidth. The trilinear filtering is quite effortless with S3TC enabled (read on about S3TC below). But the strictness of these limitations will define, whether the chip will score against classical accelerators or not. However, we'll discuss it further in the review. The KYRO II will be a serious threat to inexpensive chips from NVIDIA and ATI, providing the appropriate marketing. The features of the first Series 4 chip are even more interesting as it is going to be a new tile accelerator with hardware polygon binding to tiles, T&L, and maybe even with pixel shaders. It is most likely that the announcement should follow soon. Probably, it will be able to become a serious competitor to accelerators of the classical architecture. By the way, the drivers which have been released simultaneously with the KYRO II, have provided an essential performance increase even for the first chip. The CardThe prototype of the KYRO II from Imagination Technologies has the AGP x2 interface (the chip itself also supports AGPx4, but the manufactured cards have only AGPx2) and 32 MBytes SDR SDRAM memory in 4 chips on the face side of the PCB. The memory chips are manufatured by Samsung, the timing is 5 ns, which corresponds to the frequency of 200 MHz . However, the memory units, as well as the graphics core operate at 175 MHz. The design of the card (as you have probably noted in the uppermost photo) is very similar to the design of the mass-produced VideoLogic Vivid! based on the first KYRO chip (the review is going to appear on our site soon). No extras in general, just the chip and memory modules, and a TV-out scheme, traditionally supplied with an S-Video out. PCB is bright-green, the card is rather small. A usual cooler with a ventilator is attached to the processor. TweakingDespite the fact that the latest PowerStrip (3.0 beta123) is able to work with the KYRO II, the tweaking to more than 180 MHz is impossible because of the appearance of some very notable artefacts (let me remind that the chip and memory frequencies are rigidly synchronized), perhaps, the chip is the main limiting factor. There's no sense in tweaking this card. Drivers and InstallationBelow are the testbed configurations, on which the card, based on the KYRO II, was tested:
The ViewSonic P810 (21") and ViewSonic P817 (21") monitors were used. 1.00.7.056 version drivers from STM were used in testing. Vsync disabled. As we can see from the screenshot, among the driver adjustments there are anti-aliasing settings, a capability of anisotropic filtering activation, a trilinear filtering forcing and texture compression. The actual profit these features give to an end user will be described below. The results of the following videocards were used for the comparative analysis: the ATI RADEON 32 MBytes SDR, ATI RADEON 32 MBytes DDR, Leadtek WinFast GeForce2 MX/DVI, VideoLogic Vivid! (KYRO), AOpen PA256 Deluxe (GeForce2 GTS). Test ResultsIn 2D-graphics field everything is good. The quality corresponds to the NVIDIA GeForce2 cards from top manufacturers and, maybe, to ATI's cards based on the RADEON. However, the first KYRO had no problems in 2D as well. Now as for the 3D perfomance of the videocard. The following programs were used as the toolkit:
The following games were additionally used to estimate the quality parameters:
Quake3 ArenaThe Standard TestThe test was conducted in two modes: Fast (demonstrates card's operation at 16-bit color) and High Quality (demonstrates card's operation at 32-bit color). At 16-bit color the KYRO II takes the stable second place after the GeForce2 GTS and only in 1600x1200 the RADEON DDR catches it up. At 32-bit color the architecture of the RADEON DDR optimized for high load on the memory bus plays its role. The KYRO II, still lagging behind the GeForce2 GTS, goes on a par with the RADEON DDR, and only in 1600X1200 the SDR-memory affects the KYRO's II results thus giving the RADEON DDR a leading position. What does it mean? The two-pipelined KYRO II (one texture unit per pipeline) has outrun the RADEON which also has two pipelines, but with 2 working texture modules per each (not speaking about the inoperable one now). The latter has a bit lower frequency, but it has a faster DDR memory. And the KYRO II easily fulfils the same amount of work, having only SDR-memory. And though the GeForce2 GTS takes the lead with its 4 pipelines (2 texture modules per each) and excellent OpenGL drivers, it is notable that the KYRO II, being four times weaker from the point of pipelines and texture modules amount, is not too far behind it. It's the advantage of the tile architecture. Texture CompressionThe test was conducted in two modes: 16-bit color and 32-bit color with the highest possible quality. After the activation of the S3TC technology (the variable r_ext_compress_textures 1) Quake3 is known to start working in the autocompression mode. It has it's pros and cons. Pro is that the total size of compressed textures is smaller and it consequently saves the video memory bandwidth. The cons are that all the Quake3 textures were not compressed initially and were not intended to have after-compression quality losses. Therefore some quality deterioration will occur. For each card it's different as we have already written in our 3Digest. Below I'll show you what happens with the KYRO II in this situation, and now let's look at the diagrams (16-bit color). The speed boost when using S3TC is minimal for the RADEON cards because of quite a free memory bandwidth, which is also due to HyperZ technology that saves it. The boost for the GeForce2 MX/GTS is also small and that's understandable (memory bandwidth is sufficient at 16-bit color). It is also necessary to underline the outstanding visual image quality at 16-bit mode due to the internal blending calculations (at 32-bit accuracy) of different rendering passes. This quality is unavailable to accelerators with a traditional architecture. Speaking about the testing with the best quality, we'll remind you that the trilinear filtering also participates in the given test. We know, that it's realized for the KYRO/KYRO II in actuality instead of approximation. It is also well known that such pipeline architecture (2 pipelines with one TMU per each) under the condition of real trilinear filtering enabled doesn't save the NVIDIA Riva TNT2 from disastrous perfomance falloffs. Therefore let's consider the situation with disabled trilinear filtering at 32-bit color: What do we see? The perfomance boost for the KYRO II without trilinear filtering is so great that this card has easily overtaken even the NVIDIA GeForce2 GTS! Here's the percentage: And just look at these falloffs of the KYRO II with enabled trilinear filtering! And the speed of the GeForce2 GTS almost doesn't depend on this factor. In the given case the classic accelerator is supported not only by the larger amount of texture blocks and pipelines, but also with the wider bandwidth - you shouldn't forget that the volume of textures passed increases when using the trilinear filtering. With S3TC enabled the trilinear filtering is 'easier': The falloffs are still present though they are not so catastrophic. But they are even worse for the others. It's almost resolution independent for the KYRO II (unlike the GeForce2), from this we may conclude that it has an insufficient fillrate - the larger amount of pipelines would have allowed the KYRO II to use it's potential entirely. It's interesting that the greatly balanced RADEON has outdone the tile architechture of the KYRO II in this aspect. Now let's get back to S3TC. Above we have already mentioned the speed increase on S3TC activation. And the KYRO II has surprised us with a decent perfomance increase: However, the KYRO is good as well. Just look at this striking difference from the other cards! I won't make any conclusions now and will proceed to the 32-bit color. You can see that the difference in perfomance without and with S3TC has become more notable for all the cards! Almost all of them in this case meet the shortage of video memory bandwidth, but S3TC, saving it, helps to raise the speed. The KYRO II at the 32-bit color with S3TC enabled has easily caught up the GeForce2 GTS (and has even slightly overtaken it), though the perfomance increase of the latter is also quite appreciable! What an impressive increase! Pay your attention that in 1600X1200 the situation with memory bandwidth has become so critical for many cards that even S3TC has essentially ceased to help, and the increase has reduced. And the KYRO and KYRO II are still at a high level. The main advantage of the tile architecture has an effect - the size of information transferred to/from the frame buffer is several times lower and that plays the vital role at high resolutions. And in addition compare the perfomance of the KYRO II at 16- and 32-bit color with S3TC enabled. The difference is insignificant. The tile architecture with the help of S3TC almost completely releases the chip from the chains of video memory bandwidth (as the KYRO chip is known to operate with 32-bit color in all cases) and is limited only by the chip's clock rate and the number of pipelines / texture units. And what about quality losses? Let's see: S3TC disabled S3TC enabled We see the losses, but they are not fatal. In our3Digest we have compared the quality of texture compression in the autocompression mode, and NVIDIA's cards were the worst. GiantsS3TC - The Quality QuestionUp to this moment we have spoken about the game, that controls texture autocompression, and the textures themselves have suffered no fatal quality losses. But there is a capability to force S3TC for all games via drivers settings. Let's take Giants as an example and see what is caused by the enabling of S3TC for games, not designed for this technology initially: S3TC disabled: S3TC enabled: S3TC enabled In this case the losses are catastrophic. All water lightspots, received by means of the lightmaps, have turned into puny squares. It is a huge contra of autocompression. We are forced to state, that S3TC automode is desirable only in games that are initially designed for texture compression. Below are the screenshots from other games (Expendable and Unreal) literally filled with autocompression artifacts: Such artifacts won't allow the Kyro II to process many games with autocompression. The autocompression could noticeably increase the performance of the KYRO II. Therefore we are trying to show the results both with and without S3TC, leaving the right to choose to our readers. Quake3 - Anti-aliasingThe screenshot of the driver's settings shows that the KYRO II supports several AA levels with super-sampling method (SSAA). The most justified level of 2x2 gives the best visual effect, but the perfomance falls disastrously: We can see a great speed falloff which spoils the gameplay. For the sake of justice we shall mark that the similar result can be seen on all other accelerators, only the GeForce3 with its Quincunx is playable at 1024X768X32 resolution, and also the GeForce2 Ultra at 16-bit mode. And even the new architecture of the KYRO II does not help - the quadruple increase of the filled pixels amount (the effectively filled ones) at AA 2X2 plays its role. The two-pipeline chip with SDR memory bus is simply not capable of processing such an enormous array at an acceptable speed. The RADEON looks even a bit worse. Only the GeForce2 GTS at 16-bit color with enabled AA looks as the true leader. And all the cards are almost equal at 32-bit color with enabled AA, but the KYRO II is slightly ahead as it doesn't have to store oversized frame buffers in memory. Let's see what happens on S3TC activating: The results have changed, but the fillrate disadvantage, restraining the competitors, is still present. You may notice that the KYRO II is better than the GeForce2 with S3TC enabled. But the FPS value for 4x AA mode hasn't reached the 'gameplay' level of 60 frames yet. Having the GeForce3 in mind, we understand that the future is MSAA's. And what about the AA quality of the KYRO II? Let's look at the best AA possible (2x2) without the comparative results of other cards yet. FAKK2Anti-aliasing - The Quality QuestionBelow are the screenshots from FAKK2: UnrealAnti-aliasing - The Quality QuestionLet's compare the AA quality among the three competitors in this game. Besides, the perfomance of all the three accelerators is almost equal at FSAA 2X2 (4X) at 32-bit color mode, as it was mentioned above. SSAA disabled, this picture can be seen on any of the three cards, therefore we won't point at these 'hacks' for each board SSAA 1x2/2x1 (2x), i.e. an intermediate AA levelThe KYRO II: The RADEON: The GeForce2 GTS: Some enlarged fragments from the screenshots aboveThe KYRO II: The RADEON: The GeForce2 GTS: Note that the AA quality of the KYRO II and GeForce2 GTS is identical, and it's almost the same for the RADEON and differs only because of a slightly different LOD value. SSAA 2x2 (4x), maximum AA levelThe KYRO II: The RADEON: The GeForce2 GTS: The GeForce3 - Quincunx: Some enlarged fragments from the screenshots above The KYRO II:
The RADEON: The GeForce2 GTS: The GeForce3: And again we can see that the AA quality of the KYRO II is equal to that of the GeForce2 GTS and approximately coincides with the RADEON. Certainly, the quality standard in the given situation is the Quincunx mode of the GeForce3 (mind the price). Now we are going to review another interesting feature of the KYRO II which is the anisotropic filtering. Let's take Unreal (Direct3D) as an example. It's a pity, but the KYRO II can't be forced to use the OpenGL anisotropy because of the drivers (it's available for application calls only), thus we can compare quality in Direct3D only, striking the RADEON out as it has no Direct3D anisotropy. As we see, the perfomance falloff is great. Let's look at the quality now: The first exampleTrilinear filteringAnisotropic filteringThe KYRO II (up to 16-tap): The GeForce2 GTS (8-tap): The GeForce3 (32-tap) And some fragments from the screenshots above The KYRO II: The GeForce2 GTS: The GeForce3: The second exampleTrilinear filteringAnisotropic filteringThe KYRO II (up to 16-tap): The GeForce2 GTS (8-tap): The GeForce3 (32-tap) And some fragments from the screenshots above The KYRO II: The GeForce2 GTS: The GeForce3: The third exampleTrilinear filteringAnisotropic filteringThe KYRO II (up to 16-tap): The GeForce2 GTS (8-tap): The GeForce3 (32-tap) And some fragments from the screenshots above The KYRO II: The GeForce2 GTS: The GeForce3: We can see almost an equal Direct3D anisotropy quality on the GeForce2 GTS and KYRO II. However, you shouldn't forget about the striking perfomance falloff of the latter! Read more about the anisotropic filtering research for all chipsets, that support it, in our 3Digest. Certainly, the 32-sample anisotropy of the GeForce3 is out of competition in terms of quality, however even the GeForce3 (very overpriced) can not show low values of speed losses here. ExpendableThis game is already so out of date as a benchmark that we'll use it only for demonstrating Direct3D perfomance, and it's hard to judge the quality with it, as the scenes in Expendable are simple. You can see that the KYRO II deals very well with the effects and scene complexity in the given benchmark and the results of demonstration are not only comparable to stronger GeForce2 GTS results, but the KYRO II has even overtaken it at 32-bit color. Some Aspects of 3D-graphics QualityIn the KYRO II performance analysis we have already considered a number of questions about the design of the major 3D-features of this videocard. Now I'll speak about the bump mapping. Let's remember that even the KYRO has supported EMBM well. In March in our 3Digest we observed the cards' operation in 3DMark2001, and paid attention to the EMBM quality. The KYRO II also supports Dot Product 3, and you can estimate its quality in the 3Digest and also in Giants: The KYRO II The GeForce2/3 I shall mark that Dot3 works on the KYRO II only after the upgrade to 1.396 version. Therefore it is necessary to watch the releases of new patches for all modern games that support up-to-date features. And in conclusion I'll present a couple of screenshots from a new hit game Black&White: The KYRO II The GeForce3 Despite the fact that this game has been advertised as Hardware TCL oriented for a long time, it runs very well on the KYRO II, which doesn't have a hardware TCL block! And from this point, the well known NVIDIA's attacks on the KYRO II with arguments like B&W are not justified. The graphics quality is almost the same on both KYRO II and GeForce3. The Conclusions.So, we've got a tweaked KYRO I here with perfect performance, even in the FSAA mode, an excellent realization of S3TC and an advanced capability of combining up to 8 textures per one rendering pass. Not very good market position - it would be helpful to decrease the price slightly (that will be probably done soon) or slightly increase the fillrate (say, DDR memory and four pipelines). In this case we would have a strong opponent even to a much more expensive GeForce2 Ultra. As we see, the KYRO II is not intended to take over the market. It is intended to win its part of 3D accelerators' market in below $150 price range. If we observe this card as the way to show the advantages of tile rendering architechture, which has a good potential, and also as the way to mark the presense in the market (not fighting for leadership yet), that will decrease the emotional criticism. And, moreover, the KYRO II is even able to encourage from this point of view. We impatiently await the KYRO III (whatever the name will be) which, as we hope, will be capable of competing with the GeForce3 not only in terms of memory bandwidth, but also in the field of DX8 (pixel and vertex shaders, etc.)... However, the KYRO II has a level of performance that was previously inaccessible at this price range and this is a true reason to praise it, especially with its innovative PowerVR architecture. Pros:
Contras:
Write a comment below. No registration needed!
|
Platform · Video · Multimedia · Mobile · Other || About us & Privacy policy · Twitter · Facebook Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved. |