Authors:
Ph.D. Claudio Augusto Delrieux – Associate Professor - Universidad Nacional del Sur
Lic. Federico Andrés Lois – Consultant - Huddle Group (Computer Graphics Division)
The importance of graphics and visual reasoning can never be overestimated, as has been shown many times and in several research areas. As a biological species, we depend on visual information for most of our critical tasks and, because of that, more than 60% of the active areas of our brain are devoted to the processing of visual stimuli. As a culture, visual information is the most important means of communication, and many of our linguistic conventions appeal to the strong, rich, visual environment that we experience in our everyday lives. As technology producers and consumers, visually oriented products have been paramount. Computer technology is no exception to that.
Computer Graphics and its associated disciplines in Computer Science have been there for nearly 40 years. Even though the goals and purposes of graphics computing have been clear from the very beginning, the technological limitations constrained the full expression of what now can be considered as one of the most attractive, widespread, and socially relevant sides of the computing technology. A sustained growth of the share of computer graphics products in the entertainment market, together with the recent spectacular breakthroughs in graphic hardware development, are leading to revolutionary new ways of designing and programming systems. Graphics processors are becoming widespread, and imaginative uses of their processing power in non graphics related applications, are steadily emerging.
The computing capabilities of graphic processing units (GPUs) are increasing at an astounding rate, and the quality achieved by current off the shelf graphic products exceeds even the most optimistic expectations of just a few years ago. The current generation of GPUs is about to reach the Teraflop throughput mark, with a full parallel architecture that streams together 128 processors. In addition to this, runtime support provides the programmer with an adequate level of abstraction over the full programmability of the pipeline. Consequently, the most diverse kind of applications can take advantage of the computational power of GPUs when used as generic processors.
It would be short-sighted to think that the solutions that GPUs may provide are relevant or only limited to graphics. Examples of this are the upsurge of plug-ins that take advantage of GPUs in general-purpose software (for instance, matrix algebra accelerators for scientific software), and applications in DSP, data parallel algorithms, and database management, to name just a few. All these examples show how a less than US$ 400 card may perform several times faster than a US$ 2000, 3 GHz dual-core CPU, when adequately programmed. In fact, GPU acceleration has been successfully used in applications that range from research oriented software (for example gene sequencing, protein folding, and climate models) to industrial systems including chemical reactors simulation, video surveillance and analysis, and augmented reality for task instruction.
However, this good news has a problematic side, which is due to the influence of the hardware evolution in the life cycle of an application. In conventional hardware platforms (e.g., CPUs), the application life cycle is only influenced by major changes in the underlying architecture (for instance, upgrading from 32 to 64 bit processors). These changes are likely to occur seldom in the lifetime of a typical application. With GPUs this is hardly the case, since what can be considered as a “major change” in the underlying architecture happens at a yearly basis, sometimes even twice a year. Therefore, any reasonable design methodology should consider that the hardware that is available at the time of the specification of the application is going to be obsolete during its development phase, and will be two or three generations old at the moment of delivery. Consider now, how would your product look today if the binary was designed for 8 or 16 bit architectures?
This situation is rather unprecedented in the traditional Software Engineering discipline, and puts the software factories into a difficult quandary: either disregard the possibility of GPU-accelerated applications, or face the challenge of redefining the whole life cycle of their processes and products. The fact is that disregarding the inception of inherent parallelism in the underlying hardware platform can be suicidal for software companies, given the current trend in hardware development. Even what we know today as a CPU is probably going to evolve into architectures of several parallel processors in a chip. Examples of this are IBM’s Cell Processors, SUN’s multi-core architecture, and the new Intel and AMD releases. However, programming for those architectures has proven to be very difficult and error-prone. The complexities arising from the subtle interaction between processors and processes is daunting even for world-class software teams. Even though some products like IIS or Database Engines are able to abstract part of the concurrency problem, it is a known fact that unexpected performance hits, and subtle, difficult to diagnose errors are frequent in almost every application.
The software industry is constantly struggling to find creative ways of handling API complexity and obsolescence problems. It always pays back to invest in technologies that increase the support for higher abstraction and reduce the overall burden of the product’s engineering and development. In particular, they make the final products more competitive, and above all, they shorten the release times. In this aspect, GPU programming is subject to the currently strong evolutionary pressure in the games and entertainment industry. For this reason, hardware and software providers are beginning to offer general-purpose programming tools that enable rapid application development. To mention just a few, NVidia CUDA and Microsoft’s Research Accelerator are among the most remarkable examples.
NVidia CUDA provides a data parallel execution model for developing real time mathematical simulation and modelling. Programming is simplified though a special purpose C-like language supporting an adequate level of abstraction from the underlying parallelism and synchronization primitives. Runtime libraries allow for dynamic compiling and binding of GPU programs with the application program. Microsoft’s Research Accelerator, on the other hand, gives a .Net API the power to handle vectorized math procedures on the GPU, dynamically generating and loading the programs on the GPU.
The change is inevitable, independent of the specific shape that those technological breakthroughs are going to adopt, decision makers and developers alike should be aware that they will certainly be forced into a major change in how to specify, design, and implement software.