Parallel Programming in Native Code

Parallel programming using C++ AMP, PPL and Agents libraries.

“Hello world” using Textures in C++ AMP

“Hello world” using Textures in C++ AMP

  • Comments 0

So far, we have showed you how to use textures in C++ AMP with a series of blog posts. In this post, inspired by our “Hello world in C++ AMP” post, I will share a “Hello World” example demonstrating the main ingredients of using textures in C++ AMP.

“Hello world” code

Using the same preparation steps, you should have an empty “Source.cpp” code file, where you can type in (or copy paste) the following lines of code.

   1: #include <iostream>
   2: #include <amp.h>
   3: #include <amp_graphics.h>
   4: using namespace concurrency;
   5: using namespace concurrency::graphics;
   6:  
   7: int main()
   8: {
   9:    char v[11] = {'G', 'd', 'k', 'k', 'n', 31, 'v', 'n', 'q', 'k', 'c'};
  10:  
  11:    const texture<int, 1> tex_in(11, v, 11U, 8U);
  12:    texture<int, 1> tex_out(11, 8U);
  13:    writeonly_texture_view<int, 1> tex_out_view(tex_out);
  14:    parallel_for_each(tex_in.extent, [=, &tex_in](index<1> idx) restrict(amp) { 
  15:        tex_out_view.set(idx, tex_in[idx] + 1);
  16:    });
  17:    copy(tex_out, v, 11U);
  18:  
  19:    for(int i = 0; i < 11; i++) 
  20:       std::cout << v[i];   
  21:    return 0;
  22: }

… and this is the “Hello world” output when you compile and run the code from the command line.

image

Code walk through

Now let’s walk through the code. I will only focus on the texture specific code.

First of all, to start using texture you need to include amp_graphics.h header (line 3) and use the concurrency::graphics namespace (line 5).

At line 11, I create a texture<int, 1> by specifying bits_per_scalar_element as 8, which is the same as the size of char. In the constructor, I also initialized the texture by copying the content directly from v to tex_in. The size to be copied is 11 bytes. Recall that if a texture is constructed with bits_per_scalar_element not equal to 32, it is effectively read-only inside restrict(amp) code.

Since we use tex_in for the input, I need another texture for output. So I created tex_out in line 12.

Because I want to write to tex_out, I also created a writeonly_texture_view on top of it. This is shown in line 13.

At line 14, tex_out_view is captured by value using the default capturing clause “=”, and tex_in is explicitly captured by reference.

At line 15, inside the lambda passed to the parallel_for_each, I read the value from tex_in, add 1 to it, and write the result to tex_out_view.

At line 17, I copy the content from tex_out to host side array v using the global copy function. The size to be copied is 11 bytes.

In Closing

Comparing the difference between the two “Hello world” examples is left as an exercise to the reader. Your questions and feedback are always welcome below or at our MSDN forum!

Blog - Comment List MSDN TechNet
  • Loading...
Leave a Comment
  • Please add 2 and 4 and type the answer here:
  • Post