Parallel Programming in Native Code

Parallel programming using C++ AMP, PPL and Agents libraries.

Copying Textures in C++ AMP

Copying Textures in C++ AMP

  • Comments 0

So far, we have already talked about how to construct textures, and moreover, how to construct textures by specifying the bits_per_scalar_element property. Further, we explained how to read from and write to textures in restrict(amp) code. In this post, I’m going to cover how to copy data to and from texture objects, and between texture objects.

Copy-in during construction

First, let’s review how to copy data into a texture during its construction. There are two kinds of constructors that can initialize a texture<T, N> object with data from host memory: passing iterators and passing a raw pointer.

By passing two iterators you can specify the range of input data that you wish to copy to texture memory. Here is an example:

std::vector<float_2> src1(16 * 32);  // code to initialize "src1" is elided
texture<float_2, 2> tex1(16, 32, src1.begin(), src1.end());

Note that you are not allowed to specify bits_per_scalar_element, so the default bits_per_scalar_element is used. Also note that these constructors are not available if the element type T is norm, unorm, or norm/unorm-based short vector types (because such types do not have a default bits_per_scalar_element).

You can also construct a texture object by passing a raw pointer (void *) to the host data, the size of host data in bytes, and the bits_per_scalar_element. The pointer and the size specify the range of the input data. For example,

float src2[1024 * 2]; // code to initialize "src2" is elided
texture<float_2, 1> tex2(1024, src2, 
                         (unsigned int)sizeof(src2) /* size in bytes */, 
                         32U /* bits_per_scalar_element */);
 
char src3[16 * 16]; // code to initialize "src3" is elided
texture<int, 2> tex3(16, 16, src3, 
                     (unsigned int)sizeof(src3) /* size in bytes */, 
                     8U /* bits_per_scalar_element */);

For both kinds of constructors, if the amount of input data is not adequate to initialize the texture, a runtime exception will be thrown.

Copy between host data and texture using global copy functions

We have talked about a rich set of copy functions (in concurrency namespace) to accomplish data transfer which involves array or array_view. For texture, we also provide several global copy functions in concurrency::graphics namespace. They provide the functionality for copying host data to texture, and vice versa. You can copy from host data to texture by specifying the range of input data with a pointer and the size.

template <typename T, int N>
void copy(const void * src, unsigned int src_byte_size, texture<T, N>& dst);

You can also copy to a writeonly_texture_view. The data is copied to the texture object that the view was created upon.

template <typename T, int N>
void copy(const void * src, unsigned int src_byte_size, 
          writeonly_texture_view<T, N>& dst);

For both, if the amount of source data is not sufficient to fill the texture, a runtime exception will be thrown.

You can also copy the data out from the texture by specifying the range of output host memory with a pointer and the size.

template <typename T, int N>
void copy(const texture<T, N>& src, void * dest, unsigned int dest_byte_size);

If the data_length of the source texture is larger than dest_byte_size, a runtime exception will be thrown.

Naturally, there is no ability to copy from a writeonly_texture_view, since the view is write-only. However, you can always copy from the underlying texture, upon which the writeonly_texture_view was created. Note the global copy functions are the only way to copy data from a texture to the host memory. 

Below are some examples of copy-in and copy-out using global copy functions.

float src4[1024 * 2]; // code to initialize "src4" is elided
texture<float_2, 1> tex4(1024);
copy(src4, (unsigned int)sizeof(src4)/* size in bytes */, tex4); //copy-in
 
char src5[16 * 16]; // code to initialize "src5" is elided
texture<int, 2> tex5(16, 16, 8U /* bits_per_scalar_element */);
copy(src5, (unsigned int)sizeof(src5), tex5); // copy-in
 
char dst5[16 * 16];
copy(tex5, dst5, (unsigned int)sizeof(dst5)/* size in bytes */); // copy-out

For each copy function, there is a corresponding copy_async function that performs copy asynchronously. The copy_async function returns a future object that you can wait later. For example,

auto f = copy_async(tex5, dst5, (unsigned int)sizeof(dst5)/* size in bytes */); 
// some other work
f.wait(); // wait for the completion of the copy

Copy_to member method

The texture<T, N> class has two member methods called copy_to:

void copy_to(texture<T, N>  & dest) const;
void copy_to(writeonly_texture_view<T, N> & dest) const;

They can be used to copy from this texture object to another texture object. The two textures could be created on different accelerator_views.  When copying to a writeonly_texture_view, the data is copied to the texture object that the view was created upon.

It’s required that the bits_per_scalar_element’s must be the same, and that the source and destination texture objects have exactly the same extents. If those requirements are not met, a runtime exception will be thrown. 

Summary

In this post, I showed that

  • you can copy from host data to a texture either at the time of construction or using the global copy function later;
  • you can copy data from a texture to host memory using the global copy function;
  • you can copy from one texture to another using the copy_to member method of the source texture object.

As always, your feedback is welcome below or in our MSDN Forum.

Blog - Comment List MSDN TechNet
  • Loading...
Leave a Comment
  • Please add 6 and 7 and type the answer here:
  • Post