Parallel Programming in Native Code

Parallel programming using C++ AMP, PPL and Agents libraries.

Writing to Textures in C++ AMP

Writing to Textures in C++ AMP

  • Comments 0

We have already talked about textures including how to read from textures, and we have shown how to read from textures packed with sub-words. In this post, we‘ll show how to write to textures.

Writing to a texture object

Unlike concurrency::array, which assumes continuous storage and offers pointer semantics for its interior, texture does not. As a result, we cannot provide the write operation via the subscript [] operator (since [] operator returns a const value), so we introduced the set member method for the purpose of assigning value to a texel:

void set(const index<_Rank>& idx, const value_type& value) restrict(amp)

Note that, like the [] operator, the set method is annotated with restrict(amp), thus can only be used in amp restricted code. Here is an example:

texture<int, 1> tex1(16);
parallel_for_each(tex1.extent, [&tex1] (index<1> idx) restrict(amp) {
   tex1.set(idx, 0); // write to tex1
});

Read, and/or write

A texture object is not always both readable and writable. In Direct3D, only a few DXGI_FORMATs support both read and write. Mostly, a texture used in a kernel (corresponding to a parallel_for_each launch in C++ AMP) can either be read-only or write-only, but not both.

In C++ AMP, inside a parallel_for_each, you can perform both read and write to a texture object of type texture<T, N> only if the following three rules are all satisfied:

  1. T has only one scalar element, and
  2. T is not double, norm, or unorm, and
  3. texture is constructed with bits_per_scalar_element of value 32.

Otherwise, only read is allowed and write is disallowed, the texture object is effectively read-only.

Violations of rule 1 and 2 are checked and reported at compilation time via static assertion in the set method for the texture class. Therefore, when the restrictions are violated, the set method cannot be called, thus write is disallowed. For example, if you have code like:

texture<int_2, 1> tex2(16); // int_2 has more than one scalar element
parallel_for_each(extent<1>(16), [&tex2] (index<1> idx) restrict(amp) {
   tex2.set(idx, int_2(1, 1)); // error: attempt to write
});

You will get a compilation error for the write as:

     error C2338: Invalid value_type for set method.

For reporting violations of rule 3, we rely on runtime exceptions. Because the bits_per_scalar_element is not information known at compile-time, the violation is detected at runtime. For example,

texture<int, 1> tex3(16, 8U /* bits_per_scalar_element: 8 */); 
parallel_for_each(extent<1>(16), [&tex3] (index<1> idx) restrict(amp) {
   tex3.set(idx, tex3[idx] + idx[0]); // OK for compilation
});

That code can be compiled successfully. However, when it executes, an unsupported_feature exception will be thrown. The exception error message is:

    unsupported_feature: Both read and write are detected on a texture with bits-per-scalar-element not equal to 32.

Note that the runtime detection is lenient, as it fires only if you perform both read and write on such a texture object.

Write to a writeonly_texture_view object

Now, you are probably asking “how can I write into a texture<T, N> object if the write is disallowed due to one of the aforementioned rules?”. The answer is that you have to do this using the writeonly_texture_view class, which creates a view of a texture object and provides the write-only accesses to it.

You can construct a writeonly_texture_view object from a texture object. Then you can call the set member method (which has no static assertion to check T) to perform the write. For example,

texture<int_2, 1> tex4(16); // int_2 has more than one scalar element
writeonly_texture_view<int_2, 1> wo_tv4(tex4); // create a writeonly view
parallel_for_each(extent<1>(16), [=] (index<1> idx) restrict(amp) {
   wo_tv4.set(idx, int_2(1, 1)); // write
});

The above code can be compiled without error. Just like array_view, a writeonly_texture_view (e.g. “wo_tv4”) is a view or wrapper of the texture container (e.g. “tex4”), and it does not hold the data/storage. Please also note that when authoring the lambda, a writeonly_texture_view object needs to be captured by value, just like array_view. You can also construct writeonly_texture_view<T, N> inside restrict(amp) code, as long as the source texture object allows both read and write (satisfying rule 1 ~ 3).

texture<int, 1> tex5(16); 
texture<int_2, 1> tex6(16); 
parallel_for_each(extent<1>(16), [&tex5, &tex6] (index<1> idx) restrict(amp) {
   tex5.set(idx, idx[0] + tex5[idx]); // OK, tex5 is both readable and writable
   writeonly_texture_view<int, 1> wo_tv5(tex5); // OK
   wo_tv5.set(idx, tex5[idx] + 1); // OK, read from tex5, can write via wo_tv5
   writeonly_texture_view<int_2, 1> wo_tv6(tex6); // Error, tex6 is not writable
}); 

The construction and use of “wo_tv5” is fine. The creation of “wo_tv6” triggers a static assertion because the T of “tex6” is int_2, which has more than one scalar element (violating rule 1):

   error C2338: Invalid value_type for the constructor.

Summary

In this post, I have covered how to write to texture objects. One topic that I have not discussed yet is how to copy the content of a texture object (the results of the writes) back to the host memory. In a future blog post, we will talk about texture copying. Stay tuned! As always, your feedback is welcome below or in our MSDN Forum.

 

Blog - Comment List MSDN TechNet
  • Loading...
Leave a Comment
  • Please add 8 and 3 and type the answer here:
  • Post