Parallel Programming in Native Code

Parallel programming using C++ AMP, PPL and Agents libraries.

Interop with Direct3D11 textures in C++ AMP

Interop with Direct3D11 textures in C++ AMP

  • Comments 0

We have seen how to create and use textures in C++ AMP in this post on concurrency::graphics. We have also seen how to interop between C++ AMP and the Direct3D API for buffers and devices. In this blog post we will learn how to create a C++ AMP texture from an existing Direct3D11 texture object, and vice versa, using the C++ AMP interop APIs for textures.

Reasons to Interop

There are various reasons why you may need to Interop with a Direct3D texture object. Here are some:

1. Your app needs to render the results on the texture object associated with the UI. Using Interop, you can get the texture object of the back buffer associated with the window using the IDXGISwapChain and update it directly in the C++ AMP kernel.

2. The texture class in C++ AMP lets us create textures with the most common DXGI_FORMATs. Using Interop, you can create a C++ AMP texture with the specific format required by your app.

3. The texture class in C++ AMP only has access to only one mipmap level. If your app needs multiple MIP levels, then you can create the texture using Interop. Only the first mipmap level is operable with C++ AMP but the other mipmap levels can be accessed using DirectX.

4. C++ AMP doesn’t support the sampling/filtering functionality of textures. If you need to use this in your app, you can get the underlying Direct3D texture from C++ AMP and use sampling in HLSL.

Interop APIs in C++ AMP

C++ AMP supports the following Direct3D Interop APIs in the concurrency::graphics::direct3d namespace:

1. Make texture<T, N> from a Direct3D texture object

template <typename T, int N>
texture<T,N> make_texture(const concurrency::accelerator_view & av, const IUnknown* pTexture);

2. Get the underlying Direct3D texture object from a texture<T, N>

template <typename T, int N>
IUnknown * get_texture<const texture<T, N>& texture);

3. Get the underlying Direct3D texture object from a writeonly_texture_view<T, N>

template <typename T, int N>
IUnknown * get_texture<const writeonly_texture_view<T, N>& textureView);

The type of the IUnknown* object depends on the rank of the texture. The table below shows the mapping between the rank and the Direct3D resource you need you use.

  C++ AMP texture

  Direct3D resource interface

  texture<T, 1>

  ID3D11Texture1D

  texture<T, 2>

  ID3D11Texture2D

  texture<T, 3>

  ID3D11Texture3D


Creating textures with specific DXGI_FORMAT

If you create a C++ AMP texture directly via the provided constructors, it maps the combination of the texture type and bits_per_scalar_element to an exact DXGI format. The table below summarizes this mapping. Your app may need a format which is not part of this table, or perhaps the default mapping doesn’t work for you. For those cases you can use the Interop API from the previous section to create a texture<T, N> with a DXGI_FORMAT of your choice.

  Value Type

  Bits per scalar
  element

  DXGI_FORMAT

  int

  8

  DXGI_FORMAT_R8_SINT

 

  16

  DXGI_FORMAT_R16_SINT

 

  32

  DXGI_FORMAT_R32_SINT

  int_2

  8

  DXGI_FORMAT_R8G8_SINT

 

  16

  DXGI_FORMAT_R16G16_SINT

 

  32

  DXGI_FORMAT_R32G32_SINT

  int_4

  8

  DXGI_FORMAT_R8G8B8A8_SINT

 

  16

  DXGI_FORMAT_R16G16B16A16_SINT

 

  32

  DXGI_FORMAT_R32G32B32A32_SINT

  uint

  8

  DXGI_FORMAT_R8_UINT

 

  16

  DXGI_FORMAT_R16_UINT

 

  32

  DXGI_FORMAT_R32_UINT

  uint_2

  8

  DXGI_FORMAT_R8G8_UINT

 

  16

  DXGI_FORMAT_R16G16_UINT

 

  32

  DXGI_FORMAT_R32G32_UINT

  uint_4

  8

  DXGI_FORMAT_R8G8B8A8_UINT

 

  16

  DXGI_FORMAT_R16G16B16A16_UINT

 

  32

  DXGI_FORMAT_R32G32B32A32_UINT

  float

  8

  -

 

  16

  DXGI_FORMAT_R16_FLOAT

 

  32

  DXGI_FORMAT_R32_FLOAT

  float_2

  8

  -

 

  16

  DXGI_FORMAT_R16G16_FLOAT

 

  32

  DXGI_FORMAT_R32G32_FLOAT

  float_4

  8

  -

 

  16

  DXGI_FORMAT_R16G16B16A16_FLOAT

 

  32

  DXGI_FORMAT_R32G32B32A32_FLOAT

  double

  8

  -

 

  16

  -

 

  32

  -

 

  64

  DXGI_FORMAT_R32G32_UINT

  double_2

  8

  -

 

  16

  -

 

  32

  -

 

  64

  DXGI_FORMAT_R32G32B32A32_UINT

  unorm

  8

  DXGI_FORMAT_R8_UNORM

 

  16

  DXGI_FORMAT_R16_UNORM

 

  32

  -

  unorm_2

  8

  DXGI_FORMAT_R8G8_UNORM

 

  16

  DXGI_FORMAT_R16G16_UNORM

 

  32

  -

  unorm_4

  8

  DXGI_FORMAT_R8G8B8A8_UNORM

 

  16

  DXGI_FORMAT_R16G16B16A16_UNORM

 

  32

  -

  norm

  8

  DXGI_FORMAT_R8_SNORM

 

  16

  DXGI_FORMAT_R16_SNORM

 

  32

  -

  norm_2

  8

  DXGI_FORMAT_R8G8_SNORM

 

  16

  DXGI_FORMAT_R16G16_SNORM

 

  32

  -

  norm_4

  8

  DXGI_FORMAT_R8G8B8A8_SNORM

 

  16

  DXGI_FORMAT_R16G16B16A16_SNORM

 

  32

  -

 

Interop Example aka “Show me the Code”

Let us suppose your application processes an RGBA image. Further, in some kernels the underlying data is used as 8-bit unorm and elsewhere it is used as 8-bit integers. You can implement this by creating a Direct3D texture with a TYPELESS format such DXGI_FORMAT_R8G8B8A8_TYPELESS and then creating 2 different C++ AMP textures using Interop; one to interpret the data as integers and another to interpret the data as unorms.

First we will create a direct3d device to create the texture on.

D3D_FEATURE_LEVEL featureLevels[] ={D3D_FEATURE_LEVEL_11_1, D3D_FEATURE_LEVEL_11_0};
UINT numlevels = sizeof(featureLevels) / sizeof(D3D_FEATURE_LEVEL);
ID3D11Device* dx_device = nullptr;
ID3D11DeviceContext* immediateContext = nullptr;

HRESULT hr = D3D11CreateDevice(nullptr /* default adapter */,
                               D3D_DRIVER_TYPE_HARDWARE,
                               nullptr,
                               0,
                               featureLevels,
                               numlevels,
                               D3D11_SDK_VERSION,
                               &dx_device,
                               nullptr,
                               &immediateContext);

if(hr != S_OK)
{
     cout << "Failed to create direct3d device: " 
          << std::hex 
          << hr
          << std::endl;
     return 1;
}

Next we will create the D3D11_TEXTURE2D_DESC describing the properties of the texture object such as its height, width, format etc.

//Initialize the properties of the underlying texture object
const int height = 10;
const int width = 10;

D3D11_TEXTURE2D_DESC desc;
ZeroMemory(&desc, sizeof(desc));
desc.Height = height;
desc.Width = width;
desc.MipLevels = 1;
desc.ArraySize = 1;
desc.Format = DXGI_FORMAT_R8G8B8A8_TYPELESS;
desc.SampleDesc.Count = 1;
desc.SampleDesc.Quality = 0;
desc.Usage = D3D11_USAGE_DEFAULT;
desc.BindFlags = D3D11_BIND_UNORDERED_ACCESS | D3D11_BIND_SHADER_RESOURCE;
desc.CPUAccessFlags = 0;
desc.MiscFlags = 0;

We are now ready to create a Direct3D texture using the D3D11_TEXTURE2D_DESC and the target device. The CreateTexture2D API can be used to create a Direct3D texture of rank 2.

// Now create the direct3d texture object
ID3D11Texture2D *dx_texture = nullptr;
HRESULT hr = dx_device->CreateTexture2D( &desc, NULL, &dx_texture );

if(hr != S_OK)
{
    
cout << "Failed to create ID3D11Texture2D. Exit code: "
          << std::hex 
          << hr
          << std::endl;
    
return;
}

Let’s create the C++ AMP textures from the Direct3D texture using the make_texture API.

// get the accelerator_view associated with the direct3d texture
accelerator_view acc_view = create_accelerator_view(dx_device);
std::wcout << acc_view.accelerator.description << std::endl;

// texture with uint data
texture<uint4, 2> amp_texture_int = make_texture<uint4, 2>(
                                        acc_view,
                                       
dx_texture);

// texture with unorm data
texture<unorm4, 2> amp_texture_unorm = make_texture<unorm4, 2>(
                                        acc_view, 
                                        dx_texture);

Finally, let’s look at how the properties of the Interop texture are used by C++ AMP

1. The Width, Height (for 2D and 3D textures) and Depth(for 3D textures) are used to determine the extent of the C++ AMP texture.

2. MipLevels specify the number of mipmaps in the texture. C++ AMP textures have access only to the first mipmap.

3. Format is used to determine the size of each texel. The number of components and their type is deduced from the type of the C++ AMP texture created. In the code above, the format R8G8B8A8_TYPELESS tell C++ AMP that each texel is 32 bits long. The type texture<uint4, 2> tells C++ AMP to interpret each texel as four short vectors of type uint.

4. BindFlags are used to determine if the texture will be using for reading, writing or both. D3D11_BIND_SHADER_RESOURCE allows for reading from the texture and D3D11_BIND_UNORDERED_ACCESS allows for writing to the texture.

These two textures can now be used as needed depending on the data type used by the kernel. In the code snippets below, we set each pixel to blue, first using integer values and then using unorm values.

// use as integer
writeonly_texture_view<uint4, 2> amp_texture_view(
                                 amp_texture_int);

parallel_for_each(amp_texture_int.extent, 
                 [amp_texture_view](index<2> idx)
                
restrict(amp)
{
    // Set each pixel to blueish color
   
uint4 color(0,0,0x7f, 0xff);
   
amp_texture_view.set(idx, color);
});

vector<char> color_from_int(amp_texture_int.data_length);
copy(amp_texture_int,
     color_from_int.data(),
     amp_texture_int.data_length);

// use as unorm
writeonly_texture_view<unorm4, 2> amp_texture_norm_view(
                                  amp_texture_unorm);

parallel_for_each(amp_texture_unorm.extent, 
                 [amp_texture_norm_view](index<2> idx) 
                 restrict(amp)
{
    // Set each pixel to blueish color
    // 0.498f = 0x7f/0xff
    unorm4 color(0,0,0.498f, 1.0f);
    amp_texture_norm_view.set(idx, color);
});

vector<char> color_from_unorm(amp_texture_unorm.data_length);
copy(amp_texture_unorm,
     color_from_unorm.data(),
     amp_texture_unorm.data_length);

If we copy back the textures and compare the values set by each kernel, we find that the values are equivalent. Here is the output when looking at the first pixel of the image

cout << std::hex << endl;
cout << "Color from int texture: " 
     << (unsigned int)color_from_int[0] << " "
     << (unsigned int)color_from_int[1] << " "
     << (unsigned int)color_from_int[2] << " "
     << (unsigned int)color_from_int[3] << " " 
     << endl;

cout << "Color from unorm texture: "
     << (unsigned int)color_from_int[0] << " "
     << (unsigned int)color_from_unorm[1] << " "
     << (unsigned int)color_from_unorm[2] << " "
     << (unsigned int)color_from_unorm[3] << " " 
     << endl;

Output:

Color from int texture: 0 0 7f ffffffff
Color from unorm texture: 0 0 7f ffffffff

Now let us look at some the properties of one these interop textures.

cout << "Extent: "
     << amp_texture_int.extent[0] 
     << ","
    
<< amp_texture_int.extent[1] 
     << endl;

cout << "Data Length: " 
     << amp_texture_int.data_length 
     << endl;

cout << "Bits per scalar Element: "
     << amp_texture_int.bits_per_scalar_element
     << endl;

Output:

Extent: 10,10
Data Length: 400
Bits per Scalar Element: 0

The extent is the same as the height and width specified: [10, 10]. The data length can be calculated using the format used - DXGI_FORMAT_R8G8B8A8_TYPELESS. Each texture element has 4 scalar elements of 8 bits each. This gives us (10 * 10) * (4 scalar elements * 8 bits per scalar element) = (10 * 10) * (4 * 1) bytes = 400 bytes.

However, the bits_per_scalar_element is not set to a valid value. It is set to 0. This is because C++ AMP does not try to map the underlying DXGI_FORMATs to a valid bits_per_scalar_element. This may not even be possible in some cases. C++ AMP provides no extra checks to make sure that the underlying data can be safely interpreted as the type of the texture. For example, if we change the previous example to use DXGI_FORMAT_R8G8B8A8_UINT format to create the Direct3D texture and create an Interop texture<float, 2> in C++ AMP. The texture creation will be successful. However, when launching the application in debug mode (compile with /MDd or /MTd), we will see the following error from DirectX runtime. In Release mode, this error will not be seen and the behavior is undefined.

Failed to dispatch kernel.
ID3D11DeviceContext::Dispatch: The resource return type for component 0 declared in the shader code (FLOAT) is not compatible with the resource type bound to Unordered Access View slot 0 of the Compute Shader unit (UINT). This mismatch is invalid if the shader actually uses the view (e.g. it is not skipped due to shader code branching).

Other invalid format bindings may result in runtime_exception from C++ AMP such as the one below.

Caught runtime exception: Invalid D3D texture argument. Error code: 80070057

Block compressed formats can be created using Interop and used in kernels in C++ AMP but copying data from these textures to/from a host container or to another texture using C++ AMP is not supported and will result in undefined behavior. Video resource formats are not supported in the Visual Studio 2012 release of C++ AMP.

After you have finished using the texture, you need to call Release() to decrement the reference count on the interface object. This is because C++ AMP implicitly calls AddRef() on the underlying texture resource in the interoperability APIs.

In closing…

That concludes the introduction to creating C++ AMP texture using Interop. If you have any questions, please ask them below or in our MSDN forum. If this is an area of interest, you should look forward to a sample app that we will publish showing how to manipulate images using C++ AMP and texture interop – stay tuned.

Blog - Comment List MSDN TechNet
  • Loading...
Leave a Comment
  • Please add 2 and 4 and type the answer here:
  • Post