Parallel Programming in Native Code

Parallel programming using C++ AMP, PPL and Agents libraries.

concurrency::array_view –array_views on staging arrays

concurrency::array_view –array_views on staging arrays

  • Comments 0

The previous posts in this series on C++ AMP array_view covered:

  1. Introduction to array_view and some of its key semantic aspects
  2. Implicit synchronization on destruction of array_views
  3. array_view discard_data function
  4. Caching and coherence policies underlying array_view implementation

In this post we will look at using array_views with staging arrays.

array_views with a staging array as data source

As described in a previous post, C++ AMP provides staging arrays for efficient data transfers between the host and accelerators. A staging array can only be accessed on the accelerator_view where it is allocated and additionally has an associated accelerator_view (indicated by the get_associated_accelerator_view method of concurrency::array) to/from which it can be copied efficiently. When using a staging array as the host memory data source for an array_view, any implicit data transfers from the staging array data source to its associated accelerator_view are faster compared to an array_view on regular (non-staging) host memory where an extra intermediate copy to a temporary staging buffer is performed.

Staging arrays have certain limitations that you must be aware of should you choose to use them as the data source for array_views. It is NOT safe to access a staging array when a copy from (or to) that staging array is concurrently in progress. Hence, for an array_view with a staging array as its data source, any operation that may result in transfer of data from the staging array data source to its associated accelerator_view (or vice versa) must not be concurrently executed with another operation accessing the array_view on the CPU or another accelerator_view where the array_view is not already cached. Any such concurrent operations have undefined behavior (for example may cause an access violation error).

Guidelines regarding using staging array as array_view data source

 Guideline A: Consider using staging arrays as your array_view data source if the view is to be accessed only on the host plus exactly one accelerator_view.

accelerator_view cpuAv = accelerator(accelerator::cpu_accelerator).default_view;
 
// Guideline A: Use a staging array as the data source for an array_view
// to be used in a parallel_for_each computation, for faster transfer of data
// between the CPU and the accelerator
std::vector<float> sourceVec(size);
float *hostPtr = sourceVec.data();
concurrency::array<float> sourceArray(size, cpuAv, accelerator().default_view);
float *hostPtr = sourceArray.data();
 
std::generate(hostPtr, hostPtr + size, rand);
 
// Using a staging array as the data source for the array_view
// results in faster transfer of data from the CPU to the accelerator_view
// where the parallel_for_each kernel executes
array_view<float> dataView(size, sourceVec);
array_view<float> dataView(sourceArray);
parallel_for_each(dataView.extent, [=](index<1> idx) restrict(amp) {
    dataView(idx) = fast_math::cos(dataView(idx));
});
 
// Using a staging array as the data source for the array_view
// also results in faster transfer of data from the accelerator_view
// to the CPU
dataView.synchronize();

 

Guideline B: Exercise extreme caution when using array_views over staging arrays in multi-threaded CPU code that can potentially access such array_views concurrently from multiple threads. As described earlier such accesses have undefined behavior and may result in fatal errors. 

accelerator_view cpuAv = accelerator(accelerator::cpu_accelerator).default_view;
concurrency::array<float> sourceArray(size, cpuAv, accelerator().default_view);
float *hostPtr = sourceArray.data();
std::generate(hostPtr, hostPtr + size, rand);
 
array_view<const float> sourceView(sourceArray);
array_view<float> outputView(array<float>(size));
 
std::vector<float> sourceCopy(size);
concurrency::task<void> t([&]() {
    for (int i = 0; i < size; ++i) {
        sourceCopy[i] = sourceView[i];
    }
});
 
// Guideline B violation: An array_view over a staging array should
// not be concurrently accessed on the CPU as in the concurrency::task above
// (or another accelerator_view) with an operation that transfers data from
// the staging array to the associated_accelerator_view of the staging array
// (the parallel_for_each invocation results in such a transfer here)
parallel_for_each(sourceView.extent, [=](index<1> idx) restrict(amp) {
    outputView(idx) = fast_math::cos(sourceView(idx));
});

 

In closing

In this post we looked at some key aspects regarding using array_views over staging arrays as their data source. Subsequent posts will dive into other functional and performance aspects of array_view - stay tuned!

I would love to hear your feedback, comments and questions below or in our MSDN forum.

Blog - Comment List MSDN TechNet
  • Loading...
Leave a Comment
  • Please add 8 and 7 and type the answer here:
  • Post