concurrency::array_view – discard_data

Article
08/06/2012

The previous posts in this series on C++ AMP array_view covered:

This post will describe the discard_data function on array_views.

discard_data

Often, your algorithm may need to use an array_view purely as an output, and the current contents of the array_view are inconsequential. When accessing such an output array_view on any accelerator_view for a computation, copying the current contents underlying the array_view to the accelerator is undesirable. C++ AMP provides the discard_data method as a means for you to indicate to the runtime that the contents of portion of data underlying the array_view are not interesting and need not be copied when the array_view is accessed on an accelerator_view where the array_view is not already cached. Calling this method on an array_view can be thought of as trashing the existing contents of the portion of data referenced by the array_view and applies to the array_view and all other array_views that are its sections or projections.

It is important to note that the effect of discard_data is transient – as soon as a discarded array_view is written to (on the host or any accelerator_view), the contents of the array_view become valid and are no longer considered discarded. Hence if an array_view is discarded, then written to on accelerator_view “av1 ” and then accessed on another location, the new contents of the array_view are transferred over from “av1” to the next location where it is accessed.

Another use of the discard_data method is to avoid any unwanted implicit synchronization upon destruction of the last array_view of a data source. Typical scenarios requiring this would be where temporary data sources and array_views over them are used for storing intermediate results in an algorithm and their final contents need not be synchronized to the data source.

Guidelines on discard_data

Guideline A: For an array_view that is exclusively used for output in your CPU code or a parallel_for_each invocation, call discard_data on the array_view before capturing the array_view in the parallel_for_each kernel or accessing it in your CPU code.

Keeping in mind the transient effects of a discard_data call, C++ AMP programmers are encouraged to call discard_data on the output array_view just before the parallel_for_each invocation, to serve the dual purpose of self-documenting the output-only nature of an array_view in the parallel_for_each and also to ensure that effects of the discard are not inadvertently lost due to intermediate writes to the array_view between the discard_data call and the parallel_for_each invocation where the array_view is used for output.

Guideline B: Call discard_data on array_views created on temporary data sources after they are no longer needed, before array_views on such data sources are destructed.

This would avoid any pending modifications for such array_views (on locations other that the data source’s home location), to be implicitly synchronized to the data source upon destruction of the array_views.

 template <typename BinaryFunction>
 float ReduceRow(const array_view<const float> &rowView,
                 const BinaryFunction &func)
 {
     std::vector<float> tempVec(rowView.extent[0] / 2);
     array_view<float> tempView(tempVec.size(), tempVec);
  
     // Guideline A: Call discard_data on an array_view that is to be used
     // purely as an output, to avoid unnecessary copying to the accelerator_view
     tempView.discard_data();
  
     parallel_for_each(tempView.extent, [=](index<1> idx) restrict(amp) {
         float a = matrixView(rowToReduce, idx[0]);
         float b = matrixView(rowToReduce, idx[0] + (numCols / 2));
         tempView(idx) = func(a, b); 
     });
  
     for (int stride = tempView.extent[0] / 2; stride > 0; stride /= 2) 
     {
         parallel_for_each(extent<1>(stride), [=](index<1> idx) restrict(amp) {
             float a = tempView(idx);
             float b = tempView(idx[0] + stride);
             tempView(idx) = func(a, b);
         });
     }
  
     int result;
     copy(tempView.section(0, 1), &result);
  
     // Guideline B: Call discard_data on the temporary view before it goes 
     // out of scope to avoid the view from being synchronized to the CPU on destruction
     tempView.discard_data();
  
     return result;
 }

In closing

In this post we looked at the discard_data function for array_views. Subsequent posts will dive into other functional and performance aspects of array_view - stay tuned!

I would love to hear your feedback, comments and questions below or in our MSDN forum.

concurrency::array_view – discard_data

discard_data

Guidelines on discard_data

In closing

Additional resources