Cover for C++ AMPWe are pleased to announce the new book C++ AMP: Accelerated Massive Parallelism with Microsoft Visual C++ by Kate Gregory and Ade Miller. C++ AMP lets you capitalize on the fast GPU processors in today’s computers through the C++ AMP code library, bringing massive parallelism to your project. Experienced C++ developers will learn parallel programming fundamentals with C++ AMP through detailed examples, code snippets, and case studies.

The case studies include:

· An “N-body” case study that uses several different implementations of the classic n-body problem, which models particle movement under gravity, intended to show you how to use C++ AMP to get the most out of your GPU hardware in a computational application

· A “Cartoonizer” case study that demonstrates braided parallelism, using both the available cores on the CPU and any available GPU(s). This project processes video into simpler “cartoonized” images, using two different approaches to solve the problem.

· A “Reduction” case study that shows twelve different implementations of the reduce algorithm. The book shows and discusses each implementation’s performance characteristics and the trade-offs associated with each.

You’ll discover how to:

· Gain huge code performance improvement using graphics processing units (GPUs)

· Choose accelerators that enable you to write code for GPUs

· Program code using the Microsoft DirectX platform

· Apply thread tiles, tile barriers, and tile static memory

· Debug C++ AMP code with Microsoft Visual Studio

· Use profiling tools to track the performance of your code

The full table of contents appears below. You can purchase the book here: https://www.microsoftpressstore.com/store/c-plus-plus-amp-9780735664739.

Chapter 1 : Overview and C++ AMP Approach

Why GPGPU? What Is Heterogeneous Computing?
Technologies for CPU Parallelism
The C++ AMP Approach
Summary

Chapter 2 NBody Case Study

Prerequisites for Running the Example
Running the NBody Sample
Structure of the Example
CPU Calculations

Chapter 3 C++ AMP Fundamentals

array<T, N>
accelerator and accelerator_view
index<N>
extent<N>
array_view<T, N>
parallel_for_each
Functions Marked with restrict(amp)
Copying between CPU and GPU
Math Library Functions
Summary

Chapter 4 Tiling

Purpose and Benefit of Tiling
tile_static Memory
tiled_extent
tiled_index<N1, N2, N3>
Modifying a Simple Algorithm into a Tiled One
Effects of Tile Size
Choosing Tile Size
Summary

Chapter 5 Tiled NBody Case Study

How Much Does Tiling Boost Performance for NBody?
Tiling the n-body Algorithm
Using the Concurrency Visualizer
Choosing Tile Size
Summary

Chapter 6 Debugging

First Steps
GPU Debugging Basics
Seeing Threads
Taking More Control
Summary

Chapter 7 Optimization

An Approach to Performance Optimization
Analyzing Performance
Optimizing Memory Access Patterns
Optimizing Computation
Summary

Chapter 8 Performance Case Study—Reduction

The Problem
Case Study Structure
CPU Algorithms
C++ AMP Algorithms
Summary

Chapter 9 Working with Multiple Accelerators

Choosing Accelerators
Using More Than One GPU
Swapping Data among Accelerators
Dynamic Load Balancing
Braided Parallelism
Falling Back to the CPU
Summary

Chapter 10 Cartoonizer Case Study

Prerequisites
Running the Sample
Structure of the Sample
The Pipeline
The Pipeline Cartoonizing Stage
Using Multiple C++ AMP Accelerators
Cartoonizer Performance
Summary

Chapter 11 Graphics Interop

Fundamentals
Using Textures and Short Vectors
HLSL Intrinsic Functions
DirectX Interop
Summary

Chapter 12 Tips, Tricks, and Best Practices

Dealing with Tile Size Mismatches
Initializing Arrays
Function Objects vs. Lambdas
Atomic Operations
Additional C++ AMP Features on Windows 8
Time-Out Detection and Recovery
Double-Precision Support
Debugging on Windows 7
Additional Debugging Functions
Deployment
C++ AMP and Windows 8 Windows Store Apps
Using C++ AMP from Managed Code
Summary

Appendix Other Resources

More from the Authors
Microsoft Online Resources
Download C++ AMP Guides
Code and Support
Training