I appreciate this. It’s a good overview of what it means to be a productive part of a larger context.
I prefer the terms “throughput” for “worker productivity” and “latency” for “work-unit productivity” but I can see why they chose to use their terms.
I assume that they mean that OpenCL, which is a traditional GPGPU language, is a very restrictive subset of either C or C++ (both are options) plus some annotations.
In fact, OpenCL toolchains already use the Clang frontend and the LLVM backend, so the experience of using and compiling them is very close to C++.
The talk mentions all of this; it says that a benefit of using full C++ on the GPU over using OpenCL is that you don’t have to deal with all the annoying restrictions and annotations.