giuseppebilotta, Whenever I can, I try to end each lesson with a provocation.
When we finished our first trivial #OpenCL program, I showed them how the kernel runtime plus data transfer runtime actually made GPUs “not convenient”, as a prelude to illustrating the usefulness of memory pinning and buffer (un)mapping to improve data transfer efficiency and avoiding them when possible.
We're still working on that trivial program, so today I showed them how number of elements affects performance.
1/2