Here's a short CUDA demo from NVidia, of adding two arrays of a million numbers
each, elementwise. The line that actually does the add is
add<<<1, 1>>>(N, x, y);
All N adds are conceptually done in parallel, with no side effects. In practice, hundreds or thousands of adds are done simultaneously, depending on the available hardware.