Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm talking about realistic kernels, which aren't that simple. sure, we can also say that multiplying two arrays doesn't need it. but everything from gemm to FFT does.


How are atomics absolutely essential for gemm?


>> Ironic, a few days ago, I argued for the use of Fork-join parallelism in most cases (aka: Kernel launch / synchronized kernel exits). Now I find myself arguing the opposite now that we have a topic here with missing atomics. Like... atomics need to be used very, very rarely, but those rare uses are incredibly important.

Did you read my post? Or did you just start counter-arguing me without seeing my full statement?

Most people dipping down into a GPU for parallelism will probably run across a globally-consistent read/write across some data-structure. Especially because things like gemm already have high-performance libraries and there's no damn point writing yet another gemm (unless you're some kind of super-performance expert, the standard libraries are probably way faster than what you can do)

EDIT: If you are going to rely upon global kernel synchronization, chances are your code would work with CUDA Thrust (aka: GPU-accelerated data-structures) rather than dipping down to CUDA directly.


we are talking about shared memory


That makes less sense, since there is literally no algorithm that can't be implemented without shared memory.


I'm not sure you were reading this thread. obviously anything can be written without shared memory, but they will be much, much slower, and using a GPU becomes less appealing. the entire purpose of the article and project is that it's fast, but it can't be anywhere near as fast as most cuda apps until it supports shared memory (not worth arguing about atomics).


If you were reading this thread, you know I responded to your assertion that shared memory is "absolutely essentialy to have" (sic). It was your words, literally. I wasn't arguing that shared memory has no advantages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: