I'm not really what he means either. If he was talking about kernel launches, there'd be no way he was getting 200k/sec since there's a bit of a launch latency (tens of milliseconds); the EngineYard program that one of the members posted on the CUDA forum can do over 200 million SHA hashes on a good card. One member said that he was getting over a billion hashes/sec on his multi-GPU setup.
By iteration, I mean one loop of my algorithm. I'm not brute forcing. There are a few other calculations I'm doing in addition to the hamming distance and the hash. I'm relying heavily on the random characters you get to append.
Hamming distance? Lowest looks like 31, and I've seen a bunch around there, but to save on iterations it's only dumping out every 1 million comparison, or ones less than 20.
What's notable is that my average is in the mid 30s. Seems like these JS based computers higher. Not sure why this is.