Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's my understanding that fp16 (available on the previous generation P100) and mixed-precision (major innovation of V100) are different things and the speedup of TensorCores is entirely missing from this benchmark. Unlike the general purpose P100, the TPU is a heavily optimized chip built for Deep Learning, hence it's performance increase. However, the V100 is also heavily optimized for Deep Learning (arguably the first non-GPU chip) from NVIDIA. I'm in no position to defend NVIDIA here haha but it seems like the benchmark misses the point if this is indeed the case.


It was my understanding that the TensorFlow benchmarks do make use of TensorCores on the V100. We'll verify and update accordingly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: