Its "weird" in the ways that matter: there's no commodity hardware in existence that replicates what a TPU does. The only place to get TPUs is through Google's cloud services.
CPUs are basically Von Neumann Architecture. GPUs (NVidia and AMD) are basically SIMD / SIMT systems.
Google's TPU is just something dramatically different, optimized yes for Matrix Multiplication, but its not something you can buy and use offline.
> but its not something you can buy and use offline.
But you will. The entire point is to put this in a phone, so you can distribute a trained neural net in a way that people can actually use without a desktop and $500-$4,000 GPU.
> But you will. The entire point is to put this in a phone, so you can distribute a trained neural net in a way that people can actually use without a desktop and $500-$4,000 GPU.
As far as I can tell, they put a microphone on your phone and then relay your voice to Google's servers for analysis.
Or Amazon's servers, in the case of the Echo.
I don't see any near-term future where Google's TPUs become widely available for consumers: be it on a phone or desktop. And I'm not aware of any product from the major hardware manufacturers that even attempt to replicate Google's TPU architecture.
NVidia and AMD are sorta going the opposite direction: they're making their GPUs more and more flexible (which will be useful in a wider variety of problems), while Google's TPUs specialize further and further into low-precision matrix multiplications.
Is that the point? I ask, because the "weird" in the TPU is mostly its scale. Its not like you can't do matrix multiplies with the vector units on a CPU or with a GPU. Its really the scale, by that I mean its more elements than what you get with existing hardware, but its also lower precision, and appears less flexible, and is bolted to a heavyweight memory subsystem.
So, in that regard its no more "weird" than other common accelerator/coprocessors for things like compression.
So, in the end, what would show up in a phone doesn't really look anything like a TPU. I would maybe expect a lightweight piece of matrix acceleration hardware, which due to power constraints isn't going to be able to match what a "desktop" level FPGA or GPU is capable of much less a full blown TPU.
Its "weird" in the ways that matter: there's no commodity hardware in existence that replicates what a TPU does. The only place to get TPUs is through Google's cloud services.
CPUs are basically Von Neumann Architecture. GPUs (NVidia and AMD) are basically SIMD / SIMT systems.
Google's TPU is just something dramatically different, optimized yes for Matrix Multiplication, but its not something you can buy and use offline.