The Nvidia Jetson Nano costs the same and is less likely to be killed-off after you’ve invested in the platform.
Jetson is actually an important product for Nvidia and Google tends to kill off this type of pet project.
Google/alphabet might have more success with their side-bets if they spun them out as separate companies like Xiaomi and Haier (both Chinese) seem to do.
They might be great for inference with tensorflow - but from what I can tell from Google's documentation, Coral doesn't support training at all.
I'm sure an ML accelerator that doesn't support training will be great for applications like mass-produced self-driving cars. But for hobbyists - the kind of people who care about the difference between a $170 dev board and a $100 dev board - being unable to train is a pretty glaring omission.
You wouldn't want to use it for training: This chip can do 4 INT8 TOPs with 2 watts. A Tesla T4 can do 130 INT8 TOPs with 70 watts, and 8.1 FP32 TFLOPs.
Assuming that ratio holds, you'd maybe get 231 GFLOPs for training. The Nvidia GTX 9800 that I bought in 2008 gets 432 GFLOPs according to a quick Google search.
Hobbyists don't care about power efficiency for training, so buy any GPU made in the last 12 years instead, train on your desktop, and transfer the trained model to the board.
On the other hand, it would be useful for people experimenting with low-compute online learning. Also, those types of projects tend to have novel architectures that benefit from the generality of a GPU.
You can get pretty much any GPU at pre-COVID prices right now, except for the newest generation NVIDIA GPUs that just came out to higher-than-expected demand.
If you want to train yet-another-convnet sure, but there could be applications where you want to train directly on a robot with live data, as in interactive learning.
Google is pretty invested in TPUs for their own workloads but I fail to see any durable encouragement of them as an external product. At best they're there to encourage standalone development of applications/frameworks to be deployed on Google Cloud (IMHO of course).
AFAIK, apart from toy dev boards like this, you can't buy a TPU, you can only rent access to them in the cloud. I wouldn't want my company to rely on that. What if Google decides to lock you out? If you've adapted your workload to rely on TPUs, you'd be fucked.
They're nothing alike at all. Similar to how a low end laptop GPU differs from a top of the line NVIDIA datacenter offering. Google's cloud TPU offering is the strongest ML training hardware that exists, the edge devices simply support the same API.
Yea I've been wondering about charts I've seen comparing tpu model quality perf to gpu model quality like here [1], whether that could be due to error correction. At the same time training on gaming gpus like 1080 ti or 2080 ti is widely popular, though they lack the ECC memory of the "professional" quadro cards or V100. I did think conventional DL wisdom said "precision doesn't matter" and "small errors don't matter" though.
I've noticed this difference in quality perf in my own experiments tpu vs gaming gpu, but don't know for sure what the cause is. I never did notice a difference between gaming gpu trained models and quadro trained modela. Have more info/links?
Until you want to use Pytorch or another non tensor flow framework the support goes down dramatically. Jetson Nano supports more frameworks out of the box quite well, and it ends up being same cuda code you run on your big Nvidia cloud servers
That benchmark appears to compare full precision fp32 inference on the nano with uint8 inference on the coral, that floor wiping comes with a lot of caveats
jetson's x1 core (note: not arms new x1 architecture!) is already 5 years old. once upon a time that would scare me, but now, it seems almost comedically safe to say "I guess it's not going anywhere!"
"The fastest SBC at CPU tasks priced below $100 is the Raspberry Pi."
The Odroid N2+ costs $79 and is over twice as fast as the Pi4.
The Khadas Vim3 costs $100 and is about 30-40% faster than the Pi4.
The number of SBC boards out there is becoming huge; although the PI price has dropped significantly wrt performance and features (especially RAM), there's a lot of comeptition, and it's growing.
It uses an Amlogic S922X aka the G12B. Support is generally pretty good, there's a dedicated community that has been very active pushing upstream[1].
Except the ARM G51 Bifrost gpu, which has only recently started to see viability[2] thanks to one hacker's reverse engineering. If you want to read a lot of words, there's a status report from the libreeelec Kodi-based media player distribution distribution that's a year old, that lays out a lot of what needs be done, from a very video-intense perspective[3]; this is before recent reverse engineering efforts, & largely discusses uses closed proprietary blobs, but still interesting. Most recently & very interestingly, there are signs that ARM itself may be willing to start helping out the reverse engineered development[4], which would be a new potentially interesting state of affairs.
According to the Armbian (one distro to support them all:^) page, mainline kernel support is complete, although they say there still could be some network problems. From what I read on their forum, the Hardkernel Ubuntu-based image is currently more stable than the Armbian one.
The $63 N2+ has the latest "C" rev of the S922X, which is a dual 2.4GHz A73 + 4x A53 and a "MP6" variety of bifrost GPU, a G51. The C4 has the newer S9005X3, which has 4x 2GHz A55 cores and a smaller G31 bifrost gpu. Those A55's, while improved over the A53's, are going to be sigificantly outmatched by the A73 cores on the N2+.
The H2+ has an Intel J4115 Atom celeron running 2.3GHz all-core, which I expect would trounce these ARM chips. It's also $120.
Alas there hasn't been any update to the excellent Exynos5422 that started HardKernel's/Odroid's ascent as the XU4. Lovely 2GHz Cortex 4x A15 4x A7 with (2x! wow! thanks!) USB3 root hosts and on-package RAM: really an amazing chip way ahead of it's time. These days it's way outgunned but this chip really lead the way for SBCs with it's bigger cores for the time, USB3, and on-package RAM (which we really need to see a comeback on).
Worth noting that the A73 on the N2/N2+ and RPi4 are from ARM Artemis, which hails from 2016. Maybe some year SBC won't all be running half decade old architectures, but at least we're at the point where half a decade ago we were doing something right. ;) Still, one can't help but imagine what a wonder it would be if an chip & SBC were to launch with an ARM X1 chip available.
it's an a57 on the x1 (an architecture from 2012, but a big core), so this coral mini's a35 (new but quite small) very significantly below.
the attraction of coral is supposed to be the inference engine. 4 TOps/s at 2 watts is... impressive. Jetson takes 10 or 15 watts & tops out a little under 0.5 TOp/s. those are much more flexible gpu cores but that's 60x efficiency gain & centered around a chip that is much easier to integrate into consumer products.
If you're looking for an even easier & cheaper way to start experimenting with an Edge TPU, the $59 Coral USB Accelerator has been out for a while now: https://coral.ai/products/accelerator
I’m not versed in this field so curious, in that real life application, why not just use the computer you’re broadcasting from to run the camera switching client and ML inference. Does the USB accelerator do something that a standard desktop can’t?
Real-world use cases wouldn’t typically have a powerful computer broadcasting video. Rather, this board would be used at the edge to drive a camera and offer on-device, low-power inference.
Useful for robotics/drones, surveillance cameras, vehicles, consumer devices etc.
I put it on my Raspberry Pi 4 with some mobile-resnet version (88 objects) to do real time object detection from a Pi camera. It's quite a small package.
This is why my Beagleboard is gathering dust. The proprietary GPU drivers only worked with particular kernels that were usually out of date. Steer clear of PowerVR would be my advice.
Building software for new Google platforms is like farming on the river delta. Yeah, soil is good ... but any day the river will change its path and wash all your work into the fucking ocean.
The MIPI-DSI display is nice for prototyping actual devices I suppose, but I don't think this is a great fit for hobbyists. The only thing hobbyists might use this for is the machine learning chip and with a $70 price difference between this and a Raspberry Pi (or similar) with much more normal processing power, I don't think many people will take this deal.
Not for anything I'd wanna use it for. USB 2 makes it hard to get data in and out. Video is 720p single stream. I'm grasping at straws here.
Seems like a total loser compared to the NVidia Jetson line. It does have a few more TOPs than the Nano, I guess? And it costs a bit less than the Xavier series?
Jetson Nano doesn't have dedicated AI hardware... so 472 GFLOPs that you can use there pretty much. Same applies to TX2.
NV hardware with dedicated AI starts with Xavier, with the NX having 21 TOPs at $399 and 32 TOPs for the AGX at $699... that might leave a space for Google Edge TPUs.
I don’t think the Jetson nano even has a DL accelerator, so I could see the Google SBC achieving much better INT8 inference performance. The Jetson has a CUDA capable GPU, so the comparison is kind of apples and oranges.
The Google unit will definitely achieve better DL performance than the Nano. But the set of applications where:
* The Nano's 0.5 TFLOP of CUDA is insufficient; AND
* Google's 4 TOP of DL performance IS sufficient; BUT
* You're not bottlenecked by things like USB2; AND
* You can't afford a higher-end Jetson with dedicated ML
Well, let me just say I can't think of one on things I've actually done. I could come up with ones in abstract, but it seems like Google is aiming for 1% of the market at best. With 1% of the market, NVidia will win on community, tooling, R&D budget for v2, etc., so in the end, that means 0.1% of the market.
And that's if the two businesses were starting on even footing. They're not. Google has a horrible reputation for leaving customers high-and-dry, while NVidia's reputation is pretty good for B2B. Even if the boards were equivalent, most would pick NVidia based on that alone. Plus, NVidia is coming in with established users; they have a first-comer advantaged.
Getting late to the game with a generally inferior product with backing from a less reliable company? It seems like a loser to me.
Paired with a $25 Coral M.2 Accelerator [1] this is actually an interesting alternative. Twice the price of the Coral Dev Board Mini, but much better hardware.
Neat. Looks pretty similar to the ODROID H2+. Some differences I see: the Odyssey has soldered-on RAM and eMMC, the extra M.2 slot, fewer USB3 ports, and 1 GbE rather than 2.5 GbE. I think they're similar in price (once you add the missing RAM and storage to the ODROID H2+).
Is 110x110mm a standard form factor? Are there cases available that could fit one or two 3.5" HDDs?
the problem with these boards is that there are not enough general purpose cores, and the TPU is only really usable for a very small niche set of computing problems. only 0.00001% of developers is really interested in doing anything with AI, which is mostly a fad hyped up by the media anyway. what we need is a 16 core A17 or A72, without peripherals that are only useful to 0.01% of developers out there. these cores can then be applied to a variety of real world use cases where 4 or 8 cores is simply not enough.
You did not even mention Google. Your comment reminds me of people who said GPU is not an important development a decade or more back. I was in high-school back then and can see why TPUs can be super relevant 10 years from now. Someone has to start...
the problem with GPUs is the same - really only applicable to graphics and now since a few years AI, shoehorned into something which was never meant to deal with AI workloads. meanwhile nothing has been done to increase the number of cores, which is really important to be able to run a wider variety of computationally intensive applications on embedded/mobile devices. what most SoC companies are doing is simply slapping on a variety of irrelevant peripherals instead of focusing on what's important - more cores.
I'd rather think your comment has been downvoted because of the baseless claims therein. If you don't like that device, then, well, that's like your opinion man.
it's not about me not liking the device, read what I wrote - and I can add that I've designed and shipped hardware products with SoCs in them, so I know what I'm talking about.
Low-power and very fast inference for a relatively constrained set of model architectures. Think robots, drones, surveillance cameras, consumer devices.
So the Edge TPU has been out for a while now, can anyone point out a large enterprise customer that is using this for real for something new that we couldn’t do before? Or is the “edge” just made up by salespeople. Like there needs to be super low latency requirements before someone decides they can’t use an internet api right?
Not an actual existing product, but imagine something like a fish trap that only closes when a certain species of fish goes in. Not everything can be readily hooked up to the internet. And the cell bandwidth of constant upload video stream isn't readily affordable even in things like cars.
I feel like this would be a dramatically better device if they worked with the Raspberry Pi Foundation instead of heaping it into a board with a PowerVR GPU, one of the few remaining major mobile/embedded GPU lineups with no meaningful upstream support, nor any acceleration through Mesa.
You can get that better device by attaching the $60 Coral USB Accelerator to one of the Raspberry Pi 4's USB3 ports. You can even install more than one USB Accelerator per host machine if you want. (But if you go far down that line maybe you should get a nVidia Jetson board instead.)
I don't know why the Coral folks bother with the dev boards; I think their USB accelerators and M.2 cards are better buys. The Pi4 + USB accelerator + a SD card costs roughly the same as this dev board and is more powerful with better community support.
I'm hoping they release a newer, faster version of the Edge TPU itself. There are some newer, faster AI accelerators from other vendors (the Gyrfalcon Lightspeeur 2803S, the Hailo-8) but the Coral USB Accelerator is reasonably priced, actually available to hobbyists, (as of recently) has open-source driver for TensorFlow, etc.
Note: I work for Google, but I don't have any inside knowledge about Coral.
I have a desktop with an excellent processor, RAM, and decent gaming GPU in it. Is this thing remotely worth it for me, or should I stick with what I've got? Who is the target market for these? People who only have laptops with shitty GPUs?
Does anyone make Beowulf clusters of these? I could see that being more cost-effective if you're resource-limited even by what higher end GPUs can do.
The target market seem to be people who want something to integrate ML in their robot or RC car. It features decent inference performance with (comparatively) low power demand. But they can't even do training.
Unless you're building something with space and/or power constraints you are much better off with a laptop or desktop.
It's a variant of their development board for their AI accelerator ship for embedded setups (think quality control, licence plate detector, analysing visitors via webcam etc). This variant is probably just to make it more attractive for hobbyists; increasing mindshare and community size. It doesn't need a huge market, and maybe people come up with cool ideas nobody has thought of yet.
It's much slower than your desktop, or higher-powered embedded options from NVIDIA for example (4Tops of AI inference perf... and no training support by design)
Note that the Cortex-A35 is a CPU tier _below_ the A53 (80% of the perf at 32% lower power).
I hate the use of SBC for Single Board Computer. First, the acronym is already in wide use for Session Border Controllers. Second, most PCs are single board computers (just a motherboard), so the name is meaningless.
The "single board computer" meaning of SBC dates back to the mid-1970s when microprocessors first started coming out. Whatever a session border controller is, it's unlikely to pre-date that.
Jetson is actually an important product for Nvidia and Google tends to kill off this type of pet project.
Google/alphabet might have more success with their side-bets if they spun them out as separate companies like Xiaomi and Haier (both Chinese) seem to do.