What is a Cloud TPU?



Google has developed a system of ASICs, boards, and entire supercomputers to accelerate performance in machine learning. So google needed this for internal purposes, initially.

There was concern several years ago that, if every Android phone user spoke to their phone

for a few minutes a day, with algorithms of the time on big fleets of CPUs, Google might need to double the number of data centres. And so that didn’t make sense. 

And so there’s this crash program to develop a chip that was specialized to accelerate

some of these machine learning workloads. And so that was Google’s first TPU, which

was announced at Google I/O, two years ago. So then the last year 2020 google revealed that they have the second-generation system, the TPU v2, that’s also available now in Cloud as a Cloud TPU. It’s just recently gone to beta. And so this supports training and inference. It’s 180 teraflops per device. And then these devices can be connected together into these TPU pods that go up to 11 and a half petaflops. 

if you’re training an image recognition model, let’s say ResNet50, it’s the sort of standard benchmark right now. It was state of the art not too long ago. And if you want to train that

to 75% or 76% accuracy, which is what you’d expect from publications on the subject,

that might previously have taken you days a few years ago when that paper was published.

Now that’s down to about 12 and half hours on one of these Cloud TPUs. And on the full TPU pod, you can do that in less than 12 and a half minutes. So I bet you’re all wondering out there, how can I get my hands on one of these things?

Well, they’re in beta right now, so you have to request a quota at the moment. But soon, Google will lift that requirement, and you’ll be able to just fire up a Cloud TPU as

infrastructure, just like you would a virtual machine.

You go to cloud.google.com/tpu, and you’d fill out the quota form. Or if you’re already in touch with Google Cloud in some other way, just talk to your contact at Google Cloud.

Google also has this TensorFlow Research Cloud and Google making it available at no cost

through an application process to top ML researchers out in the community. Because what Google is really trying to do here is accelerate the rate of progress in machine learning. And so google knows that performance is one of the main gating factors. A lot of the amazing

results you’ve seen across vision, and speech, and language, and robotics, and other things,

have really been driven by this massive increase in the amount of available computation

and reduction in cost. And so Google is trying to make ML acceleration universally accessible and useful, affordable for everybody, and available at scale in the Cloud.

Leave a Reply

Your email address will not be published. Required fields are marked *