Skip to playerSkip to main content
  • 2 days ago
Transcript
00:00What are they going to tell us in Las Vegas this week?
00:02What is the future of the TPU program?
00:05So what we're reporting is they're probably going to announce an inference chip, you know,
00:11for running AI models after they've been trained.
00:14Thus far, they've been doing training and inference in one chip.
00:17You know, we are expecting and reporting that they're probably going to announce something
00:20separate just for inference.
00:22Google chief scientist Jeff Dean told me in an interview, look, you know, the way inference
00:27demand is growing, quote, it now becomes sensible to specialized chips more for training and
00:31more for inference workloads, and they're looking at a bunch of things.
00:35Their chip chief, I mean, Vidot declined to tell me specifically whether they're going
00:39to announce that this week, but, you know, said we'd be hearing more soon.
00:42And this continues sort of a trend.
00:45You know, NVIDIA announced a fast inference chip from what they'd acquired from Grok.
00:50You mentioned before the Cerebris IPO.
00:52That's also at this point really a low latency fast inference play.
00:57Meanwhile, the play has been extraordinary with TPUs and the adoption by many would call
01:03even rivals, Meta likely wanting in on the Google made chips.
01:08So how are they broadening and where do you think they're going to be getting supply from
01:12more broadly?
01:12There's a lot of reporting in the market, for example, that maybe they turn to Marvell versus
01:16Broadcom.
01:17I know you can't comment on that directly, but how are they thinking about their own supply
01:20chain?
01:21Look, I think for them supply, excuse me, supply is a problem.
01:25I was talking to Google DeepMind CEO, Demis Hasevis, and he mentioned that as well.
01:29Look, you have Meta, which signed, they told us a multi-billion, multi-year deal to use TPUs.
01:35They're just getting their first big tranche of them.
01:38They're trying to figure out what they're going to do with them.
01:40Anthropic has a huge deal.
01:42Citadel is going to talk about how they're using TPUs at the Google Next conference this
01:46week.
01:47And what Demis was telling me was that, contrary to what Jensen Wong said last week on the Dora
01:53Kash Patel podcast, that it's really just Anthropic that wants them.
01:57They actually have a lot of people that are interested.
02:00They don't have enough supply.
02:03Demis was saying to me, look, what they end up doing is prioritizing the top-of-the-line
02:08Frontier Lab customers, because those are the customers who are most capable of taking
02:13advantage of what TPU has to offer.
02:15More broadly, I think the TPU play, and this is why it appeals to these big Frontier Labs,
02:21is that Google is the only maker of a large top-of-the-line Frontier model, one of the top
02:29models that also makes AI accelerator chips in large volume.
02:34OpenAI has said they will as well.
02:36I think that's why it's worth lingering for a minute on why the TPU is useful.
02:41So you explained really well that to this point, the TPU Tensor Processing Unit has basically
02:46been a general-purpose accelerator, training or inference.
02:50But when you put it side-by-side, as many people try to do against NVIDIA's latest GPU
02:55or other inference-specific chips, what is it about Google owning the architecture, about
03:01designing it for specifically the inference use case that makes it better?
03:04Is it money?
03:05Is it power?
03:06What do we need to know?
03:07What Google's argument is that it is the fact that they know what you need for training
03:13and running a top-of-the-line model.
03:16There are two things in the last couple of months that have really increased the interest
03:21in Google TPU.
03:22One is the Anthropic deal, which is a validation of the technology.
03:27The other was sort of the release of the latest version of Gemini, which was trained and is
03:31running its inference on Google TPU and the strong reviews that it's gotten.
03:36Google uses data and requests and information from their own AI model teams to figure out
03:43what they need to prioritize and, frankly, what they need to fix in the chip business.
03:48They work together to figure out that, for example, Google TPU wasn't doing, utilization
03:53was too low on those chips when you were using it for reinforcement learning.
03:58Demis was telling me they're using it to figure out how precise the chips have to be as opposed
04:03to where they can save money, and that's sort of a set of data that flows into the Google
04:07TPU design team that other chip makers don't necessarily have.
04:13NVIDIA does have a very solid model team, but among the big three, Anthropic, OpenAI,
04:20Google, four frontier models, Google is the only one making AI accelerator chips at volume
04:25right now.
Comments

Recommended