00:00Great reporting, interesting story. It does feel like, though, increasingly, we're in this world
00:04where these big tech giants want to make and are making their own semiconductors. Tell us about
00:12Google, because they've been, what, designing chips for over a decade. But what's interesting
00:16about what they are doing right now? So first of all, you've mentioned in a matter of months,
00:22let's talk about what's likely to happen in a matter of days. Later this week is Google's
00:27next conference in Las Vegas. And as we reported in our story, they are likely to announce at that
00:34conference that they are releasing a version of their trip, a TPU, for inferencing workloads. So
00:40inferencing is when you've already got an AI model that's trained, and you need to run it. You need
00:46to enable it to answer people's questions quickly. So we're reporting they're probably going to announce
00:51a TPU for inferencing. And that would sort of, you know, as you mentioned, individual companies
00:57doing things, you know, some of what's gone on in the last couple of months also is this focus on,
01:03you know, low latency or very fast inferencing. So if we're getting into a world where people are
01:08looking at agents and things that need to be more conversational, you want to be able to answer
01:12those queries quickly. And so last month, we had NVIDIA, which is obviously the giant, the dominant
01:19player, releasing a fast inference chip based on the technology they acquired from McGraw. I know,
01:27you know, Cerebris just filed their IPO. They're also in this fast inferencing space. And so now we're
01:33expecting that Google is going to get more fully into that space with a chip that's specific for that
01:38task. Whereas up until now, as NVIDIA had in the past, Google has been, you know, using one set of
01:45TPU model for both training and inferencing. So that's what we're expecting, you know, later this
01:52week. Google's chief scientist, Jeff Dean, you know, told me in an interview, look, you know,
01:56there's all this demand growing for fast inferencing. And he said, quote, it now becomes sensible to
02:01specialized chips more for training or more for inference workloads. You know, that said, the head of
02:08their chip division, I mean, Vedat declined to say specifically what they're going to do later this
02:12week. But, you know, he did say that more will be shared, you know, quote, in the relatively near
02:16future. Okay. So before we get to the relatively near future, we're here in the now and we have what
02:21you've reported, which is just absolutely amazing. And I'm curious about NVIDIA and the competition
02:27that this could or potentially would pose to NVIDIA, which, as you mentioned, and we all know,
02:32is the market leader when it comes to this stuff.
02:35So I think you have to remember that it's not just that NVIDIA is the market leader. Everyone
02:39uses NVIDIA, that everyone includes Google. You know, Google's latest Gemini model, however,
02:46which has been very well reviewed and very, you know, successful, was trained and on TPU and uses
02:53TPU for inference. Now, last week, there was a podcast with NVIDIA CEO Jensen Wong and Dorkesh Patel,
03:00in which Jensen talked about the TPU threat and said, look, the growth in TPU is all anthropic,
03:06which is a huge, obviously, huge Google TPU customer. But, you know, Jensen's argument was,
03:12look, this is not a phenomenon. It's a one-off customer. You know, I spoke to Adam as Google
03:18DeepMind CEO and, you know, he pushed back on that. He said a lot of customers do want to do
03:22what
03:22Google does, which is use both GPU and TPU. And in fact, the problem for Google is that they don't
03:28quite have enough supply to meet that. We talk in the story about some other customers, about
03:33Meta, about Citadel Securities. But, you know, Demis' comment is that they don't have enough
03:38supply to meet all the interests. And so for now, at least, they're prioritizing the really high-end,
03:44you know, frontier model teams who he said really are, you know, best able, best skilled to take
03:51advantage of the, you know, the advantages that TPU brings. And then, you know, we talked a lot about
03:55what those advantages are. I think from Google's point of view, what helps them is the fact that
04:01they co-design the TPU chips with both the hardware team and the folks at Google who work on those
04:07leading AI models. And that gives them a view into what you need to successfully, you know,
04:14design a chip that trains and inferences great models. It also helps them see problems more quickly,
04:19you know, they argued to me. Does NVIDIA need to be worried here, Dina?
04:26NVIDIA is still the dominant player. I, you know, everyone that I talk to expects them to be the
04:34dominant player for the foreseeable future. I don't know that it's, I think there's a different
04:40metric of success for Google TPU and NVIDIA. NVIDIA success looks like continuing to dominate the
04:48market, starting to pick up some of these newer growing workloads in a significant way, such as
04:54the fast inference. TPU is coming from a place of being so much smaller in terms of share. It's
05:00really more a question of, you know, can they pick up some share, but, but also even more than that,
05:05they, they have to decide what they want to do and what they want to be. NVIDIA sells chips for
05:11companies to use in their own data centers in whatever way that they want. Google has thus far almost
05:17exclusively focused on giving you access to TPUs through their own cloud. Now they're starting to
05:23experiment with some other models. You know, Anthropic, for example, has been able to, you know,
05:29do with Google some experiments where they're running TPUs for Anthropic in data centers that
05:34Anthropic controls rather than through Google Cloud. But Google has to figure out, do they want to be a
05:39chip seller like NVIDIA that's not tied to their cloud? And then also with that limited supply,
05:45how do they allocate between customers who want a certain amount and not everyone can get what they
05:50want, as well as all of the Google workloads that are using the TPUs and, you know, including Gemini,
05:57including Google search. You know, that's also a question. They sort of have to figure out where
06:01they want the business to go. It's not quite the same thing as NVIDIA. Why? And just, we only have
06:05about 30 seconds left, Dina, but, but why sort of this, this identity question, like why wouldn't Google
06:11try to do both things really well?
06:14I think it's just a question of where they want to go and where they want to invest. NVIDIA is,
06:19you know,
06:20a chip company that also does some AI models and they do some, some of the NVIDIA's AI models are
06:25increasingly excellent. They're very open source focused, but they're a chip company coming at that
06:29space. Google is a company in a lot of different areas, including, you know, cloud and search and AI
06:35models before they're a chip company. I think they have to figure out how, what the chip portion,
06:40you know, where that fits in and how it helps them.
06:43Sounds like TBD and let's see what we get out of the, what are you, just real quickly,
06:4710 seconds, the announcement. So just the detail, the devil in the details or what?
06:52I think we'll have to see what they say and when they're available. And, you know, again,
06:57this is, you know, next starts on Wednesday. So keep your eyes peeled for what they say.
Comments