- 14 hours ago
Category
🗞
NewsTranscript
00:0030 hours straight is how long that it can code on its own. What are the
00:05technical feats needed to be able to go that long where humans can, well,
00:09definitely not survive that unless there's a whole load of caffeine involved?
00:12Good morning. I think one of the main advances we made was around memory and
00:16what we call context management. So if you think about how you, you know, human
00:20works for longer periods of time, you're writing things down, you're making sure
00:23you can always pick up where you left off if you're coming back the next day. So
00:26with cloud sonnet 4.5, we did a lot of work on that memory management. So the
00:30model, you think about it, sort of writes down what it's doing, keeps track of its
00:33state, and then if it needs to sort of backtrack, it's able to then keep going.
00:37And that's how it's able to stay coherent for a much longer period of time than any
00:40other of our models. How much have you managed to lower, therefore, those ideas
00:46of inaccuracies or more broadly that, well, that they're making things up?
00:50Hallucination has always been the key issue here and been something that's
00:53limited agentic AI adoption. Yeah, this is the model that we have that is also, besides
00:58being our most powerful, is also our safest and most coherent. So it has the lowest
01:02hallucination rate and is the least susceptible to things like jailbreaks. And I think that
01:06matters a lot. I think I tell my product team all the time, it's no use going for 20, 30
01:11hours if you're making mistakes along the way. And so having it be both accurate, producing
01:15good code, that's the prerequisite. And then you can focus on scaling up on the time horizon.
01:19Mike, can we talk a bit about the audience for Claude Sonnet 4.5? The real emphasis from
01:24Anthropic, from the early days, was enterprise customer as opposed to a sort of direct to
01:29consumer. But this field of tools for the developer is expanding. It's probably more
01:34competitive. Who are you hoping uses this?
01:38It's really, we've taken a business focus, but that also manifests kind of in the prosumer
01:43space. And so we have a lot of what we call power users who might be developers or might
01:47just be early adopters who want to bring AI to their work. So one of the things that Sonnet
01:514.5 can do, along with writing code, is also creating really professional looking Word and
01:57PowerPoint and Excel documents. It actually uses the same coding capabilities under the hood,
02:01but not to write code, but instead to produce documents. And that sort of capability means that
02:06we're starting to see adoption in the enterprise as well.
02:09Mike, you and I have discussed this in the past. We know Mike Krieger is Mike Krieger CTO Instagram.
02:15And what I'm seeing right now in the field of your peers is the reports on OpenAI and what
02:21Meta is doing in social media and video. Is that a direction you want the product team to take Claude in?
02:29We're focused much more on the productivity use case. And so when I think about our roadmap,
02:33it's very much how do we take work off people's hands or how do we accelerate folks and make them,
02:38you know, the work the best that it can be? How do we automate your work in the browser? So much
02:42more on that productivity side of things. And I don't think you'll see us play very much in that
02:46entertainment space.
02:47Mike, that productivity perspective, it's come under some concerns recently. Think about the MIT report.
02:53Everyone's suddenly talking about this. Only, well, 95% of the tests were basically failing out there in
03:00the wild. How are you making sure that enterprises adopt your products, but actually see the productivity
03:05gains from them?
03:07I think this is really important where if, you know, AI gets brought into the workplace without
03:11the right either tools around it or enablement, what you end up with is this disillusionment a
03:15couple of months later around, well, folks aren't adopting it or, yeah, it helped me a little bit,
03:19but not enough. And so we have a lot of emphasis on, let's make sure the work is actually good.
03:24You know, you might hear this word online, like slop, where AI is creating work that actually
03:28just is not very good. And I think of us, we're trying to produce the anti-slop work that actually,
03:32you know, maybe gets you 80% of the way there, but it's 80% that then lets you complete the work in a way
03:36that you're proud of rather than, you know, oops, it did something, but now I feel like I have to start over
03:41because it didn't really help. So I think that's the really key piece for enterprise adoption.
03:44And therefore, does it remove the need for so many people? Or ultimately, there's still this argument,
03:51this can augment, but will it start to replace, Mike?
03:53We think a lot about what, you know, the comparative advantages are of people, you know, as relates to AI.
04:00There's a lot of still relationship building and trust, critical analysis and strategy that really comes
04:06on the human side of things. And so we really try to design tools that as much as possible play up those parts
04:12of that human AI interaction, knowing that, you know, there will be, you know, labor shifts that are almost inevitable,
04:17but if we can design our products along the way to maximize both people's understanding of AI,
04:21but also their use in a complementary way.
04:24Mike, OpenAI is holding Dev Day next Monday. It's probably on your calendar for peripheral awareness.
04:31But I'm very conscious that you're kind of speaking to us 24 hours after the news of Claude Sonic 4.5 came out.
04:37Have you any data on the sort of reaction to it, early demand, and where that's coming from?
04:45It's been really interesting how quick people are to adopt a new model.
04:48So by, I think, about 1 p.m. yesterday, we already had more usage of Claude Sonic 4.5 than all of our other models combined,
04:55which really speaks to the eagerness of a lot of these startups that are using or building on top of our models
05:00as well as early adopters to, on day one, you saw GitHub and Cursor and Windsurf and many of these products
05:06that build on top of our models want to incorporate Claude Sonic 4.5.
05:10And so we had this really early crossover moment as well.
05:12That is interesting data. Correct me if I'm wrong, but this model is running on Project Rainier, right?
05:20Is that sort of operationally and infrastructure-wise where the training and now inference of it is being done?
05:27So we do our training and inference across, you know, we have partnerships with Google and Amazon,
05:33but we have a significant part of this model being served now from Amazon as well.
05:39And we're seeing a lot of growth on AWS Bedrock as well.
05:41Just going to that infrastructure layer, you're obviously the product visionary here,
05:46but you need to have the energy, the compute to bring your products to life.
05:50Many worrying that we're in some sort of bubble cycle around AI.
05:54How do you think about that as you drive this business forward, Mike?
05:57I think there's this combined need to both scale up for the training side, but also on inference.
06:01And as we've scaled, especially with our business arrangements and the companies building on top of Sonic,
06:06I think we now have a sort of forward-looking perspective on what our inference needs will be.
06:12And I think that will let us go out and also secure the kinds of compute deals that we need to both feel the training,
06:16but also have that sort of revenue-generating inference side as well.
06:20And we get so focused on the compute needs of the United States,
06:24but we've been talking a lot about that in Europe, how it's scaling in the UK.
06:27From your adoption, how are you seeing things different globally, Mike,
06:30of those that are actually deploying what's happened with Sonic 4.5 and the latest models?
06:35We see this a lot in terms of our global footprint.
06:38This is something we started expanding earlier this year.
06:40And so for our rollout of Tranium 2, which is the chip that Amazon has built for AWS
06:45that we use pretty extensively for our cloud models, a lot of that deployment is actually international.
06:50And when I go to Europe, for example, I hear a lot of questions about data locality
06:54and making sure that inference is happening in local data centers.
06:57And the only way we're going to be able to do that is to have that international footprint of these chips.
07:01And so you've seen the same in APAC.
07:03Mike, we are going to ask you a question about talent wars,
07:07but I'm just going to make an appeal to you to just be honest with me on this,
07:12how big a factor it is or isn't for you right now on the product team at Anthropic.
07:17In particular, I'm looking at the pace at which OpenAI is putting stuff out,
07:21Meta is putting stuff out.
07:23So just through your experience, what's the talent situation right now?
07:28I'm seeing much more of that talent sort of back and forth happen within the research side in general,
07:35a little bit less on the product.
07:36I think there's some key hires where that's been the case.
07:39One thing that's been a positive sort of maybe surprise or just outcome of how mission oriented
07:44a lot of the Anthropic team really is, has been it's affected us very minimally in terms of that back and forth
07:50that you're seeing maybe among some of the other frontier labs, which is very encouraging.
07:54Of course, you have to continue to make sure we build a great culture and maintain that mission alignment.
07:58But so far, it's been minimally affecting us.
08:01If we take Sonic 4.5 as the case study,
08:04what were the types of roles that you needed to bring in to roll out the release?
08:09I think people think of research and model science as being fairly cut and dry.
08:14I actually think that there's a lot of art and taste to it as well.
08:17You were making a lot of decisions from a research engineering perspective around
08:21what are the tasks the model needs to improve on?
08:23How will it improve on that?
08:24How will we know that it's improving on that?
08:26And so a lot of that reinforcement learning post-training piece
08:29is the key shape of what we really thought about in Sonic 4.5.
Recommended
6:01
|
Up next
3:52
6:14
1:58
1:57
4:28
5:20
4:25
4:05
5:32
5:34
3:49
4:14
3:49
2:34
6:35
5:19
2:17
5:01
5:52
5:27
6:24
7:54
7:20
Be the first to comment