Skip to playerSkip to main content
  • 7 minutes ago

Category

🗞
News
Transcript
00:00Peter, May, congratulations.
00:02What a lineup.
00:04Jan, I want to start with you.
00:05I may be over-simplifying,
00:08but this is my understanding.
00:09You are now squarely focused after your time at Meta
00:12on world models.
00:14Because, and here's possibly where I'm simplifying,
00:16you think that frontier LLMs
00:19are not going to produce the kind of AI
00:21that is going to be globally useful
00:24at the scale and level that we need it to be.
00:27If that is the case,
00:29help frame it for us.
00:30If, for example, we're thinking of a young infant,
00:33what is it that a young child can do
00:36and understands about the world
00:37that maybe LLMs, even those right at the frontier,
00:40cannot yet do?
00:41Okay, first of all, LLMs are useful.
00:45They're not a path towards human-level intelligence,
00:49but they are useful.
00:49There's a lot of products,
00:51including computer products,
00:53including AI systems,
00:54that are very useful,
00:56but that are not a path to human-level intelligence.
00:59And so that's the confusion,
01:00I think, that we need to pay attention to.
01:04Let me give you a very simple number.
01:08The biggest LLMs are trained on the pre-trained,
01:12at least on the totality
01:14of all the publicly available text on the Internet.
01:17That's about 20 trillion words,
01:1930 trillion tokens.
01:20The token is about three bytes.
01:22Do the arithmetics,
01:23it's about 10 to the 14 bytes.
01:25This is the amount of data
01:28a four-year world has seen through vision
01:30during four years.
01:33The text, though,
01:34would take 400,000 years to read, right?
01:37So there is enormously more data
01:39from sensory input like vision or touch
01:43or everything else
01:45than there could ever be through language.
01:47Language is a very approximate,
01:50reduced, quantized, simplified description of the world.
01:54And LLMs can only deal with discrete sequences of symbols,
01:58but the world is much more complicated than language.
02:02So, it is very well known in computer science.
02:05It's called the Moravec Paradox.
02:06It's the fact that computers can do tasks
02:09that are seemingly complicated for humans,
02:11like solving equations and computing integrals
02:14and even passing the bar exams
02:17or answering any question.
02:19But when it comes to just, you know,
02:21grabbing an object like this without breaking it,
02:23that's kind of much more challenging.
02:25And so there's a lot of tasks that, you know,
02:27four-year-olds can do that we can't do with robots.
02:30That's why we have those systems, again,
02:33you know, proof DRMs and write code, thanks to you.
02:37But we don't have domestic robots.
02:42We don't even have level-fast self-driving cars.
02:44I mean, we have them, but we cheat.
02:47So, you know, dealing with the real world
02:49is just much more complicated.
02:50That's what my company is really trying to do.
02:54That's what I've been trying to do for 15 years,
02:55actually more than that.
02:57But, you know, kind of partially succeeding
02:59over the last five,
03:00and that it made sense to create a company around that.
03:02So one of the things that makes me think about
03:04is the huge needs for data and compute.
03:07And the market is clearly betting
03:08that more data, more compute, scaling continues.
03:14If you succeed,
03:15is there going to be more demand for data and compute?
03:17Or is the market misallocating right now
03:20around the data and compute story?
03:21What is your take on whether or not
03:23the market is getting this right?
03:24There is a need for compute.
03:26Nature abhors a vacuum.
03:27But, that said, the model that we are training
03:31that essentially tries to understand the world.
03:33You mentioned the phrase world model.
03:37What is a world model?
03:38Given an idea of the state of the world at time t,
03:40and given an action that you imagine taking,
03:44what is going to be the state of the world at time t plus one
03:47after you've taken this action?
03:48If you have such a model, predictive model,
03:51you can predict what the consequence
03:54of a sequence of actions is going to be.
03:56And so, if you want a robot, an agent, or whatever
03:59to accomplish a task,
04:00you give it a cost function that measures
04:02to what extent the task has been accomplished.
04:05And then you search, you plan,
04:07you search a sequence of actions
04:09that will accomplish this action internally.
04:11We do this all the time as humans.
04:13We use the power of our prefrontal cortex,
04:16which is a world model,
04:17to predict what's going to happen
04:20as a consequence of our actions.
04:21That's what allows us to plan and to live in the world.
04:24That's what allows us, when we learn to drive,
04:27we learn to drive in a few hours of practice
04:29when we are teenagers.
04:31We have millions of hours of training data
04:33of people driving cars.
04:34We still cannot train a system
04:37to just clone the human behavior
04:40of driving reliably,
04:42which is why you can't buy a level 5-serving car.
04:46So, how is it that, you know,
04:49a 17-year-old can learn to drive
04:50in a few hours of practice?
04:53If the 17-year-old drives near a cliff,
04:57the mental model that we have,
04:59the knowledge of the world,
05:00tells us, if I turn the wheel to the right,
05:02the car will run off the cliff.
05:03Nothing good is going to come out of this, right?
05:05So, we don't do it.
05:07So, the ability of predicting the effect of our actions
05:10is what allows us to learn quickly
05:12and also to accomplish new tasks
05:15that we have never been trained to accomplish.
05:17That, I think, is essential,
05:20not just for the future of AI,
05:21but even for building agentic systems.
05:24You should know something about it, right?
05:26Like, I cannot imagine, he's on it,
05:29but I cannot imagine how you can possibly
05:32build agentic systems that are reliable
05:35unless they can predict the consequences of their actions.
05:39And LLM simply do not do it.
05:40They can apply recipes, but blindly.
05:45Well, May, this seems like a good point to bring you in,
05:47and then, of course, we'll bring in Peter.
05:48May, at Writer, you build your own models,
05:51and you deploy them.
05:52So, my question for you is,
05:53was it more difficult building that model
05:55or persuading the first Fortune 500 company to adopt it
05:59and actually get it to use that system
06:02on their enterprise data and do real work?
06:06It is interesting that at every era of the last five years,
06:09there has been a gap between what the models can do,
06:13what the capabilities are,
06:15and what the enterprise can trust
06:17to actually put into production.
06:19And especially now with agentic systems,
06:22enterprises want autonomy,
06:24but they want it to be governed, not unchecked.
06:26And what we've been doing for the past five years
06:29is, yes, building the models,
06:30but also the harnesses around the models
06:32that allow us to collect the recipes, right,
06:35that exist across the business,
06:36but more crucially, actually allow us to collect
06:39all of those corner cases that sit in people's heads.
06:42And a lot of what we've done for the Fortune 500
06:45is be able to really iterate and innovate
06:48on memory systems that allow agents
06:52to collect new information
06:54while they're operating in a highly governed,
06:57secure, regulated context
06:59that improves the performance of these agents
07:02when, you know, you don't have the recipes
07:06or the workflows or the SOPs really well defined.
07:09What has to happen?
07:10Maybe we're there already.
07:11Maybe you're starting to see that.
07:12In terms of moving from enterprise AI
07:16being at the pilot stage
07:18to being broadly used as infrastructure
07:21across the enterprise.
07:22Well, I would say there's three stages,
07:25even in a mature organization today.
07:27There's the successful pilots.
07:29There's the stuff you've gotten into production.
07:31You can, you know, brag to your board about,
07:33but then there's still a big gap
07:34between stuff that people have
07:36or say that they've got in production
07:38and what's really scaled.
07:39And even when we've got, you know,
07:41really successful agents in production
07:44at a pharma company or at a bank,
07:46scale for us is the whole process now looks like this.
07:51And you are also able to have
07:54the much more autonomous capabilities
07:57unleashed into the enterprise.
07:58And, you know, there is a org design gap for sure
08:03in that the vast majority of organizations
08:05simply don't have the talent on the business side
08:08to orchestrate and oversee these systems,
08:11build the evals, build the launch plans, etc.
08:14Nor do they have the willpower
08:17to change the roles and responsibilities
08:20of the people who were, you know,
08:22doing it the old way.
08:23And so there's certainly a gap on the technology side.
08:26You know, we are seeing a real distrust
08:29of what the foundation model companies
08:32are bringing to market
08:33because agentic explosion
08:36means a token cost explosion.
08:39And agents are using 1,000 times
08:41the token budgets of chatbots.
08:44And so really being able to have the harnesses
08:46that send tokens to the right model, right,
08:49based on the task,
08:50based on the ROI of that task
08:52are also things that we've been building
08:54as the capabilities of our models
08:56have, you know, covered a greater surface area.
08:59Agentic explosion.
09:00So, Peter, you caused quite the agentic explosion
09:02with OpenClaw.
09:03I still remember the images of people lining up
09:06in China outside stores,
09:07including grandparents,
09:09to get OpenClaw downloaded
09:11on their iPhones and iPads.
09:13Was there a moment...
09:14There must have been a moment
09:16when you thought,
09:16wow, this is going to be big.
09:20Was there that moment for you?
09:21You know, for me,
09:23that moment was in late November.
09:27Like, there was, like,
09:28where I sent a voice message
09:30and suddenly it just worked,
09:31even if I didn't build it that way.
09:34But then I tried to, like,
09:38explain to people and I failed.
09:40You know?
09:41Like, when I...
09:43When I invited people one-on-one
09:45in, like, a WhatsApp group chat,
09:47they always were, like, blown away
09:48or somewhere scared.
09:51But on Twitter, I was very mute.
09:54So it took me a few weeks
09:56to, like, figure out
09:57how can I show this to people?
09:59Yeah.
10:00Until one day,
10:02I created a Discord server
10:05and just everyone could come,
10:07could, like, watch me work.
10:08I was kind of, like,
10:09building OpenClaw
10:10with OpenClaw at the time already
10:12and people could interact
10:13with the models
10:14and people tried to do malicious things
10:16and it wouldn't work
10:16and suddenly people, like,
10:19bit by bit,
10:20started feeling the magic.
10:22And the funniest story is
10:24I did this the whole night.
10:26You know, it was, like, very exciting
10:27and then at 7 a.m.,
10:29I was ready to go to bed.
10:31I pressed Command C,
10:35you know, to close everything,
10:36went to bed,
10:38slept for, like, 10 hours.
10:40And then I woke up
10:41and there were, like,
10:42800 messages.
10:44Like, I forgot that I built a system
10:46in a way that was resilient
10:47and, like, while I was walking to the bed,
10:49it would restart
10:50and then happily answer
10:51to everyone in the world.
10:55Exciting times.
10:55Well, that's pretty wild
10:58and that takes me
10:58on to my next question
10:59quite nicely,
10:59which was,
11:00can you give us an example
11:01of a use case with OpenClaw
11:03where it made you laugh
11:05and where it made you
11:06maybe a little nervous?
11:14One of my friends,
11:15he has his personal claw
11:18and he has one at work
11:20and then at work,
11:21he pulled up all the meetings
11:23he has for the week
11:24and then talked to the agent
11:26to connect to his personal claw
11:28to get his blood sugar levels
11:30per hour
11:31and then the two agents
11:32connected it
11:33to tell him
11:34which person stressed him out
11:35the most.
11:38And the one that made me
11:39a little bit nervous
11:40is when I was talking
11:42to a friend
11:42and he was like,
11:44you know,
11:44I love OpenClaw.
11:45I was like,
11:46what do you use it for?
11:47You know, my mom,
11:48she always complains
11:50that it takes me so long
11:52to answer.
11:53So I built a claw
11:54to answer.
11:57And she only noticed
11:58because I answered too fast.
12:02That's when it's too good.
12:05Peter,
12:05Jan,
12:06thank you very much indeed.
12:07Really fantastic
12:07on the agentic piece,
12:08the enterprise AI story,
12:10of course,
12:10and the prospect
12:11of world models
12:12and what that could deliver
12:13and particularly
12:14in combination
12:14with LLMs.
12:16Jan,
12:16Peter and mate,
12:17thank you very much indeed.
12:18Take your trophies
12:19and thank you.
12:20Congratulations.

Recommended