Desafiando lo convencional con IA de código abierto: Thomas Wolf de Hugging Face

MIT Sloan Management Review México

El experto invitado analiza cómo los modelos de IA de código abierto y cerrado coexistirán. Explica sus ventajas, riesgos y el papel de las comunidades para mejorar herramientas y colaboración.

Transcript

00:02Today's guest believes open-source and closed-source models will coexist in the world of AI.

00:08Find out what he considers the opportunities and drawbacks to each,

00:12as well as how communities can make AI tools and themselves work better together on today's episode.

00:19Hello, I'm Thomas from Hugging Face, and you're listening to Me, Myself and AI.

00:25Welcome to Me, Myself and AI, a podcast from MIT Sloan Management Review, exploring the future of artificial intelligence.

00:34I'm Sam Ransbotham, professor of analytics at Boston College.

00:38I've been researching data, analytics, and AI at MIT SMR since 2014,

00:44with research articles, annual industry reports, case studies, and now 12 seasons of podcast episodes.

00:51On each episode, corporate leaders, cutting-edge researchers, and AI policymakers join us

00:58to break down what separates AI hype from AI success.

01:06Hey, everyone. Thanks for joining us again, and welcome back to a new season.

01:10Today, I'm lucky to be talking with Tom Wolfe.

01:12He's the co-founder and chief scientific officer of Hugging Face.

01:16Tom, great to have you on the show today.

01:18Thanks, Sam. It's a big pleasure to be here.

01:21Let's start with Hugging Face itself.

01:23Some of our listeners may not be familiar with Hugging Face.

01:26Can you give us a brief overview of what the company does and what you do?

01:29Yeah, of course.

01:31Hugging Face is an open-source AI platform.

01:33We give access to all the AI models that are open-source,

01:37which means that, basically, these are the models you can download and run wherever you want.

01:43So, when you use an AI model nowadays, you can choose either to go to ChatGPT,

01:49Anthropic, or Google.

01:50They are the most widely diffused at the moment.

01:53Or sometimes you want to run the AI models on your own data center,

01:57or you want to run them on some specific hardware.

01:59It could be like local hardware, or it could be maybe faster chips,

02:03because you need instant response.

02:04In most cases, you will want to go for open-source AI model,

02:08which is a model you can basically just download.

02:10There is quite a lot of them.

02:12On the Hugging Face, there is close to 4 million of this model at the moment.

02:15There is one new model being published every five seconds.

02:19Some of the most famous ones are the Meta series, the Lama series,

02:23and one that I think got the most adoption and most visibility recently was DeepSeek,

02:28which was released in January, and kind of crashed the stock market at the same time that it was released.

02:34And so, over the past eight years, Hugging Face has been building this platform,

02:39growing it together with the community of people and teams who are both sharing and downloading models.

02:44This community is now roughly 10 million users and AI builders, how we call them.

02:50And we've expanded as well beyond just model hosting to also host data sets,

02:55which are used to train models, to fine-tune them, to evaluate them.

02:59And more recently, also what we call spaces,

03:02which are simple, low-code way to test all of these models.

03:08So, there are a lot of people offering solutions here.

03:11And if I think back on the way technology is developed throughout the history of mankind,

03:16people came up with chips and Bell Labs,

03:18and Intel came along and built fabs for processors.

03:22None of that was open-source.

03:23Why is open-source important here?

03:26I think open-source has been always important in a way.

03:30The thing is, open-source is more often the long game in computer science.

03:34So, if we go back to, for instance, the year 2000 or pre-2000,

03:40where basically Microsoft was one of the largest operating systems, right?

03:44And Linux was somehow more for fanatics or geeks.

03:47And now, if you fast-forward 20 years later, right,

03:51Unix is really the basics of all enterprise software and all enterprise cloud, basically.

03:57You almost always run them on some version of Linux.

04:00Even macOS, which I think is probably the most widely diffused on consumer laptops nowadays,

04:07is one of the largest competitors to Windows-based,

04:10is itself based on the Unix core.

04:13So, there is this trend, which is open-source has some advantages

04:19that make it extremely appealing in the long term.

04:22Obviously, on the short term, you can go faster with closed-source,

04:25and that's also what we see with closed-model.

04:27You can iterate faster.

04:30You can raise large amounts of capital to train your models.

04:33You can try to grab, you know, the most expensive AI researchers

04:37and pay them, pay them like huge sum of money.

04:40We've kept pushing a lot for open science.

04:43And just this Tuesday, we published a new model that's called Small LM3,

04:48which is an extremely smart model, but it's the best one, but at 3 billion parameters.

04:53So, it's in the range of size that you can run on your laptop and even on a smartphone.

04:57And we've decided to share, at the same time, all the data, all the recipe,

05:02all the knowledge on how to build this model.

05:04It's fine for us because we don't make money out of these models.

05:07And we think it's very good because anyone who can want to build a model based on this

05:12or looks a bit like this or want to extend this type of model

05:15now has all the knowledge they want to start.

05:18So, we think open-source can be defined in many ways in AI,

05:22but we think the most radical way is to say you share just everything.

05:25You share the data, you share the code, you share the recipe,

05:27and we wrote even a very long blog post.

05:29We're going to probably make it into a full-blown paper

05:32on all the nitty-gritty details on how to build this model.

05:37So, we've kept kind of publishing all of this.

05:40We're even making a book right now,

05:42which is on how to train efficiently LLM on GPU cluster.

05:46But we think basically all of this should be really accessible.

05:49But I think, yeah, I don't want to give the impression

05:52that I'm an open-source absolutist.

05:54I think both of them just have interesting advantages and drawbacks,

05:59and I think both of them will generally coexist in AI.

06:03If you compare to hardware, it's an interesting comparison you made, right?

06:08Bell Labs was in part mostly developing.

06:10At the time, software was probably much more niche,

06:13and software and hardware were much more tied together.

06:16I think if you compare to hardware, there is some difference,

06:19but there is also an interesting advantage

06:20and point to be made for open-source hardware.

06:23So, that's actually something we have started to do

06:26very recently at Hacking Face in robotics.

06:29I don't know if it's something we want to cover nowadays,

06:31but we just acquired this year an open-source hardware company

06:35in robotics called Polen.

06:37And I think there is some definite interest.

06:39The general idea is for me

06:42that a lot of the way you see hardware in the long term

06:46can be reinvented using software.

06:52MIT Federal Credit Union

06:53is a member-owned, not-for-profit financial institution

06:56serving the MIT community and beyond.

07:00Discover a range of financial products

07:02such as checking, savings, personal loans, auto loans,

07:05and home lending.

07:07MIT FCU is here to support you on your financial journey.

07:10Learn more at mitfcu.org slash podcast.

07:15MIT Federal Credit Union

07:17is federally insured by NCUA.

07:20NMLS number 699-225.

07:23Equal housing lender.

07:30So, maybe we'll come to that.

07:31I don't want to go too quickly in a crazy tangent.

07:36Especially, actually, with Hope Jr.

07:38and the robotics initiatives

07:39that you're involved with right now.

07:41I guess you can go ahead and talk about that,

07:43but as you're doing that,

07:44I want to push you a little bit and say,

07:45I don't even know how we define hardware and software anymore.

07:48You've got firmware,

07:50you've got so many layers in the software stack.

07:53And traditionally,

07:55even if we had open source,

07:57like Linux was the example you mentioned,

07:59it still may have built on top of

08:01firmware that was vendor-specific and proprietary.

08:03So, you know, even within those stacks,

08:05it becomes complicated.

08:07But so, yeah, talk about hardware.

08:08I'd like to talk about Hope Jr.

08:10I think this is very interesting.

08:12One thing, it's slightly futuristic,

08:14but my job at HagiFace

08:15is mostly to think about what's coming.

08:17So I do spend a lot of time thinking about the next year

08:19and the coming few years in AI.

08:21I think, just like you're saying,

08:23the frontier between software and hardware

08:25is becoming maybe smaller again.

08:27We had a moment where there was really this huge...

08:29I mean, maybe at the beginning,

08:30it was all the same

08:31because everyone was so close to the hardware

08:33that basically we didn't really have anything

08:35that we would call software,

08:37any like large abstraction.

08:39And nowadays, the interesting thing

08:41is we see this tendency to kind of fill the gap

08:44and to go back to very low level.

08:47And there's a couple of, I think, trends

08:49that I don't have a fully thought about.

08:51But what I see is, for instance,

08:52people using AI again

08:54to speed up the process of building hardware,

08:57both computer-assisted development,

09:00basically, for mechanical PCs,

09:01but also hardware like chips,

09:04all these type of things.

09:05And saying, using AI,

09:07we can maybe reinvent

09:09and basically lower

09:10or maybe digest all of this knowledge

09:13that you kind of need nowadays

09:14if you want to develop something in hardware

09:16in a form that's actually so helpful

09:19that, again,

09:20we could develop hardware

09:21a little bit like we develop software.

09:24So we could iterate much more quickly on it.

09:26We would have much less

09:28of an entry barrier of knowledge

09:30to basically being able

09:31to design things in hardware.

09:32So I'm very excited about this.

09:34And I think this is,

09:35in part, unlocked by the tools

09:37that we now have available.

09:39If I go to Hugging Faces site,

09:41I'm overwhelmed.

09:42There's so much stuff.

09:43And I think,

09:44I mean, it's great that everything's open,

09:46but how do we solve then

09:48this curation problem

09:49of there's a whole bunch of information

09:51I'm never going to get at all.

09:53What do I do?

09:54How do I get started?

09:56Yeah, it's a difficult problem

09:57because there is like 4 million models.

10:00So how do you find the model

10:01you want to use, right?

10:02In the beginning,

10:03we tried to have some curation manual for us,

10:05but with one new model every five seconds,

10:07that's just not really sustainable.

10:09There is a couple of general guidelines,

10:11of course.

10:11I mean, if you're looking

10:12for a speech generation model,

10:13you can restrict yourself

10:14to this type of model.

10:16If you're looking for a text model,

10:17You're down to just a million.

10:18You can filter, exactly.

10:20And it's surprising

10:20because everyone thinks usually

10:22that the category of model

10:24they are interested in

10:25is the most downloaded,

10:26but it's not,

10:26it's often like the most downloaded one

10:28will be like a speech model.

10:29And I get LLM people

10:31or text people are very surprised.

10:32They're like, what?

10:33But the reality is

10:34AI is becoming a huge, huge field

10:37with many subfields.

10:38And each of these subfields

10:39is actually getting really large itself.

10:42So the way we try to do that

10:43is we cannot really curate ourselves anymore.

10:47Just like on the internet,

10:48you cannot really just

10:50try to curate yourself,

10:52the best websites.

10:53You can have a couple of ones you like.

10:54You can rely on search,

10:56you know,

10:56but the best way usually

10:57is to rely on kind of a social discovery.

11:00So you can go on Reddit

11:02or you can find a place

11:03where people like likes.

11:05If you're on Pinterest,

11:06maybe there is some likes

11:07for different things.

11:08And so the way we're trying

11:09to do that more and more

11:11is to give social tools

11:13to comment on models,

11:14to make collection of models.

11:16A little bit like

11:17the internet itself,

11:19I would say,

11:19which is often guided

11:20by other people

11:22that tell you,

11:23you should go there.

11:24And that's how you find the place.

11:25I think the most relying signal

11:27right now are two places

11:29on the GIMFACE website.

11:31One is the blog.

11:32We have a blog session

11:34that's very active actually

11:35and that has

11:36really high quality content.

11:38So it's a good thing to follow.

11:40And the other one

11:40is just the trending models

11:42that show you basically

11:43in the past weeks or two weeks,

11:45which have been the models

11:47that have got the most likes

11:49or the most interest

11:50among all the new models

11:52in the past period.

11:53Yeah, actually,

11:54that makes a lot of sense.

11:55Although at the same time,

11:56I can't help but want to connect it

11:57to what we said earlier

11:59in saying that,

11:59oh, it's only the weird thing.

12:01It's the not average thing.

12:02So maybe you should be looking

12:03at the bottom of the list

12:04instead of the top of the list

12:06based on our,

12:07the problem is

12:08there's going to be a lot there.

12:09But back in the original software days,

12:11if you think about

12:11the huge monolithic IBM 360

12:14operating systems,

12:15they were giant

12:16and expensive to get going.

12:18And we've seen in software

12:20a great reduction in that cost

12:23by building on components.

12:25But we still have just a handful

12:27of chip manufacturers

12:30that, again,

12:31because of the billions of dollars

12:34that it takes to create chips

12:36and to create hardware,

12:37and that may not be true

12:38of all hardware,

12:39you know, like robotics or whatever,

12:41have a much lower entry barrier

12:43because of the scale.

12:45How do we enable that?

12:46How do you push that?

12:47In particular,

12:48if we talk about chips,

12:49which is a field

12:50I'm quite interested in recently,

12:52I think there is

12:53two converging things.

12:55The first one is

12:56AI, in a way,

12:57is quite a simple technology.

12:59So having AI chips

13:01is actually,

13:02most AI chips

13:03are much simpler

13:04than CPU or GPU.

13:06We don't have

13:07all of this, like,

13:08complexity.

13:08We need it to build

13:09on top of them,

13:10all the branch prediction,

13:11all the, like,

13:12the complex things.

13:13We need to make sure

13:14we make really the best use

13:15of these chips

13:17in very generic

13:20compute workflow

13:21and setup.

13:22And an AI model itself

13:25is extremely simple,

13:26in a way.

13:27It's just really

13:27this series of

13:28matrix multiplication,

13:29a couple of nonlinearities,

13:31and you have

13:31the attention blocks,

13:32and that's maybe

13:33slightly scary

13:34the first time

13:34you see the equation,

13:35but actually,

13:36that's really that simple,

13:37right?

13:37If you compare it

13:38to really the

13:39huge,

13:40cumbersome system

13:41we had to design

13:42to actually make

13:43a general computing

13:44system efficient

13:45in all of these

13:46age cases,

13:47just being able

13:48to support

13:49one forward pass

13:50is a baby task,

13:51in a way.

13:52So this means,

13:52in a way,

13:52we can really reinvent

13:54how we make

13:56this computing

13:56architecture itself,

13:58and we can make

13:59it much simpler.

14:00So that's one thing.

14:02And if you project

14:03in the future,

14:04and if you think

14:04that basically AI

14:05might become

14:06the dominant form

14:07of compute,

14:08by which I mean

14:09the dominant form

14:09we use energy

14:10for compute,

14:12this might be

14:12the thing

14:13we might actually

14:14want to really

14:14over-index on

14:15and say,

14:16actually,

14:16it's not like

14:17the GPU

14:18is on the side

14:18helping the CPU,

14:20but that's actually

14:20the reverse case now.

14:22It's this AI computer

14:23is the central piece

14:24of what we're building.

14:26And so,

14:26the first thing is,

14:27I think there is a way

14:28to redesign our

14:29computer architecture

14:30in a much more

14:30simple way,

14:31which is kind of

14:32a funny gift

14:33that the AI

14:34revolution brings us,

14:36which is,

14:36it's much more simple,

14:38and it's also

14:39much more powerful,

14:40because a well-trained

14:41LLM and an LLM

14:43in the future

14:43can simulate

14:45extremely complex things.

14:47And you can even

14:47ask it right now,

14:48which is funny

14:50thought experiments

14:51that my friend

14:52Stephen from

14:53Lambda Lab

14:53was doing,

14:54was you can ask it

14:55to simulate

14:56any type of software

14:57and you could even

14:58tell it now,

14:58behave as a

15:00spreadsheet thing

15:01or behave as

15:02a website.

15:03And so,

15:04it behaves as a

15:05very complex type

15:06of software

15:07pretty well.

15:08And if you look

15:09at the core

15:10operation,

15:10it's just all

15:11very simple

15:12forward paths,

15:13just you have

15:13a lot of them.

15:14So,

15:15yeah,

15:15so that's one aspect,

15:16which is we have

15:16this very simple

15:17computing,

15:18compute architecture.

15:19And the other

15:19aspect is,

15:20we can use this

15:21AI software

15:23increasingly as

15:24helpers to

15:25simplify complex

15:26tasks we had

15:27to do.

15:28And basically,

15:28we can offload

15:30a lot of

15:31the cognitive

15:33load we needed

15:34to have to

15:35kind of remember,

15:36oh yeah,

15:37I need to check

15:38this part of

15:39the software.

15:40I know where

15:41it's on the dock,

15:41but I need to

15:42check how this

15:43works and there's

15:43this thing I also

15:44need to be careful

15:45about.

15:45With the development

15:46of this type of

15:47AI agents,

15:48there is some

15:49hope that we can

15:50really automate

15:50a lot of this

15:52design part.

15:53Just like

15:53nowadays,

15:55even when you

15:56use like some

15:57kind of CAD

15:58design software,

15:59you actually use

16:00something that's

16:00extremely powerful.

16:01So basically,

16:02with just a couple

16:03of parametrics

16:03lines,

16:04you can design

16:04something in 3D

16:05that used to be

16:06extremely complex

16:07to design before.

16:08And the same

16:09thing is,

16:10I could see us,

16:11just like we do

16:11a little bit of

16:13parametric shape

16:14on shape on one

16:15of these design

16:16software,

16:17I can see us

16:19designing a very

16:19complex system

16:20by just giving

16:21a couple of

16:22points on the

16:23basic curve and

16:23hoping that the

16:24AI system will

16:25just actually

16:25connect all the

16:26things and make

16:27sure this all

16:27fits nicely.

16:28So this is the

16:29other thing,

16:29not that we are

16:30building this

16:30system,

16:31but we are

16:31using them to

16:32help us build

16:33an extremely

16:33complex system.

16:35I think it's

16:36great you mentioned

16:36on shape.

16:37My son will now

16:37listen to the

16:38episode because,

16:39yeah, I mean,

16:40it's amazing.

16:41You can take a

16:422D picture and

16:44turn it into a

16:443D, making

16:45models of himself

16:46and just by

16:47taking a frontal

16:48picture there and

16:49it's pretty

16:50amazing.

16:51I want to push

16:51back on a couple

16:52of different things

16:53you got going

16:53there.

16:54One, I think

16:55the chip growth

16:56is something I

16:56hadn't really

16:56thought about,

16:57but we

16:58originally had

16:59CPUs and

17:00then we

17:00developed GPUs

17:02for graphical

17:02processing units

17:03for processing

17:04images on the

17:05screen and then

17:06we had this

17:06aha that, hey,

17:07those matrices

17:08are the same

17:09matrices inside of

17:10machine learning

17:11models.

17:11We can use those

17:12GPUs much more

17:13efficiently.

17:14But what you're

17:15pointing out is

17:16that we didn't

17:17make those chips

17:18for that purpose

17:19originally and we

17:20might have different

17:21design constraints

17:22if we did make

17:23those chips

17:24from the start.

17:26And if you

17:26combine that

17:27with your ability

17:28to make chips

17:29cheaper, make

17:30hardware cheaper

17:31in general,

17:32then that's a

17:33nice combination

17:34that might get

17:34started.

17:35To be honest,

17:36the GPU is

17:37increasingly also

17:38an AI-optimized

17:39chip, for sure.

17:40I mean,

17:41Tensor Core,

17:42and you'll see

17:43a variety of

17:43chips.

17:44You see

17:44Cerebrus chips

17:45that we work

17:46with a variety

17:47of them.

17:47I think it's

17:48very interesting

17:48to follow this

17:49field and to

17:50see how

17:50basically

17:52competition

17:52push people

17:53to explore,

17:54you know,

17:54just like you

17:55were saying,

17:55maybe reinventing

17:56this.

17:57Cerebrus is

17:58this example

17:59of let's be

17:59able to host

18:00a full model

18:01on just this

18:02very large

18:02way for

18:03scale chips.

18:04Grok,

18:05that I was

18:05just seeing

18:06on the news,

18:06is also an

18:07interesting case

18:08of, you know,

18:09let's try to

18:09push, maybe

18:11for the low

18:11batch, the

18:12small batch,

18:13let's try to

18:13push the token

18:14per second to

18:14the max.

18:15Maybe the

18:16driving force

18:16here, if we

18:17step back a

18:18little bit and

18:19take more like

18:19a business view

18:20on this,

18:21is for

18:22the first

18:23time, one

18:25of the main

18:25metrics we

18:26have is a

18:28very low

18:28level metrics

18:29and that's

18:29basically the

18:30cost per token.

18:32And the cost

18:33per token is an

18:33interesting metrics

18:34because it's both

18:35something that

18:36basically almost a

18:36CFO level person

18:38could take a look

18:39at.

18:39When you use

18:40Gemini, that's the

18:41first metrics,

18:42you know, people

18:42told you, and

18:43when you compare

18:44these values

18:44providers, that's

18:45maybe the most

18:46cost-related

18:47metrics you'll take

18:48a look at, but

18:50also a metric

18:51that's extremely

18:52low level because

18:53if you think

18:53about that, that's

18:54really just a

18:55series of

18:55operations and

18:56if you think

18:57about that, you

18:58can even link

18:58that to almost

18:59how many

19:00transistors will

19:01I activate

19:02because this

19:02model has this

19:03size and one

19:04token is just

19:05one forward

19:06pass.

19:06You could link

19:07that to exactly

19:09how much, you

19:10know, billion

19:10transistors you

19:11will need to

19:12activate for this

19:13price.

19:14In the past, we

19:15didn't have that

19:16for the price we

19:17were paying for

19:17all our compute.

19:18We never said,

19:19oh, you will pay

19:20actually this

19:20amount because you

19:21do 10,000

19:22operations.

19:23We never had

19:24this connection

19:24between this

19:25cost metrics and

19:26the extremely low

19:27level of one

19:27single operation

19:28on the chip.

19:29And so since

19:30this metrics is

19:31now the main

19:32one we focus

19:32on, there is

19:35this natural

19:35tendency to lower

19:36this cost and

19:37that just come

19:38directly to the

19:38low-level

19:39hardware, which

19:39is I want to

19:40actually optimize

19:41the cost to do

19:42one forward

19:43path and

19:43these things.

19:44It's very

19:44strongly driving

19:45the optimization

19:46of this chip

19:47in this direction.

19:48Yeah, I think

19:49metrics are a big

19:50deal and we

19:50respond to those.

19:51I hadn't really

19:52thought about it,

19:52but before we

19:53talked about

19:54computing hours,

19:55well, that was

19:55hard to understand

19:56what an hour

19:57did for you.

19:58At the other

19:59end of the scale,

19:59we had floating

20:01point operations

20:02per second,

20:02which have much

20:04less of a way

20:05to relate that

20:06to anything I

20:07want to do

20:08versus that the

20:09token actually makes

20:10a lot more sense

20:11then.

20:11And then people

20:12will optimize on

20:13It's the dollar

20:14per token,

20:15that's the thing.

20:16We never had the

20:17dollar per gigahertz

20:18that would maybe

20:19push you to make

20:20faster CPU or

20:22like, I don't know,

20:23the dollar per

20:23flops that would

20:25make you make

20:26faster GPU in

20:27computer graphics.

20:28The connection was

20:30really wide between

20:31these two universes.

20:32And so I think a

20:33lot of people are

20:33saying that's pretty

20:34true that the

20:34limiting costs of

20:36intelligence would

20:37be the cost of

20:38electricity.

20:38I think that's

20:39for instance.

20:40And the last

20:41person I heard

20:42saying that was

20:42Patrick Collison,

20:43but I think a lot

20:44of people view

20:45this this way.

20:46I do agree, but

20:47there will be a

20:47multiplying factor

20:48between electricity

20:49and intelligence.

20:50And this multiplying

20:51factor will be

20:51exactly this, like

20:52how much

20:54intelligence can

20:55your chips give

20:56you per

20:56electron, per

20:57energy.

20:58And that's where

20:59you will want to

21:00squeeze this as

21:01much as possible.

21:03You mentioned

21:04cognitive, and of

21:05course a big deal

21:06is what we're

21:06going to do with

21:07these chips.

21:07Let's say we've

21:08got all these

21:08chips and we've

21:09got them cheap.

21:10What are we

21:10going to do

21:11with them?

21:11And there's a

21:12report that I

21:12think you reacted

21:13to about a

21:14country of

21:14Einsteins sitting

21:15in a data

21:16center.

21:18And I think

21:19that's pretty

21:19appealing, the

21:20idea that we'd

21:20have, oh gosh,

21:21we'd get all

21:22these models out

21:23there running in

21:23data centers and

21:24we'll just

21:24suddenly not have

21:25one Einstein, but

21:26we'll have

21:27zillions of

21:27Einsteins out

21:28there.

21:28Just think of

21:29the progress.

21:31But today, I

21:32mean, tools in

21:33general and AI

21:34specifically, of

21:35course, they can

21:35be a head start

21:36for people to

21:37get to average

21:37quickly so they

21:39don't have to

21:39spend time getting

21:40to average

21:42cognitive output.

21:43But at the same

21:44time, they can

21:44also be a way

21:45that people just

21:46learn to depend

21:47on these tools.

21:48You know, without

21:49practicing skills,

21:50we don't get

21:51better at skills.

21:51How can we go

21:52beyond average?

21:53So can we have

21:54a country of

21:54Einsteins instead

21:55of sitting at

21:56home?

21:56Can we have a

21:56country of

21:57Einsteins sitting

21:58not in data

21:59centers, but at

22:00homes that are

22:00using AI tools to

22:02provide a head

22:02start?

22:04Yeah, for sure.

22:05So in this case,

22:06started by this

22:06essay from

22:07Dario Amudei, the

22:08CEO of

22:09Antropic, basically

22:10saying, yeah, it's

22:11a beautifully

22:12written essay.

22:13It's very

22:13optimistic.

22:14It's called

22:14Machine of

22:15Loving Grace, and

22:16it's basically

22:16saying that AI

22:18will enable us

22:19to do extremely

22:21important scientific

22:22breakthrough.

22:23And what he was

22:24taking as

22:24example was

22:25really this

22:25Nobel Prize

22:27level breakthrough.

22:28And where I

22:29kind of agree

22:30with him, which

22:30is, I think

22:32if you

22:32summarize

22:33scientific

22:34progress, you

22:36have a lot of

22:36incremental

22:37progress.

22:38And I was

22:38guilty of doing

22:39a lot during

22:40my PhD and

22:40postdoc.

22:41And that's

22:41basically, that

22:42was also the

22:42maximal thing I

22:43could do, which

22:44is basically do

22:45your tiny piece

22:46on this little

22:47aspect, extending

22:48a little bit the

22:48frontier.

22:49And then you

22:49have this massive

22:51change of

22:52paradigm that

22:52usually are the

22:54one that will

22:55typically be awarded

22:56the Nobel Prize.

22:57And it can be

22:58general relativity,

22:59it can be

23:00CRISPR in

23:01biology, there is

23:02a couple of

23:03them in every

23:03field, and they

23:04usually also create

23:05new fields in

23:06themselves.

23:07And what I was

23:09saying and what I

23:09think is that AI

23:10will be extremely

23:11useful for all the

23:12incremental

23:13innovation.

23:14AI is very good

23:14at exploring many

23:16things around the

23:18status quo, but

23:19AI is extremely

23:20bad at challenging

23:21the status quo

23:22itself.

23:22It's very easy to

23:24get TGPT to

23:25agree with you

23:26on anything.

23:27It's very hard to

23:28get this model to

23:29disagree with you

23:30on something

23:31actually and

23:31challenge your

23:32view of the

23:33world, which is

23:33quite a problem

23:34in some case, and

23:36in particular, I

23:36think in scientific

23:37research.

23:38Two weeks ago, I

23:39had the pleasure to

23:40meet again one of

23:41my former

23:41professors called

23:42Alain Aspe, who

23:43got the Nobel

23:44Prize 10 years

23:45ago, I think, if

23:46I'm not mistaken,

23:47for basically

23:48proving this

23:49disagreement that

23:50Einstein had with

23:51quantum mechanics

23:52where basically he

23:53was disagreeing

23:54with the core

23:55idea of quantum

23:56mechanics, that if

23:57you project the

23:59waveform function, you

24:00just basically have a

24:01random output, and

24:02so you can map that

24:03in one experiment, and

24:05he did this optical

24:06experiment.

24:07And if you talk with

24:08this type of

24:09researcher, they

24:11don't want to

24:12please you.

24:12They have strong

24:13opinion, they have

24:14strong ideas, and I

24:15think it's what

24:15actually led them to

24:17make strong discovery

24:18because they were

24:18like, I don't think

24:19this is right, I want

24:20to prove this wrong, and

24:22they don't try to do

24:23this type of

24:23sycophancy that the

24:24LLM will do, where

24:25they actually want to

24:26please you.

24:27I think that's a

24:28strong missing point

24:29for AI models

24:31nowadays, is they're

24:32really trained to, I

24:34mean, first, they are

24:35trained to predict the

24:36most likely next word

24:38in a sentence, which

24:39means they will miss

24:40words that are

24:40unlikely.

24:42If, you know, they

24:42will tend to regress,

24:44just what you were

24:45saying, they tend to

24:45bring you to the

24:46average.

24:47They're very good,

24:47you know, average

24:48thinking, or average

24:51designer, or creative

24:53process, if you use

24:54them for image

24:54designer.

24:55But they're quite

24:56bad at really

24:57challenging the

24:58average and going

24:59this crazy idea that

25:01might challenge some

25:02of their training data

25:03in particular.

25:04And so my point is

25:06they will be very

25:06useful research

25:07assistants, but they

25:08won't be the one

25:09really that could lead

25:10us to extremely novel

25:13breakthrough.

25:14So I think I ask

25:16myself a lot this

25:18question, in particular

25:19when I see my kids, my

25:20son using AI, which

25:22is how much should

25:24they use AI to

25:25automate their

25:26thinking?

25:26What is the

25:27remaining part that's

25:29very human, that we

25:30should keep, that we

25:31should build?

25:32Like always, I think

25:33it's probably a bad

25:35idea to just say

25:35don't use AI.

25:37We'll need to find a

25:38way to teach them how

25:39to use this tool and

25:40to remain very

25:42conscious of what's

25:43the missing part.

25:45You brought in

25:45something that I

25:46think about a lot is,

25:47you know, both with

25:47my own kids and also

25:49just the university

25:50students that I

25:51contact all the time

25:52is what exactly should

25:53their relationship be

25:54with these technologies?

25:55And I think, you

25:57know, I feel like

25:58because I'm at the

25:59front of the room or

26:00around the dinner

26:00table, that I ought to

26:01be having some

26:02opinion about this.

26:04And it's really

26:05difficult to know.

26:06Sounds like you're

26:07pushing your kids to

26:08use these tools and to

26:09embrace them to some

26:10degree, at least.

26:12Yeah, I think you

26:13have to.

26:14This sounded maybe

26:15quite critical of this

26:16tool, but I think

26:17also these tools are a

26:18huge way to unlock

26:19creativity.

26:20I mean, let's talk

26:21about, for instance,

26:22vibe coding.

26:23I know quite well

26:24this idea that you can

26:25prompt a website into

26:26existence and a quite

26:28complex website at it.

26:29I think it's very

26:30fascinating because it

26:31used to be quite

26:32complex to code a

26:33website for sure.

26:34And a lot of people

26:35just, I think, self-censor

26:36themselves and say,

26:37oh, I have this idea

26:38for something, but it's

26:40so complex to be, I

26:41don't know, HTML.

26:42I don't want to, you

26:43go and you have some

26:44no-code tools for sure,

26:46but they have all their

26:46kind of quirks, the

26:47limitation where they can

26:48build, some of them

26:49don't have databases.

26:50And this general idea I

26:52can just ask for these

26:53things to exist is quite

26:55new.

26:56And so one month ago,

26:57for instance, we

26:58organized.

26:59So my son is 12.

27:00He's still interested in

27:01what I do, I would say,

27:02luckily for me for a

27:03couple of years, maybe,

27:04I don't know how long

27:05this was last.

27:06But so I managed to bring

27:07him and a couple of

27:08friends and kids from

27:09other friends to a

27:11little hackathon we

27:11organized with them

27:12where we selected one

27:15vibe coding tool,

27:15Lovable, that I found

27:17very, very easy to use,

27:18very nice.

27:19And we explained to

27:20them a little bit like

27:21the design process that

27:22it's better, for instance,

27:23to formalize a little

27:24bit your idea.

27:25So we asked them to

27:26draw the website they

27:27had in mind to think

27:28about their idea, how

27:29it is, and then to

27:30prompt it.

27:31And we tried to organize

27:32a bit their process.

27:34But the thing we saw

27:35is they very quickly

27:37grabbed this tool and

27:38and they started to

27:39create much more

27:41different apps than we

27:42thought.

27:43So we thought they

27:43would just have one idea,

27:44but basically they had

27:45just 10 ideas.

27:46And very quickly, each

27:47kid was experimenting

27:49with four or five

27:50different websites at the

27:51same time because they

27:52wanted to create, you

27:53know, this thing to

27:54connect scouts with

27:55football players, this

27:57thing to connect cats

27:58owner with secondhand

27:59cats.

28:00So that was very crazy

28:01to see.

28:02And then you imagine

28:02this, they were between

28:04nine and 12.

28:04And to imagine they

28:05will grow.

28:06We've basically just

28:07decided that if they

28:08want to create a

28:08website, it's just a

28:10couple of prompts away.

28:11There's just something

28:12they can do in a couple

28:13of hours.

28:14I think it was very

28:15beautiful to see.

28:16And then you even see

28:17them morphing as a

28:19little entrepreneur.

28:20So my daughter was

28:21building this website to

28:22connect cat owners with

28:24this secondhand cat

28:25people wanted to.

28:26And she was thinking,

28:27oh, maybe I could also

28:28ask them to pay when

28:29they want to meet each

28:30other because then they

28:31need to give the

28:32address.

28:32And so you see them

28:33starting to ask

28:34questions that I

28:35think just because the

28:37technical part is so

28:38easy, they start to

28:39think, to project

28:40themselves a lot more

28:41in how this would be

28:42in real life.

28:43So yeah, that was one

28:44recent example that

28:45really striked me as

28:46an unlocking of

28:47creativity I had no

28:48idea could exist.

28:50And in September and

28:51October, we want to

28:52redo this type of

28:53hackathon everywhere in

28:54the world.

28:55We want to see what

28:56happens if we do that

28:57at a bigger scale than

28:58just basically our

28:59neighborhood.

28:59So I'm quite excited.

29:00Maybe when this podcast

29:01will go out, we'll have

29:02this worldwide kids

29:04vibe coding hackathon

29:05going on.

29:06I may have a couple

29:07of kids we can add to

29:09that, but I think that

29:10ties together well with

29:11what you were saying

29:11before, which was you

29:14have this, yes, this

29:16tool that always says

29:17yes and will always do

29:18it what you want and

29:20try to do it quickly.

29:22Using that to your

29:22advantage and to take

29:24advantage of the fact

29:25that this tool will in

29:26fact do everything you

29:27ask and try very hard

29:28to accomplish that.

29:29And that seems like it

29:31ties well with your

29:32framing of the tools as

29:33good assistance.

29:36Yeah.

29:36Yeah, I think in a way

29:37we are quite lucky.

29:38I'm much more

29:40unimpressed by all the

29:41stories of AI, you

29:43know, freeing itself

29:44from each change and

29:45decided to take over

29:46humanity.

29:47I think the way we are

29:48building these tools

29:49and really that's both

29:51the advantage but also

29:52their strongest

29:53limitation in a way is

29:54we are really building

29:54this tool as assistant

29:56to what we want to do.

29:57This has been

29:58fascinating.

29:59I've really enjoyed

29:59talking with you.

29:59Maybe by the time

30:00this podcast comes

30:01out, let's have some

30:02hackathons organized

30:03for kids all over the

30:04world.

30:04It's been fascinating

30:05talking to you.

30:06Thanks for taking the

30:07time.

30:08Thanks a lot, Sam.

30:09It was a pleasure.

30:11Thanks for joining us

30:12today.

30:12On our next episode,

30:14I'm joined by Angela

30:15Nakalembe, Engineering

30:16Program Director at

30:17YouTube.

30:18Please join us for an

30:20insightful conversation

30:21about trust and safety

30:22in the midst of an

30:23influx of AI-generated

30:25content.

30:28Thanks for listening to

30:30Me, Myself, and AI.

30:31Our show is able to

30:32continue in large part

30:33due to listener support.

30:34Your streams and

30:35downloads make a big

30:36difference.

30:37If you have a moment,

30:38please consider leaving

30:39us an Apple Podcasts

30:40review or a rating on

30:41Spotify and share our

30:43show with others you

30:44think might find it

30:44interesting and helpful.

30:46Thank you.

30:47Thank you.

Categoría

Transcripción

Comentarios

Recomendada