- hace 10 horas
El experto invitado analiza cómo los modelos de IA de código abierto y cerrado coexistirán. Explica sus ventajas, riesgos y el papel de las comunidades para mejorar herramientas y colaboración.
Categoría
📚
AprendizajeTranscripción
00:02Today's guest believes open-source and closed-source models will coexist in the world of AI.
00:08Find out what he considers the opportunities and drawbacks to each,
00:12as well as how communities can make AI tools and themselves work better together on today's episode.
00:19Hello, I'm Thomas from Hugging Face, and you're listening to Me, Myself and AI.
00:25Welcome to Me, Myself and AI, a podcast from MIT Sloan Management Review, exploring the future of artificial intelligence.
00:34I'm Sam Ransbotham, professor of analytics at Boston College.
00:38I've been researching data, analytics, and AI at MIT SMR since 2014,
00:44with research articles, annual industry reports, case studies, and now 12 seasons of podcast episodes.
00:51On each episode, corporate leaders, cutting-edge researchers, and AI policymakers join us
00:58to break down what separates AI hype from AI success.
01:06Hey, everyone. Thanks for joining us again, and welcome back to a new season.
01:10Today, I'm lucky to be talking with Tom Wolfe.
01:12He's the co-founder and chief scientific officer of Hugging Face.
01:16Tom, great to have you on the show today.
01:18Thanks, Sam. It's a big pleasure to be here.
01:21Let's start with Hugging Face itself.
01:23Some of our listeners may not be familiar with Hugging Face.
01:26Can you give us a brief overview of what the company does and what you do?
01:29Yeah, of course.
01:31Hugging Face is an open-source AI platform.
01:33We give access to all the AI models that are open-source,
01:37which means that, basically, these are the models you can download and run wherever you want.
01:43So, when you use an AI model nowadays, you can choose either to go to ChatGPT,
01:49Anthropic, or Google.
01:50They are the most widely diffused at the moment.
01:53Or sometimes you want to run the AI models on your own data center,
01:57or you want to run them on some specific hardware.
01:59It could be like local hardware, or it could be maybe faster chips,
02:03because you need instant response.
02:04In most cases, you will want to go for open-source AI model,
02:08which is a model you can basically just download.
02:10There is quite a lot of them.
02:12On the Hugging Face, there is close to 4 million of this model at the moment.
02:15There is one new model being published every five seconds.
02:19Some of the most famous ones are the Meta series, the Lama series,
02:23and one that I think got the most adoption and most visibility recently was DeepSeek,
02:28which was released in January, and kind of crashed the stock market at the same time that it was released.
02:34And so, over the past eight years, Hugging Face has been building this platform,
02:39growing it together with the community of people and teams who are both sharing and downloading models.
02:44This community is now roughly 10 million users and AI builders, how we call them.
02:50And we've expanded as well beyond just model hosting to also host data sets,
02:55which are used to train models, to fine-tune them, to evaluate them.
02:59And more recently, also what we call spaces,
03:02which are simple, low-code way to test all of these models.
03:08So, there are a lot of people offering solutions here.
03:11And if I think back on the way technology is developed throughout the history of mankind,
03:16people came up with chips and Bell Labs,
03:18and Intel came along and built fabs for processors.
03:22None of that was open-source.
03:23Why is open-source important here?
03:26I think open-source has been always important in a way.
03:30The thing is, open-source is more often the long game in computer science.
03:34So, if we go back to, for instance, the year 2000 or pre-2000,
03:40where basically Microsoft was one of the largest operating systems, right?
03:44And Linux was somehow more for fanatics or geeks.
03:47And now, if you fast-forward 20 years later, right,
03:51Unix is really the basics of all enterprise software and all enterprise cloud, basically.
03:57You almost always run them on some version of Linux.
04:00Even macOS, which I think is probably the most widely diffused on consumer laptops nowadays,
04:07is one of the largest competitors to Windows-based,
04:10is itself based on the Unix core.
04:13So, there is this trend, which is open-source has some advantages
04:19that make it extremely appealing in the long term.
04:22Obviously, on the short term, you can go faster with closed-source,
04:25and that's also what we see with closed-model.
04:27You can iterate faster.
04:30You can raise large amounts of capital to train your models.
04:33You can try to grab, you know, the most expensive AI researchers
04:37and pay them, pay them like huge sum of money.
04:40We've kept pushing a lot for open science.
04:43And just this Tuesday, we published a new model that's called Small LM3,
04:48which is an extremely smart model, but it's the best one, but at 3 billion parameters.
04:53So, it's in the range of size that you can run on your laptop and even on a smartphone.
04:57And we've decided to share, at the same time, all the data, all the recipe,
05:02all the knowledge on how to build this model.
05:04It's fine for us because we don't make money out of these models.
05:07And we think it's very good because anyone who can want to build a model based on this
05:12or looks a bit like this or want to extend this type of model
05:15now has all the knowledge they want to start.
05:18So, we think open-source can be defined in many ways in AI,
05:22but we think the most radical way is to say you share just everything.
05:25You share the data, you share the code, you share the recipe,
05:27and we wrote even a very long blog post.
05:29We're going to probably make it into a full-blown paper
05:32on all the nitty-gritty details on how to build this model.
05:37So, we've kept kind of publishing all of this.
05:40We're even making a book right now,
05:42which is on how to train efficiently LLM on GPU cluster.
05:46But we think basically all of this should be really accessible.
05:49But I think, yeah, I don't want to give the impression
05:52that I'm an open-source absolutist.
05:54I think both of them just have interesting advantages and drawbacks,
05:59and I think both of them will generally coexist in AI.
06:03If you compare to hardware, it's an interesting comparison you made, right?
06:08Bell Labs was in part mostly developing.
06:10At the time, software was probably much more niche,
06:13and software and hardware were much more tied together.
06:16I think if you compare to hardware, there is some difference,
06:19but there is also an interesting advantage
06:20and point to be made for open-source hardware.
06:23So, that's actually something we have started to do
06:26very recently at Hacking Face in robotics.
06:29I don't know if it's something we want to cover nowadays,
06:31but we just acquired this year an open-source hardware company
06:35in robotics called Polen.
06:37And I think there is some definite interest.
06:39The general idea is for me
06:42that a lot of the way you see hardware in the long term
06:46can be reinvented using software.
06:52MIT Federal Credit Union
06:53is a member-owned, not-for-profit financial institution
06:56serving the MIT community and beyond.
07:00Discover a range of financial products
07:02such as checking, savings, personal loans, auto loans,
07:05and home lending.
07:07MIT FCU is here to support you on your financial journey.
07:10Learn more at mitfcu.org slash podcast.
07:15MIT Federal Credit Union
07:17is federally insured by NCUA.
07:20NMLS number 699-225.
07:23Equal housing lender.
07:30So, maybe we'll come to that.
07:31I don't want to go too quickly in a crazy tangent.
07:36Especially, actually, with Hope Jr.
07:38and the robotics initiatives
07:39that you're involved with right now.
07:41I guess you can go ahead and talk about that,
07:43but as you're doing that,
07:44I want to push you a little bit and say,
07:45I don't even know how we define hardware and software anymore.
07:48You've got firmware,
07:50you've got so many layers in the software stack.
07:53And traditionally,
07:55even if we had open source,
07:57like Linux was the example you mentioned,
07:59it still may have built on top of
08:01firmware that was vendor-specific and proprietary.
08:03So, you know, even within those stacks,
08:05it becomes complicated.
08:07But so, yeah, talk about hardware.
08:08I'd like to talk about Hope Jr.
08:10I think this is very interesting.
08:12One thing, it's slightly futuristic,
08:14but my job at HagiFace
08:15is mostly to think about what's coming.
08:17So I do spend a lot of time thinking about the next year
08:19and the coming few years in AI.
08:21I think, just like you're saying,
08:23the frontier between software and hardware
08:25is becoming maybe smaller again.
08:27We had a moment where there was really this huge...
08:29I mean, maybe at the beginning,
08:30it was all the same
08:31because everyone was so close to the hardware
08:33that basically we didn't really have anything
08:35that we would call software,
08:37any like large abstraction.
08:39And nowadays, the interesting thing
08:41is we see this tendency to kind of fill the gap
08:44and to go back to very low level.
08:47And there's a couple of, I think, trends
08:49that I don't have a fully thought about.
08:51But what I see is, for instance,
08:52people using AI again
08:54to speed up the process of building hardware,
08:57both computer-assisted development,
09:00basically, for mechanical PCs,
09:01but also hardware like chips,
09:04all these type of things.
09:05And saying, using AI,
09:07we can maybe reinvent
09:09and basically lower
09:10or maybe digest all of this knowledge
09:13that you kind of need nowadays
09:14if you want to develop something in hardware
09:16in a form that's actually so helpful
09:19that, again,
09:20we could develop hardware
09:21a little bit like we develop software.
09:24So we could iterate much more quickly on it.
09:26We would have much less
09:28of an entry barrier of knowledge
09:30to basically being able
09:31to design things in hardware.
09:32So I'm very excited about this.
09:34And I think this is,
09:35in part, unlocked by the tools
09:37that we now have available.
09:39If I go to Hugging Faces site,
09:41I'm overwhelmed.
09:42There's so much stuff.
09:43And I think,
09:44I mean, it's great that everything's open,
09:46but how do we solve then
09:48this curation problem
09:49of there's a whole bunch of information
09:51I'm never going to get at all.
09:53What do I do?
09:54How do I get started?
09:56Yeah, it's a difficult problem
09:57because there is like 4 million models.
10:00So how do you find the model
10:01you want to use, right?
10:02In the beginning,
10:03we tried to have some curation manual for us,
10:05but with one new model every five seconds,
10:07that's just not really sustainable.
10:09There is a couple of general guidelines,
10:11of course.
10:11I mean, if you're looking
10:12for a speech generation model,
10:13you can restrict yourself
10:14to this type of model.
10:16If you're looking for a text model,
10:17You're down to just a million.
10:18You can filter, exactly.
10:20And it's surprising
10:20because everyone thinks usually
10:22that the category of model
10:24they are interested in
10:25is the most downloaded,
10:26but it's not,
10:26it's often like the most downloaded one
10:28will be like a speech model.
10:29And I get LLM people
10:31or text people are very surprised.
10:32They're like, what?
10:33But the reality is
10:34AI is becoming a huge, huge field
10:37with many subfields.
10:38And each of these subfields
10:39is actually getting really large itself.
10:42So the way we try to do that
10:43is we cannot really curate ourselves anymore.
10:47Just like on the internet,
10:48you cannot really just
10:50try to curate yourself,
10:52the best websites.
10:53You can have a couple of ones you like.
10:54You can rely on search,
10:56you know,
10:56but the best way usually
10:57is to rely on kind of a social discovery.
11:00So you can go on Reddit
11:02or you can find a place
11:03where people like likes.
11:05If you're on Pinterest,
11:06maybe there is some likes
11:07for different things.
11:08And so the way we're trying
11:09to do that more and more
11:11is to give social tools
11:13to comment on models,
11:14to make collection of models.
11:16A little bit like
11:17the internet itself,
11:19I would say,
11:19which is often guided
11:20by other people
11:22that tell you,
11:23you should go there.
11:24And that's how you find the place.
11:25I think the most relying signal
11:27right now are two places
11:29on the GIMFACE website.
11:31One is the blog.
11:32We have a blog session
11:34that's very active actually
11:35and that has
11:36really high quality content.
11:38So it's a good thing to follow.
11:40And the other one
11:40is just the trending models
11:42that show you basically
11:43in the past weeks or two weeks,
11:45which have been the models
11:47that have got the most likes
11:49or the most interest
11:50among all the new models
11:52in the past period.
11:53Yeah, actually,
11:54that makes a lot of sense.
11:55Although at the same time,
11:56I can't help but want to connect it
11:57to what we said earlier
11:59in saying that,
11:59oh, it's only the weird thing.
12:01It's the not average thing.
12:02So maybe you should be looking
12:03at the bottom of the list
12:04instead of the top of the list
12:06based on our,
12:07the problem is
12:08there's going to be a lot there.
12:09But back in the original software days,
12:11if you think about
12:11the huge monolithic IBM 360
12:14operating systems,
12:15they were giant
12:16and expensive to get going.
12:18And we've seen in software
12:20a great reduction in that cost
12:23by building on components.
12:25But we still have just a handful
12:27of chip manufacturers
12:30that, again,
12:31because of the billions of dollars
12:34that it takes to create chips
12:36and to create hardware,
12:37and that may not be true
12:38of all hardware,
12:39you know, like robotics or whatever,
12:41have a much lower entry barrier
12:43because of the scale.
12:45How do we enable that?
12:46How do you push that?
12:47In particular,
12:48if we talk about chips,
12:49which is a field
12:50I'm quite interested in recently,
12:52I think there is
12:53two converging things.
12:55The first one is
12:56AI, in a way,
12:57is quite a simple technology.
12:59So having AI chips
13:01is actually,
13:02most AI chips
13:03are much simpler
13:04than CPU or GPU.
13:06We don't have
13:07all of this, like,
13:08complexity.
13:08We need it to build
13:09on top of them,
13:10all the branch prediction,
13:11all the, like,
13:12the complex things.
13:13We need to make sure
13:14we make really the best use
13:15of these chips
13:17in very generic
13:20compute workflow
13:21and setup.
13:22And an AI model itself
13:25is extremely simple,
13:26in a way.
13:27It's just really
13:27this series of
13:28matrix multiplication,
13:29a couple of nonlinearities,
13:31and you have
13:31the attention blocks,
13:32and that's maybe
13:33slightly scary
13:34the first time
13:34you see the equation,
13:35but actually,
13:36that's really that simple,
13:37right?
13:37If you compare it
13:38to really the
13:39huge,
13:40cumbersome system
13:41we had to design
13:42to actually make
13:43a general computing
13:44system efficient
13:45in all of these
13:46age cases,
13:47just being able
13:48to support
13:49one forward pass
13:50is a baby task,
13:51in a way.
13:52So this means,
13:52in a way,
13:52we can really reinvent
13:54how we make
13:56this computing
13:56architecture itself,
13:58and we can make
13:59it much simpler.
14:00So that's one thing.
14:02And if you project
14:03in the future,
14:04and if you think
14:04that basically AI
14:05might become
14:06the dominant form
14:07of compute,
14:08by which I mean
14:09the dominant form
14:09we use energy
14:10for compute,
14:12this might be
14:12the thing
14:13we might actually
14:14want to really
14:14over-index on
14:15and say,
14:16actually,
14:16it's not like
14:17the GPU
14:18is on the side
14:18helping the CPU,
14:20but that's actually
14:20the reverse case now.
14:22It's this AI computer
14:23is the central piece
14:24of what we're building.
14:26And so,
14:26the first thing is,
14:27I think there is a way
14:28to redesign our
14:29computer architecture
14:30in a much more
14:30simple way,
14:31which is kind of
14:32a funny gift
14:33that the AI
14:34revolution brings us,
14:36which is,
14:36it's much more simple,
14:38and it's also
14:39much more powerful,
14:40because a well-trained
14:41LLM and an LLM
14:43in the future
14:43can simulate
14:45extremely complex things.
14:47And you can even
14:47ask it right now,
14:48which is funny
14:50thought experiments
14:51that my friend
14:52Stephen from
14:53Lambda Lab
14:53was doing,
14:54was you can ask it
14:55to simulate
14:56any type of software
14:57and you could even
14:58tell it now,
14:58behave as a
15:00spreadsheet thing
15:01or behave as
15:02a website.
15:03And so,
15:04it behaves as a
15:05very complex type
15:06of software
15:07pretty well.
15:08And if you look
15:09at the core
15:10operation,
15:10it's just all
15:11very simple
15:12forward paths,
15:13just you have
15:13a lot of them.
15:14So,
15:15yeah,
15:15so that's one aspect,
15:16which is we have
15:16this very simple
15:17computing,
15:18compute architecture.
15:19And the other
15:19aspect is,
15:20we can use this
15:21AI software
15:23increasingly as
15:24helpers to
15:25simplify complex
15:26tasks we had
15:27to do.
15:28And basically,
15:28we can offload
15:30a lot of
15:31the cognitive
15:33load we needed
15:34to have to
15:35kind of remember,
15:36oh yeah,
15:37I need to check
15:38this part of
15:39the software.
15:40I know where
15:41it's on the dock,
15:41but I need to
15:42check how this
15:43works and there's
15:43this thing I also
15:44need to be careful
15:45about.
15:45With the development
15:46of this type of
15:47AI agents,
15:48there is some
15:49hope that we can
15:50really automate
15:50a lot of this
15:52design part.
15:53Just like
15:53nowadays,
15:55even when you
15:56use like some
15:57kind of CAD
15:58design software,
15:59you actually use
16:00something that's
16:00extremely powerful.
16:01So basically,
16:02with just a couple
16:03of parametrics
16:03lines,
16:04you can design
16:04something in 3D
16:05that used to be
16:06extremely complex
16:07to design before.
16:08And the same
16:09thing is,
16:10I could see us,
16:11just like we do
16:11a little bit of
16:13parametric shape
16:14on shape on one
16:15of these design
16:16software,
16:17I can see us
16:19designing a very
16:19complex system
16:20by just giving
16:21a couple of
16:22points on the
16:23basic curve and
16:23hoping that the
16:24AI system will
16:25just actually
16:25connect all the
16:26things and make
16:27sure this all
16:27fits nicely.
16:28So this is the
16:29other thing,
16:29not that we are
16:30building this
16:30system,
16:31but we are
16:31using them to
16:32help us build
16:33an extremely
16:33complex system.
16:35I think it's
16:36great you mentioned
16:36on shape.
16:37My son will now
16:37listen to the
16:38episode because,
16:39yeah, I mean,
16:40it's amazing.
16:41You can take a
16:422D picture and
16:44turn it into a
16:443D, making
16:45models of himself
16:46and just by
16:47taking a frontal
16:48picture there and
16:49it's pretty
16:50amazing.
16:51I want to push
16:51back on a couple
16:52of different things
16:53you got going
16:53there.
16:54One, I think
16:55the chip growth
16:56is something I
16:56hadn't really
16:56thought about,
16:57but we
16:58originally had
16:59CPUs and
17:00then we
17:00developed GPUs
17:02for graphical
17:02processing units
17:03for processing
17:04images on the
17:05screen and then
17:06we had this
17:06aha that, hey,
17:07those matrices
17:08are the same
17:09matrices inside of
17:10machine learning
17:11models.
17:11We can use those
17:12GPUs much more
17:13efficiently.
17:14But what you're
17:15pointing out is
17:16that we didn't
17:17make those chips
17:18for that purpose
17:19originally and we
17:20might have different
17:21design constraints
17:22if we did make
17:23those chips
17:24from the start.
17:26And if you
17:26combine that
17:27with your ability
17:28to make chips
17:29cheaper, make
17:30hardware cheaper
17:31in general,
17:32then that's a
17:33nice combination
17:34that might get
17:34started.
17:35To be honest,
17:36the GPU is
17:37increasingly also
17:38an AI-optimized
17:39chip, for sure.
17:40I mean,
17:41Tensor Core,
17:42and you'll see
17:43a variety of
17:43chips.
17:44You see
17:44Cerebrus chips
17:45that we work
17:46with a variety
17:47of them.
17:47I think it's
17:48very interesting
17:48to follow this
17:49field and to
17:50see how
17:50basically
17:52competition
17:52push people
17:53to explore,
17:54you know,
17:54just like you
17:55were saying,
17:55maybe reinventing
17:56this.
17:57Cerebrus is
17:58this example
17:59of let's be
17:59able to host
18:00a full model
18:01on just this
18:02very large
18:02way for
18:03scale chips.
18:04Grok,
18:05that I was
18:05just seeing
18:06on the news,
18:06is also an
18:07interesting case
18:08of, you know,
18:09let's try to
18:09push, maybe
18:11for the low
18:11batch, the
18:12small batch,
18:13let's try to
18:13push the token
18:14per second to
18:14the max.
18:15Maybe the
18:16driving force
18:16here, if we
18:17step back a
18:18little bit and
18:19take more like
18:19a business view
18:20on this,
18:21is for
18:22the first
18:23time, one
18:25of the main
18:25metrics we
18:26have is a
18:28very low
18:28level metrics
18:29and that's
18:29basically the
18:30cost per token.
18:32And the cost
18:33per token is an
18:33interesting metrics
18:34because it's both
18:35something that
18:36basically almost a
18:36CFO level person
18:38could take a look
18:39at.
18:39When you use
18:40Gemini, that's the
18:41first metrics,
18:42you know, people
18:42told you, and
18:43when you compare
18:44these values
18:44providers, that's
18:45maybe the most
18:46cost-related
18:47metrics you'll take
18:48a look at, but
18:50also a metric
18:51that's extremely
18:52low level because
18:53if you think
18:53about that, that's
18:54really just a
18:55series of
18:55operations and
18:56if you think
18:57about that, you
18:58can even link
18:58that to almost
18:59how many
19:00transistors will
19:01I activate
19:02because this
19:02model has this
19:03size and one
19:04token is just
19:05one forward
19:06pass.
19:06You could link
19:07that to exactly
19:09how much, you
19:10know, billion
19:10transistors you
19:11will need to
19:12activate for this
19:13price.
19:14In the past, we
19:15didn't have that
19:16for the price we
19:17were paying for
19:17all our compute.
19:18We never said,
19:19oh, you will pay
19:20actually this
19:20amount because you
19:21do 10,000
19:22operations.
19:23We never had
19:24this connection
19:24between this
19:25cost metrics and
19:26the extremely low
19:27level of one
19:27single operation
19:28on the chip.
19:29And so since
19:30this metrics is
19:31now the main
19:32one we focus
19:32on, there is
19:35this natural
19:35tendency to lower
19:36this cost and
19:37that just come
19:38directly to the
19:38low-level
19:39hardware, which
19:39is I want to
19:40actually optimize
19:41the cost to do
19:42one forward
19:43path and
19:43these things.
19:44It's very
19:44strongly driving
19:45the optimization
19:46of this chip
19:47in this direction.
19:48Yeah, I think
19:49metrics are a big
19:50deal and we
19:50respond to those.
19:51I hadn't really
19:52thought about it,
19:52but before we
19:53talked about
19:54computing hours,
19:55well, that was
19:55hard to understand
19:56what an hour
19:57did for you.
19:58At the other
19:59end of the scale,
19:59we had floating
20:01point operations
20:02per second,
20:02which have much
20:04less of a way
20:05to relate that
20:06to anything I
20:07want to do
20:08versus that the
20:09token actually makes
20:10a lot more sense
20:11then.
20:11And then people
20:12will optimize on
20:13It's the dollar
20:14per token,
20:15that's the thing.
20:16We never had the
20:17dollar per gigahertz
20:18that would maybe
20:19push you to make
20:20faster CPU or
20:22like, I don't know,
20:23the dollar per
20:23flops that would
20:25make you make
20:26faster GPU in
20:27computer graphics.
20:28The connection was
20:30really wide between
20:31these two universes.
20:32And so I think a
20:33lot of people are
20:33saying that's pretty
20:34true that the
20:34limiting costs of
20:36intelligence would
20:37be the cost of
20:38electricity.
20:38I think that's
20:39for instance.
20:40And the last
20:41person I heard
20:42saying that was
20:42Patrick Collison,
20:43but I think a lot
20:44of people view
20:45this this way.
20:46I do agree, but
20:47there will be a
20:47multiplying factor
20:48between electricity
20:49and intelligence.
20:50And this multiplying
20:51factor will be
20:51exactly this, like
20:52how much
20:54intelligence can
20:55your chips give
20:56you per
20:56electron, per
20:57energy.
20:58And that's where
20:59you will want to
21:00squeeze this as
21:01much as possible.
21:03You mentioned
21:04cognitive, and of
21:05course a big deal
21:06is what we're
21:06going to do with
21:07these chips.
21:07Let's say we've
21:08got all these
21:08chips and we've
21:09got them cheap.
21:10What are we
21:10going to do
21:11with them?
21:11And there's a
21:12report that I
21:12think you reacted
21:13to about a
21:14country of
21:14Einsteins sitting
21:15in a data
21:16center.
21:18And I think
21:19that's pretty
21:19appealing, the
21:20idea that we'd
21:20have, oh gosh,
21:21we'd get all
21:22these models out
21:23there running in
21:23data centers and
21:24we'll just
21:24suddenly not have
21:25one Einstein, but
21:26we'll have
21:27zillions of
21:27Einsteins out
21:28there.
21:28Just think of
21:29the progress.
21:31But today, I
21:32mean, tools in
21:33general and AI
21:34specifically, of
21:35course, they can
21:35be a head start
21:36for people to
21:37get to average
21:37quickly so they
21:39don't have to
21:39spend time getting
21:40to average
21:42cognitive output.
21:43But at the same
21:44time, they can
21:44also be a way
21:45that people just
21:46learn to depend
21:47on these tools.
21:48You know, without
21:49practicing skills,
21:50we don't get
21:51better at skills.
21:51How can we go
21:52beyond average?
21:53So can we have
21:54a country of
21:54Einsteins instead
21:55of sitting at
21:56home?
21:56Can we have a
21:56country of
21:57Einsteins sitting
21:58not in data
21:59centers, but at
22:00homes that are
22:00using AI tools to
22:02provide a head
22:02start?
22:04Yeah, for sure.
22:05So in this case,
22:06started by this
22:06essay from
22:07Dario Amudei, the
22:08CEO of
22:09Antropic, basically
22:10saying, yeah, it's
22:11a beautifully
22:12written essay.
22:13It's very
22:13optimistic.
22:14It's called
22:14Machine of
22:15Loving Grace, and
22:16it's basically
22:16saying that AI
22:18will enable us
22:19to do extremely
22:21important scientific
22:22breakthrough.
22:23And what he was
22:24taking as
22:24example was
22:25really this
22:25Nobel Prize
22:27level breakthrough.
22:28And where I
22:29kind of agree
22:30with him, which
22:30is, I think
22:32if you
22:32summarize
22:33scientific
22:34progress, you
22:36have a lot of
22:36incremental
22:37progress.
22:38And I was
22:38guilty of doing
22:39a lot during
22:40my PhD and
22:40postdoc.
22:41And that's
22:41basically, that
22:42was also the
22:42maximal thing I
22:43could do, which
22:44is basically do
22:45your tiny piece
22:46on this little
22:47aspect, extending
22:48a little bit the
22:48frontier.
22:49And then you
22:49have this massive
22:51change of
22:52paradigm that
22:52usually are the
22:54one that will
22:55typically be awarded
22:56the Nobel Prize.
22:57And it can be
22:58general relativity,
22:59it can be
23:00CRISPR in
23:01biology, there is
23:02a couple of
23:03them in every
23:03field, and they
23:04usually also create
23:05new fields in
23:06themselves.
23:07And what I was
23:09saying and what I
23:09think is that AI
23:10will be extremely
23:11useful for all the
23:12incremental
23:13innovation.
23:14AI is very good
23:14at exploring many
23:16things around the
23:18status quo, but
23:19AI is extremely
23:20bad at challenging
23:21the status quo
23:22itself.
23:22It's very easy to
23:24get TGPT to
23:25agree with you
23:26on anything.
23:27It's very hard to
23:28get this model to
23:29disagree with you
23:30on something
23:31actually and
23:31challenge your
23:32view of the
23:33world, which is
23:33quite a problem
23:34in some case, and
23:36in particular, I
23:36think in scientific
23:37research.
23:38Two weeks ago, I
23:39had the pleasure to
23:40meet again one of
23:41my former
23:41professors called
23:42Alain Aspe, who
23:43got the Nobel
23:44Prize 10 years
23:45ago, I think, if
23:46I'm not mistaken,
23:47for basically
23:48proving this
23:49disagreement that
23:50Einstein had with
23:51quantum mechanics
23:52where basically he
23:53was disagreeing
23:54with the core
23:55idea of quantum
23:56mechanics, that if
23:57you project the
23:59waveform function, you
24:00just basically have a
24:01random output, and
24:02so you can map that
24:03in one experiment, and
24:05he did this optical
24:06experiment.
24:07And if you talk with
24:08this type of
24:09researcher, they
24:11don't want to
24:12please you.
24:12They have strong
24:13opinion, they have
24:14strong ideas, and I
24:15think it's what
24:15actually led them to
24:17make strong discovery
24:18because they were
24:18like, I don't think
24:19this is right, I want
24:20to prove this wrong, and
24:22they don't try to do
24:23this type of
24:23sycophancy that the
24:24LLM will do, where
24:25they actually want to
24:26please you.
24:27I think that's a
24:28strong missing point
24:29for AI models
24:31nowadays, is they're
24:32really trained to, I
24:34mean, first, they are
24:35trained to predict the
24:36most likely next word
24:38in a sentence, which
24:39means they will miss
24:40words that are
24:40unlikely.
24:42If, you know, they
24:42will tend to regress,
24:44just what you were
24:45saying, they tend to
24:45bring you to the
24:46average.
24:47They're very good,
24:47you know, average
24:48thinking, or average
24:51designer, or creative
24:53process, if you use
24:54them for image
24:54designer.
24:55But they're quite
24:56bad at really
24:57challenging the
24:58average and going
24:59this crazy idea that
25:01might challenge some
25:02of their training data
25:03in particular.
25:04And so my point is
25:06they will be very
25:06useful research
25:07assistants, but they
25:08won't be the one
25:09really that could lead
25:10us to extremely novel
25:13breakthrough.
25:14So I think I ask
25:16myself a lot this
25:18question, in particular
25:19when I see my kids, my
25:20son using AI, which
25:22is how much should
25:24they use AI to
25:25automate their
25:26thinking?
25:26What is the
25:27remaining part that's
25:29very human, that we
25:30should keep, that we
25:31should build?
25:32Like always, I think
25:33it's probably a bad
25:35idea to just say
25:35don't use AI.
25:37We'll need to find a
25:38way to teach them how
25:39to use this tool and
25:40to remain very
25:42conscious of what's
25:43the missing part.
25:45You brought in
25:45something that I
25:46think about a lot is,
25:47you know, both with
25:47my own kids and also
25:49just the university
25:50students that I
25:51contact all the time
25:52is what exactly should
25:53their relationship be
25:54with these technologies?
25:55And I think, you
25:57know, I feel like
25:58because I'm at the
25:59front of the room or
26:00around the dinner
26:00table, that I ought to
26:01be having some
26:02opinion about this.
26:04And it's really
26:05difficult to know.
26:06Sounds like you're
26:07pushing your kids to
26:08use these tools and to
26:09embrace them to some
26:10degree, at least.
26:12Yeah, I think you
26:13have to.
26:14This sounded maybe
26:15quite critical of this
26:16tool, but I think
26:17also these tools are a
26:18huge way to unlock
26:19creativity.
26:20I mean, let's talk
26:21about, for instance,
26:22vibe coding.
26:23I know quite well
26:24this idea that you can
26:25prompt a website into
26:26existence and a quite
26:28complex website at it.
26:29I think it's very
26:30fascinating because it
26:31used to be quite
26:32complex to code a
26:33website for sure.
26:34And a lot of people
26:35just, I think, self-censor
26:36themselves and say,
26:37oh, I have this idea
26:38for something, but it's
26:40so complex to be, I
26:41don't know, HTML.
26:42I don't want to, you
26:43go and you have some
26:44no-code tools for sure,
26:46but they have all their
26:46kind of quirks, the
26:47limitation where they can
26:48build, some of them
26:49don't have databases.
26:50And this general idea I
26:52can just ask for these
26:53things to exist is quite
26:55new.
26:56And so one month ago,
26:57for instance, we
26:58organized.
26:59So my son is 12.
27:00He's still interested in
27:01what I do, I would say,
27:02luckily for me for a
27:03couple of years, maybe,
27:04I don't know how long
27:05this was last.
27:06But so I managed to bring
27:07him and a couple of
27:08friends and kids from
27:09other friends to a
27:11little hackathon we
27:11organized with them
27:12where we selected one
27:15vibe coding tool,
27:15Lovable, that I found
27:17very, very easy to use,
27:18very nice.
27:19And we explained to
27:20them a little bit like
27:21the design process that
27:22it's better, for instance,
27:23to formalize a little
27:24bit your idea.
27:25So we asked them to
27:26draw the website they
27:27had in mind to think
27:28about their idea, how
27:29it is, and then to
27:30prompt it.
27:31And we tried to organize
27:32a bit their process.
27:34But the thing we saw
27:35is they very quickly
27:37grabbed this tool and
27:38and they started to
27:39create much more
27:41different apps than we
27:42thought.
27:43So we thought they
27:43would just have one idea,
27:44but basically they had
27:45just 10 ideas.
27:46And very quickly, each
27:47kid was experimenting
27:49with four or five
27:50different websites at the
27:51same time because they
27:52wanted to create, you
27:53know, this thing to
27:54connect scouts with
27:55football players, this
27:57thing to connect cats
27:58owner with secondhand
27:59cats.
28:00So that was very crazy
28:01to see.
28:02And then you imagine
28:02this, they were between
28:04nine and 12.
28:04And to imagine they
28:05will grow.
28:06We've basically just
28:07decided that if they
28:08want to create a
28:08website, it's just a
28:10couple of prompts away.
28:11There's just something
28:12they can do in a couple
28:13of hours.
28:14I think it was very
28:15beautiful to see.
28:16And then you even see
28:17them morphing as a
28:19little entrepreneur.
28:20So my daughter was
28:21building this website to
28:22connect cat owners with
28:24this secondhand cat
28:25people wanted to.
28:26And she was thinking,
28:27oh, maybe I could also
28:28ask them to pay when
28:29they want to meet each
28:30other because then they
28:31need to give the
28:32address.
28:32And so you see them
28:33starting to ask
28:34questions that I
28:35think just because the
28:37technical part is so
28:38easy, they start to
28:39think, to project
28:40themselves a lot more
28:41in how this would be
28:42in real life.
28:43So yeah, that was one
28:44recent example that
28:45really striked me as
28:46an unlocking of
28:47creativity I had no
28:48idea could exist.
28:50And in September and
28:51October, we want to
28:52redo this type of
28:53hackathon everywhere in
28:54the world.
28:55We want to see what
28:56happens if we do that
28:57at a bigger scale than
28:58just basically our
28:59neighborhood.
28:59So I'm quite excited.
29:00Maybe when this podcast
29:01will go out, we'll have
29:02this worldwide kids
29:04vibe coding hackathon
29:05going on.
29:06I may have a couple
29:07of kids we can add to
29:09that, but I think that
29:10ties together well with
29:11what you were saying
29:11before, which was you
29:14have this, yes, this
29:16tool that always says
29:17yes and will always do
29:18it what you want and
29:20try to do it quickly.
29:22Using that to your
29:22advantage and to take
29:24advantage of the fact
29:25that this tool will in
29:26fact do everything you
29:27ask and try very hard
29:28to accomplish that.
29:29And that seems like it
29:31ties well with your
29:32framing of the tools as
29:33good assistance.
29:36Yeah.
29:36Yeah, I think in a way
29:37we are quite lucky.
29:38I'm much more
29:40unimpressed by all the
29:41stories of AI, you
29:43know, freeing itself
29:44from each change and
29:45decided to take over
29:46humanity.
29:47I think the way we are
29:48building these tools
29:49and really that's both
29:51the advantage but also
29:52their strongest
29:53limitation in a way is
29:54we are really building
29:54this tool as assistant
29:56to what we want to do.
29:57This has been
29:58fascinating.
29:59I've really enjoyed
29:59talking with you.
29:59Maybe by the time
30:00this podcast comes
30:01out, let's have some
30:02hackathons organized
30:03for kids all over the
30:04world.
30:04It's been fascinating
30:05talking to you.
30:06Thanks for taking the
30:07time.
30:08Thanks a lot, Sam.
30:09It was a pleasure.
30:11Thanks for joining us
30:12today.
30:12On our next episode,
30:14I'm joined by Angela
30:15Nakalembe, Engineering
30:16Program Director at
30:17YouTube.
30:18Please join us for an
30:20insightful conversation
30:21about trust and safety
30:22in the midst of an
30:23influx of AI-generated
30:25content.
30:28Thanks for listening to
30:30Me, Myself, and AI.
30:31Our show is able to
30:32continue in large part
30:33due to listener support.
30:34Your streams and
30:35downloads make a big
30:36difference.
30:37If you have a moment,
30:38please consider leaving
30:39us an Apple Podcasts
30:40review or a rating on
30:41Spotify and share our
30:43show with others you
30:44think might find it
30:44interesting and helpful.
30:46Thank you.
30:47Thank you.
Comentarios