- il y a 1 jour
Building the Future with OpenAI Agents
Catégorie
🤖
TechnologieTranscription
00:00Good afternoon, everyone.
00:01Such a pleasure to be in Paris today
00:03and to see all of you.
00:05You know, I'm thrilled each time I come back to France.
00:08I was here on the stage last year
00:10and I'm struck by the intensity and the level of talent
00:12that keeps on increasing each time I come back.
00:17OpenAI is a research company
00:19working on building Artificial General Intelligence, or AGI,
00:23that benefits everyone.
00:24And we're starting to see glimpses of that AGI
00:27and what it will be.
00:28It no longer feels like a theoretical concept
00:31like maybe a few years ago.
00:34And to achieve this mission,
00:35we truly believe in the iterative deployments,
00:38making sure we can put the technology
00:39in the hands of people like all of you
00:41and learn along the way to improve our models
00:44and our capabilities and products.
00:46And the top focus for us at OpenAI
00:48is to empower the best developers, founders, entrepreneurs,
00:52like many of you in this room,
00:53to make sure you can build with the best frontier AI models.
00:58And it's truly incredible to see the progress we've made
01:01since I was here last year.
01:03At the time, most of the companies, developers I was speaking with,
01:08they were kind of like experimenting with AI.
01:10They were building text-based assistant experiences.
01:13And I was showing for the first time in Europe,
01:16GPT-Foro and the speech-to-speech capabilities
01:19that we had at the time.
01:21We started to talk about agents,
01:23but candidly, this was quite hard to build.
01:26And the capabilities of the models were just not there yet.
01:30Well, now, AI is getting deployed everywhere,
01:33and that's truly exciting.
01:35From supply chain, consumer apps, finance, customer service,
01:39every sector and every app is now benefiting from frontier intelligence.
01:44But the real breakthrough that happened in order to get there
01:48is what happened with reasoning,
01:49something we introduced back in September
01:51with the O series of models starting with O1.
01:55And O1, for the first time, could reason through complex problems
01:59in math, finance, or strategy, science, coding.
02:03And the way it worked for O1,
02:06and now with O3 and O4 Mini, the successors,
02:09is the ability for the model to have a chain of thought.
02:12And what that means is what we refer to in the field
02:14as test-time compute,
02:16the ability for the model to kind of explore multiple paths
02:19before committing to an answer.
02:21And as a result, those models require a lot less prompting techniques now.
02:26There are also a lot less context,
02:28and you get much better answers than the historical models.
02:33And it's really important to underscore
02:35that this is what that shift
02:38and this paradigm shift that happened with reasoning
02:40is the kind of backbone now enabling agents.
02:43We're coming now to a place where we have agentic AI,
02:48or that means AI that are able to take actions on behalf of users.
02:52So we went from this era of interaction with models
02:55like back and forth, asking questions, getting answers,
02:58to now a place where you can have models acting on behalf of users
03:02in the world, which is pretty exciting.
03:04And that's why at OpenAI, we keep referring to 2025
03:07as the year of agents, the year where they finally get to work.
03:10And we'll talk mostly about agents today in this talk.
03:15Earlier this year, we launched Operator.
03:17This was our first agent that's not just able to browse the web,
03:21it's also able to take actions for you.
03:23So here you can see, for instance,
03:26this tweet from Aaron Levy describing what's happening here
03:28with due diligence for a TikTok transaction.
03:32And you see the computer use model is able to go through tabs,
03:36fill out forms, clicking links, and navigate the internet.
03:39And what's really amazing here when you have such a model is that
03:43it can operate a computer without having APIs
03:47or very specific interaction.
03:49It would just use a computer or a browser like a human would.
03:52And what's really amazing with this is that for companies
03:55or enterprises even, that means driving efficiency.
03:59That means freeing teams to work on what's more creative
04:03as opposed to doing repetitive tasks.
04:06And so if legal needs due diligence, well, you can use Operator
04:10to spin up a folder, pull every filing you need, and scan the web.
04:14Maybe you need to book a trip and build a travel app.
04:17Well, in which case, Operator could find hotels, book flights,
04:21and navigate the internet, even like top-rated tours
04:24that may not have an API.
04:26Or if you want to stay on top of, say, volatile markets these days,
04:29well, you can have Operator having a task that kind of scans the web
04:32overnight and you wake up every morning with a brief of what's happened.
04:38Then a few months ago, we launched a second agent that we call Deep Research.
04:43And that agent lives directly within ChatGPT.
04:46What's really cool about Deep Research is that with a very simple prompt,
04:50you can suddenly have insights that are coming from hundreds of online sources
04:55all at once, and you get like a very personalized and comprehensive report,
05:01like a PhD-level report, turning like complex research into something
05:06that's now much more accessible.
05:08Previously, you might have spent hours, if not days, to actually conduct
05:13this research here on hard conditions.
05:16Well, now, in maybe like 20 or 30 minutes, you'll have something very comprehensive
05:21that you can refer to.
05:22So imagine your business wants to expand to a new country,
05:26or maybe you want to find like key clauses from multiple supplier contracts.
05:31You can fire off those deep research now.
05:33And personally, I love using it to learn more about like new topics,
05:37and I get personalized reports to learn more about them.
05:42So these are the first two agents that we launched at OpenAI earlier this year.
05:46But what's also very important to us is to enable all of you
05:50to build your own agents, to integrate agentic AI in your apps,
05:54in your products, to reach your own customers.
05:57And so, in fact, that's exactly why we built the OpenAI API back five years ago.
06:03That's right, we're actually celebrating the fifth anniversary of the API this week.
06:08And most people don't know this, but this was our first product
06:11before we even launched at GPT.
06:14And the vision for us with the API and the developer platform
06:17is to enable all of you to build these kind of experiences
06:20where you can have your users benefiting from powerful AI agents
06:25that can turn complex tasks into something much more easily accessible for them.
06:32So we want to power your agents and let them learn and act on your terms,
06:37with your guardrails, with your context,
06:39and that's why we're investing in the platform.
06:41So let me walk you through three pillars that we're investing in to make this possible.
06:47But first, let's talk about models.
06:49I mentioned a bit progress earlier, but truly, I want to call out the pace here
06:55and how fast everything is accelerating in AI.
06:58I'm sure you keep reading the news and hearing that AI is moving fast,
07:02but seriously, this even blows our minds.
07:04Like, we used to measure progress on models in years and then in months,
07:08but these days, we cannot measure progress in weeks.
07:11That's the pace at which, like, AI progress is going.
07:14And let me show you a quick recap, in fact, of what we've launched on models
07:18at OpenAI in just the last month.
07:21Let's take a look at the GPT family first.
07:24GPT 4.1 is multimodal, meaning you can use it for text, for vision, for audio.
07:29We also offer mini and nano versions that are much more efficient,
07:34and so you can use the one you need for your latency and cost needs for your specific use cases.
07:41And you can really think of GPT 4.1 as the workhorse.
07:44It's like the model you want to use to, like, really follow the instructions
07:49and fire off any kind of task at it, you know.
07:52We train these models, by the way, based on developer feedback,
07:55and that's how we make them better.
07:58We improve the instruction following, we increase the context window up from 128K,
08:03with GPT 4.0 now to a million tokens,
08:05so you can bring a lot more context into every request.
08:08And they offer exceptional performance at lower cost.
08:13Now let's talk about reasoning models.
08:15And once again, we've also refreshed that entire family just last month.
08:20with O3 and O4 mini, these two are the smartest and most capable model we've ever built.
08:26They push the frontier across math, science, coding, visual perception, and much more.
08:33And I've mentioned coding quite a bit, we'll get back to it,
08:35but that's a real priority for us.
08:37We want to have the very best developers in the world at building software,
08:40and we really think that's how technical teams will get unimaginable leverage.
08:44We'll get to this with Codex in a bit.
08:48And so with all these models, what's also very important to us is that we want to have great models,
08:53but we want to help you deploy at scale.
08:56And we know that performance and costs go hand in hand.
08:59It's very important to have both in order for you to build agents.
09:02And so here, when you look at GPT 4.1 Nano, for instance, on the right side,
09:07everything has gotten so much cheaper to run.
09:10In 18 months, we've cut the cost of AI by 99%.
09:14That's pretty incredible when you think about it, because in any other industry,
09:18cutting costs by 99% would be terrifying.
09:22But for us, it's actually great, because every time we do so,
09:25there are more and more use cases that you come up with,
09:28and then more AI applications that you can deploy.
09:31And we know that controlling cost is what makes deployment even in the enterprise viable,
09:36because you need scale.
09:38And in fact, on Wednesday, just this week, two days ago,
09:40we've also reduced the cost of O3 by 80%, with no compromise on performance.
09:45We really want you to deploy these AI agents at scale.
09:48And that's pretty incredible.
09:51Now, let's talk about the second big focus for us, and that's building an agent platform.
09:56Let's face it, building an agent is actually quite hard when you get into the details of it.
10:01I've worked with many, many customers during my time at UponAI,
10:04and there's one thing that comes up pretty frequently as a pattern.
10:09You know, companies, developers, enterprises, they have great ideas for an agentic application,
10:13and they just think that, like, picking the right model for the task is all it needs.
10:19But that's actually just half of the story.
10:21That's where the rubber meets the road, because quite candidly,
10:24you need to have the right model for the right cast, but you also need to bring your context.
10:28You need to have access to the right tools.
10:30You need to also evaluate performances and how that deploys in production.
10:36And that's why we're building this agents platform.
10:38The goal for us is to give you all of the building blocks that you need to build great agents.
10:43that are powerful and reliable.
10:46So, you know, APIs, tools, we'll get to that.
10:49Orchestration, evals, that's actually something very hard to do.
10:52How do you stay on top of orchestration?
10:56And how do you optimize your agent?
10:58So I won't have the time to dive into all of these today,
11:01but I want to call out two that I think are pretty important.
11:04So if you look at this demo here, this is like a demo application to boot travel on the left
11:09side.
11:10And on the right side, you see what our orchestration SDKs can do.
11:14And here you can tell that, like, as I'm trying to book a trip to Scotland,
11:19you have multiple agents that are triggered.
11:22First, a ticketing agent.
11:23Here we see an hotel agent that's using computer use to book an hotel.
11:27You also have a report agent.
11:29And so you have full visibility into the waterfall.
11:32What are the agents?
11:33What are their guardrails?
11:34What are the contacts that they have access to?
11:37And we also just introduced last week the same agents SDK in TypeScript.
11:41Very excited about this because it's going to make it much easier for you all to build full-stack application
11:47that marry front-end, back-end, but also have all of the benefits of having this, like, stream of agents
11:52working hand-in-hand.
11:55We also launched a new API underlying all of this to make them easier.
11:59We call it the responses API.
12:02And that responses API supports many tools out of the box.
12:05When we launched it back in February, we launched it with web search, file search, and computer use,
12:10the same model underlying operator that you can also use now in your own apps.
12:15But just recently, we also launched MCP support.
12:18We also added the ability to do code interpreter.
12:21And most recently, image generation as a tool.
12:24So you can also use that.
12:25And we keep increasing the collection of what tools are you able to access
12:29when you're building these agents for your own product and for your companies.
12:34And speaking of image generation, that brings me to the third pillar that we're investing in.
12:38And that's what we call multi-modality.
12:41The ability for the model to really have access to all of the different modalities from vision to audio and
12:48so on.
12:49But I have to confess, this was the state of the art last year when I was here at VivaTech.
12:53And I was not quite proud of how the landscape looked at the time,
12:58because developers had to integrate with multiple APIs for all of the different modalities.
13:03And that was quite complex.
13:04You know, DALI and then Whisper and so on.
13:07But GPT-4 was the turning point.
13:10This was the model we started to bring all of these modalities together.
13:13And now with 4.1, we're starting to see how all of these modalities converge into one model,
13:20one API that you can rely on much more easily.
13:23And developers are much happier now as a result.
13:26We're continuing to invest a lot in multi-modality, starting with image generation.
13:33Who in the room has tried image gen in chat GPT when it launched?
13:39Yeah, that's a lot of you.
13:41Not surprising, right?
13:42Because in just the week after launch, we had more than 700 million images generated on chat GPT.
13:49And now we're well into the billions since launch.
13:53And when you look at what customers have done with this, companies like Adobe, Figma,
13:59here on the screen, you see photo room.
14:01They're using image gen to turn any kind of picture of a product into like a stage environment in various
14:10settings.
14:10So you can really bring these pictures to life.
14:13They can even take like any kind of photo and make it like more beautiful than it was.
14:18So very impressive use cases for image gen from all of these companies.
14:23Figma, for instance, make it easy for any employee to become like now pretty basic designer
14:28and have some better tools at their disposal.
14:33We've also graded the real-time API and all the capabilities around voice and audio, transcription, text-to-speech.
14:40You know, just a week ago, we also improved the real-time API now with like better steerability,
14:46a new speed parameter that you can control.
14:48We made the text-to-speech models more steerable so you can actually like adjust the tone and the vibe
14:53of the voice.
14:54There's a lot more to say about multi-modality, but these are just a few call-outs I wanted to
14:59make.
15:01All right, now I want to zoom in on one group that sits the center of a lot of this
15:05change with AI.
15:07And these are your technical teams, your software engineers.
15:10It turns out we just introduced our latest agent, Codex, specifically to augment developers.
15:18And for your technical teams, this is not about just like velocity.
15:22It's also about like bringing your 2026 roadmap and pulling it into 25.
15:27It's about shipping ideas that maybe you had for the longest time on the backlog, but they felt out of
15:32reach.
15:33And when you have like a 5x or a 10x leverage, the whole tempo of product development changes.
15:39You can go a lot faster, you can do a lot more for your own products.
15:44And for me as an engineer, it's pretty incredible to see how far like coding with AI has come.
15:49Like that was just the state of coding a few years ago, like a single line autocomplete,
15:55which was at the time pretty compelling.
15:58But then in just a few years, we've gone from that to like, you know, complete functions being written
16:03and then, you know, code blocks entire files.
16:06But now we're entering a stage where the models are so capable
16:09that you're actually able to give them much larger engineering tasks
16:13compared to what you could do back then.
16:15And so the interaction patterns also with your coding editors and your tools have to change.
16:21And our vision here for Codex is to let developers delegate a lot of tasks to coding agents,
16:29making them more productive, making them faster, but also freeing up time for them to spend it
16:35on what is most exciting and more creative.
16:38So here, for instance, we imagine a world where developers can delegate a lot of their tasks
16:43to like their teammates in the cloud, like AI teammates of sorts.
16:47Maybe they can write code, build features for them, but also everything that's like the intricacy of software engineering,
16:53like running tests, you know, or like fixing bugs, migrating legacy code and so on and so forth.
17:00And that's why now with tools like Codex, developers can spend more time or they feel more creative.
17:08And we're starting to see this emerge in two different ways.
17:11First is Codex CLI, which we launched alongside O3.
17:15Codex CLI is a very lightweight coding agent that lives locally on your terminal.
17:21So what that means is that developers can pay with it and explain in plain language what they want to
17:28get built.
17:28And if we take a look at this example here, for instance, I'm simply saying,
17:33well, explain this entire code base to me.
17:35And then the model will go on and explain what happens in that code base with like complete understanding,
17:41even if you have like thousands and thousands of files.
17:45But I can also ask to generate docs, I can ask to build features and create, you know, production quality
17:50code.
17:50It also excels at multimodal understanding, but we'll see that in a live demo right after.
17:57But where things get really interesting is when developers don't have to just pair with one AI agent or teammate
18:04locally on their Mac.
18:05When it's when you start thinking about what if you had a cloud based agent that is now a team
18:11of AI agents that can work and pick up tasks for me and parallelize a lot of this work.
18:16And that's why we introduce codex in research preview.
18:20It's a cloud based agent that engineers can use to delegate work in the cloud.
18:25So the way it works is that you can connect to GitHub repository with your code base.
18:31And then what's going to happen is like there's going to be for every task a container spun up in
18:37the cloud securely and private that the model will be able to work through and work with.
18:43So it's going to have access to a terminal to read the files, edit code and so on and so
18:47forth.
18:47And our goal with this is to make sure that the code that's produced is actually great quality code that
18:55developers want to merge, not just code that looks good on benchmarks.
19:00And that's why we built codex one.
19:02It's a custom fine tuned model based on O3 using the reinforcement fine tuning technique.
19:10And so we have this great model that now generates high quality code optimized for what you want to merge
19:16in production.
19:18And we started to launch codex here with this very simple UI within chat GPT.
19:23But we want a codex agent to be ubiquitous ultimately.
19:26Like any kind of tool you have, you should be able to access your codex agent, whether it's your coding
19:31editor or maybe your terminal and your issue tracker.
19:34And that's a big bet for OpenAI.
19:35We want to make sure that developers have great AI agents they can rely on to produce great quality code.
19:44And so far developers have loved it.
19:47Here is an example of a developer that used codex right after the launch, like a few hours after the
19:54launch.
19:54And he created 50 pull requests with codex and he told us he merged all of them actually.
19:59They were high quality.
20:00Another engineer told me that same launch weekend that they managed to migrate an entire code base in one shot
20:09in 12 minutes.
20:10The developer was shocked because he told me he had planned to work on it for three full weeks.
20:15And all of the sudden codex could do it in 12 minutes and done.
20:18So that's pretty incredible.
20:21And we launched codex with almost no rate limits.
20:23So as you can tell on the screen, Ryan and developers enjoy having generous rate limits to take advantage of
20:30it.
20:31And now codex is available to ChatGPT Plus users as well.
20:35Lastly, we added codex to the mobile app too.
20:38So that's pretty cool because you don't even have to be now on your computer.
20:41You can start thinking about ideas.
20:43You can start being on the go, firing off tasks to codex.
20:47And then ultimately, you'll be able to review pull requests of fresh code when you're back at your desk.
20:53But with all of that, enough code.
20:56I think it's time to see some live demos.
20:57I'm going to show you some codex CLI demos.
21:02And we'll get a sneak peek into the codex agent as well.
21:08All right.
21:10So what you're looking at here on the screen is a very simple demo app that we built to introduce
21:16the new text-to-speech models.
21:17We called it OpenAI.fm.
21:19So it basically lets you pick a voice, a vibe, type a script.
21:23You hit play.
21:24Good afternoon, Viva Tech.
21:27To everyone here and everyone joining us online, welcome.
21:32Great.
21:32So imagine now I have this project I'm working on, but I want to start making some modifications to it.
21:37So I'm going to open up the terminal.
21:39And what you're looking at here is the codex CLI.
21:41And the first thing I did was like asking codex to explain this codebase to me for OpenAI.fm.
21:47So as you can tell, it goes on to explore the codebase.
21:50It looks at every possible file.
21:52And then we have this interesting thing here, right?
21:55Right away, the CLI is able to tell me, okay, I've analyzed the whole codebase.
22:00This is how it's structured.
22:01It even pointed out the interesting pieces here where, like, I integrated OpenAI text-to-speech, for instance, where the
22:07routes are and so on.
22:09And so pretty compelling that I have this, like, agent locally on my machine.
22:14But then, of course, explaining a codebase is interesting, but I can do much more.
22:18So then I opened this, like, second window to save us a bit of time.
22:22Once again, this one is running, like, codex mini.
22:24And then I said, you know what?
22:26I looked at this app, and when I turned dark mode, it was actually not working, not implemented.
22:33And so I thought, you know what?
22:34Change the top left logo to vivatech.fm, implement dark mode based on the macOS settings.
22:40I also turned the orange accent to a beautiful purple because that's the theme of this stage, right?
22:45And then as soon as I fired off this task, locally on my machine, the codex agent was able to,
22:51like, break it down, try to understand what to do.
22:54I will scroll fast here because it was browsing the entire codebase to figure out what to do.
22:58It was searching for the accent color and so on.
23:00It's starting to make the dark mode integration, as you can tell.
23:03It's updating the CSS logo and so on.
23:05Pretty compelling to see it do all of these things on my local codebase.
23:09And then after a minute or so, this is what it did.
23:13So it did change the vivatech title, made the dark mode all work perfectly.
23:18It made, like, the accent color in purple.
23:21And then if I switch back, everything still works fine.
23:24So very compelling.
23:26And then I can fire off one more task and keep on going.
23:29Maybe I could say, what could I say here?
23:30Maybe I could say, like, implement a new vibe called Paris Tour guide.
23:40And then, you know, it will figure out where this is displayed.
23:43It will also, like, look at the code.
23:45I should probably have told it to make it default.
23:48But, you know, it will now just go on and do this task and we don't have to watch it.
23:52But this is pretty compelling.
23:54Now, while it's working on this, I want to show you a second demo.
23:57What if I'm working on something that requires multimodal reasoning and I'm actually starting from scratch with no project here?
24:05So what I'll take here is, like, this piece of paper that I started to sketch on.
24:11I'm going to try to bring the camera so you can see what I'm looking at.
24:15So I'm starting to draw a little bit like a vivatech booth app, right?
24:19So I left one spot here which I need to fill out.
24:23So we're going to do that together.
24:24Here I'm going to say that I want a filter called Paris Glow, since we're in Paris.
24:31And then here I'm going to say add countdown.
24:41Flash 3, 2, 1.
24:44Great.
24:44So now, what you're looking at here is my beautiful vivatech photo booth.
24:50I'm going to try and go ahead and take a screenshot of this, if I can.
24:53So let's see.
24:55Very clunky way to take a screenshot, but, you know, bear with me.
24:59Okay, maybe right there.
25:01Cool.
25:02Okay, now that we have this screenshot, I'm just going to pass it right away to Codex.
25:08So we'll see what happens.
25:09So here, I just mentioned to Codex that I have this image to look at, the screenshot we just took.
25:16It's going to take a few minutes of thinking, but it's going to analyze that sketch and figure out what
25:20to do with it.
25:21Okay.
25:22So now, it's asking for clarification, obviously, because I have not explained what I've been trying to do.
25:29But, as you can tell, it definitely understood it was a photo booth app with some filters.
25:33It even found out the Paris blue I just wrote.
25:36So I'll just say implement this as a simple HTML page, elegant design using the web camera API.
25:47Enter.
25:49And now we'll see.
25:50It's going to probably take a minute or so to figure it out.
25:52But, you know, it would take me much, much longer to figure out how to use, you know, the user
25:58media API on the web and so on.
25:59So I'll just give it this task and we'll let it run in the background and we'll come back to
26:04it in a minute or so.
26:06So that's Codex CLI.
26:08Once again, like an open source agent running on your computer.
26:12But as I said earlier, when things get really interesting, it's like what if I can not have like this
26:17one task running at a time on my machine, but I can fire off a bunch of tasks.
26:21Let's look at Codex on the web.
26:24So here I have Codex open on this OpenAI FM repo we were looking at.
26:30And I can start firing off a bunch of tasks from here.
26:33So as you can tell, I have a few that I've already done.
26:36We'll look at them.
26:37But here I'm going to fire off a task.
26:39Let's say find a bug and fix it.
26:42So very ambiguous task.
26:44But we'll just let Codex figure it out, navigate the code base and so on.
26:49Now, what if I want to do something more interesting?
26:52Maybe I'll say, well, you know, I have this script here to type.
26:55I'll just turn on dictation for this one.
26:59Hey, so I'm thinking there is a script on the right side.
27:04It would be kind of cool if we had a little microphone icon on the bottom right.
27:08And maybe when I click it, you can actually transcribe what I'm saying into the text using the latest OpenAI
27:14models.
27:15Okay.
27:18Sounds great.
27:18We're going to fire off this task to Codex and Codex will have to figure it out.
27:23Maybe I can also say, you know, the sound wave effect on the play button does not look real to
27:32me.
27:32I'd like to replace this animation with the real sound wave.
27:35So just figure out how to implement it.
27:40Boom.
27:41I'm going to fire off this task as well.
27:43So what's happening now on the screen?
27:45I just fired off tasks and each and every one of them now is starting a new container.
27:51So if I look at this one, for instance, it's now like spinning up a container, cloning the GitHub repo.
27:56It's going to go through all of the setup scripts to install everything.
28:00It's a Next.js project.
28:01And then it's probably going to take like a couple minutes or three minutes to figure it out.
28:05But it's going to go through everything in that one container and work on it.
28:09So I can just leave it behind and we'll come back to it.
28:11So now I just fired off three tasks with you all on stage.
28:15But since they're going to take a little bit of time, let's switch to one that I just fired off
28:19already.
28:21This one.
28:21I know it's painful for all of us developers to localize apps in so many languages each time.
28:27Well, I just fired off a task which was extract all the strings in this app and make it work
28:32in French and Spanish.
28:34Well, right away Codex went to the task, extracted all of the possible strings,
28:39and then it made everything that we had to do to extract all of this.
28:45And if I scroll to the bottom, we have the fr.json file, we have the Spanish one,
28:51and I can even say, great, do it for all languages in Europe.
28:59And I can go ahead and ask it to code it.
29:01But what's interesting here now is this button on top, create pull request.
29:05Oops, I guess I just clicked archive because it disappeared.
29:08But let's go back to this other one.
29:10Let's see this one, for instance, implement dark mode.
29:13I can just go ahead and click create pull request.
29:16And that will create a pull request on GitHub that my coworkers can review,
29:19and we can just right away merge it into production.
29:24So that's a preview of Codex Agent.
29:27With those three tasks, we fired off.
29:29They're going to take a few more minutes.
29:30But I wanted to give you a feel for new tasks and tasks already completed.
29:34Now let's go back quickly to what happened in our terminal here.
29:37Let's take a look. It sounds like our photo booth app was complete.
29:40So let's open it and see what happens.
29:44All right, I have to allow the camera.
29:47And we have a VivaTech photo booth.
29:51Great, it just worked.
29:53So I don't think anyone has taken selfies with laptops before.
29:55But, you know, we can probably try.
29:58Let's see if it put the countdown.
30:00It did put the countdown.
30:02And there we go.
30:03We have a photo booth app zero shot using Codex CLI.
30:06So that gives you an idea of using Codex locally on your machine to build tasks
30:10or Codex Agent in the cloud to take on all of your most complex tasks for your backlog.
30:17Now, last but not least, I have one last thing to show you before we part.
30:22I'm going to bring here this little drone that I programmed initially with O1.
30:29So when O1 came out with reasoning,
30:32one of the first tasks I wanted to show O1 and Shui was this idea,
30:37could we have O1 actually interface with this device without me knowing anything about it?
30:42I don't have to read documentation.
30:44I know it's a programmable drone because it's made for education.
30:47But I wanted O1 to figure it out on its own.
30:52So I did that.
30:53I'm going to connect to it over Wi-Fi.
30:55But what's new here is that, of course, since I did that just six months ago,
30:59the capabilities of AI have increased a ton, especially around multi-modality.
31:04And so I thought, well, it's kind of nice to have these buttons,
31:07but what if I could enter voice mode to talk to it instead?
31:10So we're going to try that in a sec.
31:12I'm going to switch over to the server of this, try to connect to it.
31:17Let's go back to it here.
31:20We are connected.
31:21Great.
31:22And I can see the battery level show up.
31:23Great.
31:24And I'm going to start the video.
31:28Great.
31:29Let's enter voice mode.
31:32Hi there.
31:35How can I help you?
31:37Yeah.
31:37Could you tell me about your capabilities?
31:40I can take off, fly around and help you with some video analysis of your surroundings.
31:47That's great.
31:48Could you actually take a look at what you see now?
31:56Sure.
31:56I'm going to take a look around.
31:59It looks like a conference room with a large audience facing a stage and screens.
32:06The lighting is purple and everyone looks attentive.
32:09That's right.
32:09That's exactly that.
32:11Could you transform the frame you're seeing into like a style of a Monet painting?
32:18Which style would you like exactly?
32:21You know, like a Monet painting you can pick.
32:29Got it.
32:30I'm on it.
32:32Great.
32:33And while you're working on this, can you tell me your battery level?
32:41The battery level is at 88%.
32:43That's right.
32:44Could you do a quick demo flight for us in the room?
32:50Absolutely.
32:51Let's add a little excitement to the room.
32:54Here we go.
32:55I'm flying now for a quick demo.
32:57Let me know what you think.
32:59I hope that little flight looked good from your view.
33:02Amazing.
33:03That sounds great.
33:05And we have the beautiful Monet painting of you all that I have to save.
33:11I'm glad you like it.
33:12Let me know if you need any more artistic transformations or any other help.
33:17Thank you, AI agent.
33:19We're good to go.
33:19I think they enjoyed it.
33:25All right.
33:25So these were a couple of demos of Codex, CLI, Codex agent, and how you can bring all of
33:31this together with multi-modality.
33:33Now, let's go back to the presentation to wrap us up.
33:35Because you might be wondering, what's next for OpenAI?
33:39Well, I wanted to call out two models that we've already mentioned publicly, but were not
33:44part of the presentation today just yet, so we're going to fix that.
33:47First, an open model.
33:49We're super excited to bring a new Openweight model to the community.
33:54We've had several listening sessions with many developers to listen to their feedback,
33:58their wish list, including here in Europe.
34:00We've learned a ton.
34:02We've gathered great insights, and we're hard at work building one.
34:05Sam tweeted just Tuesday here that it's going to take us a little more time because we think
34:09it's going to be pretty awesome, but so we are hard at work on this, and you can still
34:14expect this this summer.
34:15Pretty excited to see what the community will do with this.
34:18And second, of course, GPT-5.
34:21Sam telegraphed our roadmap earlier this year.
34:23We know with this space of change in AI, the model names have become quite complex to follow
34:29along with 03, 04 mini, GPT 4.0, 4.1.
34:32So we're truly excited to not just make a net new great frontier model.
34:37We're also going to unify our two series.
34:40So the breakthrough of reasoning in the O series and the breakthroughs in multi-modality in
34:45the GPT series will be unified, and that will be GPT-5.
34:50And I really hope I'll come back soon to tell you more about it.
34:55With that, thank you all so much.
34:57I can't wait to see all the agents you're going to build and how Codex will help you build faster.
35:02Thank you.
Commentaires