Saltar al reproductorSaltar al contenido principal
  • hace 1 día

Categoría

🤖
Tecnología
Transcripción
00:00We want models to be really good at checking its own work,
00:03especially as the things we ask our models to build become more complex.
00:07My name is Eskew, and I work on training the models to be better at web development, app development,
00:12anything that really requires some sort of user experience.
00:15Today we're talking about the launch of our new model, GFT 5.4 Thinking,
00:19and two of its app development-related capabilities.
00:22One, it's its ability to use Kua, or computer use,
00:25and two, it's its ability to make great websites using an image input.
00:30When we ask the model to use Kua, compared to 5.3 Codex,
00:33it doesn't have to spin up like a new environment to do it.
00:36It's more like how you or I would interact with a computer.
00:39With persistent Kua, we're seeing in some cases, when we ask the model to test its own work,
00:45that the token use has actually dropped by two-thirds, which is quite exciting.
00:49So I brought some examples to show today.
00:51I'm going to open Codex, I'm going to use GFT 5.4 Thinking,
00:54and I'm going to use the high reasoning level.
00:56Build and test.
01:00A 3D chess game.
01:03Electron app.
01:04I'm going to add just a little bit more of a challenge for the model as well,
01:08and ask it to make two effects, glass and marble.
01:12It's cooking.
01:13This is a challenging use case for Kua, because there's so many pieces,
01:17you have to click the right pieces, are reflections working?
01:20The model needs to have a good sense of all the rules,
01:22and then how, like, manipulating those pieces will lead to a state where you can actually test out those rules.
01:27Like castling, for instance, right?
01:29Where do you drag the king or the rook to get it to the right place where it'll actually castle
01:33correctly?
01:34That was castling that just happened.
01:35That was en passant.
01:36What is it doing now?
01:38So Kua is clicking through the game and moving the pawn on the other side.
01:42It's actually playing the game.
01:43We're building software for humans to use, and humans use software with user interfaces.
01:50And so we want the model to be able to check its own work like a human would.
01:54The second thing I want to talk about is website replication, and in particular, image gen and image search.
02:00My partner, Nancy, has always wanted to start her own coffee shop.
02:04She's not a coder, and so she gave me a design that she wanted for a website,
02:08and we're going to use Codex and 5.4 Thinking to make that into a reality.
02:12For this example, I'm using Codex, but it works just as well in ChatGPT.
02:15The model is better able to understand the context of the design,
02:18like what kind of images that actually would be most appropriate, given the style,
02:23and will prompt image gen to make images that are more in line and aesthetically cohesive.
02:28So right now, it's calling the image gen tool, and it has a smart use of image gen as well,
02:32because images take a while to generate.
02:34So it's actually doing all four of these images concurrently, which is pretty neat.
02:38Now the model is able to check its own work using Kua.
02:41What Kua did here is open up the image, inspect it, open up the website, also look at it,
02:46compare them side by side, and make sure that the website created is as close as possible to the image
02:51that I put.
02:52With this update, it makes the work a lot cheaper, a lot more efficient,
02:57and also ultimately helps you do better work.
Comentarios

Recomendada