Skip to playerSkip to main content
  • 1 day ago

Category

🤖
Tech
Transcript
00:00Google's new Gemini AI can actually use the web like you do.
00:05Google's newest AI makes your browser feel a little haunted, in a good way.
00:09Google unveiled Gemini 2.5 Computer Use, a model that uses visual understanding and reasoning
00:15to operate a web browser like a person.
00:18Instead of relying on application programming interfaces or hidden shortcuts, it clicks
00:22buttons, fills out forms, and drags stuff around, all by looking at what's on screen.
00:28Ask it to fill out a form and it will find the right boxes and type in your info like
00:32a human.
00:33Google says it's handy for UI testing or for websites that don't offer direct access through
00:39an application programming interfaces.
00:42This agentic AI race is heating up.
00:45Just yesterday, OpenAI showed new ChatGPT apps and its upcoming ChatGPT agent, and Anthropic
00:51rolled out Computer Use for Claude AI last year.
00:54Google says its model outperforms leading alternatives on web and mobile benchmarks.
01:00The demos are sped up three times, so take that with a grain of algorithmic salt.
01:04Unlike OpenAI's approach, Gemini 2.5 doesn't yet control your whole computer.
01:10It's limited to a browser sandbox for now, supporting 13 actions like typing, scrolling,
01:14and dragging items around.
01:16That's still enough to play 2048 or dig through Hacker News for spicy debates.
01:21Developers can try it through Google AI Studio or Vertex AI.
01:26There's a public demo on BrowserBase where you can watch the AI in action.
Be the first to comment
Add your comment

Recommended