Đi đến trình phátĐi đến nội dung chính
  • 2 ngày trước
Google is testing Remy, a new 24/7 Gemini agent that can connect to Gmail, Docs, Calendar, Drive, and Search, then handle real tasks in the background. At the same time, Gemini 3.2 Flash has leaked with stronger coding, 3D, animation, and design skills, Gemma 4 is getting faster, OpenAI just upgraded ChatGPT with GPT-5.5 Instant, and Anthropic’s Claude Orbit may bring proactive AI briefings across your work apps.
Phụ đề
00:02Google is testing Remy, a new 24-7 Gemini agent that looks like a direct
00:07response to OpenClaw, the viral AI that can act on your behalf. Remy can connect
00:13to Gmail, Docs, Calendar, Drive, and Search, and handle real tasks in the
00:18background. At the same time, a new Gemini 3.2 Flash model just surfaced with
00:24stronger coding, 3D simulations, animation, and design skills, and Google
00:30also dropped a major speed upgrade for Gemma 4 that can make AI responses up to
00:36three times faster. On the other side, OpenAI just rolled out GPT 5.5 Instant
00:42as the new default chat GPT model, with fewer mistakes, better reasoning, and
00:47deeper personalization. And Anthropic is preparing Orbit, a proactive clawed
00:52assistant that can brief you across Gmail, Slack, GitHub, Figma, Calendar, and Drive
00:58before you even ask. So let's start with the biggest piece, because this one is
01:03different. Internally at Google, employees are testing a new AI agent called Remy.
01:08And this isn't just another feature inside Gemini. It's being described as a
01:1224-7 personal agent that can actually take actions on your behalf. That wording
01:18matters, because it moves the system away from something that responds to prompts
01:22and into something that actively does things for you. Remy is running inside a
01:27staff-only version of the Gemini app right now, and it's deeply integrated across
01:31Google's entire ecosystem. So we're talking Gmail, Docs, Calendar, Drive, Search, all of it.
01:39And the idea is simple on the surface, though the execution is where it gets
01:43interesting. Instead of asking the model to help you with tasks, Remy monitors what
01:47matters to you, handles complex workflows proactively, and learns your preferences
01:52over time. So instead of opening your email, sorting messages, replying, scheduling
01:57something, then jumping into Docs to write something, and then maybe doing research
02:02and search, the agent handles that flow in the background. It acts more like a
02:06digital executive assistant than a chatbot. The internal description literally says
02:11it elevates the Gemini app into a true assistant that can take actions on your behalf, not
02:17just answer questions or generate content. That's a pretty clear shift in positioning.
02:22And Google employees are already testing it internally, which is what they call a dogfooding
02:28phase. That's standard in tech, where internal teams use the product before it ever reaches
02:33the public. Right now, there's no confirmed release timeline, which usually means they're
02:38still refining behavior and reliability, especially for something this autonomous.
02:43What's interesting is how far this goes compared to what's already out there.
02:48Google already rolled out things like Agent Mode inside Gemini, where the system can handle
02:53multi-step tasks, though access depends on your subscription tier and region.
02:58Remy goes further. It's designed to operate continuously, not just when you ask it to do something.
03:04And that puts it directly in competition with tools like OpenClaw, which went
03:08viral earlier this year. OpenClaw gained attention because it could actually perform tasks like
03:14responding to messages or conducting research autonomously, not just assist with them. And
03:20it made enough noise that OpenAI ended up hiring its creator back in February.
03:24Remy clearly follows that direction, though Google has one major advantage here, integration.
03:30Because they control the entire ecosystem, they can plug this agent into everything from your
03:35calendar events to your documents to your inbox. That gives them a real edge when it comes to
03:40everyday productivity. There are also smaller details that hint at how Google sees this system.
03:46The name Remy itself might come from the Latin Remigius, meaning oarsman or rower, which kind of fits the
03:52idea of something doing the work for you in the background. It could also be a reference to the
03:57rat chef from Ratatouille, which again fits the concept of a hidden assistant running things
04:02behind the scenes. And timing wise, this is all lining up with Google I.O. 2026, which is happening
04:08between May 19th and May 29th at the Shoreline Amphitheater in Mountain View. That event is expected
04:14to focus heavily on AI breakthroughs, especially around Gemini and Android. If Remy is anywhere close
04:20to ready, that's where it would show up. And speaking of AI doing real work, today's video is sponsored
04:26by Higgsfield. And this actually fits perfectly with the whole agentic AI direction we're talking
04:32about. Higgsfield just launched Marketing Studio, a new end-to-end AI ad workflow powered by Seedence
04:382.0. And the idea is pretty simple. You bring in a product image or paste a product link, choose
04:44your
04:44own avatar or generate one inside the platform, and Marketing Studio turns that into ready-to-use video ads.
04:50The crazy part is that it doesn't just make one random clip. It can generate ads across nine different
04:55formats, including UGC, tutorials, unboxing videos, product reviews, TV style spots, hyper-motion ads,
05:04virtual try-ons, and more. So instead of hiring a creator, writing a script, booking production,
05:10editing footage, and testing only a few angles, you can build a full ad campaign from one workflow.
05:16Hello. Look at our new Science Revolution YouTube channel. That's why Higgsfield describes it as
05:21replacing an entire $50,000 production pipeline in a single session. One prompt, your entire campaign.
05:28For e-commerce, dropshipping, Shopify, Amazon FBA, TikTok Shop, SaaS launches, courses, or even AI
05:36marketing agencies, this is really useful because you can test way more creatives without spending weeks
05:42or thousands of dollars on production.
05:44What science YouTube channel would you actually recommend?
05:47Science Revolution. Physics, biology, chemistry, materials, space. No hype. Just real science.
05:54Explained so anyone can actually understand it. Paste a product link, generate multiple ad angles,
06:00see what works, and move faster. So if you want to try Higgsfield Marketing Studio,
06:04check the link in the description and pinned comment. Alright, back to the video.
06:08Now, at the same time as this agent work, something else leaked out, and it gives a pretty clear look
06:14at what's happening on the model side. Gemini 3.2 Flash showed up on the Eleuther AI Arena,
06:19which is basically an external testing platform where models get evaluated under real-world conditions.
06:25That's important because it means Google isn't just testing internally,
06:28they're putting the model in environments where it can be compared directly against competitors.
06:33And this version of Gemini looks like a significant upgrade over the current Gemini 3 Flash that's
06:39available in AI Studio. The improvements are pretty technical, though they translate directly
06:44into practical capabilities. The model shows stronger performance in SVG generation,
06:50which means it can create detailed vector graphics with high precision.
06:54It also has improved coding abilities, including the ability to generate complex code for interactive 3D environments,
07:00things like voxel-based simulations and dynamic systems.
07:04Then there's animation processing, which has been upgraded to handle smoother transitions and more dynamic outputs.
07:10That matters for anything involving video, interactive content, or even UI design.
07:15And the responsiveness of the model in interactive scenarios has improved as well,
07:20so it can handle tasks that require real-time feedback more effectively.
07:24The reason Google is using platforms like Eleuther AI Arena is to stress-test the model.
07:29These environments expose weaknesses faster, especially when the model is pushed across different types of tasks.
07:36It also allows Google to benchmark directly against other systems in a more transparent way.
07:41From what's been seen so far, Gemini 3.2 Flash isn't just a small iteration.
07:46It looks like a more capable system that's being prepared for broader deployment,
07:50possibly tied into upcoming announcements.
07:52Then there's another piece that doesn't get as much attention,
07:55though it's actually one of the most important upgrades happening under the hood.
07:59Google released something called Multi-Token Prediction, or MTP drafters, for the Gemma 4 model family.
08:06And this directly targets one of the biggest bottlenecks in large language models, which is inference speed.
08:12Right now, most models generate text one token at a time.
08:15That means for every word or fragment of a word, the system has to load massive amounts of data from
08:21memory into compute units.
08:22This process is memory bandwidth limited, not compute limited,
08:27which means the system spends more time moving data around than actually doing calculations.
08:32That's why even powerful models can feel slow in real-world usage.
08:37MTP changes that by using a speculative decoding approach.
08:41Instead of generating one token at a time, a smaller, faster model called the drafter predicts multiple tokens ahead.
08:48Then a larger, more accurate model verifies those tokens in a single pass.
08:54So in practice, the drafter might generate a sequence of tokens very quickly,
08:58and the main model checks them all at once.
09:01If they're correct, the system accepts the entire sequence,
09:04and even generates one additional token in the same step.
09:07That means you're effectively getting multiple tokens generated in the time it would normally take to produce one.
09:13And because the final verification still comes from the main model,
09:17there's no loss in quality or accuracy.
09:19It's a lossless speed improvement.
09:22Google claims this can deliver up to three times faster inference speeds,
09:26which is a massive gain, especially for production systems.
09:30There are also some deeper optimizations here.
09:32The drafter models share the same KV cache as the main model,
09:35which means they don't need to recompute attention states.
09:39That saves time and reduces redundant processing.
09:42For edge devices, like mobile hardware,
09:44Google added clustering techniques in the embedder layer
09:47to speed up the final step where the model converts internal representations
09:50into actual word probabilities.
09:53That's one of the slowest parts of the process on limited hardware,
09:56so optimizing it makes a big difference.
09:59Even hardware-specific improvements show up here.
10:01For example, on Apple Silicon, increasing batch sizes can unlock up to around 2.2 times speed improvements,
10:08and similar gains are seen on NVIDIA A100 GPUs.
10:12So this isn't just about faster text generation.
10:15It's about making these models usable at scale, across different types of devices.
10:20And while Google is pushing all of this forward,
10:23OpenAI is making a different kind of move,
10:26one that focuses more on the user experience.
10:29They just rolled out GPT 5.5 Instant as the new default model in ChatGPT,
10:35replacing GPT 5.3 Instant.
10:38And this matters because it affects the highest volume model,
10:41the one used by hundreds of millions of people for everyday tasks.
10:45The focus here is clarity, speed, and accuracy.
10:49GPT 5.5 Instant produces 52.5% fewer hallucinated claims compared to the previous version,
10:56and it reduces inaccurate claims by 37.3% on difficult conversations.
11:02That's a big improvement, especially in areas like medicine, law, and finance,
11:07where accuracy matters more than anything else.
11:10The model also improves performance in visual reasoning, math, science, coding, and image analysis.
11:17So it's not just faster, it's more reliable across a wide range of tasks.
11:22And then there's personalization, which is becoming a major focus.
11:26GPT 5.5 Instant can use context from past chats,
11:30uploaded files, and connected Gmail accounts to deliver more tailored responses.
11:34It also introduces memory transparency,
11:37where users can see which past interactions influenced a response and manage that data.
11:42Now there's one more piece, and it's coming from Anthropic.
11:45They're working on Orbit, and it is still unreleased,
11:48though it has started showing up inside newer CloudWeb and mobile builds.
11:53For now, it appears mostly as a settings toggle,
11:56which usually means the feature is being staged before launch.
11:59Orbit is a proactive briefing tool for CloudCowork and CloudCode.
12:03Instead of waiting for you to ask what's going on,
12:06it prepares useful updates for you automatically.
12:09And the connectors are the important part here.
12:12Orbit is expected to pull from Gmail, Slack, GitHub, Calendar, Drive, and Figma.
12:18So it's not just email summary tool.
12:21It's built around the daily workflow of people who write code,
12:24manage projects, design products, and work across teams.
12:28That changes the use case.
12:30With Orbit, CloudCode could brief you
12:32on what changed in a GitHub repo,
12:34what people discussed in Slack,
12:36which design updates happened in Figma,
12:38what meetings are coming up,
12:39and which emails actually matter.
12:41All of that can be turned into a short, personalized briefing
12:44based on your time zone and connected apps.
12:47That makes it different from a normal chatbot.
12:50You don't open it just to ask a question.
12:52It's more like a work radar running in the background.
12:55And the timing is interesting, too.
12:57Anthropic's Code with Cloud conference
12:59starts in San Francisco on May 6th,
13:01with London on May 19th and Tokyo on June 10th.
13:04So Orbit could either get a quiet rollout
13:06or a formal reveal around that event.
13:09That's the direction everything is moving in right now.
13:12Also, if you want more content around science,
13:15space, and advanced tech,
13:17we've launched a separate channel for that.
13:19Link's in the description.
13:20Go check it out.
13:21Anyway, that's it for this one.
13:23Let me know what you think about Remy
13:24and where this is all heading.
13:25Thanks for watching,
13:26and I'll catch you in the next one.
Bình luận

Được khuyến cáo