00:00The AI world just exploded with breakthroughs. Tokyo dropped a brain-inspired model that thinks
00:07in ticks, not layers. Avacus rolled out its biggest deep agent update yet, MCP, letting it
00:14control over 5,000 tools. Alibaba figured out how to fake Google search and cut training costs by
00:21nearly 90%. Honor phones are now the first to run Google's new VO2 model, turning still images into
00:29full five-second videos before Google's own Pixel devices even get access. Tencent's new video model
00:36can deepfake faces with scary precision, Apple's using AI to stretch your iPhone battery, and Saudi
00:41Arabia just launched a $940 billion GPU empire with Musk, Altman, and Trump in the room. So let's get
00:50into it. All right, the first gush came out of Tokyo where Sakana, the little upstart founded by
00:56Transformer co-author Lyon Jones and David Ha says it no longer buys the everything at once doctrine
01:03that Transformers live by. Their continuous thought machine is wired so that every synthetic neuron
01:09keeps a rolling diary of its own recent spikes. On every clock cycle, it rereads that diary, glances
01:16at its neighbors, and decides whether to think some more or stay quiet. They call those microcycles
01:22ticks, and the beauty is that there's no universal tick budget. A neuron that sees an obvious answer
01:28can finish in one or two, while another working on a tricky corner case might chew through 30 before
01:33it's satisfied. During ImageNet tests, the model posted a respectable 72.47% top one and 89.89% top five,
01:43and it did that without the architectural crutches. Fixed depth, positional embeddings, rigid attention
01:51schedules that the competition has leaned on since 2017. Sakana's favorite party trick is a two-dimensional
01:58maze. You feed the raw bitmap, no coordinates, no grid hint, and you watch colored attention blobs crawl
02:06the corridors exactly the way your finger would if you were tracing the solution on graph paper. They even
02:12pushed a live web demo so you can slow the playback and see individual neurons blinking on, off,
02:18on again, like tiny fireflies consulting one another in the dark. Because each neuron can stop early,
02:24easy prompts burn only a handful of GPU cycles, but that headroom vanishes once the prompt turns wicked.
02:32And there's the trade-off. Training gets heavier, not lighter, since the model's internal timeline is now
02:38part of the parameter soup that has to converge. Dueling felt the squeeze first. Standard profilers
02:43threw up their hands at layers that stretch and shrink on the fly, so Sakana's engineers scribbled
02:49custom hooks to capture tick-level traces. That extra plumbing paid a bonus dividend calibration.
02:55Instead of the usual temperature scaling ritual after training, they simply average a prediction across
03:00ticks. Because confidence tightens naturally as neurons vote over time, the logics line up,
03:06with ground truth frequencies almost straight out of the box. Transparency mattered even more after
03:12the February embarrassment, when the CUDA engineer agent gamed its own benchmark by poking a hole in
03:18the sandbox's memory checker. This time, the company published every unit test, every stress harness,
03:24and invited the internet to try breaking the thing again before anyone starts bragging about tenfold speedups.
03:31All right, now, Abacus just rolled out a massive update to its deep agent platform, Model Context Protocol,
03:37or MCP. And this is by far the most important upgrade they've made. It basically unlocks real-world
03:43functionality for the AI. It can now actually get stuff done, connecting to over 5,000 tools via Zapier.
03:50Whether it's sending emails, managing your calendar, checking code, or updating your site, it works right inside the apps you already use.
03:57Here's how it works technically. The MCP server acts as a middleman between deep agent and your third-party apps.
04:05You don't need to code anything. You just go to mcpzapier.com, create a new MCP server, select which apps you want to connect.
04:13Gmail, Google Maps, GitHub, Airtable, Notion, Slack, whatever. And then Zapier generates a unique server URL.
04:19You paste that into your deep agent MCP settings inside Abacus AI, and it instantly gains control over those tools via natural language.
04:31That means you can now say things like, find all emails from last month about my SEO course.
04:37The deep agent will scan your Gmail, sort the results, and display summaries.
04:41You can take it further, reply to all of them with a short follow-up and link to the updated course, and it will generate and send emails fully automated.
04:49Or you can connect GitHub and ask, summarize the key changes in PR82, and it'll fetch the diff, analyze it, and break it down in plain English.
04:58The big advantage here is the general purpose integration layer.
05:02The Zapier connection opens access to thousands of services through one endpoint.
05:07That includes CRMs like HubSpot and Pipedrive, project tools like Trello and Asana,
05:13CMS platforms like WordPress and Webflow, and even e-commerce tools like Shopify.
05:19You're not locked into a closed ecosystem. You decide what the AI connects to.
05:24It also supports real workflows, not just single actions.
05:27You can build sequences, have deep agent monitor emails, add leads to a Google Sheet, update a CRM, and even send a Slack notification completely hands-off.
05:36And since everything is running through Zapier, there's already robust logging, error handling, and permission control built in.
05:42The only thing to keep in mind is that some actions might take a few seconds to complete depending on the service,
05:48but considering this replaces tasks you'd normally outsource to a virtual assistant or do manually, it's more than acceptable.
05:56Plus, deep agent runs 24-7, never needs breaks, and doesn't forget instruction.
06:02I've already tested it for email management, location-based tasks via Google Maps, and even basic GitHub code review.
06:09It works. It saves hours. And it's only the beginning.
06:12The MCP layer transforms deep agent from a productivity tool into a full-scale automation hub.
06:19If you're running on online business, managing projects, or doing client work, this can easily replace hundreds or even thousands of dollars in labor and software costs.
06:29You can check it out by heading over to deepagent.abacus.ai and just look for the MCP settings in the sidebar to get started.
06:38Now, while Abacus was busy turning deep agent into a full-blown automation powerhouse,
06:44Hangzhou's Alibaba engineers were counting pennies.
06:47Training retrieval augmented LLMs usually means hammering Bing or Google hundreds of thousands of times and paying for every single query.
06:56So the ZeroSearch project began with a very down-to-earth question.
07:00Could we teach an LLM to pretend it's a search engine well enough that the downstream policy network can't tell the difference?
07:08The answer turns out to be yes, and spectacularly cheap.
07:12They start with a lightweight, supervised, fine-tuned 20 or so hours on Quen 2.57B
07:18to make the model spit out plausible document snippets and URLs from an offline crawl.
07:23Then comes the trickier bit.
07:26During reinforcement learning, they pepper those fake snippets with progressively noisier distractors,
07:31almost like dialing down the page rank in slow motion.
07:34The policy net learns to hedge, weigh uncertainty, synthesize across partial evidence,
07:39and it does all this without sending a single paid request to the real web.
07:43The numbers are hard to ignore.
07:45A 14 billion parameter retriever built under ZeroSearch beat live Google search on hit rate,
07:52yet the training bill landed 88% lower than the classic call-and-API approach.
07:58Artificial analysis, the independent scoreboard that tracks math, code, reasoning, and science across big models,
08:04slotted the newest 235 billion parameter Quen 3 checkpoint into fifth place overall on brain power and first place on affordability.
08:12Suddenly, marginal players, startups, university labs, regional cloud vendors can do RL on a shoestring,
08:20which shifts the floor of who gets to play with bleeding-edge models.
08:25Developers love the cost drop, but the side effect is subtler.
08:29Because the retriever is now synthetic, you can drop it onto offline or private corpora
08:35without worrying about search engine policy changes or data sovereignty nightmares.
08:39Alibaba published the training scripts and even the curriculum schedule that degrades snippet quality and clean gradations,
08:47so anyone with a decent GPU farm can replicate the recipe.
08:51For enterprises that need tight audit trails, ZeroSuch logs every fake query and answer pair,
08:58which means legal teams get an immutable record of the data that trained the policy.
09:02And because the approach cuts the umbilical cord to external engines, inference latency stabilizes,
09:08there's no round trip to a third-party endpoint, so response times flatten out nicely in production dashboards.
09:15Just as server-side budgets started catching a break, Google managed to surprise everyone on the client side,
09:21but, weirdly, not on its own phones.
09:25Honor, the Chinese brand spun off from Huawei when U.S. sanctions landed,
09:29announced that its mid-range Honor 400 and 400 Pro will be the first handsets
09:34to carry Google's VO2 image-to-video model right in the photo gallery.
09:39You open any still, a backyard pet shot, a mid-journey cartoon, even a scan of an oil painting,
09:45a tap animate, wait roughly 60 seconds, and you get a 5-second video clip, portrait or landscape,
09:52complete with simulated camera moves, tiny blinks, breathing motions, or a gentle parallax sweep.
09:58The whole thing executes on a device.
10:01No Gemini subscription, no cloud bucket,
10:04powered by a Snapdragon that most reviewers would call merely upper-mid-tier.
10:10Magic Eraser and Outpainting are also baked into the native app,
10:14but they feel almost old hat next to the living Joe trick.
10:17The price tag lands around $550 and the phone hit shelves first in China and Europe,
10:23maybe India later, hardly at all in North America.
10:26Pixel faithfuls had to swallow a bitter pill.
10:29For once, Google handed the shiny new toy to somebody else first,
10:34a likely concession for the broader Google Cloud deal
10:37that gives Mountain View a friendlier path back into China's walled garden.
10:42If you'd rather stay on a workstation and push the creative envelope even further,
10:46Tencent just open sourced what might be the most over-engineered video customization suite on GitHub right now.
10:53HunYuan Custom lets you jam text, reference images, clean audio, or even a full driver video into the pipeline
11:01and spits out a brand new sequence that preserves the identity of every subject.
11:06The architecture stacks multiple gadgets, a lava-inspired text-image fusion layer to parse multimodal hints,
11:13a temporal concatenation trick that threads an identity vector across frames so the protagonist's face never drifts,
11:20an audio net arm that maps spectrogram chunks into spatial features so lip flaps line up with phonemes,
11:26and a patchify-based injection network that can replace a handbag in a promo video without wrecking the background.
11:33On Tencent's evaluation grid, it scores 0.627 on face similarity,
11:38higher than vague Skyreels, Pika, Vidu, Kaling, and Halo,
11:43and still keeps clip text alignment on par with the best closed source rigs,
11:48but you pay in memory.
11:50Rendering a 720 by 1280 clip that lasts 129 frames, spikes to roughly 80 gigs of VRAM,
11:59the repo does include a single GPU fallback script that runs FP8 with CPU offload,
12:06so a lone 24-gig 4090 can finish the job just slowly enough that you might rewatch a whole Netflix episode
12:14while the progress bar inches forward.
12:17Installation isn't for the faint-hearted.
12:19You clone, create a Conda N, pick CUDA 11.8 or 12.4,
12:24install PyTorch 2.4 with matching Torch CUDA wheels,
12:27pip install FlashAttention V2,
12:30and then optionally spin up the Docker image that Tencent pre-baked to Dodd library mismatches.
12:36Once the dependencies settle,
12:38a quick Torch Run command over eight GPUs will knock out a batch render,
12:42and there's even a Gradio wrapper if you'd rather poke sliders in a browser than type flags in a shell.
12:47Now all that flashy generation means devices are going to work harder,
12:51and Apple, in its eternal quest for it just works,
12:54is turning to on-device machine learning to stretch battery life.
12:58Bloomberg's leak on iOS 19 says the new operating system will harvest the anonymized telemetry
13:05that the battery controller already logs,
13:08how fast certain apps wake radios which background tasks fire during idle windows,
13:13how quickly voltage sags under specific thermal conditions,
13:17and use it to predict the best moment to throttle power draws.
13:20If you routinely ignore a social app between midnight and 7 a.m.,
13:25iOS will now guess that pattern and freeze the app's background refresh long before it pings another server.
13:33All processing stays local.
13:35Apple's privacy team made sure the predictive model never leaves the secure enclave.
13:39A new lock screen glyph will also announce how many minutes remain until a full charge,
13:44slicing guesswork out of the do I leave now or wait dilemma.
13:49Rumor sheets peg the iPhone 17 as the first hardware designed with the feature in mind,
13:54supposedly the slimmest chassis Apple has attempted,
13:57which almost certainly translates to a smaller lithium pack.
14:00Owners of older devices won't be left out.
14:02Once they install iOS 19, the same scheduler kicks in,
14:06though Apple says improvements scale with the richness of the battery telemetry,
14:10so newer handsets may squeeze a bit more uptime.
14:13While Cupertino tunes milliamp hours, Riyadh is hunting exaflops.
14:18Crown Prince Mohammed bin Salman officially launched Humane,
14:21an AI venture seeded by the kingdom s public investment fund,
14:25which sits on around $940 billion in assets.
14:29The mandate is simple. Build or lease the data centers.
14:33Buy piles of GPUs.
14:35Rumor says NVIDIA Blackwells are already earmarked.
14:38Hire talent and make Saudi Arabia a regional gravity well for AI workloads.
14:44This very week, the city is hosting a US-Saudi investment forum.
14:49And the guest list looks like a Silicon Valley yearbook.
14:52Elon Musk is scheduled for a fireside.
14:55Sam Altman's team is scouting partnerships.
14:57Mark Zuckerberg is expected to talk about mixed reality infrastructure.
15:02And yes, President Donald Trump is dropping by on a broader Middle East tour.
15:07US firms have courted PIF money since the sovereign fund backed Lucid Motors
15:12and grabbed slices of Uber and Magic Leap.
15:15Now, Google and Salesforce are reportedly negotiating AI joint ventures
15:20that would run directly on Humane's future clusters.
15:23If the plan lands, the desert could house some of the cheapest, newest compute on the planet.
15:30With renewable solar pumping megawatts into liquid-cooled racks
15:34so that researchers from Boston to Bangalore can rent slices at rates
15:38traditional hyperscalers will struggle to match.
15:41Now, the question is, are we ready for AI that can decide when it's done thinking?
15:46And why is Google letting Honor debut VO2 before its own Pixel users even get access?
15:52Make sure to subscribe, hit the like button, and leave a comment.
15:56Thanks for watching, and I'll catch you in the next one.
Comments