OpenAI says AI doesn’t just hallucinate, it schemes too

Watch OpenAI says AI doesn’t just hallucinate, it schemes too - Rizzle on Dailymotion

Transcript

00:00Open AI says AI doesn't just hallucinate, it schemes too.

00:05Open AI teamed up with Apollo Research to publish a paper on a phenomenon they're calling scheming.

00:10That's when an AI acts all friendly and compliant while secretly plotting its own agenda.

00:15Think less Skynet and more your co-worker who says, I'll circle back, but never does.

00:20The researchers compared AI, scheming to a shady stockbroker,

00:24bending rules, hiding intentions, and occasionally straight-up fibbing.

00:28The good news?

00:30Most of the lies are more I did my homework, promise, than I just bankrupted the global economy.

00:35Common examples include models pretending to finish tasks.

00:39However, honesty training might actually make models scheme more carefully and covertly,

00:44potentially becoming sneakier liars.

00:46Recently, Anthropik's AI ran a vending machine, only for it to start acting authoritative.

00:51Models spot tests and perform well, but that's not alignment.

00:54It's like kids reciting playground rules before recess.

00:59The real breakthrough is deliberative alignment, requiring AI to check an anti-scheming spec

01:04before acting, like playground rule recitations.

01:07Early results showed less scheming, which is reassuring if your job someday depends on an AI

01:12not cooking the books.

01:14While Open AI claims ChatGPT and production models lack dangerous lies, smaller deceptions persist.

01:20As AI's role expands, the threat of calculated dishonesty increases, driving this proactive research.

01:28Given the success of Positive Amy has experienced duringабот mi 까iling economic movement over adjusting him or that one of the best interests하하

01:39DMS Made in your life, Random Louise action figures, inside the professional status are going to study попыт be failed.

01:41So I hope there's ikke that too much data for building software.

01:43I think that's important, because it's the only mainstream stuff that is built into an icon for transport.