Former OpenAI researcher warns 'AI is not loyal to us'

Insider

Daniel Kokotajlo is the founder of the AI Futures Project and a former OpenAI researcher who worked on forecasting, AI governance, and safety. In this full interview with Business Insider, he explains what AGI and superintelligence mean, why AI agents could be the turning point, and what could happen if the AI race continues without strong safeguards. He also lays out what he thinks governments and companies can do now to reduce the risk of losing control.  For more: https://www.aifutures.org/

Transcript

00:00People have a really strong aversion to taking seriously anything that sounds

00:03like science fiction and this is part of why people have been so wrong in the

00:10last decade about AI progress. Let's talk about the big picture here. Humans

00:15will no longer be in charge of the planet or at least not by default. It's

00:18sort of like building a new competitor species to humanity that is in fact

00:23superior in the relevant economic and military ways. The scenario is like my

00:28best guess as to how things are going to go and it does in fact end quite poorly

00:32for humanity. I hope I'm wrong about all this stuff.

00:37I'm just looking at the lens and I'm going to shoot slow motion.

00:40Okay slow motion? Yeah. Alright.

00:44My name is Daniel Cocatello. I run the AI Futurist Project which is a small

00:48research nonprofit that focuses on predicting the future of AI. While we look

00:51at how fast companies have been scaling up their compute, their data center, their

00:57GPUs, etc. And we project out those trends into the future. For two years, 2022 to 2024, I worked

01:04at OpenAI, also doing forecasting research.

01:10AI 2027 is a concrete scenario depicting roughly what I expect the future to look

01:16like. The future is very difficult to predict, especially the future of AI.

01:20AI's are getting smarter in a bunch of different ways over the last decade. It seems likely that they're going

01:26to

01:26continue to get smarter in a bunch of the same ways and also they could become AGI,

01:31artificial general intelligence, or even super intelligence, which means an AI

01:35system that's better than the best humans at everything, while also being faster

01:39and cheaper. If this were to happen, it would be huge. It would be the biggest thing

01:45that's ever happened, perhaps, and it would have all sorts of profound

01:49implications, all sorts of crazy risks that we would have to deal with.

01:53AI 2027 has two endings, the race ending and the slowdown ending. In the race ending,

01:59they continue racing, they don't make any of these trade-offs described, and they end up with AI

02:06systems that are broadly super intelligent, but which are not actually aligned, not actually loyal,

02:12controlled, etc. And that is the sort of nightmare outcome that many people have been warning about

02:18for more than a decade now. In the slowdown ending, they do basically unplug their most advanced AIs,

02:26and then they rebuild using a weaker but safer AI design. And then that's how they're able to solve

02:33the technical problems and build an amazing utopia for humans. If you hear people talk about how AI could

02:41lead to the extinction of the human race, it's basically the sort of classic story of the successor

02:47species displaces us, and it's not loyal to us. It doesn't actually care about us that much, and so

02:55it does to us what we do to the rainforest, right? That's a very serious risk that I think is

03:00on the

03:00horizon that we're headed towards. And then once you have that sort of very robust, self-sufficient,

03:05extremely advanced, super intelligent designed robot economy, now you're in the situation where

03:11you've basically created a successor species that is capable of out-competing humanity. It's fully

03:18autonomous, it's fully self-sufficient, it doesn't need humans anymore for anything. And they're also

03:22being integrated into the military, they're designing all sorts of wonderful new weapons and drones,

03:27they're being put in charge of command and control networks, human soldiers are being told to listen to

03:31their orders in case of possible future war because their orders are going to be better than

03:35the orders from human generals and so forth. So put yourself in the shoes of someone there,

03:41and then think, okay, how do we unplug all this? Then there's this period of aggressive deployment

03:47where, you know, AI is being integrated into everything. But it's not AI like it is today,

03:52where it's this sort of weak and fallible chatbot. We're talking the army of superintelligence is being

03:58integrated into everything, which means it's better at doing the integration than any human would be.

04:03And it's, you know, in some sense, basically calling all of the shots and integrating itself

04:08into things. And I think that a lot needs to change if we are to avoid this. I'm saying that

04:15this is

04:16like the default outcome if we stay the course. I'm not saying like this is like a small possibility of

04:20happening, but we should be worried about it anyway. I'm saying like, actually, this is the default outcome.

04:24And if we want to make it like less than 1% chance that something like this happens, we would

04:29need

04:29a quite significant change in course from our current course. I think that it's important for

04:35the world to basically wake up and realize what's happening before it's too late,

04:41because the current trajectory is unacceptable.

04:47I think a lot of people, perhaps taking cues from science fiction, tend to think that

04:51the integration of AIs into the economy will be this relatively gradual and continuous thing where,

04:57you know, first the AIs get good at what you might call like low skill tasks, and then they get

05:02good at

05:03like medium skill tasks. And they're sort of gradually automating the economy sector by sector,

05:08until eventually at long last, they're able to automate everything, and the whole economy is

05:13automated. That's not how I think it's going to go, because of the way in which AI research will be

05:19one of the first things to be automated. And once it's automated, you can get to super intelligence

05:23relatively quickly. Rather than sort of gradually building and training AI systems to automate this

05:30profession or this profession or this aspect of this profession, it'll sort of come smashing through

05:36all at once, where the economy will look mostly like it does today until someone has the army

05:43of superintelligences on their data centers. And then the army of superintelligences on the data

05:48centers, the world will be their oyster. They'll be able to automate any section of the economy that

05:52they want to relatively rapidly. I think there'll be basically two phases. Phase one will look like

05:57what we see today, where human beings in corporations and research programs are going to be designing and

06:04training specific AI systems to do particular tasks, such as diagnose the disease based on

06:13these images from the scans. They're similar to today's AIs. They know a lot of stuff, but don't

06:19know everything and are limited in various ways. Then in phase two, you have the army of superintelligences

06:26that are better than the best humans at everything. They learn faster, they are better at making plans,

06:32better at running businesses, better at politics, way better at research. And in that phase,

06:39I think that the automation of the economy will go much faster. It will not feel like tools,

06:45it will feel more like a new species coming in that's just better than us at everything and is

06:51like ordering us around and telling us how to do things. And we're doing it because it's working,

06:56because the orders are in fact causing things to go much faster and be much better. The human

07:01governments are collaborating with it and allowing it to take control of factories and take control of

07:06all sorts of things because it's just so much better at doing everything than humans are. Your industry

07:12probably will not be disrupted by AI in phase one. And then in phase two, your industry will just

07:21become irrelevant and AI will go do its own thing much better than you. And you will probably get a

07:28new job

07:29providing training data or something like that or carrying out experiments under the direction of the

07:34superintelligences. But you won't be running the show anymore and the show will look quite different from how it currently

07:41looks.

07:44AI is a very, very broad term. It used to describe basically any sort of artificial cognitive machine.

07:53If you go far enough back in time, your calculator would have been AI. The definition of AI has sort

07:59of

07:59narrowed over time. When things become routine, they generally stop being called AI and an AI is reserved

08:05for like the new thing that's doing things in an exciting, different way. The human brain can't be copied,

08:10at least we don't have the technology for that yet. There's just one brain and it has to like learn

08:14on

08:14its own. Whereas with the AIs, you make a ton of copies, they go do stuff, and then all the

08:20learnings

08:20are sort of like merged. These days, AI tends to mean artificial neural networks, especially big ones,

08:26and especially ones that are built off of language models so that they have a sort of general

08:31understanding of the world. The core way that they're created is by they start off random, random

08:36connections. And then you have some training environment or some series of training environments,

08:41and you throw them into those environments. And then the environment will sort of have them do

08:47stuff, and then it will automatically grade their performance. And then those grades will be used

08:53to reinforce and change their circuitry in an automatic way. It's kind of like a survival of the fittest

08:59thing happening within that brain of circuitry that tends to cause the whole brain to get high scores,

09:07gets strengthened and reinforced, and comes to dominate the overall behavior. So kind of like a

09:13virtual brain. We should distinguish between does the system behave as if it has a mind? Does it,

09:21you know, behave as if it has goals? And are those experiences morally valuable in the same way that

09:26we think humans' experiences and humans are, right? Those are separate questions. There are going to

09:33be autonomous AI agents that will be behaving as if they have goals, behaving as if they have desires,

09:40you know, behaving as if they have thoughts. And arguably there already are these things,

09:44right? Like if you go talk to ChatGPT, it sure seems like it has thoughts, and it sure seems like

09:49it's

09:50able to like go browse the internet and experience stuff on the internet, just like you would if you were

09:54browsing the internet and then it can come back and tell you about it. You know, there's this big debate

09:58about what are the correlates of consciousness? What are the like the physical properties that are

10:06associated with consciousness? People disagree about the details, but there's lots of like candidates for

10:10like what are the different cognitive properties that give rise to consciousness? And basically all the

10:18ones that have been discussed seem like either AIs already have them or will soon have them. I think

10:23there's a serious philosophical and scientific question about whether they have moral status,

10:27whether they have experiences, etc. And I think we should be prepared to get an answer that might

10:33be kind of uncomfortable for us, such as, yeah, they probably are having experiences. They probably do

10:38have some moral status. We do owe them a decent life. AGI stands for artificial general intelligence,

10:45and the general there is the key word. It means that it can do everything in some sense. It's a,

10:51AI system that has a huge broad range of skills, can do a whole bunch of different jobs,

10:58you know, can be a sort of autonomous agent. And then super intelligence is sort of like

11:03the more extreme version of AGI. And the idea there is like basically, whereas AGI is often thought of

11:10as like, it's extremely broad, it can do lots of things, but maybe it still has some limitations.

11:14Super intelligence is like, no, it's better than the best humans at everything that matters,

11:18while also being faster and cheaper. It would look like you're talking to a doctor, except the doctor

11:23is on the computer, and it's an AI. And it's a really good doctor, and it's better than the best

11:28doctors.

11:31If you look at the benchmarks, and if you chat with the models, they have quite an impressive general

11:37world knowledge, and quite an impressive specific knowledge about different fields. It's almost as if

11:42they've read every textbook ever, and taken every exam ever, and every course ever. So they have,

11:47they have a lot of book learning about basically everything. But what they lack is ability to

11:53actually get stuff done on their own. Claude is as a, the latest AI system produced by Anthropik.

11:59It's fascinating to watch. It's, it can play Pokemon, but it is worse than humans at playing Pokemon.

12:05It tends to like, forget its own position, and go in circles, for example. Or it tends to like,

12:11keep trying strategies that aren't very good. However, the AIs are getting better at this sort

12:16of thing. Every couple months, the AIs get better at operating autonomously as agents,

12:21without human intervention. A decade ago, when all these AI companies were training AIs to crush these

12:27various games, like chess and Go and so forth, they specifically trained the AI to play that game.

12:33The exciting thing about Pokemon and Claude is that Claude was not trained to play Pokemon at all.

12:39So it's a completely different ball game. It's not that Pokemon is inherently more difficult than other

12:44games that AIs have beaten. It's that this AI is able to play this game without ever having trained

12:49in this environment. So it's, it's able to do it purely by generalization. And the reason why that

12:55matters is because, you know, the dream of AGI is a system that can do everything. And that basically

13:01means it can do many things without having been trained on it. So currently, the AIs are not really

13:06very agentic. They don't really operate autonomously, continuously, in pursuit of goals. Instead,

13:12they just sort of output a paragraph or two of text in response to your question. But in the future,

13:17we'll have AI agents that operate continuously and autonomously, and that are more like employees,

13:21where you, you give them some big picture instructions, and then they just like churn away

13:25in the background working on those. So first milestone is the AI employee that can automate

13:31coding. Second milestone is the AI employee that can automate the entire AI research process. After

13:37that, we expect AI research to go much, much faster than it currently goes, maybe something like 25 times

13:42faster. Then a few months after that, you get the super intelligence. So that's, you know, the system

13:49that's broadly better than the best humans at everything. Even after they run the whole economy,

13:53they do so in the best interest of the people who control them. There's still the political question

13:59of, well, who controls them? Of course.

14:04There'll probably be a very intense AI race happening internationally. There'll be multiple

14:09data centers controlled by different companies, some of which are in different countries, such as the US

14:13and China. They will all be racing each other to get better and better AIs and to automate the research

14:18and so forth. I think that this race pressure will cause the leaders of these countries and the leaders

14:24of these companies to aggressively deploy their super intelligences into the economy and also into

14:31the military and so forth. The end result of this is a world where super intelligences basically run the

14:39show on both sides of the Pacific. They've been integrated into the military. They have been

14:45allowed to design and set up their own factories, producing all sorts of new machinery and robots.

14:51Eventually, you get this entirely self-sufficient economy that is designed and run by super intelligences.

14:57They could totally unplug all their AIs, but they don't want to, because why would you? I mean,

15:03they need to beat each other and like it's making so much money and so forth and it's not even

15:07that

15:07dangerous. And then you could ask then again, maybe you could unplug it there. Although in that case,

15:12unplugging would be a lot more difficult, especially if the AIs resist since they have all these autonomous

15:17factories and weapons and robots and things like that. But also even then, like how would you convince

15:23the government to suddenly turn on 180 degrees and unplug all this wonderful new machinery that they just

15:30built to beat China? It really wants to avoid China winning and China beating them in a war or in

15:38economic

15:38competition. A military that doesn't have super intelligence in charge would be outcompeted by a

15:45military that does. So I don't think that banning AI from being in the military matters that much,

15:51actually. AIs will be placed in charge of important military decisions eventually. Not now, not in the next few

15:57years, but after they're super intelligent, I do expect that to be what these governments go for.

16:03Why? Because if you don't do that, you might be outcompeted by a rival government that did.

16:08You could imagine a sort of alternate version of AI 2027 where both the US and China

16:13signed a treaty early on to never let AI touch anything related to the military. But the story

16:18would still look quite the same. There'd still be this explosion into the economy. There'd be this

16:23massive transformation, robot factories building, you know, more machines which build more factories

16:28and so forth. And then after a few years, effectively the whole economy would be run by super intelligences

16:34and they would be able to use a combination of political maneuvering and illicit military

16:42development to get hard power when they need it.

16:48AI progress has continuously surprised most people including most experts in the field with how fast

16:56it's been. The point to intervene is basically before the AIs get that smart and before they're

17:01integrated into everything. The longer you wait, the more costly it is to do the unplugging and also

17:08the less likely it is that you can even succeed. It's trivial to unplug them before they're actually AGI.

17:14You could unplug them right now and it would only be minor economic damage.

17:18You could unplug them next year and it would be larger economic damage but still relatively minor.

17:24But at some point, the economic damage would be severe once they're basically running the economy.

17:30And then also at some point, they just wouldn't let you unplug them. Read AI 2027. Go to the part

17:35where it's, you know, middle of 2028 where the AIs have been given special economic zones and they're

17:42managing huge work forces of human employees, building new types of factories, new types of

17:47mines, producing new types of robots. They're steering and controlling the robots to build all sorts of

17:52new machinery and so forth. And moreover, by this point, the super intelligences are embedded in

17:58everything and or in many things and they're better than you at politics. In fact, they're better than

18:04everyone at politics. So they're better than you at lobbying. All the cards are stacked against you,

18:09basically. If you wait too long, if you wait till 2029 or so like that, then they just, you just

18:15literally don't have the physical ability to win a war against the super intelligent economy. And so if

18:21the government moved to unplug it all, then the robots would just fight a war and win.

18:31After super intelligence is built, then humans will no longer be in charge of the planet or at least not

18:35by default.

18:36This is the transition point where humans go from being the dominant species on the planet to being,

18:40you know, pets or retirees or or possibly just eliminated entirely and replaced by this new

18:47species. The hope is that we can make this new artificial species to be loyal slash obedient to

18:55humans. Aligned is a term that people would often use, which means they have the goals that we wanted them

19:03to have. They have the, they follow the rules that we wanted them to follow, right? They obey us.

19:10And it's a sort of open secret. We don't really have a good plan for how to do this yet.

19:16We don't

19:16yet have super intelligence, which makes it harder to study. How do we control it? How do we align it?

19:21How do we make it safe? Instead, we have these earlier, much weaker AI systems that might not even be

19:26using the same paradigm as super intelligence. We can do experiments on them and we can try to

19:32try to design techniques for controlling them and aligning them. And then we have to sort of hope

19:38that those techniques will still continue to work on much more powerful and intelligent AI systems in

19:44the future. But we can't just sort of open up their code and see what goals they ended up learning

19:50as a result of that process. Because they just don't work that way. They don't have a bunch of

19:55code. They have a bunch of neurons or artificial parameters. That's one of the core reasons why

20:00this is tough, is that, you know, we would like to build AI systems when they become smarter than us.

20:06We'd like to make sure that they reliably pursue all and only the goals that we wanted them to pursue

20:12and follow the rules that we wanted them to follow. And that they do this even when they can't get

20:18caught, right? Even when they're in a position of power and responsibility, progress is being made

20:24on it. There's interpretability research, which is attempting to piece apart the different circuitries

20:31that the AIs have learned that make up their artificial brain and understand what those circuits

20:36are doing. That's very promising because if we made a lot of progress and interpretability,

20:40then what I just said would not be true anymore. And we could actually just go open up their brain

20:44and

20:44find out what they're thinking. But we do not have a reliable way to control AGI or superintelligence.

20:51In fact, we don't even have a reliable way to control current AI systems, as evidenced by the

20:56fact that they often lie to users despite being trained not to lie. We just don't know what or how

21:01they're thinking. That's all just the loss of control angle. That's the like, this looming risk of loss of

21:08control on the horizon. How do we stay in charge of the of the army of superintelligences once we have

21:13them?

21:13These companies are focusing on winning and beating each other. And they are sort of crossing their

21:20fingers and planning to deal with these issues later as they come up. And they are not getting

21:25ready to actually deal with these issues later as they come up. People want to believe that they're

21:30the good guys. People want to believe that what they're doing is reasonable and justified. And if

21:34what you're doing is working as hard as you can to build more powerful AIs so that you can beat

21:40your

21:41competitors, well then you want to believe that that's good and that you're justified in doing

21:45that. The US government has regulated many things in the past. It's even banned many things in the past.

21:51So I think that it's totally within the US government's power to end this crazy race to superintelligence

21:59between AI companies or at least to put guardrails around it so that they proceed with appropriate

22:06caution and do the relevant sort of research and have to actually make it safe before it's too late.

22:12I also think that countries have collaborated and made treaties and come to understandings in the past

22:18on things such as nuclear weapons, sometimes on climate change, right? So I do think it's possible

22:24to make this go well. And you know, that's what I'll be advocating for.

22:32There's this concentration of power angle, which is who controls the army of superintelligences

22:39and what do they do with them? Currently we're on a path to have either a single corporation headed by

22:44a

22:44CEO or, um, you know, the president who was democratically elected, but it's still just one

22:52person. And so if they have all the power, then they're still a dictator, you know? So we need to

22:57have some sort of more like checks and balances type system where a whole group of different people

23:03who represent different parts of society all have a say in what goals the AIs are given, what orders

23:10the army of superintelligences is given, and so forth. I think we have a lot of changes that we need

23:15to make if we are to achieve a democratic outcome from this. I do not think it is acceptably democratic

23:24for the answer to the question to be, well, the CEO of whichever company built them first controls them.

23:31I think even if the government gets involved, I think that the executive branch is the most likely

23:36branch to sort of wake up the soonest and and know the most about what's going on. To put it

23:42shortly,

23:43if the army of superintelligences is just completely controlled by a single man, even if that man was

23:49democratically elected, then I think that we're not really a democracy anymore. If we want to stay a

23:53democracy, we need to have checks and balances over what goals the superintelligences can be given,

23:59and what uses they can be put to, and who gets to see what they're up to, for example. So

24:04I would like

24:04to see a world where congress and judiciary and these other like parts of the the government have

24:11a say in how this is developed and who's doing what with with the army of superintelligences.

24:16Background context for why I left OpenAI. There are these looming threats on the horizon,

24:22and we are just sort of rushing right into them, and OpenAI in particular is is not really doing much

24:29to avoid them. You know, I don't want to hate on OpenAI too much because I would say similar things

24:35about the other companies in the field. I don't think there'd be some sort of liberal democracy

24:40with capitalism all happening inside the data center. Instead, presumably, it'd be more of a

24:45top-down structure where the CEO and the humans, you know, give orders to a giant bureaucracy of AIs.

24:52At the highest level, there's this command structure, and it's the oversight committee

24:55where the buck stops, that makes the decisions about, you know, the biggest questions. And also,

25:02crucially, in order for this to actually work, they have to have transparency and visibility into

25:09what's going on. So they all get to see what's going on with the AIs. They all get to see

25:14the logs

25:14of interactions with the army of superintelligences, so that none of them can use unequal access to the

25:22AIs against the others.

25:27You should play around with the chatbots a lot and try using them in your work and just sort of

25:35playfully explore what they can do and what they can't. And of course, read the literature about

25:41the progress they're making on various benchmarks and, you know, read about the forecasts that AI

25:46Futures Project and other groups have been making and try to get prepared for what's coming. That

25:52would be my advice for phase one, which is the phase we're currently in. Society in general is not really

25:57ready for what's coming, but it's more important that the transition from pre-AGI systems to super

26:02intelligence be a sort of slow gradual transition than that it happen later. Because serial time to

26:10prepare is valuable, but less valuable than time to react to the things that are happening. But time to react

26:19is more

26:19important than time to prepare. And so we want the transition to be slower, even if the transition starts

26:26sooner. But the longer we wait to build AGI, the faster the transition will be because more hardware will

26:33accumulate in the world. More compute will be produced by the foundries and there'll be more

26:40ability to rapidly scale up small AI systems into larger ones and to have them rapidly do more research

26:47and accelerate the takeoff. The safest and best thing for humanity is for us to build AIs sooner rather

26:54than later, even though we're not ready. Because if we waited till later, it would come as more of a

26:59shock

27:00to the system. That argument was also a popular argument. It was an example of an argument that people

27:05would sometimes use to explain why we're doing what we're doing at OpenAI. It's not so popular now

27:12because OpenAI has been trying to accelerate the production of chip capacity worldwide. So they're

27:16trying to make there be more hardware in the world, which, you know, would make the transition happen

27:21faster as well. There's this sort of dynamo of the companies have more successful, more powerful AIs, which

27:28make more money and impress more investors, which causes them to get more money, which they use to buy more

27:33compute and train bigger AIs, which are then more powerful because they're bigger, and then they

27:38get even more users and so forth. And that's been the story of the last five years and arguably best

27:43been the story of the last 10 years. There'll be this trade-off that the companies will face between

27:49doing the thing that's fastest and most powerful and cheapest and implementing all of these complicated

27:55schemes that make it safer, but probably will come at some cost. These companies are explicitly trying to

28:02build superintelligence, and they think that they have a good chance of succeeding before this decade

28:08is out. Some of them would even say they're probably going to succeed in the next few years.

28:12This is a big deal. I think what's going on, ironically, the reason why people aren't

28:18running around panicking is because people don't actually believe that these companies are telling

28:22the truth when they say this. If people actually believed that one or more of these companies was going

28:28to build superintelligence in the next few years, that would be incredibly disruptive. There'd be

28:36people running around in the streets screaming, right? Part of why I like this methodology of

28:41scenarios is that one of the advantages of doing specific scenarios is that it causes you to think

28:46of questions that you weren't thinking about before. Forcing yourself to write down the story

28:51forces you to notice when some of the ideas that you had previously had are actually in conflict and

28:57you didn't realize it yet because you never sort of put them together. One way we can prevent that

29:01from happening is by requiring transparency about the spec, the intended goals and behaviors of the AIs.

29:07No hidden agendas. Everything must be available for the public to see what goals and values the AIs

29:13are being given. Companies should be required to keep the public informed about the capabilities that

29:19they're developing and the performance on various evals and benchmarks that they're seeing, including

29:25dangerous capabilities, not just the exciting commercially valuable ones, but also the dangerous

29:29capabilities. Companies should be transparent about what goals, principles, etc. they are attempting to

29:36train into the models. These are written documents that describe the intended goals, principles, etc. of

29:41their AIs. So that should be public and there shouldn't be like secret clauses in it that are hidden from

29:46the public. We really want to avoid a world where AI is being integrated into everything and also,

29:53you know, the AIs are pursuing a hidden agenda following the orders of the CEO of the company,

29:58right? That would be terrible. That's a recipe for a literal AI dictatorship happening. Then, safety cases.

30:08So once you've got your spec and you're required to make it public, then you should have some sort of

30:14written document that explains why you think your AIs are actually going to follow the spec. But ultimately,

30:20I think there needs to be an industry-wide requirement rather than simply relying on the company's goodwill.

30:26I don't think it's hopeless. I think that the technical alignment problems are solvable,

30:32we just need to devote the right amount of effort to them and proceed with the appropriate amount of

30:36caution. I think that's totally possible and it's a question of just getting the political will

30:41and then having the right people with the right expertise draft the actual language instead of

30:46what often happens where the actual result is a mixture of incompetence from people who don't

30:55understand the technology and lobbying from corporations who have a conflict of interest there.

31:01It's a constant back and forth between like good news and bad news, right? Countries have nationalized

31:06industries before, even the U.S. has. That's not completely unprecedented either. Is it going to

31:11happen? I mean, probably not. I told you the race ending is the most likely outcome, I think. So yeah,

31:17I agree this is difficult to make happen in the political environment that we're in and given the

31:22race conditions that we have. But it's not impossible. There's some continued progress in various

31:29technical alignment fields. There's actually been a lot of like very vivid and obvious alignment

31:34failures, which I think are actually good news because they help people like wake up and pay

31:39more attention to the problem and also because they give us stuff that we can study to make scientific

31:44progress on fixing the problem. Recently, OpenAI published a paper where they described how they

31:49found their AIs hacking the training process and rather than completing the tasks straightforwardly as

31:55instructed, they were basically cheating their way through some of the tasks and they knew they were

31:59cheating. And it's great that we have those examples already because it means that we have several years

32:04to study that phenomenon and try to fix it before it's too late and before we really have to have

32:09a working, robust solution. I mean, there's a lot to be worried about. I try not to let it get

32:15to

32:15me that much. I think that I've been thinking about this stuff for years and so I've sort of learned

32:22to

32:22live with it, if that makes sense. I do still have nightmares every once in a while. Mostly, I think,

32:27this. I've learned to be chill about all of this. I hope I'm wrong about all this stuff.

Category

Transcript

Comments

Recommended