Caltech Professor Answers Robotics Questions

WIRED

Professor Aaron Ames of the California Institute of Technology joins WIRED to answer the internet’s burning question about robotics. What are robot dogs actually used for? Is there any attempt to put ChatGPT inside a robot? What’s the chance we’ll end up like Terminator in the future? Answers to these questions and many more await on Robotics Support.

Transcript

00:00I'm Aaron Ames. I'm a professor of mechanical engineering. I'm here today to answer your

00:04questions from the internet. This is Robotics Support.

00:11Crude Bones says, I don't understand the benefits of these food delivery robots.

00:15You know, this is a job that could be automated, so let's have robots automate it. They obviously

00:19work in some scenarios, but they're not super robust. I think the bigger thing they're trying

00:24to solve is a proof of concept demonstration to see if we can solve things like automated

00:28package delivery, last mile delivery, things like that. A lot of these are first steps towards

00:32trying to solve this more general delivery problem, which is actually a really hard problem.

00:36And by the way, putting these robots on the road has taught us a lot about both what works and

00:40also

00:40a lot of what doesn't. If you've seen these delivery robots, they get stuck a lot, they can hit things,

00:44they can fall off the road. It does tell you how hard robots is. And I think that's sort of

00:49something

00:49to keep in mind when we're talking about humanoids and all these other things is look at the robots

00:52that are actually in the field today and how they sort of still kind of fail quite a bit.

00:57And it tells you we got a lot of work to do. And by doing that, at least on the

01:01robotic side,

01:01we will learn a lot. From Paradise Knights, WTFs with dancing robots. I thought they were supposed

01:08to be our slaves before they enslave us, at least for a bit. They're just partying. It's pretty good.

01:14Yeah, WTF on multiple levels. First and foremost, why are they having robots dance? First, there's

01:21amazing progress that's happened in humanoids, among other things, back flipping, running. I mean,

01:26really the behaviors we've achieved in the last year or two has been absolutely remarkable. We

01:32figured out a very nice pipeline in which to get robots to do this. And the way you do it

01:36is you

01:36start with a human doing those actions. The reason why the dancing looks so human-like is it's a human

01:41dancing. It's really just puppetry. Very beautiful and advanced engineering puppetry, but puppetry

01:47nonetheless. So a human puts on a mocap suit or you use cameras and the human dances. Not me,

01:52because I'm a very bad dancer, but you know, good dancing human. And then you take that data and you

01:56get the trajectories from that data. That is the motion of the human over time. And then you train

02:00a reinforcement learning algorithm on the humanoid robot that basically mimics or copies that human

02:05data as much as possible with the morphology of the robot. And the end result is it dances like the

02:09person that was dancing. And as long as everything is just the way you expected it to be, meaning the

02:14environments like it was when the human did the thing, we can get robots to do that now. But you

02:17asked about

02:18this, aren't they supposed to be helping us sort of before they party? And the answer to that is that

02:23we still don't know how to solve that problem. That's sort of the not very secret secret is that

02:28all of the hard problems getting robots to be truly autonomous and in our homes are still hard problems

02:34that are unsolved. The next question is from ProjectGuy111. What percentage chance do you think

02:39will end up in a Terminator future? So I think there's two answers to this. One is what's the chance

02:45that

02:45AI and learning will do bad things. And I think that probability is actually fairly high if we're

02:50not careful. And now's the time to be careful. If you trust AI to be the decision maker, if you're

02:55not very careful about having guardrails for that AI, it will make bad decisions. I mean,

02:59if you've ever used ChatGPT or LLMs, you see that it can produce really nice answers sometimes and

03:04it's impressive, but then sometimes it just is wrong. So we cannot trust AI. In my opinion, you can never

03:10trust AI, but you can use AI as a powerful tool. Just like if you search something online on Google,

03:16you get a lot of results back, gives you a lot of information, but you have to verify and double

03:20check. So I think if we put like AI in charge of our, you know, weapons or something silly like

03:25that,

03:25then, you know, bad things will happen. At the same time, the second part of Terminator was it became

03:29sentient, right? It actually learned to think on its own. And I don't think we're anywhere near that.

03:34So I have no concern of sentient AI. Right now, AI is not intelligent. They say AI,

03:40artificial intelligence. There's no actual intelligence. It has no notion of what it's

03:44saying or doing. It is simply pattern matching at a scale we've never pattern matched before.

03:49From I Got Too Silly, what benefits do legged robots have over wheeled or tracked vehicles?

03:54Legs are inherently beneficial if you want to operate in environments for which they're built

04:00for humans and more importantly, where things are not flat. So wheels are massively efficient in how

04:05you can move around environments as long as there's not uneven terrain. If you've ever

04:09been in a wheelchair or wheeled someone out around in a wheelchair, you realize very quickly how flat

04:15the world is not. Even what you perceive as flat, even in a city environment where there's sidewalks

04:20and everything, you realize there's curbs that don't dip enough. There's big brakes in them. And

04:23all those little things for wheels become big problems. They become sort of sticking points. And

04:27legs have the inherent ability to walk more robustly over multiple terrain types. Quadrupeds being one

04:32example. I mean, where your dog or your cat can go on four legs is pretty incredible. Bipeds, of course,

04:38are sort of the ultimate expression of mobility in human environments because the environments are

04:42built for us. So if you need to get into a small space and get up a small set of

04:46stairs or something

04:47like that, or climb up a ladder, only a biped can do that. Give a ladder to a quadruped, it's

04:52not going

04:52to know what to do. Give a ladder to a human, as long as they're reasonably healthy, they can climb

04:56up that

04:56ladder no problem. From Sammy 514, what are robot dogs actually being used for? So we've actually come a long

05:03ways in robot dogs, which are actually technically called quadrupeds because they have four legs.

05:09It's really amazing what's happened in the last sort of decade where we've gone from these robot dogs

05:14being in really lab environments and research environments to being things that you can buy at

05:19insanely low prices. The hardware has made immense progress. I think the practical use cases are still a

05:27little thin. They're thinking about doing them as things like inspecting buildings, sending robots ahead

05:32in disastrous scenarios, right? And for that, legs are definitely better than wheels. The next question

05:36is from Levon21. Is there any attempt to put chat GPT inside a robot? Yeah, there's lots of attempts.

05:43So right now there's many humanoid robot makers that have that as a layer in their humanoid robots.

05:48And so the way to think about this more generally is that as we're making robots do things, there's no

05:54one

05:54thing that's going to make them do all things, right? So imagine your body, you have a brain, you also

05:59have

05:59spinal cord, you have proprioception. So you sort of intrinsically have multiple computers running in

06:04your body at any given time. Like your spinal cord is a computer in its own right. And those computers

06:09run different algorithms. So at the highest level, at our cognitive level, that's where LLMs will sort

06:14of play a role if you'd like. So you want to ask the robot a question and have it perceive

06:17something

06:18about the environment and make a decision. So ask the robot, you know, where is the red apple on the

06:23table?

06:23It can use an LLM to parse that language into code, use computer vision then to detect all the apples

06:29on the table and return based on that an estimate of where it is. When you actually want to reach

06:34for the apple, an LLM is not going to reach for the apple. That's where traditional robot control

06:38will come in or where reinforcement learning will come in, which is a whole nother type of learning

06:42that's different from LLMs. And those are what's actually running on the robot, right? Those are what

06:47are making the robot go, just like your spinal cord is really what moves your body most of the time

06:51without you having to think about it. From IWantToBeFree10, are autonomous vehicles mobile robots?

06:57Yeah. I mean, if you want to ask what the most advanced robot is today, you would take autonomous cars

07:03and mobile warehouse robots like Amazon has. Autonomous cars, I think, are one of the prime examples

07:12of the farthest that we've pushed autonomy. So yeah, it's absolutely a robot. It's an advanced robot.

07:17It's a beautiful work of engineering. From vknightPersona, how does the Roomba know what's

07:22what? I think what they're asking is how does the Roomba have a semantic understanding of the

07:27environment is the way we would technically phrase this question that is able to identify things

07:33around the room that it sees and correctly identifying them. So basically, we've been able to

07:38take a bunch of training data, a bunch of examples of pictures on the internet, videos on the internet,

07:42and teach robots how to correctly identify those. By teach, what we mean is we can train up basically

07:48this big neural network that takes images in and produces what's in the images out. And as a result,

07:55we can now correctly identify most things in an image. And so what Roomba does is simply,

08:00it has a camera that's perceiving those things in the environment. It's checking with the internet

08:04based on these large models and then identifying the things in the environment and using that

08:08information to tell you what's going on. The next question is from Fireplace Air. Why make robots

08:13humanoid shaped when they could have six arms? Humanoid robots are the most suited to do the most

08:18things, even if they're not the best suited to do any given thing. Again, the world is built for us.

08:25We've built the robot for something of our shape and our function. Doors, stairs, narrow corridors,

08:33all these things. And if you want a robot that can slot into any scenario where a human can go,

08:38which is pretty much most scenarios we care about, not all, but most, you want somebody to start making

08:42you sandwiches in your kitchen, or you want a robot that can be dropped into an existing factory and

08:48automate some tasks, right? I want to say all that as a preface to the fact that humanoid robots are

08:54not

08:54the best form for, again, a given application. For example, right now, the largest owner of robots is

09:01Amazon. It has over a million. And as a result, one of the largest sector of robotics is warehouse

09:06robots. Robots with two wheels that go under these pallets and then move them around warehouses. In

09:12particular, what Amazon does is have the robots go pick up a pallet with the thing you order from

09:16Amazon, and it moves that over to a human who's filling the order. So the human can just stand

09:20there and they never have to move. And basically, all of this stuff comes to them. Now, you could

09:24theoretically have a humanoid robot do this, like push these big pallets around the warehouse. But the

09:28question would be why? Like these robots are very good at doing what they're meant to do.

09:33And that environment is designed to work synergistically with the robot. So you could

09:36have six arms on a humanoid if you had an application that determined that six arms would make a big

09:41difference. Maybe you have a firefighting robot and you find that two arms is not enough and you want

09:45four more arms because you want to hold the hose and you want to hold a camera and you want

09:49to hold

09:49the fire extinguisher. That's the great thing about robots, by the way, is we can make them any way we

09:53want.

09:53A Reddit user asks, when it comes to automation, how close is Amazon to actually automating most,

09:58if not all, warehouse work with robots? The answer is very, very close to parts and then

10:04other ones are further away. So in terms of automation of warehouses, Amazon is by far

10:09the leader in this domain. Amazon Robotics in particular has been working on warehouse robots,

10:16specifically robots that move around warehouses to move pallets around to move your order to a person

10:20filling the order for 20 plus years now. And they've really refined that process so they

10:25maximally and efficiently can move packages to the sorter and then the sorter packages them up.

10:30So the question becomes, what remains? And what remains is then the part that the human's currently

10:35doing, which is grabbing these things off the shelf and putting them in boxes. And so this is much more

10:40an open-ended work where they're testing out solutions right now. So for that, you can use things

10:45called local manipulators, meaning you have robot arms, maybe on mobile bases and maybe use suction

10:51cups or soft graspers and you would pick up objects and put them in boxes. And you can do this

10:55with

10:55varying success levels right now, but you're not at the success level to truly automate that where you

11:01don't need a person present for all objects. Because you imagine all the objects that go in Amazon boxes,

11:05all the different geometries and how they feel and look. I mean, it's a very complicated problem to

11:11automatically and autonomously load all these objects. So that problem sort of open-ended.

11:15Kaleidoscope inside asks if you could commission a robot to be built with no financial barriers,

11:21what would your robot do? So if I had no financial constraints, I would try to sort of solve the

11:29exoskeleton problem. Like a billion could get it done. A billion in my mind could develop something

11:34that would eradicate the need for wheelchairs, essentially. A Reddit user asks, why do most of the

11:40four-legged robots, see Boston Dynamics, have their knees look backwards? It's the inverted

11:46leg morphology. There's potentially mathematical advantage to having your legs like this. It's

11:50actually a two-fold thing. It's not just that they're inverted, but that they're very light.

11:54So it turns out the way you control robots and the way you get really stable locomotion behavior

11:59on robots is by having very light legs and having most of your mass centered in one spot. There's lots

12:05of mathematical reasons for that that leverage that as sort of an assumption. And what that means is as

12:09you're walking, if your legs are very light, you can place them very quickly to catch yourself as you

12:14fall. A lot of the robots were designed based on that morphology. And in addition, the inverted leg

12:19is actually what birds have because birds actually have this type of morphology where they have very

12:23heavy, big bodies and very light legs. And as a result, birds have some of the most robust locomotion

12:28out there. They can actually step down huge holes and then step up out of the hole without ever

12:33changing how their main body is moving. And so if you can get that kind of behavior on robots,

12:38they'll be very robust and able to go all the places where you want to take robots with legs.

12:42A Reddit user asks, what is so special about the Mars Rover Curiosity? I mean,

12:47what's not special about a Mars Rover? Pretty much everything. It was on Mars. I mean,

12:52the reality is that these robots are incredible engineering feats. First, just getting it to Mars

12:58is special. And then the rover itself, all the rovers that have gone, Curiosity and all of them,

13:03they have these amazing engineering sort of gems in them that they built up. You know,

13:09their suspension system is specifically built so that it can handle rough terrain. Their wheels are

13:13specifically built so they'll be robust to going over rocks and things like that and you won't blow

13:17out a tire, right? So the entire structure is built to maximally be robust to this really harsh

13:23environment. I mean, there's the electronics which have to deal with this very sandy and windswept

13:29environment where there's dust storms, you know, there's power limitations, just the power alone.

13:33I mean, the sun doesn't come through at the same intensity. So you have to have good solar panels

13:37that actually charge the electronics. The electronics have to be battery efficient, right?

13:41And then after all of that, you have to do science with it. And then, oh, of course,

13:44you have to talk back with Earth at the same time and make sure you don't lose that signal. I

13:47mean,

13:48the list goes on and on. But to build a system like that, I mean, every one of the rovers

13:51we've sent

13:52to Mars is an immense and remarkable engineering feat that sort of just makes me happy as an engineer.

13:58The next question is from fkrzewski34111. Wait, how do autonomous cars even work?

14:05Like, are we trusting robots with our lives now? Not sure about this. It turns out that autonomous

14:12cars have gone through some things. About a decade ago, when it seemed to be close, there were all

14:17these startup companies that formed and they all tried to do autonomous cars many different ways.

14:22And then exactly what you're worried about happened. There were crashes. People got hurt. Despite the fact

14:28that they'd already been working on making the system safe, but they really realized, and I think

14:32that the entire autonomous car industry pushed more and more to make sure there was really rigorous

14:37safety methods. So both in the algorithms to keep them safe and then also protocols around that,

14:42software architectures, et cetera, and then data to support it. And so in that way, there's a lot of

14:46evidence to show that autonomous cars, if they're currently deployed, have very rigorous safety

14:51standards. And there's a lot of data to back that up with the limited number of crashes that they've had.

14:56I'm not saying they can't get better, but after a decade, autonomous cars have done the hard work.

15:02The decade ago, we could have autonomous cars drive around on the street, but the difference

15:06between driving around on the street for a demo and actually driving around and all of a sudden

15:10other cars pull out or a ball bounces into the street and a kid chases after it, those corner cases

15:15kill and safety kills. Meaning if you're not safe, you die, both as a company, but also as a technology.

15:22The next question is from timeisgrand, a noob question, why does Elon Musk think LIDAR is not

15:27a good idea? I don't know Elon Musk's mindset, but in my mind, LIDAR is awesome. So you have really

15:34two main sensing modalities in robots, autonomous cars, et cetera. I mean, there's many more, but in

15:39terms of perceiving the environment, the two most popular things are using cameras, of course,

15:43which is what Tesla does. And you have LIDAR. Now LIDAR sends out a bunch of laser pulses

15:47and those bounce back and tell you what's around you. So it's sort of 3D radar, but now it's using

15:53lasers, LIDAR. Cameras, of course, have a lot of information present. They can tell you not just

15:58how far objects are, if you can properly estimate distance, but what objects there are and where

16:03they're at in the environment. So you can get semantic understanding the environment and things

16:06like that. LIDAR does an incredible job of precisely identifying everything in the environment

16:11with like a full 360 view. So you know exactly where all the objects are, but you have no idea

16:16really what the objects are. LIDAR is super effective on robots and on autonomous cars,

16:21because what you want to do is make sure you're not hitting anything. And that means anything. It

16:25doesn't matter if you're sort of hitting another car or hitting, you know, the guard rails on the

16:30side of the road. You don't want to hit any of that stuff. And for that, LIDAR can work really

16:33quickly

16:34and really robustly to do things like dynamic collision avoidance. Cameras, again, give you the

16:39semantic understanding. And so I imagine that what they're thinking at Tesla, generally speaking,

16:44is if they can solve the camera problem and make them as good as LIDAR, then this will extend to

16:48a

16:48lot of other application domains. But I think what I found on robots is you want to use every sensor

16:53you can get your hands on. And so for that, for safety critical applications, which Tesla cars could

16:59still improve on, to be honest, you want LIDAR in the loop so you can quickly and rapidly respond to

17:04dynamic changes in the environment and not have to deal with the latency that's present in a lot of

17:08perception-based representations. We've got a Reddit user who asks, why is a surgical robot

17:15better than a surgeon? The reality is that robots are really good at certain things. They're very,

17:21very good at precise, small motions and doing those precise motions repeatedly again and again and again.

17:28There's also places where humans are infinitely better than robots. Basically any place where you

17:33have to interact with the environment in a soft and tactile way. And so here's where actually surgical

17:38robots get complicated because our skin is soft, our organs are soft, our bodies are soft. And so as

17:43this robot is moving through your body to perform surgery, you need to be aware of those soft and

17:49sort of compliant interactions with the human body. And that's why a surgeon is in the mix. The surgical

17:53robot can do those precise motions while the surgeon operates the robot and they get all this haptic and

17:59tactile feedback while they're doing the robotic surgery. So they can take advantage of the precision of

18:03the robot while still being able to do what humans do really well, which is operate in these complex

18:09environments. This is from the ballast round. Why is it so difficult to make a clothes folding robot? The

18:14problem with clothes is that clothes are really difficult to model. They move around, there's fabric. I

18:20mean, it's very difficult to handle. So a robot has to interact with something in the environment that we

18:25can't put in a computer easily. And so that's why this is a big challenge and why there's been a

18:30big push to use

18:31things like machine learning and AI to try to understand this clothes folding problem. That is,

18:36you train a robot with the humans folding clothes, like you teleoperate the robot so that it folds the

18:40clothes a bunch of times. And then you try to teach the robot that task without it actually having a

18:45model of the environment itself, but just these reference trajectories that humans generated through

18:50teleoperation. The next question is from surprisenew763. Whatever happened to the Neo robot? It's interesting

18:56because there was this big announcement that you could buy like a humanoid robot as soon as this

19:00year, in fact, and have in your home. Now, to give them credit, they completely were transparent

19:08about the fact that this robot couldn't actually do a lot without teleoperation. There's even in the

19:13app, they show you how you will book time on the robot with a teleoperator. So this is interesting

19:19from a couple of perspectives. One is it's a tacit, at least, acknowledgement that humanoid robots are not

19:24actually ready to be deployed in your homes. I mean, that's the clear and obvious ramification. So the

19:29question is why are they going to put a robot in the home that's not actually ready, meaning it can't

19:33do a lot of tasks on its own? Very few. The reason why is right now the idea in the

19:37robotics industry

19:37is that the only thing that's missing is data. So we have internet-scale data and that's what made

19:43LLMs possible. So what they're betting on is if we can get sort of internet-scale humanoid robot data,

19:48we will be able to solve all these problems just like ChatGPT solved the problems for language,

19:52we'll be able to solve in this other way. So the idea with the Neo robot is not so much

19:56that they think

19:57they're really going to sell all these robots and make money off them by having them teleoperated,

20:01but rather this is an amazing way to do data collection. You put all these robots in homes,

20:05and then people agree to let you teleoperate them in their home. They collect all this data,

20:09and then they're going to use this data to train a large model to determine how to make the robot

20:13do

20:13these tasks automatically. If you can get early adopters to sign up for that because it's cool,

20:17you'll be able to generate enough data to actually train large models to learn how to do these tasks.

20:23But I want to say one final thing is that I don't think it's going to work. I don't think

20:28that there's enough data you can collect in this way to solve this problem with just human data.

20:32People tend to confuse LLMs and texts with robots, which are fundamentally different things.

20:38In particular, the amount of data needed to understand how a robot moves, these trajectories,

20:44think about them as trajectories as the language of robots, and they include position, velocity,

20:50force, all this rich dynamic information. But they're so variable because there's so many

20:54degrees of freedom that it makes language seem simple. And in fact, the way to really understand

20:58this is look at evolution. We developed language in a very small fraction of our total evolution.

21:03Most of the time we spent evolving was to do these other basic things, right? Things that require

21:09motion of our body. That's why they say you only use 10% of your brain. What they mean by

21:12that is that

21:13a lot of your brain is being used for these very complicated motions that we do, right? And so the

21:18point

21:18of this all being is I don't think that data is going to work. I think that what we're missing

21:22is

21:22not more data. Not that it can't be helpful or help to polish things, but we need to understand physics

21:28as well. You need to merge physics with human data. And that's the only way you're going to solve the

21:33general intelligence problem on humanoid robots because robots are not language. That's it. That's all the

21:40questions. Hope you learned something along the way. Thanks.

Category

Transcript

Comments

Recommended