Space, Quantum & Frontier Technologies -  Live Demo with Genesis AI - video Dailymotion

Vivatech

Genesis AI is a full-stack physical AI company, building the universal foundation model for general-purpose robotics. Our breakthrough combines a robotics-native AI brain, a human-scale robotic hand, an invisible data-collection glove, and the world's most accurate simulator. Our latest model, GENE-26.5, achieves human-level manipulation — fully autonomous. We will share the latest release around model, simulation and concrete experiences.

Transcript

00:00Ladies and gentlemen, we're back on the discovery stage.

00:05Our next guest is Genesis AI, a French-American startup with offices in Paris and San Francisco.

00:14They build foundation models for a physical world.

00:19Where ChatGPT learned to talk, Genesis is teaching machines to act,

00:26to pick up things, to move in space, the brain that will power the next generation of robots.

00:34Please welcome Théophile Gervais, co-founder of Genesis AI.

00:44Hello, my friend.

00:49Hello, everyone.

00:53So we started Genesis about a year ago with a simple question.

00:56How do we bring 10 billion robots to the world within the next decade?

01:00And so we already have a lot of robots today, but typically none of you see them

01:03because they are either Roombas or industrial arms doing the same thing all day.

01:08And what's missing to get robots everywhere else is a lot more AI.

01:12And so we started with this vision about a year ago.

01:15And a year into it, we've built every piece of our vision, and that's what I want to walk you

01:21through.

01:25So the first piece is the brain of a robot.

01:27This is the model that takes as input the video that the robot can see

01:31and outputs the actions that the robot needs to take.

01:35And so we train the general-purpose robot brain that can do many types of tasks

01:41using very complex dexterous hands, like, for example, cooking a 20-step meal over five minutes,

01:47making an omelette with eggs as well as tomatoes.

01:50All of this is real-time.

01:54For example, it can also play piano or do very complex lab automation.

01:59Lab pipetting is a very common workflow where you need millimeter-level precision

02:04to insert the tip of ortho-cobel harnessing for industrial use cases.

02:13So all of these are a single model that can do all these types of use cases.

02:23And so the way we get to train a really general-purpose robot brain is

02:27that we need to scale data in the same way we've been able to scale data for language models.

02:32And so for language models, we start from all of the Internet.

02:35But for robotics, we don't have the same equivalent of what is the large-scale data set

02:38of human actions to train robots.

02:41And the issue is today's methods to collect data to train robots are very hacky.

02:46Typically, robots have grippers, while we humans have very complex dexterous hands.

02:50And that makes the data collection devices very complex.

02:53Like, for example, on the left is a device called teleoperation.

02:56So you have a human that moves an exoskeleton, a plastic thing, to move the robots.

03:02But the human cannot feel what the robot feels.

03:04And that makes data collection very slow and clunky.

03:07It would take you half an hour to teach the robot how to make a salad.

03:10And you need to do this hundreds of times.

03:13Like, you need tens of thousands of hours of data collection.

03:15So not scalable.

03:17A bit more scalable approach is remove the robot completely

03:20and have the human wear grippers in their hands.

03:23So have the human adapt to the robot's embodiment, which is grippers.

03:27That's more scalable.

03:28But you can still imagine that if you're a professional technician or in a lab or a cook,

03:32you just can't wear grippers to do your job.

03:34So this is not the right scalable way to train robots.

03:38So we thought, instead of adapting the human to the robot,

03:42what about adapting the robot to the human?

03:44If you have a dexterous hand for the robot that has the same form and function as our human hands,

03:49we have a full 20 degrees of freedom of our fingers,

03:52then you can directly transfer human data of us doing different tasks to the robots.

03:57And so one very natural data collection device is to just wear a glove

04:01that precisely tracks the position of your fingers as well as pressure within the palm

04:06and use the human data of experts doing their job to train the robots.

04:09And so that's how we were able to scale data to train our first version of the brain.

04:19And so even if you're able to truly scale data,

04:22and we were talking about we need to get to tens of millions of hours of data of human manipulation,

04:27you still have a lot of workflows in robotics that are not scalable.

04:31Like the first one, for example, is...

04:36How do you test your models?

04:38Typically, you train your model on your big GPU cluster for many weeks,

04:42and then you put it on your robot,

04:43and you have a human spending the whole day resetting the objects to where they should be,

04:48letting the robot try different tasks,

04:50and then repeat this 500 times in a row during the day

04:53to get a single metric of how well does the robot do at a single task in a single environment.

04:58But we're trying to train general-purpose robot brains,

05:01so they're supposed to do hundreds of tasks as far as hundreds of scenarios,

05:04and there is no way to do this in a way that makes sense operationally.

05:08Also, the robot wears out, so it's not the same from one day to the other.

05:11The lighting changes.

05:12The operator from one day to the other is different,

05:14so you don't get a repeatable metric.

05:16So the way we solve this is through simulation.

05:25So instead of doing evaluation of robots in the real world,

05:28you can do it in virtual worlds.

05:29So we spent a lot of time building a virtual world

05:33that is both the physics of how do objects interact with each other,

05:36whether it's rigid bodies or deformable objects,

05:39as well as rendering,

05:40with people coming from video games to make very realistic pixels.

05:44And so this is our Paris office.

05:46You are in the real world, and now you're in the matrix.

05:48This is a virtual world,

05:50and you can see it's still very realistic compared to reality.

05:56So here you can see the underlying 3D structure of the world

06:00that has been scanned.

06:07And you can even simulate the effects of the camera,

06:11like this lens simulation.

06:13You can do the underlying 3D mesh of the robots.

06:23And this is, for example, a side-by-side comparison

06:25between simulation on the left and the real world on the right.

06:29It's really hard to tell the difference,

06:31and that's what lets you train,

06:33like evaluate your robots in virtual worlds at scale,

06:36hundreds of thousands of different environments,

06:38and then have the results in simulation

06:42match the results in the real world.

06:50So I'll skip ahead the yo-yo.

06:52So the last piece is,

06:54even if you have a really general-purpose robot brain

06:56that can do many types of tasks,

06:57because you train on a lot of human data

07:00using Dexter's hands that lets you

07:02transfer human data to the robot directly,

07:04and you have a simulator to evaluate your robot at scale,

07:07you still need the robot product,

07:08because nobody cares about just the brain.

07:11At the end of the day,

07:11customers just want a robot to solve use cases.

07:14And so that's what we released last week.

07:18So this is Inno.

07:19It's our first robot.

07:21We spent a lot of time thinking about

07:23what the first robot should be,

07:24and it's really important for it to have the upper body be human,

07:28because to have human hands,

07:30human arms, and human hands,

07:32because you learn from human data.

07:33The rest of the body does not need to be human.

07:35You don't need a face,

07:37because you don't have a brain.

07:38You don't necessarily need a humanoid body.

07:40It's also more stable on wheels,

07:42because you don't need to balance your body.

07:46So this is what we ended up with,

07:48as the simplest possible form of a wheeled robot

07:50we could think of.

07:51And the idea is to have it be applicable

07:53to many different types of environments,

07:55whether it's in data centers,

07:57in industrial environments,

07:58in labs,

07:58or eventually in the home.

08:04And so the next step for us,

08:07now that we've built all the building blocks of the tech,

08:09is how do we bring this to market?

08:12And so we're partnering with industrials

08:16and pharma companies or data centers,

08:19or even services,

08:20and if you're interested, please reach out.

08:33We have time,

08:34if we have questions for my friend.

08:37Do you want to answer to your question?

08:39Do you have questions for my friend, Tioffil?

08:41No? Yes?

08:42We understand everything.

08:44It was quite clear.

08:49No?

08:50Okay.

08:51That's a wrap.

08:52Round of applause again for you.

08:54Ah!

09:01What technology do you use to simulate the environment?

09:05So we write,

09:06like you have to write the physics,

09:08we write low-level code for the physics,

09:10like how do objects interact with each other

09:13for rigid bodies, deformables,

09:15and we write rendering.

09:17Just like for video games,

09:18if you go from the 3D state of the world

09:19to how does the light interact with it,

09:21to bounce back to the camera.

09:23And so that's what we do it.

09:25So if you're working in video games

09:26and interested in building physics for robots,

09:29please reach out.

09:30We're hiring in robotics, AI, and simulation.

Space, Quantum & Frontier Technologies - Live Demo with Genesis AI

Category

Transcript

Comments

Recommended