How Adobe Builds And Trains Its Generative AI Models

  • 4 months ago
Dr. Gavin Miller is the Head of Adobe Research, spoke at Imagination in Action's 'Forging the Future of Business with AI' Summit about how Adobe trained its generative AI models.

Subscribe to FORBES: https://www.youtube.com/user/Forbes?sub_confirmation=1

Fuel your success with Forbes. Gain unlimited access to premium journalism, including breaking news, groundbreaking in-depth reported stories, daily digests and more. Plus, members get a front-row seat at members-only events with leading thinkers and doers, access to premium video that can help you get ahead, an ad-light experience, early access to select products including NFT drops and more:

https://account.forbes.com/membership/?utm_source=youtube&utm_medium=display&utm_campaign=growth_non-sub_paid_subscribe_ytdescript

Stay Connected
Forbes newsletters: https://newsletters.editorial.forbes.com
Forbes on Facebook: http://fb.com/forbes
Forbes Video on Twitter: http://www.twitter.com/forbes
Forbes Video on Instagram: http://instagram.com/forbes
More From Forbes: http://forbes.com

Forbes covers the intersection of entrepreneurship, wealth, technology, business and lifestyle with a focus on people and success.
Transcript
00:00 So how do we see GenAI transforming business from an Adobe point of view?
00:05 And in particular, in Adobe research, what are we inventing that will help our customers
00:11 transform their businesses?
00:13 Broadly speaking, you can think of it in two broad areas.
00:17 On the left, there's the creation and editing of media.
00:21 This includes images, video, audio, and so on.
00:24 And on the right, there's the analysis of generation of campaigns and then the analysis
00:30 of response to those campaigns, which are often powered by the digital media that's
00:34 created with our mainstream media editing tools.
00:40 So one of the things that's come up today and always comes up in AI is where does the
00:46 data come from?
00:47 So of course, we exploit many different sources, including our stock photography business that
00:53 then gives us a highly moderated source of images.
00:56 But we're also interested in capturing unusual or unique data sets.
01:00 So on the left is a light stage that we commissioned.
01:04 We previously built our own, but the off-the-shelf one was actually better.
01:08 And this is used for capturing data sets under a variety of lighting conditions so we can
01:13 train models for relighting.
01:16 And then the topic of the next video is the image on the right, which is how do you get
01:22 ground truth matting data for natural subjects?
01:26 So this sounds easy.
01:27 We have blue screen, but blue screen is actually an under-constrained problem.
01:32 And we found that we were training things based on the output of previous algorithms
01:35 that were themselves trained.
01:36 And so we wanted to go back to ground truth and try to do that.
01:41 And the way we did that was to leverage an idea that came out in Seagraph in about 2005
01:47 of using cross polarizers with two cameras.
01:51 So there's a new device from Sony which has little micro polarizing filters in each subpixel.
01:57 And if you do that, then hopefully the video will play.
02:01 One more click.
02:04 You can get really strong ground truth data.
02:07 So you actually use an LCD screen as the background, just displaying white.
02:12 And this camera gives you four channels of polarizing related information.
02:16 If you use it in the way that was previously published, you end up with very noisy images.
02:20 But we have a new breakthrough idea which is about to be presented at CVPR.
02:27 And these are some of the results from it.
02:28 And you can see that you get very, very clean ground truth data.
02:33 Even though this setup is not practical for doing a large movie shoot, we can capture
02:37 a data set that then lets us train better blue screen or green screen extraction algorithms.
02:43 So it's really an example of an AI-inspired data capture with a subsequent use case using
02:50 a derived model.
02:53 So one of the challenges if you have highly moderated data where you've removed all of
03:01 the trademark content, because that's part of your policy, is if you don't sell it to
03:05 an enterprise that wants to create their trademark content, there isn't an easy way to do that.
03:10 So we have also developed fast algorithms for train your own model.
03:15 So this takes our core Firefly generative model and then lets enterprises customize
03:21 it with their own content.
03:23 And so they won't accidentally create content from any other vendor in their backgrounds
03:28 or anything, but they'll definitely have full control and end up with a very high quality
03:33 example where they upload their own data to the server.
03:37 They get to have a versioned model which is just for them to use.
03:42 And so we have a multi-tenant model for this.
03:48 So here you just upload the images.
03:52 And then once you've done that, you can-- we have a pseudo brand that we invented just
03:58 to demonstrate the idea without using proprietary content called Drip, which is a drinks brand.
04:05 And then once you have it, you can use it to generate large variations and so on.
04:13 So one of the things about scaling content is that we want to generate multiple variations
04:20 for all different form factors of display and different styles.
04:25 So here is an example of this, where it's sort of the combinatorial explosion of customizing
04:31 something for a target segment and then all of the devices that that segment might have.
04:37 And it rapidly means that creative people are not doing creative tasks and get grumpy
04:42 about it.
04:43 And also, it's easy to either make a mistake or just burn through a lot of budget doing
04:49 that.
04:50 Whereas by doing this automatically with Gen AI, it's one of those everyday tasks that
04:54 lets them focus on being creative.
04:58 So one of the other variations that you have to think about, of course, being a global
05:02 company, is all of the different language versions that you need to do for anything
05:06 related to text, audio, and video.
05:09 And so the grandest challenge is probably redubbing videos in other languages.
05:15 And so we have developed models to do that for a wide variety of spoken language and
05:22 also reanimate the lips to match.
05:25 So this isn't creating arbitrary video from text.
05:30 It's really taking a pre-existing recording, translating it, and so on.
05:35 So if the demo gods are friendly...
05:38 What's up, Zeyun?
05:39 What are you doing?
05:40 Hola, estamos trabajando en una manera fácil de traducir y doblar videos.
05:41 Wait, what?
05:42 Oh, c'est traduction en génére par l'IA.
05:43 I'm working on this with my team, the Speech AI team at Adobe Research.
05:44 Hi.
05:45 Can we talk about the translation?
05:46 Yes, of course.
05:47 So, we're going to be using the speech AI team to translate the video.
05:48 And we're going to be using the speech AI team to translate the video.
05:49 And we're going to be using the speech AI team to translate the video.
05:50 And we're going to be using the speech AI team to translate the video.
05:51 And we're going to be using the speech AI team at Adobe Research.
05:53 Hi.
05:54 Can we take a sneak into what that means?
05:57 Sure.
05:58 With this technology, you can upload your video and have it dubbed and translated into
06:02 various languages.
06:03 It will match your voice and it will generate new lip motions to match those languages.
06:08 Wait, wait, wait.
06:09 This is easy.
06:10 We're very happy about it.
06:11 We're very excited about this.
06:12 It's easy.
06:13 We're very excited about this.
06:14 We're very excited about this.
06:15 We're very excited about this.
06:16 We're very excited about this.
06:17 We're so excited about this.
06:18 We're so excited about this.
06:19 We're so excited about this.
06:20 We're so excited about this.
06:21 We're so excited about this.
06:22 Keep on hearing about it.
06:23 So there he's wonderful.
06:24 He has great energy.
06:25 He doesn't speak all those languages, but we do have some people who do so they can sort
06:26 of sanity check it.
06:27 I think the main thing there is not only is it very fast and efficient, particularly for,
06:28 say, lower budget movies where you want to be on social quickly, but it captures the
06:29 voice of the original performer rather than having another actor speaking in a foreign
06:30 language.
06:31 So in some ways, it's very, very powerful.
06:32 So I think the main thing there is not only is it very fast and efficient, particularly
06:33 for, say, lower budget movies where you want to be on social quickly, but it captures the
06:34 voice of the original performer rather than having another actor speaking in a foreign
06:35 language.
06:36 So I think the main thing there is not only is it very fast and efficient, particularly
07:00 for, say, lower budget movies where you want to be on social quickly, but it captures the
07:07 voice of the original performer rather than having another actor speaking in a foreign
07:11 language.
07:12 So I think the main thing there is not only is it very fast and efficient, particularly
07:13 for, say, lower budget movies where you want to be on social quickly, but it captures the
07:14 voice of the original performer rather than having another actor speaking in a foreign
07:15 language.
07:16 So I think the main thing there is not only is it very fast and efficient, particularly
07:17 for, say, lower budget movies where you want to be on social quickly, but it captures the
07:18 voice of the original performer rather than having another actor speaking in a foreign
07:19 language.
07:20 So I think the main thing there is not only is it very fast and efficient, particularly
07:21 for, say, lower budget movies where you want to be on social quickly, but it captures
07:22 the voice of the original performer rather than having another actor speaking in a foreign
07:23 language.
07:24 So I think the main thing there is not only is it very fast and efficient, but it captures
07:25 the voice of the original performer rather than having another actor speaking in a foreign
07:26 language.
07:27 So I think the main thing there is not only is it very fast and efficient, but it captures
07:28 the voice of the original performer rather than having another actor speaking in a foreign
07:29 language.
07:30 So I think the main thing there is not only is it very fast and efficient, but it captures
07:31 the voice of the original performer rather than having another actor speaking in a foreign
07:32 language.
07:33 So I think the main thing there is not only is it very fast and efficient, but it captures
07:34 the voice of the original performer rather than having another actor speaking in a foreign
07:35 language.
07:36 So I think the main thing there is not only is it very fast and efficient, but it captures
07:37 the voice of the original performer rather than having another actor speaking in a foreign
08:05 language.
08:06 So I think the main thing there is not only is it very fast and efficient, but it captures
08:07 the voice of the original performer rather than having another actor speaking in a foreign
08:08 language.
08:09 So I think the main thing there is not only is it very fast and efficient, but it captures
08:10 the voice of the original performer rather than having another actor speaking in a foreign
08:11 language.
08:12 So I think the main thing there is not only is it very fast and efficient, but it captures
08:13 the voice of the original performer rather than having another actor speaking in a foreign
08:14 language.
08:15 So I think the main thing there is not only is it very fast and efficient, but it captures
08:16 the voice of the original performer rather than having another actor speaking in a foreign
08:17 language.
08:18 So I think the main thing there is not only is it very fast and efficient, but it captures
08:19 the voice of the original performer rather than having another actor speaking in a foreign
08:20 language.
08:21 So I think the main thing there is not only is it very fast and efficient, but it captures
08:40 the voice of the original performer rather than having another actor speaking in a foreign
09:01 language.
09:25 So I think the main thing there is not only is it very fast and efficient, but it captures
09:32 the voice of the original performer rather than having another actor speaking in a foreign
09:39 language.
09:40 So I think the main thing there is not only is it very fast and efficient, but it captures
09:41 the voice of the original performer rather than having another actor speaking in a foreign
09:42 language.
09:43 So I think the main thing there is not only is it very fast and efficient, but it captures
09:44 the voice of the original performer rather than having another actor speaking in a foreign
09:45 language.
09:46 So I think the main thing there is not only is it very fast and efficient, but it captures
09:47 the voice of the original performer rather than having another actor speaking in a foreign
09:48 language.
09:49 So I think the main thing there is not only is it very fast and efficient, but it captures
09:50 the voice of the original performer rather than having another actor speaking in a foreign
09:51 language.
09:52 So I think the main thing there is not only is it very fast and efficient, but it captures
09:55 the voice of the original performer rather than having another actor speaking in a foreign
09:56 language.
09:57 So I think the main thing there is not only is it very fast and efficient, but it captures
09:58 the voice of the original performer rather than having another actor speaking in a foreign
09:59 language.
10:00 So I think the main thing there is not only is it very fast and efficient, but it captures
10:01 the voice of the original performer rather than having another actor speaking in a foreign
10:02 language.
10:03 So I think the main thing there is not only is it very fast and efficient, but it captures
10:04 the voice of the original performer rather than having another actor speaking in a foreign
10:05 language.
10:06 So I think the main thing there is not only is it very fast and efficient, but it captures
10:07 the voice of the original performer rather than having another actor speaking in a foreign
10:08 language.
10:09 So I think the main thing there is not only is it very fast and efficient, but it captures
10:32 the voice of the original performer rather than having another actor speaking in a foreign
10:33 language.
10:34 So I think the main thing there is not only is it very fast and efficient, but it captures
10:35 the voice of the original performer rather than having another actor speaking in a foreign
10:36 language.
10:37 So I think the main thing there is not only is it very fast and efficient, but it captures
10:38 the voice of the original performer rather than having another actor speaking in a foreign
10:39 language.
10:40 So I think the main thing there is not only is it very fast and efficient, but it captures
10:41 the voice of the original performer rather than having another actor speaking in a foreign
10:42 language.
10:43 So I think the main thing there is not only is it very fast and efficient, but it captures
10:44 the voice of the original performer rather than having another actor speaking in a foreign
10:45 language.
10:46 So I think the main thing there is not only is it very fast and efficient, but it captures
10:47 the voice of the original performer rather than having another actor speaking in a foreign
11:15 language.
11:16 So I think the main thing there is not only is it very fast and efficient, but it captures
11:17 the voice of the original performer rather than having another actor speaking in a foreign
11:18 language.
11:19 So I think the main thing there is not only is it very fast and efficient, but it captures
11:20 the voice of the original performer rather than having another actor speaking in a foreign
11:21 language.
11:22 So I think the main thing there is not only is it very fast and efficient, but it captures
11:23 the voice of the original performer rather than having another actor speaking in a foreign
11:24 language.
11:25 So I think the main thing there is not only is it very fast and efficient, but it captures
11:26 the voice of the original performer rather than having another actor speaking in a foreign
11:27 language.
11:28 So I think the main thing there is not only is it very fast and efficient, but it captures
11:29 the voice of the original performer rather than having another actor speaking in a foreign
11:30 language.
11:31 So I think the main thing there is not only is it very fast and efficient, but it captures
11:50 the voice of the original performer rather than having another actor speaking in a foreign
12:11 language.
12:12 So I think the main thing there is not only is it very fast and efficient, but it captures
12:40 the voice of the original performer rather than having another actor speaking in a foreign
12:41 language.
12:42 So I think the main thing there is not only is it very fast and efficient, but it captures
12:43 the voice of the original performer rather than having another actor speaking in a foreign
12:44 language.
12:45 So I think the main thing there is not only is it very fast and efficient, but it captures
12:46 the voice of the original performer rather than having another actor speaking in a foreign
12:47 language.
12:48 So I think the main thing there is not only is it very fast and efficient, but it captures
12:49 the voice of the original performer rather than having another actor speaking in a foreign
12:50 language.
12:51 So I think the main thing there is not only is it very fast and efficient, but it captures
12:52 the voice of the original performer rather than having another actor speaking in a foreign
12:53 language.
12:54 So I think the main thing there is not only is it very fast and efficient, but it captures
12:55 the voice of the original performer rather than having another actor speaking in a foreign
13:23 language.
13:24 So I think the main thing there is not only is it very fast and efficient, but it captures
13:25 the voice of the original performer rather than having another actor speaking in a foreign
13:26 language.
13:27 So I think the main thing there is not only is it very fast and efficient, but it captures
13:28 the voice of the original performer rather than having another actor speaking in a foreign
13:29 language.
13:30 So I think the main thing there is not only is it very fast and efficient, but it captures
13:31 the voice of the original performer rather than having another actor speaking in a foreign
13:32 language.
13:33 So I think the main thing there is not only is it very fast and efficient, but it captures
13:34 the voice of the original performer rather than having another actor speaking in a foreign
13:35 language.
13:36 So I think the main thing there is not only is it very fast and efficient, but it captures
13:37 the voice of the original performer rather than having another actor speaking in a foreign
13:38 language.
13:39 So I think the main thing there is not only is it very fast and efficient, but it captures
13:56 the voice of the original performer rather than having another actor speaking in a foreign
14:17 language.
14:43 So I think the main thing there is not only is it very fast and efficient, but it captures
14:50 the voice of the original performer rather than having another actor speaking in a foreign
14:54 language.
14:55 So I think the main thing there is not only is it very fast and efficient, but it captures
14:56 the voice of the original performer rather than having another actor speaking in a foreign
14:57 language.
14:58 So I think the main thing there is not only is it very fast and efficient, but it captures
14:59 the voice of the original performer rather than having another actor speaking in a foreign
15:00 language.
15:01 So I think the main thing there is not only is it very fast and efficient, but it captures
15:02 the voice of the original performer rather than having another actor speaking in a foreign
15:03 language.
15:04 So I think the main thing there is not only is it very fast and efficient, but it captures
15:05 the voice of the original performer rather than having another actor speaking in a foreign
15:06 language.
15:07 So I think the main thing there is not only is it very fast and efficient, but it captures
15:08 the voice of the original performer rather than having another actor speaking in a foreign
15:09 language.
15:10 So I think the main thing there is not only is it very fast and efficient, but it captures
15:11 the voice of the original performer rather than having another actor speaking in a foreign
15:12 language.
15:13 So I think the main thing there is not only is it very fast and efficient, but it captures
15:14 the voice of the original performer rather than having another actor speaking in a foreign
15:15 language.
15:16 So I think the main thing there is not only is it very fast and efficient, but it captures
15:17 the voice of the original performer rather than having another actor speaking in a foreign
15:18 language.
15:19 So I think the main thing there is not only is it very fast and efficient, but it captures
15:20 the voice of the original performer rather than having another actor speaking in a foreign
15:21 language.
15:22 So I think the main thing there is not only is it very fast and efficient, but it captures
15:38 the voice of the original performer rather than having another actor speaking in a foreign
15:39 language.
15:40 So I think the main thing there is not only is it very fast and efficient, but it captures
15:41 the voice of the original performer rather than having another actor speaking in a foreign
15:42 language.
15:43 So I think the main thing there is not only is it very fast and efficient, but it captures
15:44 the voice of the original performer rather than having another actor speaking in a foreign
15:45 language.
15:46 So I think the main thing there is not only is it very fast and efficient, but it captures
15:47 the voice of the original performer rather than having another actor speaking in a foreign
15:48 language.
15:49 So I think the main thing there is not only is it very fast and efficient, but it captures
15:50 the voice of the original performer rather than having another actor speaking in a foreign
15:51 language.
15:52 So I think the main thing there is not only is it very fast and efficient, but it captures
15:53 the voice of the original performer rather than having another actor speaking in a foreign
15:54 language.
15:55 So I think the main thing there is not only is it very fast and efficient, but it captures
15:56 the voice of the original performer rather than having another actor speaking in a foreign
15:57 language.
15:58 So I think the main thing there is not only is it very fast and efficient, but it captures
15:59 the voice of the original performer rather than having another actor speaking in a foreign
16:00 language.
16:01 So I think the main thing there is not only is it very fast and efficient, but it captures
16:02 the voice of the original performer rather than having another actor speaking in a foreign
16:03 language.
16:04 So I think the main thing there is not only is it very fast and efficient, but it captures
16:05 the voice of the original performer rather than having another actor speaking in a foreign
16:06 language.
16:07 So I think the main thing there is not only is it very fast and efficient, but it captures
16:08 the voice of the original performer rather than having another actor speaking in a foreign
16:09 language.
16:10 So I think the main thing there is not only is it very fast and efficient, but it captures
16:11 the voice of the original performer rather than having another actor speaking in a foreign
16:12 language.
16:13 So I think the main thing there is not only is it very fast and efficient, but it captures
16:14 the voice of the original performer rather than having another actor speaking in a foreign
16:15 language.
16:16 So I think the main thing there is not only is it very fast and efficient, but it captures
16:17 the voice of the original performer rather than having another actor speaking in a foreign
16:18 language.
16:19 So I think the main thing there is not only is it very fast and efficient, but it captures
16:20 the voice of the original performer rather than having another actor speaking in a foreign
16:21 language.
16:22 So I think the main thing there is not only is it very fast and efficient, but it captures
16:38 the voice of the original performer rather than having another actor speaking in a foreign
16:39 language.
16:40 So I think the main thing there is not only is it very fast and efficient, but it captures
16:41 the voice of the original performer rather than having another actor speaking in a foreign
16:42 language.
16:43 So I think the main thing there is not only is it very fast and efficient, but it captures

Recommended