- il y a 5 semaines
Less is More: How Specialized AI Solutions Drive Business Value
Catégorie
🤖
TechnologieTranscription
00:26Sous-titrage MFP.
00:30Sous-titrage MFP.
01:00I'm going to ask them to introduce themselves, and then we're going to get cracking with the discussion.
01:04So first of all, Kate, would you be able to introduce yourself?
01:06Hey, everyone.
01:07My name is Kate Sol, and I lead technical product management for IBM's large language models that we train from
01:13scratch.
01:14They're called Granite.
01:15Really excited to be here today.
01:16Thank you, Kate.
01:17And Jarek?
01:18So hi, I'm Jarek, CEO and founder of DeepL, and we help companies communicate in all of the different languages
01:25that are out there by translating them.
01:28Fantastic.
01:29And Suzanne?
01:29Thanks, Oliver.
01:31It's great to be here.
01:31I'm Suzanne.
01:32I lead the product marketing team at AWS for our AWS AI services.
01:37Brilliant.
01:38Now, I want to start by really defining what we're talking about, because I'm sure most of the audience are
01:44quite comfortable with general purpose AIs like ChatGPT and perplexity, but maybe not specialized AIs.
01:50So, Kate, maybe you can kick us off by defining exactly what we mean when we say specialized AI.
01:56Sure thing.
01:57So, I don't think it's any secret that AI and these large language models are trained on huge amounts of
02:04general data that is basically found on the Internet.
02:07When we talk about specialized AI, at least at IBM, what we're really focused on is how do we start
02:13to use data that isn't readily available, data that is domain-specific and task-specific,
02:18data that lives behind, you know, the four walls within an enterprise that's proprietary and differentiated, and use that data
02:26to train your models instead.
02:28Fantastic.
02:29So, what, I guess, what goes into those models?
02:33When we talk about data, the infrastructure, the team, what's under the hood?
02:37What's the process like?
02:38Yeah.
02:38So, with specialization, there's a whole spectrum of options and things that you can do when it comes to bringing
02:45enterprise or specialized domain data into a large language model.
02:49I mean, there are patterns that you can do on one end of the spectrum, things like RAG, retrieval augmented
02:54generation, where you're not actually changing the weights of the model themselves.
02:58You're bringing your enterprise data and injecting it basically into the prompt at real time.
03:03And then, on the other end of the spectrum, you're actually starting to make changes to the model.
03:07You're continuing to train it.
03:09You're continuing to fine-tune it based off of this specialized data.
03:12And that requires more compute, but also can lead to higher performance in some cases.
03:17Interesting.
03:18So, there's a real range that companies can look at doing.
03:21And, Suzanne, do you broadly agree with the definition that we've put together so far?
03:26Do you have anything else that you'd bring in on your side?
03:28Yeah, I think specialization or think of it as domain-specific or use case-specific so that you're actually getting
03:36better accuracy, better performance out of a model for your specific use case.
03:43And at AWS, we've espoused that from the outset because we've said, you know, not one model will rule them
03:49all.
03:50And this is the best example of that.
03:52And it's true for enterprises and all sizes of companies, right, and all sizes of organizations because every organization has
03:59deep knowledge and expertise about their customer, about their geography, about their language.
04:05And so, we've made a real effort to make that freedom possible from the get-go and not sort of
04:12focus on just one, you know, tool to solve all the problems.
04:16And that's kind of what Kate was alluding to there as well.
04:19So, think of it really as use case, domain, and your specific data that allows you to customize and focus
04:27your model for your use case.
04:29And because they have all that data, so going to a general purpose solution will never be the best option
04:34to solve their challenges.
04:36What is the moment when a client realizes that they need something that's more tailored around what they're doing?
04:42I think it's what Kate was starting to get at.
04:44It's when you know something about a domain, whether it's a healthcare space or a very specific area where the
04:51data isn't readily available in the general world out there.
04:54And so, if you think about an example like evolutionary scale, they trained a specific model specifically focused on all
05:02the data that's encoded in evolution.
05:05Every protein, you know, potentially that you could imagine is encoded in their model.
05:11And that's the kind of specialization that you can only do if you have specific data, you train a specific
05:18model,
05:18and then you deploy it in a different set of use cases, in this case for developing new kinds of,
05:24you know, healthcare solutions and drugs that will help us all, you know, get healthier.
05:29Fantastic.
05:30I'm so glad you brought up some sort of concrete examples because I think it's so important in a conversation
05:34like this to talk about the actual business cases and businesses that are using specialized AI.
05:41Obviously, Yarek, we're very lucky to have you on stage with us, and Deepel has been doing exactly this.
05:47Can you give us an overview of what it took to create the AI that sits behind what you're doing?
05:53And tell us a bit about your services as well, because I think people want to know.
05:57Yeah, I mean, we've been starting off with this idea of specialized AI in 2017.
06:04Like, this was how we've built the company, and for us, it was always about this, you just said it,
06:10like the business case.
06:11Like, what is the real problem that we're solving?
06:13And for us, for me also, with technology, this is always about that.
06:18You don't start with the technology, you start with the problem that is out there.
06:23And then you're trying to optimize the technology for exactly this use case, for exactly this problem that's out there.
06:30And translation is incredibly broad.
06:33There's so many different applications that range from just quickly translating that WhatsApp message that you want to get out
06:40to your business partner,
06:44over to translating incredibly complex technical documentation or legal contracts, where every word matters and where accuracy is really, really
06:54key.
06:54And what we're trying to do at DeepL is kind of package this whole stack together of understanding what our
07:02customers need translation for
07:04and figuring out how AI can help better in that.
07:07And at the same time, do cutting-edge model development, making sure that the models that we built are, like,
07:15perfect at multiple languages
07:17that are able to figure out all of the nuances of those languages right like a native, but at the
07:27same time maintain the accuracy.
07:29So it's a lot about a problem domain and kind of the classical enterprise, business, SaaS questions,
07:37and at the same time combining that with research and how does AI help in those workflows actually in the
07:44next year and the next three years.
07:47And the work you started, you mentioned when the business began, it obviously predated a lot of the general purpose
07:54excitement that's happened in the last few years.
07:56But when you look at what's out there now, could DeepL have been done with the general purpose models you
08:05see now,
08:05or was a specialized approach always the most viable path?
08:11I think if we would start right now, we would probably take existing large language models,
08:16take whatever is available in the open source scene and build on that.
08:21I think that's just logical in a way.
08:24We had to start from scratch a little bit earlier.
08:26We had to kind of build the basics when the transformer model was not out there yet, even.
08:32So obviously when you come in with that spirit, you're doing a little bit more on your own.
08:37And as a company, you also build out more of this DIY spirit.
08:44But at the end nowadays, it would be probably possible to accomplish this also using some of the models available
08:53out there on the market.
08:54You'd still have to put a lot of work in making sure that they work in such a nuanced way
09:01with language.
09:03It's not what those models are usually trained for so strongly.
09:08Interesting.
09:09We were speaking just off stage a few minutes ago about obviously the excitement and what it's like being a
09:13founder of a business that predated this excitement.
09:16Tell us a bit about how it has felt over the last two or three years where all of a
09:20sudden the world is suddenly interested, so interested in what companies like yours are doing.
09:26Yeah, that's been an accelerator to the craziness that happens around the hyper growth company either way.
09:33I mean, we've been already very, very, very strongly growing and doing so out of Europe as one of the
09:38few AI companies in Europe already before the kind of AI boom.
09:44And I think the AI boom just kind of accelerated a lot of that and make the environment even a
09:51crazier one.
09:52And that's for the good because it obviously creates a lot of awareness for those problems with our customers before
09:59the advent of general purpose AI.
10:02We would even kind of sometimes ask ourselves, do we want to use the term AI or not?
10:09Because it was not that cool maybe even.
10:12And people associated it with risks actually rather than the benefits that it brings.
10:18So it was more focused on the use case.
10:21I think that has changed obviously.
10:24And therefore had quite a strong impact on how the company works.
10:30On the other hand, we've been always going against very strong competition from big tech.
10:36And therefore from that perspective, not a lot has changed for us.
10:40Like this competitive space has remained pretty much the same.
10:44Fantastic.
10:44Thank you.
10:45Kate, while we're talking about examples, are there any other client examples that you can talk about where a company
10:51has decided to go down the specialized AI route?
10:55Absolutely.
10:56So I'll say first that IBM is one of the biggest users of specialized AI.
11:01We're in France.
11:02So instead of saying we eat our own dog food, I'll say we drink our own champagne.
11:05And that we use Granite, our smaller, large language model.
11:10We specialize it with IBM data to run, for example, internal chatbots, HR functions, and other processes.
11:16We've also worked with key customers like KPMG who built specialized AI solutions using WatsonX and Granite in order to
11:25better improve the model's ability to audit medical records in an ambulance delivery service.
11:32So you talk about different areas where you need either specialized models, specialized domain expertise, rather, or you also need
11:40a smaller model that can run efficiently.
11:42And we see those both being really big reasons to drive our customers towards smaller, specialized models.
11:49Fantastic.
11:50And Suzanne, on your side, any other examples that you can share with us?
11:53I mean, the way we think about it is there's this triangle between cost, accuracy, and latency, which you were
11:59starting to talk about.
12:01And it's really balancing those, right?
12:02Because you want an amazing performance in your application so that responses are fast and exciting for the user.
12:09You want the cost to be appropriate for the product you're developing.
12:13And then, of course, you want the accuracy that's required for your use case, which varies vastly depending on exactly
12:19what you were describing.
12:20Is it a contract or is it, you know, a social media post and kind of somewhere in between?
12:25And so what we've seen a lot is customers combining models actually in an application.
12:30So it's not that one model will solve the entire use case.
12:34And so what we see, for example, TUI, right, a big travel provider very known, you know, in Europe, obviously.
12:41And they combine LAMA models with our Amazon Nova models.
12:46And they did that to get a much better cost profile for their overall application, much faster response times with
12:53the faster model serving up the responses to the end user.
12:57But then using the meta LAMA model to work on some of the content and content curation.
13:02And so it's that combination of models that really drives or is actually behind a lot of the production use
13:09cases.
13:09And it's remarkable to see how quickly a lot of these use cases are moving into production in just two
13:15years, right?
13:16I really think we're out of the purgatory of POC phase.
13:21And the customers who are most focused on use cases that they can deploy that are tied to something meaningful
13:27for their business are moving the fastest.
13:30And that's where TUI is a great example.
13:32Fantastic.
13:33I just really want to spell out those advantages that you mentioned because I think it's important.
13:37It's not just the domain knowledge that these specialized models have.
13:42They're actually more efficient, smaller potentially.
13:45Yeah, absolutely.
13:46So more responsive for the client.
13:48More responsive in the UI, in the experience that you're trying to create, right?
13:52If you're, you know, and it depends a lot on your use case.
13:54And this becomes heightened in the agentic era where you really need to think about how do you want this?
13:59Is this a real-time kind of experience?
14:02Is this something that's running in the background?
14:04And that's really critical because you need to kind of optimize around that.
14:08Otherwise, your users won't, you know, accept it.
14:10We're in a time where people expect answers immediately.
14:14And so you need to really curate that.
14:16And using multiple models like, you know, we support through Bedrock where the granite models also are accessible is really
14:23the way that more folks can mix and match and get the best outcome.
14:26Well, maybe just to pull on a point you were saying, Suzanne.
14:30So it's not to say that you can't, when you're experimenting, for example, in prototyping, build something on a really
14:36large general purpose model.
14:37It'll probably work really well.
14:39But then you need to get into deployment, and all of a sudden you're looking at real-world obstacles around
14:44cost, latency, efficiency.
14:46And so we're seeing specialization as a way to start to give tools to that larger model where you can
14:51kick some of the workload to the cheaper, faster model in order to reduce your costs while maintaining the same
14:57level of performance.
14:58And so as more and more customers and users go from that experimentation phase to deployment, you know, I think
15:05smaller, specialized models are going to be critical for actually getting deployable AI that you can realize the value from.
15:11I mean, and one concept maybe we should add to your earlier list around RAG and fine-tuning, around distillation.
15:18This idea of taking, you know, the best answers that you've curated for your large model in your prototyping phase
15:27and actually kind of think of it as almost sort of pouring them into the smaller model so then you
15:32can use or expect the same kind of accuracy from a much lower latency, much lower cost model.
15:40And so that's also a huge opportunity for customers.
15:45Yeah, I'm sure everyone in the audience, you know, understands when we ask ChatGPT a very detailed question and we
15:50have to wait for that answer and it's working through it.
15:53But if you're speaking to your travel agent or your banking app, you don't necessarily want to wait for that
15:58to happen.
15:59You want it to be immediate.
16:01Yarek, on your side, when it comes to responsiveness and speed and efficiency, I know accuracy is obviously key, but
16:08do those factors also play when you think about the technology you're developing?
16:12Super important.
16:13I mean, like everybody expects translation to just work immediately and that's already for the text translation case.
16:23If you're thinking about real-time voice translation, if you're thinking about kind of translating a Microsoft Teams or Zoom
16:31meeting that's going on, you have to be even faster.
16:35And that includes the whole stack from speech recognition, translation, potentially text-to-speech or all of that package into
16:43a model and with all of the latency coming in and presenting it on the screen to the user.
16:48So, you've got to fight for every millisecond that's out there in those use cases for best readability or for
16:56best comprehensibility.
16:57So, we're running a wide range of different model sizes depending on the use case and whether it's free, paid,
17:09and kind of figuring it out per use case.
17:15Yeah, yeah.
17:16A few of you have mentioned agentic AI, and I know it was teased in the description for this session
17:21that we were going to talk about this topic.
17:23Obviously, with the rise of agentic, systems that can act more autonomously, I guess, does that make the case the
17:32better, the stronger case, if you like, for more specialized AI?
17:36Can someone, Kate, could you explain how that works?
17:40Yeah, so I think the really important word you said there was systems.
17:43So, we're moving from, it used to be, you know, ChatGPT, you'd send one task, we'd have one prompt, it'd
17:49go to a really big model, we'd increase the model size until we got performance satisfactory, and then you'd get
17:54one response back, basically.
17:56And we're now moving with agents into systems of models working together, and a lot of what happens behind the
18:03scenes with these systems is task decomposition.
18:06So, now, all of a sudden, we take a thorny problem, we break it down into parts, and each part
18:11could relate to a specialty of a model that you might have that's specialized.
18:16So, I think with the rise of agents, we're only going to see more and more need to drive towards
18:20some of these smaller specialized models acting almost as tools in an agentic setup,
18:26just like an agent would have a tool to go and do a web search, it could have a tool
18:30to phone a friend, call in a specialized model so that it can answer the question faster, more cheaply with
18:36that specialized knowledge.
18:38Fantastic, thank you.
18:39Suzanne, anything to add on that?
18:40Yeah, I mean, that's, I mean, we see a lot of companies experimenting with that, in fact.
18:45So, like, Dentsu Digital, large advertising corporation, they're thinking, for example, about how to have one specialized agent that's focused
18:54on audience analysis,
18:55and so really trying to dig deep in the audience.
18:59That agent works with an agent focused on gathering the data that's needed to make those analyses.
19:06And then there's another agent that's actually focused on generating potential creative options and then kind of circling back with
19:14the others to see,
19:14do you think this will work for this audience?
19:16And then the agent might say, I need more data, you know, on this particular segment, go find it.
19:21And then they were fine.
19:22And so it's this kind of interesting interplay between these different models and experts.
19:27And then overall, of course, you know, you're curating how, what the outcomes are that you want.
19:31But this kind of multi-step process with multiple agents, multiple models is where we're headed.
19:38And it's a very exciting next phase where there's, you know, these models are expanding and the agents' capabilities are
19:45expanding.
19:46Jarek, is this a place where DeepL is looking?
19:50What are your thoughts on Agentic?
19:52Yeah, I think there's two aspects to that.
19:54And the first one being obviously, as Kate has been saying, like specialized models like our language-aware models,
20:02they can fit in very, very well into this overall agentic scheme where the agent leverages those models to get
20:09the best possible answer for the task at hand.
20:12So that's the one thing.
20:13And we're seeing more and more platforms coming up that are going to be kind of putting together all of
20:19those puzzle pieces with Bedrock
20:22and making that a kind of complete, kind of in a way like a marketplace even for those agents to
20:29pick the solutions that they need.
20:31On the other hand, we are also looking into how do the kind of language-based workflows work for our
20:38customers
20:39and how we can disrupt some of those ways that translation has been done in particular areas of a company
20:49until now
20:50and how it can be driven now by agents.
20:53And I think that's going to even more so democratize this kind of access to translation that we've been already
21:00pioneering over the last years.
21:02It's going to allow pretty much any department, any team to kind of run sophisticated language-related tasks,
21:11maintaining the versioning of maybe the documentation that they're translating and empowering all of that in such a simple manner.
21:21And I think the kind of specialization also kicks in there a little bit, just bringing it back to the
21:26theme of this panel once again.
21:29I mean, the task decomposition, the task planning aspects of agentic AI are incredibly compute-intensive right now,
21:38also incredibly expensive.
21:40And we are amazed by the possibilities of those bigger models right now.
21:45I think at some point in time we'll have to see, especially for those more repetitive tasks,
21:50how do we find efficiencies there?
21:52Because sometimes now running a task through an agent might be even exactly as costly as having a human do
22:00that task.
22:00And that's not really what this was meant for.
22:03So figuring out better hardware to do that, figuring out better systems,
22:08but also creating specialized models that can run those particular repetitive tasks
22:13might be a big venue actually to innovate on.
22:17Absolutely. And I know hardware is a big sort of piece of this discussion.
22:22I know Amazon has done a lot of work in this space, so you obviously have skin in the game.
22:28Can you tell us a little bit about what you're doing in the hardware space when it comes to specialized
22:33AI?
22:34Yeah, I think it's really, really critical that customers have options when it comes to their compute platforms.
22:41And that's something that AWS has been investing in for a long time with our Annapurna team
22:45and looking at our new specialized chips.
22:48AWS Tranium in particular has really helped customers lower their inference costs by 35% or up to 35%.
22:56And so one of the things that's critical about that is having other compute platforms
23:01that you can then deploy on to try to drive down your inference costs so you can scale faster.
23:06And we leverage those, that hardware, and we want to provide that to our customers as well
23:11and make it readily accessible, whether that's for startups like Hippocratic AI is another great example
23:16in a healthcare space that wants to train bespoke models for that domain.
23:21They take advantage of HyperPod and other tools so that they can access more options when it comes to compute.
23:28And so making compute accessible, making more options accessible through platforms like Tranium
23:34is absolutely critical as we continue to scale and develop more meaningful use cases for AI and agentic AI.
23:42Jarek, obviously operating a business with those kinds of costs of running.
23:47When you hear about the hardware piece, what's your view on this?
23:52Yeah, I mean, it's an incredibly important part of our overall setup as a company, of course.
23:57And we've been investing quite a lot into both our capabilities to run data centers and build data centers.
24:06And we have some of our partners from NVIDIA sitting here in the audience, by the way, that have been
24:11super helpful with that.
24:14So that's one part.
24:16And on the other hand, like we are obviously always looking into how can we utilize this hardware most efficiently.
24:23And that's where kind of the software part kicks in.
24:26That's where our teams are working incredibly hard to make sure that this compute footprint that we have,
24:33which is both a cost factor for us, but it's also a question of sustainability,
24:38is as low as possible.
24:41And that requires quite often a very tight kind of integration between the hardware, software,
24:48how our models are built.
24:49That all needs to really fit in so that every bit and piece, every CUDA core of this chip
24:56is utilized to the max all of the time, basically.
25:01Kate, what do you think about this conversation?
25:03Do you agree that the specialized future is a mirroring of hardware and software all coming together in a way?
25:13So I definitely agree.
25:15And I'm also really excited about the opportunities to co-optimize across the entire stack, right?
25:20Building models that are optimized for the hardware, building software that's optimized for the entire stack down.
25:26At the end of the day, where I think about hardware is with specialized models,
25:31we're seeing models are getting smaller while maintaining performance.
25:35One of the ways that we see the next kind of wave of iteration happening and innovation happening
25:40is by running more inferences with those smaller models.
25:43And so we're seeing a lot.
25:44This theme's called inference time computes.
25:47You run a model more times in order to improve your response.
25:50And I think that's going to be really important with small models as a new way to continue to boost
25:55performance.
25:55And so where hardware comes in, I think we're going to need whole new sets of hardware
25:59that's specifically optimized for inference speed of smaller models
26:03compared to just training super big models or running inference on really big models.
26:08And there are different decisions that you make when you're trying to solve for that problem.
26:12So I'm really excited by some of the new chips that are coming out
26:15and that will promise some of these advantages.
26:18Fantastic.
26:20I really want to leave our audience with some closing thoughts as we come towards the end of the panel.
26:25We've very much discussed the opportunities of specialized AI
26:30and identified many of them, some great examples.
26:33My question towards the end of this conversation is,
26:36are we heading for a reality where general purpose AIs become less prominent?
26:42They become the fallback if the specialization doesn't work.
26:47Is the real innovation over the next years going to be happening in this specialized space?
26:54Suzanne, what are your thoughts?
26:55Well, I think we're definitely moving from the sort of AI knows phase to AI that does phase, right?
27:04And as that becomes kind of the core of what enterprises and organizations want, right?
27:09They want AI that does, that executes, that completes work, completes tasks.
27:14As we move towards that phase, we will see a continued leveraging of a combination of models.
27:21These, you know, powerful models aren't going to go away.
27:25They're going to continue to be a critical piece of any enterprise's AI strategy.
27:30There's so many horizontal use cases that are critical in an organization.
27:35So that's absolutely going to continue.
27:37But they're going to be combined, almost like Kate was saying, an extension of the system with these specialized capabilities.
27:44Just think of all the different departments in an organization that will have kind of their knowledge instilled in different
27:50models.
27:51And those will collaborate with each other.
27:53So we'll see both.
27:54We'll see both evolve in parallel, which is why it will be so critical to continue to support that freedom
28:00of developers in organizations
28:02to access all these models safely and securely, to build new programs, new products with all of that combined strength.
28:11And so I think that's what we'll see over the next couple of years.
28:13Fantastic.
28:14Very interesting.
28:14Thank you.
28:15Jarek, what's your view on this?
28:17Yeah, I think I would agree in general.
28:19Like specialization is going to be an important part.
28:22But there is a place for general purpose models, too, especially if you look at this kind of broad range
28:28of different domains
28:29and different tasks that you want to have models for.
28:33We're not going to be able to build specialized models for all of those cases.
28:38There's obviously very prominent and very important use cases where translation being one of those, I think,
28:45where it makes sense to build a company around that or to really invest money into getting the best possible
28:51solution
28:51just because it's such an important piece there.
28:55But I don't think that there's going to be a company that's going to be building models that can write
29:01poems
29:01in the style of, I don't know, West Coast rap.
29:07That needs a model that can handle like this long tail of very, very different use cases.
29:13And in general purpose is going to be incredibly important for that.
29:17And then as we're moving into this AI is doing things phase, I think the focus on quality, on accuracy
29:27is going to be even stronger
29:29because every failure, every mistake that the AI is going to be making is going to potentially have even bigger
29:36implication that it has right now.
29:38If it's right now, it's assistance to the human, we can still catch the mistake and we can still figure
29:45it out.
29:45It costs us time.
29:46So every time it's not great if it happens.
29:49But if the AI is really truly autonomous, then we're going to have to be even more careful about that.
29:55And I think specialized models are able to not only provide higher quality, but provide this higher quality in a
30:04kind of more guaranteed way even.
30:06On a quality level that's more stable across the wide range of inputs that you might be getting.
30:11Just because the models have been trained for this particular task and you can do QA much better, you can
30:17just guarantee that in a better way.
30:21So it's going to be an interplay of both solutions, I think.
30:25I liked the point Suzanne made about the departments.
30:28If I think about the departments of the organization, you know, the specialized technology that the marketing department uses versus
30:35the legal department versus the translation department.
30:39Really interesting.
30:40Kate, what are your thoughts?
30:41I do agree.
30:42I think it's going to be a mix.
30:44But maybe to take a little bit more of a contrarian opinion just because it's the last part of the
30:48panel.
30:50You know, I do wonder if this is just a stepping stone towards a broader future of more efficient AI.
30:56So right now, we're all learning how to make smaller, more efficient models through specialization.
31:02I think we've failed if all that means is now we've got a proliferation of many different models, each specialized
31:08for different tasks.
31:09I mean, that's exactly what we were trying to avoid.
31:12Like, the promise of the original large language models was one model.
31:15I don't need to do additional training.
31:17It can automatically apply to all these different tasks.
31:21I can ask it to write a poem.
31:22I can ask it to write a letter.
31:23I can do all these different things without any additional training.
31:26And so there's something unsatisfying if the future is just reverting back to now I've got a model for every
31:32single task that I want to do.
31:34So where I think we're going to have the incentive to continue to evolve is, yes, specialization.
31:40I think this is, you know, a really exciting area for the now, particularly for high-volume tasks where there's
31:46a lot of benefit if you can bring the marginal costs down.
31:50There is more fixed costs because you have to host an additional model.
31:53But where I really want to see the field go is how do we get to much more efficient AI
31:59where you can specialize, you can infuse domain, but you're not having those narrow guardrails of this is now trained
32:05for this task.
32:06Or this is now an expert on, you know, medical data and you need a different expert on finance data.
32:11And if you're, you know, a medical company dealing in the financial department, you want a model that's specialized on
32:16both.
32:16So I'm really hopeful that we'll be able to get to smaller, more efficient AI using specialization as a part
32:23of many different techniques that will help us get there.
32:26A hint of optimism maybe for the general purpose.
32:29I'd like to close by asking each of you, I've not really spoken about your roles in particular.
32:33I'd love to hear what you're working on over the next year.
32:37Suzanne, what are your priorities at the moment?
32:39Well, we're definitely focused on agentic AI and really especially helping builders go to the next level with this combination
32:47of models.
32:48So we recently focused on, for example, a new SDK we released called Strands SDK to help developers orchestrate their
32:57agentic workflows.
32:58So my team, we're 100% focused on communicating and talking about this new world and helping to break it
33:05down in a way that's understandable and accessible and then launching all these products.
33:11Fantastic. Thank you.
33:13And Jarek, Deepal, what are your priorities this next year?
33:16Yeah, actually quite a few things.
33:18I mean like the next frontier in terms of translation in general, like kind of looking on a horizontal, taking
33:24a horizontal look at it is really spoken language.
33:27And we've launched Deepal Voice last year and that's really an amazing add to the overall platform that we are
33:34offering.
33:35Then on the other hand, and there was quite a few conversations here around fine tuning and how every company
33:42wants to have or needs maybe a specialized model.
33:46Most of our customers are just not equipped to build their own models or train them or fine tune them.
33:52And still they want to have Deepal behave in a way that is more tailored to their needs.
33:58That is not kind of an average of the internet translation.
34:02And a lot of our work goes into that.
34:04How can we build scalable, easy to use ways for our customers to get a translation that's really tailored to
34:14them?
34:14And then the third point, and I'm going to close with the third one, no worries, is how we can
34:22power those workflows for our customers.
34:26How does translation tie in into a broader space of what our customers are doing?
34:32How can we capture those use cases on a kind of higher level than we're doing it right now?
34:39And we're talking a lot about our customers right now, how we can help with them on that.
34:44Fantastic. Thank you. And Kate, yourself?
34:47So it probably doesn't come as a surprise then that what I'm trying to focus on and what we're really
34:51excited about is inference efficiency.
34:53How do we get more efficient AI, particularly around agents?
34:58So using techniques like specialization, but also, you know, improving model architecture.
35:03So we just released a couple weeks ago a preview of our Granite 4 models that are over 80%
35:09more memory efficient, can run on a, you know, $300 GPU.
35:14So really looking at how can we enable folks to be able to run agentic workloads in a much more
35:20cost effective manner and build them in a way that allows them to take advantage of specialization and other capabilities
35:27that are coming to small language models.
35:29Fantastic. Thank you very much. Ladies and gentlemen, we've covered what is specialized AI.
35:35We've spoken about the opportunities and we've heard from some fantastic case studies like D-Pell.
35:41It's given me a lot of food for thought about the future. I hope it has you too.
35:44Could I get a warm round of applause for our panel?
35:47Thank you.
35:48Thank you.
Commentaires