- il y a 2 mois
Beyond the Glitch: Understanding and Mitigating AI Hallucinations
Catégorie
🤖
TechnologieTranscription
00:00Hello, everybody. My name is Natasha Bernal. I'm Senior Business Editor at Wired.
00:06I am joined today in a very exciting panel, which is talking about AI hallucinations.
00:12Now, I do want to involve the audience, so be prepared.
00:17But first, I'm going to introduce my panellists.
00:19So, we've got on the end Dr. Amar Awaldala, who's the CEO and founder of Victara.
00:25We've got in the middle, Sasha Rubel, the head of AI and generative AI policy, EMEA at AWS.
00:31And right next to me, we've got Dr. Giada Pistilli, the principal ethicist at Hugging Face.
00:37Thank you very much for joining me.
00:39Okay, so my one and only question to the audience is just one way you lift your hand.
00:43So, I want to find out if any of you have asked any generative AI, any LLM, a question and
00:49have received the wrong response.
00:51Lift your hand if you've got the wrong, I love this, you guys are great.
00:56So have I.
00:57If you want, I'll tell you the story.
01:00I asked generative AI to book me the perfect day out in London.
01:05I ended up going to a really ratty looking, very awful Indian vegetarian place in the middle of nowhere.
01:13Then travelled across London to go on a bike in the middle of the Olympic Park in the rain.
01:18It was awful.
01:19It knew where I was going, but it didn't care.
01:23Now, obviously the stakes in that example are very, very small, but the stakes we might be talking about here
01:29are very, very high.
01:31So, I guess I want to start with a premise that I've seen time and time again online, which I
01:36want you all to address, if that's all right.
01:39LLMs will always hallucinate, and we just need to accept it.
01:44Let's start with Sasha.
01:46Is that true, or is it false?
01:47So, true, and I will add that I tested a model once that said that I was a man and
01:52a chess coach.
01:53So I can confirm that there are some errors sometimes, but I would nuance my answer of true with a
01:58couple of things.
01:59And Amr's t-shirt, which I hope you all can see, which says rag against the machine, is what I
02:04really want to emphasize,
02:06which is that true, LLMs will hallucinate, but also true, there are lots of really exciting things that we can
02:11do using AI as a technology and other kinds of innovations to actually address those hallucinations.
02:17And it also points to the fact that, for example, if you go to a library, you look online or
02:22you use search, you're going to verify the information that you find on that search.
02:25So at the end of the day, we really need to be investing in digital literacy and digital skills to
02:30help people be able to use these tools in a constructive way, but also critically engage with them.
02:36So I don't believe everything you read.
02:38Tiada?
02:38Yeah, so I agree 100% on what has been said.
02:41And I think I'd also like to add that if we start also having better evaluation, better metrics, and better
02:49data, then it can also be nuanced.
02:52That, of course, it's something that we can help mitigate.
02:55And sure, it's still like a big problem to address.
02:59We don't have like a silver bullet, but if we do, if we start doing and maybe taking a little
03:05bit more of time to do things in a better way, then we can find also solutions, yes.
03:11All right, Amo, take us away.
03:12Yeah, so I agree and disagree with the way you put it.
03:16So you said RAG will always hallucinate, and we have to accept that.
03:21I agree with RAG, sorry, LLMs will always hallucinate, and we have to accept that.
03:26I agree with LLMs will always hallucinate.
03:28And the reason why they will always hallucinate is because LLMs at their heart are probabilistic.
03:33There is a probability, and whenever you have a probability, there is going to be a likelihood of error.
03:38It's just how it is.
03:39It has been improving a lot, but at some point it will level off.
03:43And it is leveling off right now where we're not able to get it to be better, any better.
03:47So clearly everybody now seeing hallucinations will always be there.
03:52Now, but we don't have to accept it, right?
03:54In the same way that if you have a human who, on average, is an amazing doctor, but every now
04:00and then makes a medical diagnosis mistake, you still use that doctor.
04:04But then you have a fact checker beside him to catch him when he does the mistake, right?
04:08And I think that's what we need to evolve to as an industry, to figure out when can we trust
04:13the answer.
04:14This is a good answer.
04:15I can trust it.
04:16Let's deploy it.
04:17Versus, oh, this answer might be off.
04:19Let's have a human expert now help the machine make the proper call.
04:24I think it's interesting that you bring up a doctor because I personally, if I knew about it, wouldn't use
04:28the doctor at all, to be honest.
04:30I would just be like, hell no.
04:31But I do think it's interesting because obviously we're talking about maybe quite low stakes.
04:36So if I ask, you know, for a day out versus I'm asking for decisions about a company, it's a
04:42very, very different kind of scenario.
04:43So I guess I wanted to talk about hybrid AI a little bit as well.
04:47With the growth of use of agentic AI in businesses, what are the risks that we're looking at here moving
04:55forward?
04:56Maybe, Jada, do you want to weigh in on this one?
04:59Yeah, sure.
05:00So also to follow up, I completely agree, especially knowing that the best scenario where also not only LLMs, but
05:07AI tools in general work best
05:10when they work together with experts who know their field and they can not only fact check, but also exploit
05:18them in a better way.
05:19And so that also calls for better evaluations, as I was saying.
05:23And so to answer your question, I agree that not all of the contexts are the same.
05:28So, for instance, if we take health care, as you were mentioning, and, for instance, you have hallucinations for diagnosis,
05:35then it can have, like, really big problems, and not only ethical problems, but also legal problems, and so challenges.
05:42Or, for instance, if you're a student or a researcher, and then your LLM starts fabricating fake references, for instance,
05:52then it can become quite problematic, or, I don't know, maybe another use case could be, like, in a legal
05:58context,
05:58where you're a lawyer, and you're building up your case, and then it starts, like, and that already happened in
06:04the past,
06:05and it starts, like, hallucinating completely invented and fake case, like, references,
06:12then it can be quite problematic, especially if you're trying to build in your argument on top of it.
06:17And so, yeah, it really depends, once again, on the context, and I think that it's still, like, kind of
06:25a hard problem to solve in that sense.
06:27It's funny you mentioned the legal sector, because I think two lawyers in the UK were reprimanded recently
06:32for using AI-generated content in their arguments, and I've got here written down a good example of AI hallucinations
06:40in law.
06:41There is a person called Damien Charlotene, who is a lawyer and also a data scientist,
06:47who says he's collated 120 examples of proceedings that have hallucinated because of AI-generated content.
06:54And we're talking about quite serious things.
06:58I mean, obviously, it's just a small example of something that, for many, isn't still a widespread use.
07:04And having covered the legal sector, I can tell you they're very careful people.
07:08So, for this to be an example in that sort of more traditional sector,
07:12it kind of shows a bit of a drop in the ocean of what might be happening elsewhere.
07:16I guess...
07:17Can I just add something?
07:18Yeah, absolutely. Yeah.
07:20So, first, very quickly, on the lawyer thing that happened in the U.S. many times as well,
07:24and one of the funniest ones was when the judge was confronting the lawyer and said,
07:28have you not checked your work?
07:30And he said, yes, I asked ChatGPT to check the work.
07:33It's like, dude, so you asked ChatGPT to make the proceeding, which then included the hallucinations,
07:39and then you expected it to correct itself, which obviously shows that there is a lot of education
07:43that our industry needs to go through and how these systems can be used.
07:47But your question also highlighted another key thing, where you were saying,
07:51what are some of the issues that we see when we deploy these technologies?
07:54And hallucination and quality was one key issue.
07:58But another very big key issue that all of us are working very hard on addressing is security.
08:04These systems are very, very susceptible to something called prompt attacks.
08:08Maybe show of hands if anybody heard about prompt attacks.
08:11About 20% of the room raised their hand.
08:14So for the remaining 80%, prompt attacks for large language models is just like social engineering attacks for us humans.
08:21When somebody calls you up at home, pretends to be from your bank to get your PIN number,
08:25and then steal all of your money.
08:27With large language models, you can sometimes pressure them in a way
08:30where they will give you information that you're not supposed to see.
08:34And that can be a very, very big problem from a privacy perspective.
08:37What do you think about this idea of machine unlearning to kind of remove some of the hallucinations from existing
08:46data sets?
08:47Sasha, perhaps you can weigh in on that one.
08:49So conceptually, very exciting.
08:51It's technically very challenging.
08:53And just to kind of rebound on what the two of you are sharing,
08:55I think there are two things that are really important in terms of when we're thinking about machine unlearning,
09:00how they can be complemented by other approaches.
09:03One is the importance of open source for advancing research around safety and responsibility.
09:09And just a huge shout out to what you're doing, Jada, your research and hugging face.
09:13And we're very excited to be partnering you on that because it's about democratizing access,
09:17but also about democratizing the opportunity to advance the research around some of the biggest risks that exist.
09:23And you highlighted them.
09:24Accuracy, security, questions around bias and fairness.
09:28So that's an approach that I think is complementary to the machine unlearning.
09:32And then the other aspect, one of the things that we're seeing is that if you talk to organizations and
09:37companies,
09:38but also if you talk to policymakers and you ask them,
09:40do you think transparency, accuracy, fairness, do you think these are important principles?
09:45Every single person will say yes.
09:48But if you talk to an engineer developing the solutions,
09:52they're going to say, those are really great words, but exactly what do you want me to do as a
09:55developer and a deployer?
09:56And actually, we undertook research that showed that around 85% of organizations see these questions around mitigating hallucinations,
10:05ensuring accuracy as top priorities.
10:07But actually, only 25% of them have governance in place internally in order to address these challenges.
10:13And so really focusing on providing guidelines, which is what we spend a lot of time doing with both of
10:18you,
10:19and very happily, is really important because otherwise these stay principles.
10:23And I think you did a really good job at highlighting the importance of human oversight.
10:27One of the most important and insightful things that I've heard in conversations around these issues
10:33is that we're asking for 100% accuracy from probabilistic machines.
10:39And that artificial intelligence is not something that's magic.
10:42What you need to be asking and the way in which you need to be framing the question
10:45is, is this solution more accurate than humans?
10:49And how can it work together to augment human intelligence and provide more safety in the long run,
10:55especially when it's deployed in high-risk scenarios like the health situations that you were talking about?
11:00Can I ask, just to follow up on what you're saying,
11:02you said that 25% of businesses have some sort of guidelines in place to try to mitigate hallucinations.
11:08What are the rest of the 75% doing?
11:11Good question.
11:12I think a lot of people are stuck at the ideation phase of saying,
11:15this is something that's really important, but we're struggling on how to translate that into practice.
11:20We actually just published research that we commissioned called Unlocking Europe's AI Potential
11:24that looks at the real blockers to AI adoption.
11:27And regulatory uncertainty was one of the biggest blockers to AI adoption.
11:31And that includes understanding how to translate principles for responsible AI governance,
11:36specifically into practice.
11:38But I think there's also an interesting debate, just zooming out for a second,
11:41around questions around responsibility and innovation.
11:44And it's a fake debate, which is, do we innovate responsibly,
11:48or do we innovate fast and with velocity and with excellence?
11:52And actually, one of the things that we saw as a blocker
11:55is the lack of trust in this technology because of things like hallucinations.
11:58So you need to start, and we need to help organizations.
12:01It's one of the things that we're doing with our Generative AI Innovation Center in France and beyond,
12:06really help organizations move from an approach to responsibility that's kind of bolted on to built in to by design.
12:13How do we build systems by design that are responsible and take into consideration
12:18some of the great innovations that you've been building, for example, from the start?
12:21So it's not an afterthought. It's the starting point.
12:23Amir, you want to share something?
12:25Yeah, I was going to say on the concept of the machine unlearning,
12:27which is a very, very important concept, actually.
12:29And we are at the beginning of creating a very big problem for all of us,
12:33is all of us right now, we are leveraging these technologies to generate content
12:38that we're putting in our websites and videos that we're putting on YouTube.
12:41And then the machine is crawling that content to learn the new version of ChatGPT
12:47and the new version of, I apologize, what's the AWS model's name?
12:51Nova.
12:52Nova, et cetera, et cetera.
12:53So now the machine is learning from its own garbage, quote-unquote,
12:57and that can amplify the problem significantly.
12:59So one important thing that we all need to be doing,
13:02and I give kudos to Google now for leading that effort,
13:05is making sure that we have a signature or ID inside of any AI-generated content
13:11so that the machines, when they're learning the next time,
13:14they don't pick that up.
13:15They only pick the human content, which is incrementally new, good stuff for them.
13:19And that needs to be embedded in all of the content that we are generating
13:22to avoid falling in that pitfall.
13:25Jed, I wanted to ask you something about this responsibility and these pitfalls.
13:31AI hallucination happens.
13:32Who is responsible?
13:33Is it the people who built the LLM?
13:36Is it the people who use it?
13:37Is it a victimless crime, but somehow not really?
13:42What's the deal there?
13:44You're asking the big question.
13:46I know.
13:49So I see responsibility in that case really as a spectrum
13:53because you can say, for instance, it's just the end user responsibility.
13:58They should be checking all the time the generated content.
14:02For instance, we know and we see all the time, for instance, on general purpose AI systems
14:07that it's always written, you should double check because these LLM, blah, blah, blah,
14:13can fabricate some content.
14:15But we already know that in design, when you see something all the time,
14:19at some point, the human eyes start to just not seeing them anymore.
14:22And so I will be happy to ask all of you, how many of you still see on ChatGPT or
14:29Claude or whatever,
14:30the small sentence saying, oh, you should always double check everything that's generated.
14:37Do you guys see that?
14:39Have you noticed it online saying you're responsible?
14:42Making them smaller and smaller.
14:44It's smaller and smaller, but really it's something that it's known in design
14:47that at some point you just don't see it anymore.
14:50And so I believe we've been talking about education and human oversight.
14:55That's still super important, but it's also important to put the responsibility
15:00or at least the shared responsibility also on the shoulders of developers.
15:04and the industry, we're all responsible because we can't just say, okay, it's all on the users
15:10because once again, and something that I think it's also important to mention is that
15:16hallucinations, as we say, the devil hides in the details.
15:23And so sometimes you can also double check maybe twice the same text and maybe something missed.
15:30Maybe there's this fabricated reference that really sounded like this author that you know by heart
15:37because they already work on the same subject and the same topic, but just that specific paper doesn't exist.
15:44And so it also becomes, for human eyes, very hard to spot them.
15:49And so that's why I think developers should also be really transparent on their technical documentation
15:56and say, okay, especially on this specific topic, for instance, there could be maybe more hallucinations
16:04or, and that's why I was really stressing the importance of evaluation in that case,
16:08because I agree, you, it's for me a false dichotomy to say either you can be innovative,
16:17but, or either you can be responsible.
16:19It's something like responsible AI, it's really something that can foster also your innovation.
16:24And so if you think ahead and from scratch about those stuff, then those same principles,
16:30they don't stay just vague ideas, but they can really be something grounded in your right project.
16:37Yeah, so, so I think the courts have ruled already.
16:39Like, if you are using it in business, it's the responsibility of the business, 100%, right?
16:45So there is a very popular story of, I think it was Air Canada, I'm sorry if I got the
16:49details a bit off,
16:50and Air Canada launched an LLM that was part of their sales process for customer support and so on,
16:56and a customer was able to get an airline ticket that is a business class ticket for $1.
17:01And then Air Canada said, we're sorry, that was a hallucination from our chatbot, we're not going to honor that.
17:06And that customer actually sued them all the way to the Supreme Court in Canada,
17:10and the Supreme Court said, no, it's your responsibility as a business that you made this mistake.
17:15If this was a human and they offered it, then you have to live up to it.
17:18It's not the responsibility of the end customer.
17:22So while in AI consumer chatbots like Claude and Nova, when I'm using it directly, or Gemini,
17:27they can have that footnote and put it back on me,
17:30if now I'm using it as part of my business to do things, then the bar is way, way higher.
17:35I think just circling back to what Jada was saying with regards to responsibility and whose responsibility is it,
17:40and your anecdote really shows where does responsibility lie?
17:43For us, the approach has been about a shared responsibility model,
17:46because I think developers and deployers and end users have really different responsibilities
17:51and can actually do different things as it concerns mitigating these risks.
17:56So for example, as a developer of solutions and also hosting third-party models on our Bedrock platform,
18:02one of the things that we've seen is that with Anthropics model Claude, for example,
18:05if you ask Claude a question and you say, if you are uncertain about it,
18:10indicate that you are uncertain, and interacting with it as if it's a human being,
18:15it actually encourages the model to say, actually, I'm not 100% sure about this data.
18:21And then I think one of the things that I find the most exciting personally with regards to
18:25what's going on in innovation in this space over the past 20 years is the ways in which
18:29we can put, for example, output filters,
18:31or you can do with RAG context grounding, or reference checking,
18:36or guardrails that can mitigate not only for hallucinations,
18:39but also things like harmful content.
18:41What C2PA is doing in terms of watermarking and developing standards for watermarking,
18:46I think is really important because if AWS does watermarking in one way,
18:50Google and Microsoft does in another, Anthropic in another,
18:53then it's going to be really hard for end users to understand what is the ground truth.
18:57And so these kinds of international standards,
19:00we see as playing a really key role in building trust in the technology.
19:03So there's a lot of things that developers can do.
19:05And then for deployers and end users, again, going back to the point on education,
19:09I think this is a really essential factor that's often overlooked.
19:13We look to technology for answers to the challenges of technology,
19:16but at the end of the day, it's people using this technology.
19:19And so we also need to look to the people to advance the science
19:21and also make sure that they're skilled enough to engage critically
19:24with this kind of content and tools.
19:26So you mentioned a few of the solutions that have been deployed
19:29to try to mitigate some of these hallucinations.
19:31I guess RAG is a good example because it's relatively cost efficient, isn't it?
19:35It's very good.
19:37It does help to mitigate things.
19:39But none of these things, as far as I understand it, are fallible, right, Sasha?
19:42I mean, it's not like they...
19:44Infallible, pardon me.
19:45It's not like they are completely able to wipe out any hallucination completely, right?
19:51I mean, that's the problem that we're facing.
19:53Are there any solutions out there that are likely,
19:55even in the future, to be able to achieve that, do you think?
19:58So I think we have to work towards that.
20:01Going back to the introduction in terms of the question of are LLMs fallible,
20:05yes, and do they hallucinate?
20:07Yes.
20:07It goes back to what AI is as a technology,
20:09which is probabilistic and statistical reasoning.
20:13And so working on technologies that can address the gaps and the risks related to that
20:17and complement with human oversight is really important.
20:20Is there any technology right now as a standalone solution that can mitigate hallucinations?
20:25No.
20:25Is it really important that we have a combination of these different approaches,
20:30like guardrails, like output filtering, like RAG and context grounding,
20:35like reference checkers?
20:36And the filtering, I think, is particularly exciting.
20:39These are things that, combined with the human oversight,
20:42mitigate the risks related to hallucinations.
20:45May we get there one day?
20:47I would say as an eternal optimist and someone committed to responsible AI
20:51and the collective research done through open research and open source,
20:55do I think we'll get there one day?
20:57Maybe.
20:57I'm hopeful.
20:58Are we there right now?
20:59No.
21:00Do we need to make sure that human oversight stays a really essential principle
21:03in the ways in which we're deploying these technologies?
21:06100%.
21:08So, the answer is mitigate.
21:11The answer is mitigate.
21:12It's how can we have this technology work hand in hand with the humans
21:17to increase our productivity, right?
21:19So, let's pick something simple like customer support, right?
21:22Which, by the way, this technology is flipping customer support in its head
21:26and call centers everywhere across the world right now.
21:29If I know with very high confidence that this response coming back from the agent
21:34is 99% accurate, the specific response, I don't mean on average,
21:38I mean this response, this answer that's coming back right now is 99% accurate,
21:42then I can have that go back to my end user directly.
21:45And then I want to catch the 1% where I think I'm not sure that is accurate
21:49and then give that to my call center operator, the human, now to deal with it.
21:53If we achieve that alone, then we increase our productivity by 100x, right?
21:59So, that's what we're aiming for as an industry right now,
22:02is how can we have the confidence to know when it can go
22:06and when, no, it needs to be reviewed.
22:08And if we can do that accurately, then we're in a very, very good position.
22:12That's my view on it.
22:14I will highlight a very key dashboard that I would like folks to take a look at.
22:18If you go back later and search for hallucination leaderboard on any search engine,
22:21you will get a leaderboard that we publish at Victara.
22:25It's now becoming the industry standard for hallucination rates of different models
22:28in collaboration with Hugging Face, by the way,
22:30that shows the rates of hallucination of the different models
22:33so you can know going in how likely is it that that model will make up something.
22:38Hamer, can I just follow up with a question on those guardrails and that safety?
22:42I guess from a business perspective,
22:44a lot of people are at the moment working with off-the-shelf AI
22:48that they're trying to deploy to use as best as they can for their businesses.
22:53Within those guardrails, how important...
22:57No, that's not the question.
22:58Realistically, how much do you think people can detect for themselves
23:02whether AI is hallucinating
23:03and how important do you think that will be moving forward as a function?
23:07Extremely. That's the whole point.
23:08I think if there's one thing we'd like the audience to leave from today
23:11is if you're deploying Gen.AI in a mission-critical business application,
23:16you cannot have that without...
23:18We call it guardian agent.
23:20Without a guardian agent that is monitoring the output of these systems
23:24and making sure that the answers are correct and secure.
23:28If you do it without that,
23:30like if you go back to the stats you were sharing earlier,
23:32you were saying 80% are doing it,
23:34but only 20% are correcting for catching it,
23:37like doing the real things to catch it.
23:38That remaining 60%, all of them will fail.
23:41Like literally, they will launch their products
23:42and within a week, they will shut it down
23:44because all it takes is just one bad hallucination
23:48or one bad security leak to shut down a project.
23:51And that's exactly what's happening.
23:53And that's why our voices, we're trying to speak
23:55and scream with the loudest we can,
23:57is if you want to succeed with your Gen.AI,
24:00start with this at the beginning, not at the end.
24:03Would you agree?
24:04A hundred percent.
24:04And I think that one of the points that you raised,
24:07which is absolutely essential,
24:08is that can we go really far?
24:10For example, the bedrock guardrails
24:11that we have for our solutions,
24:13it mitigates the hallucinations
24:14and reduces them by 75%.
24:16It's a huge percentage,
24:18but do we need to go further still
24:19and do we need to make sure
24:20that there's a human in the loop?
24:22A hundred percent.
24:23I think also just,
24:24and you inspired it in the way
24:25in which you framed the conversation,
24:27President Macron said something
24:28that I think is really important
24:29speaking to your productivity question,
24:31which is,
24:31do we need to focus on misuse of the technology?
24:34Yes.
24:35Do we need to advance the research
24:36around safety and responsibility
24:38and collectively commit to that beyond industry?
24:41Yes.
24:42But we also need to focus on
24:43what misuse of this technology represents.
24:45And you evoked a couple of examples
24:47where there are productivity and efficiency gains
24:49that are remarkable,
24:50but also at the end of the day,
24:52it's also about the competitiveness of countries,
24:54of startups that are in France and elsewhere.
24:57And adopting this technology in a responsible way
24:59is actually the key to mitigating those risks
25:01and also making sure that the gains that exist
25:03both economically but also socially are there.
25:06Can I ask about trust?
25:08Because it's been brought up time and time again
25:11on this panel.
25:12I was talking actually to the head of innovation
25:14at Capgemini earlier today
25:16who said to me,
25:17Natasha,
25:17would you get into a driverless car
25:18that says it's 94% good at driving?
25:22And I said,
25:23no.
25:25And he said,
25:25but would you get on a plane?
25:26And I said,
25:27yes.
25:27And he said,
25:27well,
25:27you know,
25:28we've had a horrible news
25:29about a plane crashing today.
25:31Why would you do that?
25:32And I sort of didn't really have an answer to that.
25:35And I thought it was a really good question.
25:37I guess the inherent psychology of trust
25:42is at play here
25:43with Gen AI and with hybrid AI,
25:46with people needing to trust
25:50the tools that they're using.
25:51But finding that trust
25:53and gaining that trust
25:54is intensely damaged by any hallucination.
25:57One hallucination for me
25:58is enough to stop me
25:59from using something.
26:00I don't want to use something
26:01that's going to be mistaken
26:02because it'll make me look stupid.
26:04So how do you think people
26:06are working on gaining that trust?
26:08And do you think that trust
26:09will ever be achieved,
26:12I suppose,
26:12Jada?
26:13Yeah.
26:14Once again,
26:15it's also a complex question,
26:17I think,
26:18because you mentioned trust,
26:20but sometimes
26:20the industry wants faith.
26:24and those are two completely different notions.
26:29Because faith basically comes with no proofs, right?
26:32Like, okay,
26:33I developed something that is the best,
26:35but just trust me, right?
26:37While trust comes with evidence,
26:40and especially when we overcommit
26:43and about our confidence
26:44on how great and how good
26:46like an AI tool can be,
26:48as you say,
26:49once it fails,
26:50then you miss
26:51and you lose
26:52all the trust that you had.
26:54So that's why it's important
26:55to be as transparent as possible
26:58on also the fails.
26:59But once again,
27:00then who wants to build
27:01their businesses
27:02on something
27:03that is not perfect?
27:04But what's really hard
27:06for us humans,
27:07I think,
27:07to grasp
27:08is that
27:09data is not objective,
27:11it's not all the truth
27:13of the world,
27:14and so since
27:15it's statistics,
27:16then it can also make mistakes.
27:18And we've been talking
27:19a lot about
27:20like what's a fact
27:21and what it's not,
27:23but it made me think
27:24about some of the questions
27:25that don't have
27:26only one answer.
27:27Like,
27:28I have a background
27:29in philosophy
27:30and I've never been able
27:32to use LLMs
27:33in philosophy
27:34all the time
27:35that I ask
27:35like complex questions
27:37about some authors
27:38or anything.
27:39It's always so superficial
27:40and once again,
27:42like the most important thing
27:44that I learned
27:45in philosophy
27:45is that
27:46you don't have
27:47one single truth
27:48but like a variety
27:48of truths
27:49and so
27:50then makes
27:51also the use
27:52of LLMs
27:52even more difficult
27:54and so
27:54going back
27:55to the question
27:56of trust,
27:56I think
27:57if we can be
27:59even more honest
28:00and say,
28:01okay,
28:01listen,
28:02for instance,
28:02if you have
28:03a really sensitive context,
28:05maybe it's not
28:06the best
28:07to deploy
28:08this AI tool
28:09like right away
28:10or if you have
28:11to have
28:12like a back it up
28:13with a team
28:14of full humans
28:15who have
28:15100%
28:17the expertise
28:17to assess it
28:19because otherwise,
28:21like as you said,
28:22like end users
28:22would just be like,
28:24no,
28:25I don't trust
28:25this tool
28:26or,
28:27but also the example
28:28that you made,
28:29like would you trust
28:30an autonomous car
28:31who was like
28:3194% safe?
28:33It's the same
28:34if you flip it,
28:35what if I told you,
28:36but you only have
28:376% chance
28:39to fail
28:40or that it will crash
28:42then also
28:43in your perception
28:44it changes
28:45completely
28:46how you perceive it
28:47and that's why
28:48it's important
28:48to not build
28:49faith in your tool
28:51but mostly trust
28:52but gain trust.
28:54Amir,
28:54do you have any?
28:55I agree 100%.
28:56Like Viktara,
28:57our business
28:57is all about that.
28:58Like we're all about
28:59how can we help
29:00businesses
29:01deploying Gen.I
29:02in their environments
29:03to have that trust
29:05where I know
29:06when this answer
29:07is going out
29:07it's the correct answer.
29:08I know when this agent
29:09is performing this action
29:11it's the correct action
29:12and I also understand
29:14that there's no
29:15data leakage
29:16or security issues
29:17happening.
29:17That's very,
29:18very important to trust
29:18and I can track back
29:20which documents,
29:21which data
29:22were behind that.
29:23Hallucinations
29:24is a very big issue
29:25but believe it or not
29:25most companies
29:26the problem is
29:27their data is the issue.
29:28They have bad data
29:29and if you have
29:30bad data
29:30we say garbage in,
29:32garbage out.
29:32You give that bad data
29:33to the large language model
29:34of course
29:35it's going to give you
29:35a wrong answer.
29:36So anyway,
29:37the short of it
29:37absolutely trust
29:38is the anchor
29:39that will make this succeed
29:41but just like
29:41with any new technology
29:42like credit cards
29:43at the beginning
29:44we don't trust them
29:45and then with proper technology
29:46and with proper usage
29:47we don't even think
29:49about it today.
29:49We use credit cards
29:50for everything.
29:51Wear us down.
29:52Yeah,
29:52I love your rubbish in
29:53rubbish out thing
29:53because data sets
29:54is a whole other thing.
29:56I know we have
29:56run out of time
29:57but I want to hear
29:57from Sasha
29:58just any final thoughts
30:00on this point.
30:01Sure,
30:01a couple of really quick points.
30:03first is Jada,
30:04I'm so happy
30:04that there are people
30:05like you
30:05who are philosophers
30:06who are leading
30:07cutting-edge industry
30:09evolutions around this
30:10and I think it's
30:11absolutely essential
30:12and there's another
30:12wonderful French thinker
30:14named Asma Mala
30:15who is a philosopher
30:16and politologue
30:17who I think frames
30:18the question really well
30:20for France and Europe
30:21and beyond
30:21which is we need
30:22to be asking ourselves
30:23what kind of AI
30:24do we want
30:25and actually fundamentally
30:26what do we want
30:26from AI?
30:27that that's the question
30:28that we need to be asking
30:29but as human beings
30:30what we don't understand
30:31we trust.
30:32We don't trust
30:33and so we really need
30:35to focus on understanding
30:36this technology
30:37and what it can and can't do
30:38but also agree
30:39on taxonomies of risk.
30:41Where is there risk
30:42for example
30:42in deploying
30:43in the health sector
30:44where we need to focus
30:45and maybe over-index
30:46on human oversight
30:47and where is there low risk
30:49where we can move quickly
30:50and intentionally
30:51and make sure
30:52that we maintain
30:54I would say
30:54that trust
30:55and part of that
30:55is education
30:56and then lastly
30:57I would just say
30:58that we're really convinced
30:59that responsibility
31:00is actually the key
31:01speaking to your point
31:02because responsibility
31:03will unlock the trust
31:04which will unlock adoption
31:06which will unlock innovation
31:07and so that kind of virtuous cycle
31:10I think is absolutely essential
31:11and we see trust
31:12not only as the starting point
31:13but the middle point
31:14and the end point
31:14it's something that's lost
31:16really quickly
31:16and that needs to be maintained
31:18and it takes everybody
31:19working together
31:19philosophers, scientists,
31:21industry and governments
31:22to think through
31:23what these kinds of frameworks
31:24can look like
31:25at a global scale.
31:26Brilliant
31:27well you've heard it here
31:27please use AI responsibly
31:29thank you so much
31:30to my panel for joining me
31:31thank you everybody
31:32thank you so much for joining me
31:32thank you so much for joining me
31:32thank you so much for joining me
31:33thank you so much for joining me
31:33thank you so much for joining me
31:33thank you so much for joining me
31:34thank you so much for joining me
31:34thank you so much for joining me
31:34thank you so much for joining me
Commentaires