Skip to playerSkip to main content
  • 18 hours ago
Every AI platform you use is quietly building a file on you. They call it "memory" and say it improves the experience. Maybe, but your preferences, finances, relationships, and intentions are accumulating inside systems you don't control and can't see into. We've watched this pattern before, with the web itself. Sir Tim Berners-Lee saw the web's centralization coming and built an alternative. Now he's pointing it at AI. With Inrupt CEO John Bruce, he'll walk through Charlie — an agent that sits between you and the tools you already use, deciding what each one actually needs to know and keeping your data portable rather than locked to one platform. The premise is simple and far from settled: AI can be useful without owning you.

Category

🤖
Tech
Transcript
00:03That was some dramatic entrance music. Sir Tim, do you always get that music?
00:08That's my... Yes, in the right.
00:11Wonderful. All right. So, we are going to talk about this fascinating new product that you guys
00:18have built. You're going to explain how it works, why it's important. But first,
00:23what is the problem you're trying to solve? What has gone wrong with the world that you
00:28are trying to fix? Either one of you can answer that. Although, what's gone wrong is that it's
00:37no fun to be on the web anymore. When the web started, anybody could meet their own website,
00:46and that was very empowering. Now, everybody's on Facebook. People are using
00:58arms to ask questions where they could use the search engine just as well. So, we've got
01:15disempowered people using AI and where they don't have control of their own data. And we have
01:23people and people using... And the web is under threat. And it's from the point of view of personal
01:31empowerment, personal... We call it... Jan was talking about... They were talking about data sovereignty.
01:39About AI sovereignty of a country. We talk about sovereignty of an individual. You, an individual,
01:47ought to have control of your data.
01:50It's a powerful statement coming from a man who built the web.
01:54John, do you want to explain? Given this problem, what is it that you gentlemen have cooked up?
02:00Well, as Tim said, you know, I mean, Jan was talking about macro sovereignty, geo sovereignty. We are
02:07focused on individual sovereignty. And the essence of that is having access to your own data. Because
02:14that's the last thing we've got at the moment. And with the advent of the LLMs, it's becoming increasingly
02:20clear that they're going to be your memory. I mean, the amount of data individuals are
02:29providing to the LLMs. I mean, you get a good return for the investment. You know, you get good
02:35service. But the trade-off is that they become your memory. And we think that that's fraught with
02:43problems. And what do you mean by your data? So obviously, it's your specific information,
02:50your data birth, your health information, your bank account information. But most of the stuff on the
02:54web is shared. You know, you and I send an email. It's your data. It's my data. What exactly
02:59do you define as your data? Is it every search query you put into an LLM? What is it?
03:04Yeah, all of those things. I mean, and largely, you get to decide what's important to you. But in our
03:09world, you should. So, and what I think we're all finding is the implications when you don't
03:17keep control of your own data. I mean, you know, we talk, I mean, you and I were talking the
03:22other day
03:26about experiences that we're beginning to have where you realize, hang on a minute. How on earth
03:32did it know that about me? How did it know the name of my kids? Or how did it know
03:38that I've got
03:39a doctor's appointment that I'm used to be at in a couple of hours? I mean, the degree of intimacy
03:43that
03:44the LLMs are beginning to build about you is way beyond where it ever was with search terms and
03:51the like. So, the answer to your question largely is, I think every piece of data about you should
03:57be available to you, actually. And you should be the custodian, largely, of who gets to use it.
04:03Okay. So, there's a huge trade-off there, right? So, I will often ask my LLM of choice,
04:10hey, you know, given what you know about my health situation, my training needs,
04:14what kind of workout should I do today? And obviously, there's a trade-off between having
04:19all of my data and giving a good answer, right? So, your view would be, I should make that choice,
04:24right? I can say, hey, you can have my resting heart rate and my iron levels. Or I can say,
04:30no, you shouldn't have that, and you should just train me as a generic person.
04:35But that leads to the problem where I don't want to have to go in every question and say,
04:39yes, no, yes, no, yes, no. So, explain these two tricky trade-offs. Either of you, explain
04:43how you think about utility versus privacy and also the user having to spend a lot of time
04:51figuring out what to give and what not to give.
04:53Yeah. And candidly, that's the best of all worlds, right? Where the user should be able to get the
04:59utility out of the LLMs, of course. I mean, I want it, you want it, we all want it. But
05:04by the
05:04same token, I don't want it to know me intimately. And I think that what that requires is my agent.
05:12I need to have something that works for me, not for the LLMs, and takes care of the house keeping
05:19that you described. You know, what consents should it grant? And it should be granting them
05:24at machine speed. And I don't want to be clicking all day with yes, no, yes, no. I mean, that's
05:28impractical. It would be like going to Europe. You know, you go to Europe after I have to click
05:31no on every single browser tab. And I wouldn't be crazy. Yeah.
05:35Tim, will you explain your philosophy on what data should be private? What is the data that's
05:41important that should never go into an LLM? Well, basically, when I have a conversation
05:54with the LLM, it's often going to be about what workout should I do? It should be about
05:58what vocation should I go on? All that sort of stuff. That's completely private. I might
06:09want to share it later on. I might want to assemble, usually, to help me assemble a definition
06:19of my perfect vocation. I might want to put the definition of my perfect vocation out there
06:25in some sort of marketplace where travel agents can come to, and airlines can come and sort
06:34of match up and provide me with, so basically, we might want to flip around the current attention
06:43economy to be an intentioned economy. And so, in that case, I'd be making public things about
06:49what I want. But everything starts off private. Yeah. So, all right. So, you have this system.
06:56It's called Charlie. What is it? How does it work? You've just announced it. It's not your first
07:00interview, but this is your second public interview about it. Well, pretty early. So, explain what it
07:06is. Has anybody here used it? Hands up if you've used it. Early days. All right. Good. At the
07:14beginning. So, now, explain to all these people, because we're going to have you all come back here
07:18tomorrow. I'm going to ask you that again, and I expect every hand to go up. So, explain what it
07:22is
07:22and how it works. Yeah. So, first, let me tell you about Charlie. Let me tell you that when Tim
07:27and I
07:27started the company, we have a company together called Inrupt. And for some years now, we've been
07:32advocating the use of this technology, which gives everybody a data vault, a data wallet,
07:39if you like. And we keep all our stuff in it. And then applications can come to us and ask
07:44for access to it. And it's all based on open source protocols. Actually, there's quite a
07:50flourishing open source community around it. So, the Solid protocols were invented some years
07:55ago. And when we started the company, we started it to mobilize around Solid, to bring resources
08:01to help drive the adoption of Solid and give everybody their own data vaults.
08:08And back then, Tim said, you know, and you have to remember what it was like then. There
08:13were two things. One was called Siri, and one was called Alexa, and that was it. And Tim
08:18said, you know, at some point in time, we're going to know all of us have our own. One that
08:21doesn't work for Amazon or Google or Apple, but one that works for us. And he wrote about
08:29it, and he called it Charlie. And we said back then, in 20 years, we're going to build
08:33it. It'll take us that long to build it. 20 years, so the technology didn't exist. But
08:38then about 18 months ago, it was evident that the way the LLMs were evolving, maybe it could
08:44exist. So we built a demonstration of it. And we built a demonstration that showed you
08:49can actually do it. You can build a thing that works for us, not for them. Not to oversimplify,
08:55but that's what it did. And we began to show it to folks quietly, and they said, love the
09:01idea of a data vault. Love the idea of data wallets. But how do you get the data in it
09:06and to your point? Does that mean all day I'm clicking yes, no, yes, no, yes, no to grant
09:11consented access to it? So that thing called Charlie, that would be really interesting,
09:18because if you can build that to take care of all this stuff. So we said, okay, let's
09:24see if we can build it. And we did.
09:27But Charlie is an intermediary when I go to OpenAI, or Charlie is its own system? Do I
09:34go to Charlie, or I use Charlie in the middle between my journey?
09:37Yeah, that's right. Think of it that way.
09:39It's always a layer. It's always a layer between you and the LLM.
09:44Yeah.
09:44So I log into Charlie, and then I go to the LLM, and then I say, hey, help me understand
09:49this next workout. And Charlie, the LLM says, hey, Charlie, can you give me Nick's iron levels?
09:55And Charlie's like, nah, no, I'm not going to give you that. Is that how it works?
09:58Well, no, no, no, no, no, no, no. It's better than that. Because we don't want to stop you
10:03using the LLMs, of course not. But at the same time, we don't want you necessarily being
10:09at risk in terms of all your personal data being available to the LLMs.
10:13So what Charlie does, it does a number of things, and I don't have the time to explain it all,
10:19but in simple terms, what it does is, in the first instance, before you submit the prompt,
10:25any LL, and this applies to LLM, Anthropic, OpenAI, Mistral, before you submit the prompt,
10:31Charlie says, ha, these pieces of data would be pertinent to this prompt. I'll package it up.
10:40But before I send it, I'm going to strip out all your PII. Now, you're in Europe, so you
10:46fundamentally understand PII like they don't in the States, but Charlie strips out all your PII.
10:52And then before it submits it, it obfuscates you. So it doesn't send your
11:01absolute data. It sends just a jittered version, just an approximation. So you can engage with the
11:09LLM. You get all the kind of guidance you're looking for, but it doesn't get a fix on you.
11:15It doesn't get to know you intimately.
11:16Does it send false data? Like, does it say, well, his birthday is June 32nd, right? Or does it just
11:22say, let's say my birthday was June 15th. It says his birthday is June 16th, June 14th?
11:26It depends on the use case. Now, the way Charlie works, and it sounds like it's a heavy lift, but
11:31it's
11:31not quite simple to use. You can throttle it. You can say, in certain circumstance, you're going to
11:37need to know my date of birth. So you can be very deterministic. You can say, in these use cases,
11:46be explicit. Tell the real me. But in those use cases, not. I want you to put jitter into the
11:53financial numbers. I want you to obfuscate my, I mean, you know, the reality of me.
11:58So still, yeah, you get the best of all worlds.
12:01I mean, this is my favorite thing about the product, the whole obfuscating data thing.
12:04I've seen other people build data vaults, privacy, data on the blockchain, own your own privacy. I've
12:09been hearing that for a while. I've never heard feed false data in LLM to protect yourself, which is
12:14just fabulous. So give me, you know, you can invent the web. You can invent a new way of lying.
12:20Congratulations, Sir Tim. Explain a query where it's very useful to obfuscate data. Let's make
12:28this like, give me a query that you've put into one of the big models and how much data you
12:34sent
12:34along with it and how that data was obfuscated. Let me tell you the high-level version.
12:40But so if you, and good examples at the moment are financial services. You know, particularly now that
12:46we've seen OpenAI introduce finance manager, you know, hook me up to all your bank accounts
12:52and I'll look after you. I mean, frightening, isn't it? But anyway, so...
12:54I did that. I made a billion dollars yesterday. It was amazing.
12:57Wow.
12:57Just kidding.
12:58Yeah, I know you are.
12:59Go on.
12:59So in the context of financial services, it's fascinating. And we have a number of projects
13:06underway in this where if you want straightforward advice, can I afford a mortgage? There's two
13:10ways you can go about doing it. You can provide your actual bank balances. You can upload financial
13:17statements in order to get an answer back from ChatGPT. Alternatively, you can use Charlie.
13:24And with Charlie, it wouldn't send your actual balances. It wouldn't send your actual credit
13:30scores. It sends a little approximation, just a little approximated, and that's that notion
13:36of jitter. You can introduce how much jitter you have. Charlie takes care of that. You don't
13:39have to worry about it. So it approximates you enough where you still get the guidance
13:43you're looking for without submitting your real data.
13:47Right. And you presumably get slightly worse guidance, right? Because if you have the exact
13:51credit score and your exact income and your exact bank statement, they can make a more
13:55precise calculation. And your argument is that the tradeoff is worth it.
13:59100%.
14:00Right. Because if you upload your credit score while trying to get your mortgage application,
14:05the AI company's going to hold onto that forever.
14:08Forever. It's your memory.
14:09And it could be used against you.
14:10Yeah. Yeah. Yeah. Yeah. Exactly.
14:11What is the worst example of an AI company holding onto data and actually using it in
14:16a way that was harmful to a user that you've heard about?
14:19We probably don't know.
14:21Yeah. But that we know about.
14:25Small examples, but dynamic pricing is one such, right? And I don't know that it's the
14:29LLMs necessarily. I was meeting with somebody earlier today. And she was telling me how,
14:35you know, around here you click, no, don't, do you consent to, and you click don't. So they did
14:42an analysis. And in 35% of the times, when you click don't, they still do it. Because nothing stops
14:50them.
14:50You click on the no button and it says yes.
14:53Yeah. It still does yes. The code still does yes.
14:55It still does yes.
14:56So that's two questions. So there's stuff going on at the moment we don't appreciate. But the ones that are
15:03making it into the public domain,
15:06two levels, public domain stuff, written about New York Times and so on, dynamic pricing, all of that stuff's going
15:12on.
15:12But on a micro level, I know you've experienced this. I suspect we all have. We all get to a
15:19WTF moment.
15:22We all get to a point where we think, hang on, it should not know that. I've had it. I
15:28know you, I'm sure we're all appreciating it.
15:30If you haven't, I guarantee you will. And it shows the intimacy with which these models are getting to know
15:37us.
15:37And a consequence of that is, unfortunately, the business model has skewed quite negatively the web we've got.
15:48It will emerge on steroids because of the intimacy with which they have this data available.
15:54So help me understand the trust question. So part of the reason why we don't want to upload our data
16:01is because we don't totally trust the companies
16:03and there's a long history of corporate malfeasance. But in order to use Charlie, I have to trust you. Absolutely.
16:12I trust you guys. You're lovely on stage. But how is the user supposed to trust your company and your
16:17vault when it's still a company?
16:19It's still a vault managed by board directors with financial incentives?
16:24Sure. You don't. You don't have to trust us. You have to trust in the people we're working with who
16:31intend to distribute Charlie.
16:35And you trust corporations. Not all of them, but some of them you do. You trust your bank. You have
16:41to. They've got your money.
16:43You're sending me an overdrawn notice, though, so I trust you.
16:45Yeah, well, that's different. And some banks are different. But, I mean, you know, generally there are trusted entities, banks
16:51want such, where they say,
16:53look, if you trust us to look after your money, trust us to look after your data. Here's Charlie.
16:59And Charlie's going to help you operate in a world of the LLMs in a not-safe way.
17:04And the good news is it helps them, too, because we're in an interesting point in time, actually, where, for
17:13all the disadvantage, I think, that I fear lies ahead of us as individuals, if we're not careful,
17:19the same kind of disadvantages exist for corporations.
17:25And financial companies, retailers, telecommunications carriers, insurance companies can all get disintermediated by the LLMs and the agents and by
17:36agents of the LLMs.
17:37So, you know, if I'm a bank and I'm sitting there strategically, all the things that I normally grant to
17:43my customer, I offer my customer counsel and guidance on financial matters,
17:49all that gets swept away by ChatGPT or similar, what am I left with?
17:55I end up being disintermediated.
17:57So that's interesting.
17:58So you think, let's go back to our mortgage example.
18:00You both think the bank will want the customer to have Charlie because if the customer is just uploading their
18:06bank statements up into OpenAI to ask the loan information,
18:09that's actually worse for the bank because now they no longer have the power of the control of your data.
18:14100%.
18:15Okay.
18:15So these guys are on, these guys want you to succeed.
18:18The big LLMs don't want you to succeed because they want more data.
18:21So how does the market play out?
18:24Does Chase Manhattan Bank, do they encourage me and nudge me to install Charlie?
18:30Yeah.
18:31Yes.
18:32Yeah.
18:32So explain what's going to happen.
18:34Well, I can't speak explicitly for Jamie Dimon and all his crew, but I mean, you know.
18:38Theoretically.
18:39So you could start going out.
18:40What you might do, maybe they'll hear people in the room who will talk to you.
18:44You may go out and find partners and say, look, you guys are risking everything.
18:47If all of this data is being uploaded into these LLMs, work with us, and we've got a cool way
18:53of making sure it doesn't happen.
18:55Yeah.
18:55Is that the next business, the biz dev step for you?
18:58Yeah.
19:01All right.
19:03So, and yeah, financial companies in particular.
19:08So, because for the people who trust.
19:12Yeah, trusted agents.
19:13I mean, and that's the way it should be.
19:16I mean, these are institutions that have spent a lot of time earning your trust.
19:21They're heavily regulated as well.
19:24They sit in a position of trust and oversight where they should be the kind of people that you would
19:30say, yeah, you know, if I'm going to look after my memory and I want to store it someplace, I
19:35want to make sure that there's a trusted entity looking after it with me, why wouldn't you use me?
19:40So, is your business model, individuals are going to, because you're going to need a business model.
19:45You have 20 employees, right?
19:47You know, it's not going to be a gigantic company.
19:48You don't need trillions of dollars.
19:49But your business model will be somebody subscribes and pays a fee or your business model will be you'll have
19:55like a commission from these trusted entities.
19:57How is it going to work?
19:58Great question.
19:59And I truly mean that because what we're finding is a fascinating point in time actually for corporations.
20:06And we're spending a good deal of time teasing this out with them.
20:10It used to be the case that when you were a company, you know, you knew where you wanted to
20:15get to and the job of the leadership team and everybody else was to get you there.
20:20How do I get to that place?
20:22And they're all sitting there now in a totally different mode.
20:26They're sitting there not knowing where that place is going to be, but they can't sit doing nothing.
20:32So, somebody said to me the other day, you know, we used to be pathfinders and now we're wayfinders.
20:39We have to figure out generally how to head in the right direction and we'll figure out then where next
20:45to go because a lot of us are, and I think the LLM vendors are no different, we're not clear
20:52where it all ends up.
20:53So, the answer to your question generally is that the job of work is for folks to appreciate that they
21:03can do things now and they should do things now strategically to make sure that they're safeguarded and they're safeguarding
21:11their customers for the future.
21:13Right.
21:13Is Charlie open source?
21:15Yeah, well, excuse me, the code underneath, all open source.
21:19The way we obfuscate, the way we strip PII, the protocols, all of it open source.
21:25What we've configured with Charlie, we've made closed source only because we've experienced you can move a downside faster in
21:34certain regards.
21:35If you generate stuff, closed source, get early adoption and then open it up.
21:39Wait, which parts are open, which parts are closed?
21:41Sorry, which parts of Charlie are open and which parts are closed?
21:45It's all available open source.
21:47We just implemented it using our resources.
21:52We have a closed source solid server called Enterprise Solid Server.
21:56Okay.
21:57That's closed source.
21:59Okay.
21:59So if we put together a Charlie implementation with all the protocols, and so the ESS, the Enterprise Solid Server,
22:10implements the solid protocol, so the protocol is open.
22:14Let me ask one last question as we kind of run out of time about the obfuscation of data because
22:18I think it's so interesting the way you do it.
22:19So one of the things about data is that if you get 10 pieces of blurry data about me, you
22:26can fold it back and figure out who I am, right?
22:28You get one piece of clear data, you can figure out who I am, and 10 pieces of blurry data,
22:32right?
22:32So how do you make sure that over time the LLMs, which are very smart and getting much smarter, are
22:39not able to piece together the data that you've blurred to figure out all the information you were trying to
22:44keep from them?
22:44Oh, there's no absolutes.
22:46I mean, you know, I don't think we can guarantee that they couldn't figure it out.
22:51Yeah.
22:52But what we can, with high degrees of confidence, assure anybody, that using Charlie gives you a chance, if you
23:01will, to see what next.
23:04And in terms of the what next, you know, these are businesses, so they go for ROI.
23:09And if there's easy things to do and difficult things to do, they'll do the easy ones first.
23:13Right.
23:14And I believe that if they can get an approximation of you, that may be just enough to satisfy their
23:23needs for general learning.
23:26And then for the kind of things they want to do, which is largely get access to that multi-trillion
23:31dollar, you know, advertising and transaction market out there,
23:35then the kind of things they can do, an approximation is sort of kind of okay.
23:39Do you, last question, do you think the big large language models, even if you say, hey, don't train on
23:44my data, even if you have a corporate account,
23:47do you think they're storing all that and that they'll use it in the future, or do you think they're
23:49actually deleting it or not?
23:51We noticed that Anthropic just changed their privacy policies with Fable.
23:54They're going to store all your corporate data for 30 days.
23:56Do you think they're storing everything on a long-term basis and that'll come back to haunt us?
23:59Uh, I couldn't say they're not.
24:04Tim?
24:07So, when, uh, are the companies lying about their privacy policies?
24:13Yeah.
24:15Well, time will tell, but some of them have already been exposed to doing that.
24:20Yeah.
24:21All right.
24:21Well, great.
24:22Everybody, delete your data.
24:24Use Charlie.
24:25Thank you so much for joining on stage.
24:27And thank you for building the web.
24:30You're welcome.
Comments

Recommended