00:00Hello, welcome. Today, all the red carnations are 50% off.
00:08Apple recently revealed its latest lineup of products, and one of the most talked about is the live translation feature on the AirPods Pro 3.
00:17But have you ever thought about the technology it takes to go into something like this, and how reliable it will or will not be?
00:24This is a space where there's just been continuous progress over the last, you know, decades.
00:31I mean, I've been working on it for 25 years by now.
00:34Philip Kuhn is a professor at John Hopkins Whiting School of Engineering.
00:38His expertise is in machine translation, converting text or speech from one language to another via computer,
00:45which is exactly what's needed for the new AirPods live translation.
00:49The technology behind it works in three steps. AirPods listen and turn spoken words into text.
00:55Apple AI or Apple Intelligence then translates the text.
00:59And finally, the translation is spoken back into your AirPods using speech synthesis.
01:04I agree, yes.
01:08Let's include the key findings in Friday's presentation.
01:12Seems simple enough, right? But as Professor Kuhn points out, speech is messy.
01:17People talk over each other, speak in noisy environments, or use slang these models aren't necessarily trained for.
01:25And that's before you factor in idioms, tone or cultural nuance.
01:30Like if I kind of slightly emphasize a word or hesitate somewhere a little bit, all that means something.
01:36It means maybe something about how certain I am about it or how convinced I am.
01:41Or maybe I'm fishing for confirmation about what I'm saying.
01:45So all these subtle clues, yeah, currently still get lost.
01:52The technology handles clean, standard speech well, but strong dialects can make things difficult.
01:58If you go into some village in Scotland, the technology might not work anymore because they speak very, very strong dialects that the systems might not be trained on.
02:09One surprising area where it does better than expected, idioms like it's raining cats and dogs or break a leg.
02:16There's actually a lot of translated, especially translated text out there.
02:21So it's in the order of billions of words.
02:24So you will see all these common phrases many, many times in the training data.
02:28But even if the translation is accurate, there's always a delay, two or three seconds.
02:34That makes conversation less natural.
02:36Khun says the technology is best seen as a supplement, not a substitute.
02:41Right now, Apple's live translation only works with a handful of major European languages.
02:46When you go beyond the top 100 languages, there's far less training data.
02:51So while AirPods may help you order food or ask for directions abroad, they cannot replace the human connection of greeting someone in their own language.
03:00In fact, as Professor Khun put it, this technology can sometimes be an obstacle if you're really trying to connect with someone who speaks another language.
Be the first to comment