The BBC is launching a new audio tool that uses artificial intelligence to read articles from its website aloud with a voice that speaks in a friendly, easy to understand northern British accent.
DAVID GREENE, HOST:
The BBC is one of the world’s most recognized broadcasters. Maybe you’ve heard this before.
(SOUNDBITE OF BBC BROADCAST)
UNIDENTIFIED BROADCASTER #1: Hello. And welcome to Newshour from the BBC World Service, coming to you live from our studios in central London.
GREENE: But when the BBC wanted to step up its audio game online, it wanted a more informal, friendlier voice than that. So it turned to artificial intelligence. NPR tech correspondent Shannon Bond introduces us to their newest sound.
COMPUTER-GENERATED VOICE: I’m the BBC’s synthetic voice, and I can read out articles from bbc.com.
SHANNON BOND, BYLINE: This is the first time I’m meeting this voice, which I’m sorry to say doesn’t have a name. The BBC told me I could ask it anything, so I asked it to explain how it works.
COMPUTER-GENERATED VOICE: It’s easy. I take the text that’s on the screen and read it out loud. Well, OK, maybe it’s not quite that simple. There’s a lot of tech going on in the background.
BOND: The BBC developed the voice with engineers at Microsoft using machine learning.
COMPUTER-GENERATED VOICE: They based it on many hours of human recordings and finely tuned it to create the voice you hear now. Pretty clever, eh?
BOND: A lot of publishers are trying these experiments to turn text into speech to make their websites and apps accessible for people who have a hard time seeing and to keep people engaged even if they’re too busy to sit down and read. So the BBC did a lot of research and decided to shed one of its best-known traits, the very proper accent known as the Queen’s English.
(SOUNDBITE OF BBC BROADCAST)
UNIDENTIFIED BROADCASTER #2: Here is the Air Ministry’s weather forecast for tomorrow.
BOND: Online, the BBC wanted a voice that’s more easygoing, one you could imagine having a pint with.
COMPUTER-GENERATED VOICE: I’m British, so I say tomahto (ph) while you say tomayto (ph). I’m also Northern, aka not from London, so I say Bath, while the Queen of England might say Bahth (ph).
BOND: Bath, Bahth. OK, that might not sound like a big deal to American ears, but on the other side of the pond, it really matters.
COMPUTER-GENERATED VOICE: In the U.K., Northerners are known for sounding friendly. I hope I do, too.
BOND: That tone also sets the BBC apart from the audio technology other news organizations are using.
CLAIRE: This is your Washington Post Election 2020 results update. I’m Claire, elections AI presenter for the Post.
BOND: Claire is all business, no drama. At the other end of the spectrum, there’s The New York Times.
UNIDENTIFIED VOICE ACTOR: Black theater is having a moment. Thank Tyler Perry – seriously.
BOND: The Times bought a company this year called Audm, which produces audio stories with professional voice actors. The BBC is aiming for somewhere in the middle. Here’s what its voice sounds like reading a recent story.
COMPUTER-GENERATED VOICE: But when the office temporarily closed eight months ago due to the pandemic, Domino wasn’t wistful about losing his daily dose of corporate culture, the view or free kombucha.
BOND: So now when you might think about flipping on the radio or a podcast, the BBC hopes you’ll try out a news story or feature.
COMPUTER-GENERATED VOICE: Life can get busy, so I can help by reading articles out loud, letting you get on with other things at the same time. You might need to go for a run, pick the kids up from school or make supper, which I believe Americans call dinner.
BOND: But do people really want to listen to a robotic voice, even a friendly one? I asked Nick Quah, who writes the audio industry newsletter Hot Pod. He says that depends on whether the AI can fool you into thinking it’s a real person.
NICK QUAH: Could you – (laughter) can it be an automated delivery of information that doesn’t feel mechanical, that feels vaguely believable as (laughter) a source of like, you know, that intimacy that people look for in the audio format?
BOND: And when it comes to that question, we humans still have an edge over the robots. Just ask Siri.
SIRI: Shannon Bond, NPR News.
NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.