Accessibility links

Breaking News

The Inside Story - "Giving Voice to A.I." Episode 85 TRANSCRIPT


The Inside Story - Title Episode 85 THUMBNAIL skinny
The Inside Story - Title Episode 85 THUMBNAIL skinny


Transcript:
The Inside Story: Giving Voice to A.I.
Episode 85 – March 30, 2023

Show Open:

Unidentified Narrator:

This week on the Inside Story: Giving Voice to A.I.

From writing school papers to diagnosing mental health issues, artificial intelligence has come a long way.

But the growing list of technology companies eager to profit has some experts fearing what comes next.

Plus, invasive species in Florida wreak havoc on local ecosystems. Does America’s “sunshine state” stand a chance?

Now… on The Inside Story: Giving Voice to A.I.

The Inside Story:


TINA TRINH, VOA Technology Reporter:

Hi I’m Tina Trinh in New York.

Artificial intelligence and machine learning are a part of our lives whether we’re comfortable with it or not.

It’s not just Chat G-P-T writing your kids school papers, it’s the voice recognition in Siri, the giant robots in manufacturing plants, and the annoying chatbots that pop up on our screens and try to talk to us.

These technologies are supposed to make our lives simpler and increasingly, they're having more of a say in our day-to-day reality.

As we’ll see in the next half hour, A-I can give voice to the voiceless and diagnose illness.

But it also has the potential to be misused by individuals, corporations, even nations.

Every person’s voice is as unique as a fingerprint. But what if you no longer sounded like you? Or couldn’t talk at all?

Using text-to-speech technology, artificial intelligence can help, but so far, the results have been mixed - with those who’ve lost their voices sounding more like robots than humans.

But emerging technologies are increasingly showing promise -- letting those who’ve lost their voices sound more like themselves.



Bradley Heaven, Assistive Technology Enthusiast:

It’s amazing, I love it.

TINA TRINH:

When Bradley Heaven talks, it sounds like this,


Bradley Heaven, Assistive Technology Enthusiast:

Kind of, ha ha.


TINA TRINH:

Heaven has cerebral palsy and is nonverbal. He uses an eye-tracking device to communicate his thoughts. But until recently, the digital voice that spoke for him sounded pretty … digital.

But a company called Acapela Group has come up with a way to turn his voice from this…

Bradley Heaven, Assistive Technology Enthusiast:

It’s really cool and unique.


TINA TRINH:

…into this…

Bradley Heaven, Assistive Technology Enthusiast:

It’s really cool.

TINA TRINH:

The Belgium-based company creates personalized digital voices for those experiencing speech impairment or voice loss.

For those affected, it’s a more human way to express themselves.


Remy Cadic, Acapela Group CEO:

The key point here is the identity of the person. We want to help the person to be himself or herself in daily life.


TINA TRINH:

Acapela Group’s “My Own Voice” is a voice-banking service that allows a customer to “bank” or save their voice for future use.

Users record themselves reading 50 different sentences or, if they can’t speak, enlist a friend or family member to be their voice. The technology uses algorithms and artificial intelligence …

Remy Cadic, Acapela Group CEO:

So that just based on 50 sentences, then we can create a new synthetic voice.

TINA TRINH:

The resulting synthetic voice is more personalized and can be used with various assistive devices.

Daniel O’Connor, Heaven’s aide, friend and business partner, recorded his voice for Heaven to use. He says the process was easy.


Daniel O’Connor, Assistive Technology Enthusiast:

The first thing Brad said to me is, ‘Hey, let’s order some coffee' and 'Don’t worry, it’s my treat.’ And he used my voice to say that.



TINA TRINH:

Experts say using this technology could be meaningful for users.

John Costello, Speech Language Pathologist:

There are individuals whose speech is beginning to change, and by their own admission, it may not be the way they used to sound, but to them, it is who they are.


TINA TRINH:

Banking a voice to strengthen one's sense of identity.

Some groundbreaking research is being developed in the medical field connecting your voice to your health.

A U.S. government-funded project is collecting thousands of voice samples and using artificial intelligence to diagnose mental and physical illness.

VOA’s Julie Taboh has this report from Washington, D.C.


JULIE TABOH, VOA Correspondent:

Dr. Yael Bensoussan examines the vocal cords of a patient.

At the University of South Florida Health Voice Center, she treats patients with a range of voice disorders, such as upper airway, voice and swallowing disorders.

And lately, she’s been helping to lead a new project to build a database of 30,000 human voice recordings and train computers to detect diseases through changes in the human voice.


Dr. Yael Bensoussan, Voice Specialist:

Not only to build that data, but also to develop the guidelines on how to share that data, how to collect that data, and also how to use that data for future AI [artificial intelligence] research.


JULIE TABOH:

She works with a team of 45 investigators across 12 different universities in North America as well as a startup in Europe.

They study voice samples to help them detect illnesses like Parkinson’s disease…

Audio of Glottic cancer voice demo:

Glottic cancer.



JULIE TABOH:

cancer…


Audio from Vocal fold paralysis demo:

Vocal Fold Paralysis.


JULIE TABOH:

And voice disorders such as vocal fold paralysis.

The team also studies mood disorders such as depression and anxiety.



Dr. Yael Bensoussan, Voice Specialist:

So when somebody is depressed, sad, has anxiety, of course their speech changes.


JULIE TABOH:

The study is one of four data-generation projects funded by the National Institutes of Health's Bridge to Artificial Intelligence program, designed to use AI to tackle complex biomedical challenges.


Dr. Yael Bensoussan, Voice Specialist:

They realized that there was such a big gap between the technology that we had available, and the clinical knowledge, and what we use in clinical care in our hospitals.


JULIE TABOH:

And doing it while maintaining participants’ privacy.


Grace Peng, National Institutes of Health:

We want to think about the ethics associated with collecting people's voices. And how do we keep it private?”


JULIE TABOH:

The study will start enrolling participants in the coming year.

Julie Taboh, VOA News, Washington.

TINA TRINH:

There are a growing number of startups trying to capitalize on the newest artificial intelligence technology that can replicate human voices.

“Well Said” Labs, based in Seattle, Washington is one group that's creating synthetic voices for companies to use for advertising, marketing, and training.

VOA’s Phil Dierking has the story on the pros and cons of "Generative AI".

Audio of synthetic voice:

With AI voices delivering increasingly more human performances, content creators ….”

PHILIP DIERKING, Reporting for VOA:

This voice is synthetic, the product of a human actor’s voice and technology.


Matt Hocking, WellSaid Labs CEO:

We can take a smaller dataset and train on the pitch, pausing, intonation, emphasis and more of the stylistic qualities of the voice to really capture that authentic voice.


PHILIP DIERKING:

Matt Hocking is the CEO of WellSaid Labs, a Seattle technology firm that spun out of the Allen Institute for Artificial Intelligence, a research nonprofit. He is working to create computerized voices to sound more human.

Customers use these voices in advertising, marketing and training courses without having to hire voice actors.

A voice actor records at WellSaid Labs. From that recording, the firm uses artificial intelligence to teach software subtle things. How long a voice actor pauses, when they take a breath, what words are emphasized. A customer picks among voices and enters text they want the voice to say.

Audio of synthetic voice:

I’m Wade, and I’ve been told that my good natured and honest voice is great for all types of e-learning and training content.


PHILIP DIERKING:

With A.I. moving into many fields, creative professionals are asking how they will be affected.


Gabby Fernandes, WellSaid Labs Voice Actor:

As long as there's still like a human at the center of it and we're building off of that, I feel more comfortable doing something that might be a little bit more computer generated.

PHILIP DIERKING:

Another concern: Will jobs be eliminated? Maybe, say observers, but generative AI like WellSaid Labs' might also increase creative output and improve performance.


Vu Ha, AI2 Incubator:

It’s going to be a kind of collaboration between humans and AI to really streamline and speed up the process of creation.


PHILIP DIERKING:
It may be too early to know the consequences of generative AI like WellSaid Labs', but it’s likely here to stay.

Phil Dierking, for VOA News, Seattle, Washington.

TINA TRINH:

Now, we’ve talked about some of the new
worlds made possible by artificial intelligence and voice synthesis.

But for even casual fans of science-fiction, A-I can conjure images of a bleak reality where machines outsmart humans, and no technology is to be trusted.

I spoke with Nithya Thadani, CEO of Rain, an agency that works with companies to create voice and conversational AI experiences.

Here’s what she had to say about the responsible way forward with this powerful technology.

Is there a way to moderate this type of content that’s gonna be generated? Especially when you think about how easy it is-- how difficult rather, it is to tell the difference between what's real and something that's actually authentic?


Nithya Thadani, CEO of Rain:
Absolutely. That regulation, this type of thinking has to come really quickly because of the pace with which this AI is accelerating. So if you think about this in the context of open AI’s ChatGPT, which is a very similar concept, that AI is generating content based on what it is reading and training on from the internet. And if any of that content is biased or fake, it's going to pull that and potentially kind of ingrain that in the next generation of content. Now, imagine that happening over again, you can see how it would become very hard to detect where the original content came from, and kind of what is real versus computer generated. Voice has an even lower barrier, and so it's really important that we think about how we can get ahead of that has to come in the form of government regulation.


TINA TRINH:

Maybe that’s what needs to happen, some type of system for flagging this type of content. Your company mentioned this idea of verified voices. Can you explain that?

Nithya Thadani, CEO of Rain:
You know, when we think about labeling and regulation, there are three concepts that are very important here. One is consent. So, do we have the consent of the celebrity or politician or even patient to use their voice in the way that we're talking about? Even when we do have that consent, how do we make sure that the voice is used within the barriers of consent? So within the context that they… and that is potentially going to require some type of legal involvement right like that's going to go into the space of what does legal consent to use your voice look like? Another bucket here is disclosure and that is the big one, right. So this is how do you make somebody aware that the contents are putting forth is computer generated. This is going to come in the form of tagging and flagging that content, and really thinking about how you can tag meta data.
You know, it's gonna require a new vocabulary entirely about how you think about disclaimers and labels on technology. The last bucket, here is really detection. So once VAI is really actually out there, how to test what is computer generated and what is not. The tech companies are gonna have to play a big role in this as well. But once you do detect false or misinformation, how do you then put a system in place to be able to remove that content? And who owns the content that you’re removing? So a lot of these are really frameworks for how you think about this challenge.


TINA TRINH:

And then on the flip side of that, we have creators who are making artistic choices, willful artistic choices to use synthetic voices. You know a couple of years ago, this documentary called “Roadrunner”, Anthony Bourdain life, made headlines because it was determined that the director used a synthesized voice to pass for the late chef. It read a few lines that Bourdain never actually spoke when he was alive. Let’s take a look.

Audio from Anthony Bourdain documentary clip:
He was definitely searching for something.
You were successful. And I am successful.
And I’m wondering: are you happy?
(End of clip)

TINA TRINH:

So Nithya, a lot of critique and discussion coming out of that, once people found out that that wasn't actually Bourdain talking and the directors never disclosed this to the viewers.

Nithya Thadani, CEO of Rain:
What’s so interesting here is just is this ethical? Is it ethical to use his voice in the way the director did? He took his own creative liberties to bring Anthony Bourdain’s voice to life. Would Anthony Bourdain, who is such an authentic and creative person, would he have consented to this? And would he have agreed that what was written what was said out loud? You really hit the nail on the head though with the disclosure piece that is actually what caused so much controversy around because it wasn't made aware to viewers that Anthony Bourdain did not say these words. So it felt very deceptive. There is something very authentic and intimate about our voices that are ours. And people really have a visceral reaction. When it seems like you are using someone's voice in a way that is not genuine and authentic.

TINA TRINH:

What advice do you have for your clients who are considering exploring this new territory?

Nithya Thadani, CEO of Rain:
When we're advising any company or brands on how to think about voice content, it is very similar to any way you would think about branding. Companies spend a lot of time just thinking about the colors of their brand about the jingle or the sounds of their brand. The voices that represent the brand are no different. So whether that is a celebrity that is aligned to brand values or a voice that you're generating from scratch that really needs to be thoughtful and take into account the audience that you're trying to serve. And actually this is a really interesting place where the topic of bias and inclusivity comes into play. And that's another place where we're always advising our clients, which is, you know, voice in any AI is only as smart as the data so if we are going to create voices that are supposed to be representative of an audience, we need to make sure that that training data is representative of all voices, of minorities of women. If you have training data, and voice training data that is primarily Caucasian male, that's the type of payment you're gonna get. And so it's really important that companies are just eyes wide open to the fact that there needs to be there and one of the ways to do that to make sure that we have more diversity and equity in the people who are creating that data, right, that they can make sure that data sets are equitable to all.

TINA TRINH:

As we've seen, filmmakers are finding ways to use artificial intelligence to resuscitate long gone voices, but the technology can also make motion pictures more accessible to wider audiences.

MATT DIBBLE, VOA Correspondent:

How to make a movie work in another language? For decades there have been only two options: subtitles or dubbing. Many viewers resist watching films with subtitles, which can distract from the action.

And dubbing, which replaces a film’s dialogue,


Audio from the film: “Hercules and the Tyrants of Babylon”:

This is the first time I’ve ever heard the word ‘coward’ applied to the valiant warrior Bomia.


MATT DIBBLE:

…usually results in mouths out of sync with the words they are supposedly speaking.

When director Scott Mann first saw a foreign language dub of a film he made...


Scott Mann, Flawless:

I was kind of appalled and devastated because I saw how different it was.


MATT DIBBLE:

This got him thinking that there might be a technological solution.

He co-founded Flawless, a company that uses a kind of visual artificial intelligence to digitally modify the faces in a film to match the new words.


Scott Mann, Flawless:

The system is taking a very detailed look and an understanding, of how a certain character talks. The system is able to kind of re-time those mouth movements like subtly alter them so they fit the new dialogue.

Audio from film: “A Few Good Men”:

C’est amusant monsieur?

MATT DIBBLE:

Flawless uses voice actors who are fluent speakers of the new language.
Another company, Deepdub, uses AI to simulate the original actor’s voice in the translation.

Audio from film: “Every Time I Die”:

It’s not something that happens to me.

Oz Krakowski, Deepdub:

We need only a sample of 2 to 5 minutes of the actors’ voices in order to create what we call a voice model.



MATT DIBBLE:

The new technologies promise to accelerate a growing appetite for content across language borders.

But in the process, the line between humans and machines continues to blur.

Matt Dibble for VOA News, San Francisco.

TINA TRINH:


It’s not just humans who can use new tech to find their own voice.

"Fluent Pet” is a tool that offers an innovative way for pet owners to communicate and connect with their furry loved ones.

By using a recordable push button, pets can now express their wants, needs, and desires - effectively "speaking" to their humans. [[FS MAP ]]

This time VOA’s Julie Taboh Travels to Glen Allen, in the
State of Virginia.

Morgan Krug, Dog Trainer:

Go! Let’s go play! Let’s go play! Go girl!



JULIE TABOH, VOA Correspondent:

Morgan Krug knows how to communicate with her pets...

Morgan Krug, Dog Trainer:

Drop, good job, beep! Catch!



JULIE TABOH:

And like most animal owners, usually knows when they want something…

Morgan Krug, Dog Trainer:

More pets!



JULIE TABOH:

But now her daily interactions are even stronger, she says, thanks to a set of buttons with pre-recorded words that her pets can push to express themselves.


Synthetic Audio:

Wet food!


Morgan Krug, Dog Trainer:

I started off as skeptic. I knew that they could request ‘outside,’ or specific toys.

Synthetic Audio:

Frisbee play!



Morgan Krug, Dog Trainer:

But this allows them to be specific about what they want.


JULIE TABOH:

A dog trainer by trade, Krug is using a system called FluentPet, where she can use her own voice to record words...

Synthetic Audio:

Ball!

JULIE TABOH:

This has been especially helpful for her cat Jasper, who is blind.


Synthetic Audio:

Kibble.



Morgan Krug, Dog Trainer:

Having the buttons and a voice when she doesn't have her eyesight to navigate by has really enriched her life in ways that I could have never anticipated.


Leo Trottier, FluentPet Founder:

So thanks to buttons, we're seeing people report behaviors and interactions with their dog or their cat that to me as someone who has a background in cognitive science, I find totally astonishing.


JULIE TABOH:

The company has also developed a FluentPet app that owners can use to collect data on their pets and keep track of what their dog or their cat has learned over time.


Synthetic Audio:

Pet. Kibble. Training.



JULIE TABOH:

One thing is clear: Systems like this hold a lot of promise for future advancements in human-animal interactions.



Synthetic Audio:

What does Adora want?

Outside.

Okay let’s go!



JULIE TABOH:

Julie Taboh, VOA News, Glen Allen, Virginia.

TINA TRINH:

Millions flock to Florida for its warm weather and sunshine.

But it’s not just people who have found a new home in this warm, Southern U.S., state. VOA’s Dora Mekour has this story from Orlando.

Kylie Reynolds, Amazing Animals Deputy Director:

This is an African Spurred also known as a Sulcata.


DORA MEKOUAR, VOA Correspondent:

This African spurred tortoise looks harmless, but it’s not. It competes for resources with the native gopher tortoise, which is critical to Florida’s ecosystem.


Kylie Reynolds, Amazing Animals Deputy Director:

They dig these big burrows underground, and a lot of other animals are going to use those for shelter as well. If gopher tortoises are declining, and they're providing homes for hundreds of other animals, then what's going to happen to all these other animals?

DORA MEKOUAR:

There are hundreds of non-native animal species in Florida. Some take over their new habitat, threatening the environment.


Kurt Foote, National Park Service:

Nature relies on variety, and when you don't have variety, it's just more susceptible to collapse.

DORA MEKOUAR:

Florida spends more than 500-million-dollars a year trying to contain invasive species. But is it a losing battle?

Mike Hileman, Gatorland Park Director:

Once a species starts reproducing in the wild, and they have a system that works for them, it's almost next to impossible to eradicate them.

DORA MEKOUAR:

Like the Burmese python. Thirty years ago, many escaped a local breeding facility destroyed by a hurricane. Today, there are tens of thousands of pythons in Florida.

Mike Hileman, Gatorland Park Director:

It can take out one of our apex predators, which are alligators and crocodiles, and then it'll take down some of the other native animals that are small mammals, some of the rats, the mice, the marsh bunnies, things that are supposed to be food for other things. So they compete with our native animals and, because they're a more dominant species, they win that battle.


Kylie Reynolds, Amazing Animals Deputy Director:

What does the rooster say?


Parrott:

Cock-a-doodle-do.


Kylie Reynolds, Amazing Animals Deputy Director:

Ohhh that was a good one!

DORA MEKOUAR

Many invasives were once someone’s pet.

Kylie Reynolds, Amazing Animals Deputy Director:

Animals can be a lot of work. You know, the parrots are loud, they live almost 80 years. Tortoises can get really big and live 100 years. So it's a big commitment to have a lot of these animals. People sometimes just go, ‘You know, it's nice in Florida. We'll just let it loose.’


DORA MEKOUAR:

Feral hogs — introduced by Spanish explorers centuries ago — are everywhere in Florida. They dig up the soil, sometimes destroying restored native plants.


Ben Gugliotti, Lake Apopka North Shore Land Manager:

And then also they actually create a secondary opening for invasive plant species then to move into those disturbed soil areas. So, not only are they destroying what we're doing, but also creating an opening for additional invasive species.

DORA MEKOUAR:

The Tiger Creek Preserve is home to one of the most diverse ecosystems in the world, says preserve manager Cheryl Millet.

Cheryl Millett, The Nature Conservancy:

It's a biodiversity hotspot. And yeah, we have a lot of things that…
don't exist anywhere else. And if we lost them here, we wouldn't have them anymore on Earth.


DORA MEKOUAR:

It’s easy to spot the damage done by wild hog but Millett is most concerned about encroaching non-native lizards.

Cheryl Millett, The Nature Conservancy:

They've been found using gopher tortoise burrows in South Florida.”
They found baby gopher tortoises in the guts of Tegu lizards.
I'm really worried about their potential impact here.



DORA MEKOUAR:
Florida officials use hunts, monitoring, exotic pet amnesty programs and other methods to combat invasive species. But it’s a battle with no end in sight. ((end courtesy))

Dora Mekouar, VOA News, Orlando, Florida.

TINA TRINH:

That’s all for now. Stay up to date with all the news at VOANews.com.

Follow us on Instagram and Facebook at VOA News.

You can find me on Twitter at Tina Trinh NYC.

And catch up on past episodes at our free streaming service, VOA Plus.

For all of those behind the scenes who brought you today’s show, I’m Tina Trinh. We’ll see you next week for The Inside Story.

###

XS
SM
MD
LG