AI voice detection and recognition are becoming more crucial

This Twitter thread shows how far along artificial voices have come. For those who are familiar with Steve Jobs’ voice, the voice in these recordings is almost indistinguishable from the original. When you listen to them, you can be forgiven to think that it’s actually Steve Jobs saying these words, never mind that he’s been gone for more than a decade.

The only catch is that because the training set must have been taken from the many recordings of his Apple keynote speeches and product announcements, they all sound like he’s reading from a script or making announcements. None of the sentences sound natural the way someone would speak if they were having a regular conversation or answering questions but that’s not too difficult to overcome. The tools to make adjustments to AI generated voices to sound more natural already exists.

Here’s another example. The YouTube channel Star Wars Comics have started to experiment with using generated voices to narrate some storyline’s from the Star Wars comic books to keep their audience up to date with what’s happening in the comics. In one video, they used James Earl Jones’ Darth Vader voice to say the lines in the pages of the comic book. Their latest video voiced a conversation between Emperor Palpatine and Darth Vader from another issue in the recent Darth Vader comic book series, both using the generated voices of their real actors.

As many in the comments noted, while their voices sound indistinguishable from the original, the speech patterns make it obvious that these were generated. That’s because the voices weren’t adjusted to the way a person would speak in a proper conversation given in the situation. Again, these are relatively trivial changes that one could make using their AI voice generators.

While these may be little more than fun projects for the curious minds, the day when someone can create entirely fabricated recordings to manipulate the public is already here. You can already create fake videos of a person saying things that they never actually said, now the voices sound even closer to the original.

When deepfake videos started popping up in 2020, people knew that this was going to be a significant problem. People are already easily fooled by fabricated articles or stories and this is just going to make it far more challenging for people to fact check and verify the validity of recordings.

All I can say for that is, brace for impact.