Use Cases
Voice Similarity
Evaluate how similar the two voices
Intro
In case you are building a new AI model that can speak like Elon Musk or Taylor Swift. Now you wonder how similar the generated output is to the real human in their voice. Here, the similarity means their tone, prosody, and word articulation.
Example
In this example, we will assume you have your own AI model for generating human voice similar to the given human voice. See an example below.
import podonos
from podonos import *
client = podonos.init()
etor = client.create_evaluator(
name='Taylor Swift voice similarity',
desc='How similar voice can my AI model generate to Taylor Swift?',
type='SMOS', num_eval=10)
original_speech_path = ['ts0.wav', 'ts1.wav', 'ts2.wav']
generated_speech_path = ['ts0_gen.wav', 'ts1_gen.wav', 'ts2_gen.wav']
for org, syn in zip(real_speech_path, generated_speech_path):
org_file = File(path=org, model_tag='real', tags=['female'])
syn_file = File(path=syn, model_tag='model1', tags=['female', 'Taylor Swift'])
etor.add_files(file0=org_file, file1=syn_file)
etor.close()
Was this page helpful?