Skip to main content

Intro

When comparing three or more speech synthesis models, a ranking evaluation is an effective method for determining the relative quality of each model. Rather than comparing pairs individually, evaluators listen to a set of audio samples generated from the same script and rank them from best to worst. The Ranking evaluation is flexible in its evaluation criteria. You can rank models based on naturalness, overall preference, clarity, expressiveness, or any other quality dimension that matters to your use case.
  • Objective: Determine the relative ordering of multiple models by having evaluators rank them.
  • Use Case: Ideal for comparing TTS providers, model versions, or synthesis configurations side by side.
  • Type: RANKING in the SDK.
ranking

Example

In this example, we compare three different TTS providers by generating speech from the same scripts and submitting them for ranking evaluation. Here is a code example that you can immediately execute:
python
import podonos
from podonos import *
import provider_a, provider_b, provider_c

client = podonos.init()
etor = client.create_evaluator(
    name='TTS Provider Ranking',
    desc='Ranking evaluation across TTS providers',
    type='RANKING',
    lan='en-us',
    num_eval=10,
)

scripts = [
    'But in less than five minutes',
    'The two doctors therefore entered the room alone',
]

for script in scripts:
    path_a = provider_a.synthesize(text=script, output='provider_a.wav')
    path_b = provider_b.synthesize(text=script, output='provider_b.wav')
    path_c = provider_c.synthesize(text=script, output='provider_c.wav')

    etor.add_ranking_set([
        File(path=path_a, model_tag='Provider A', tags=['tts']),
        File(path=path_b, model_tag='Provider B', tags=['tts']),
        File(path=path_c, model_tag='Provider C', tags=['tts']),
    ])

etor.close()
Ok, let’s go line by line.
1

Create a Client

Let’s first create a new instance of Client.
python
client = podonos.init()
2

Create an Evaluator

Then, you create a new instance of Evaluator with type='RANKING':
python
etor = client.create_evaluator(
    name='TTS Provider Ranking',
    desc='Ranking evaluation across TTS providers',
    type='RANKING',
    lan='en-us',
    num_eval=10,
)
3

Generate speech and add ranking sets

For each script, generate speech from all providers and add a ranking set. Each ranking set contains one audio file per provider.
python
etor.add_ranking_set([
    File(path=path_a, model_tag='Provider A', tags=['tts']),
    File(path=path_b, model_tag='Provider B', tags=['tts']),
    File(path=path_c, model_tag='Provider C', tags=['tts']),
])
4

Close

Finally, close the Evaluator object.
python
etor.close()
With this, you can rank multiple TTS providers via podonos from real human evaluators. Once these steps finish, you can check the results in your Workspace.

Use Case

Consider a scenario where you are evaluating multiple TTS providers to decide which one to integrate into your product. Each provider may have different strengths — one might excel at naturalness while another handles proper nouns better. Using the Ranking evaluation, you can have human evaluators directly compare all providers on the same scripts and produce a clear ordering, giving you confidence in your selection.