Reference

Podonos

This is a base module. You can import:

python
import podonos
from podonos import *

init()

Initialize the module.

api_key

string

API key you obtained from the workspace. For details, see the Get API key If this is not set, the package tries to read PODONOS_API_KEY from the environment variable. Throws an error if both of them are not available.

Returns an instance of Client.

python
client = podonos.init(api_key='<API_KEY>')

Client

Client manages one or more Evaluator and evaluation history.

create_evaluator()

Create a new instance of Evaluator. One evaluator supports a single type of evaluation throughtout its life cycle. If you want multiple types of evaluation, you can create multiple evaluators by calling create_evaluator() multiple times.

name

string

Name of this evaluation session. If empty, a random name is automatically generated and used.

desc

string

Description of this evaluation session. This field is purely for your record, so later you can see how you generated the output files or how you trained your model.

type

string

default:"NMOS"

Evaluation type. One of the following:

Type	Description
`NMOS`	Naturalness Mean Opinion Score
`SMOS`	Similarity Mean Opinion Score
`QMOS`	Quality Mean Opinion Score
`P808`	Speech quality by ITU-T P.808
`PREF`	Preference test between two audios/speechs

Currently we support 5-point evaluation.

lan

string

default:"en-us"

Specific language and locale of the speech. Currently we support:

Code	Description
`en-us`	English (United States)
`en-gb`	English (United Kingdom)
`en-au`	English (Austalia)
`en-ca`	English (Canada)
`ko-kr`	Korean (Korea)
`zh-cn`	Mandarin (China)
`es-es`	Spanish (Spain)
`es-mx`	Spanish (Mexico)
`fr-fr`	French (France)
`de-de`	German (Germany)
`ja-jp`	Japanese (Japan)
`it-it`	Italian (Italy)
`pl-pl`	Polish (Poland)
`audio`	General audio file

We will add more soon. Please check later again.

num_eval

int

default:"10"

Number of evaluations per sample. For example, if this is 10 for NMOS type evaluation, each audio file will be assigned to 10 humans, the statistics of the evaluation output will be computed and presented in the final report.

due_hours

string

default:"12"

Expected due of the final report. Depending on the hours, the pricing may change.

use_annotation

bool

default:"False"

True for requesting additional details of the rating. Only available for single stimulus evaluations. File script must be provided.

use_power_normalization

bool

default:"False"

Enable power normalization to ensure consistent audio volume levels during evaluation.

auto_start

bool

default:"False"

If True, the evaluation automatically starts when you finish uploading the files. If False, you go to Workspace, confirm the evaluation session, and manually start the evaluation.

max_upload_workers

int

default:"20"

Maximum number of upload worker threads. If you experience a slow upload, please increase the number of workers.

Returns an instance of Evaluator.

etor = client.create_evaluator()

create_evaluator_from_template()

When you create an evaluation using a template, all the questions and options defined in the template are automatically assigned to the new evaluation. This ensures consistency and saves time by reusing pre-defined content.

name

string

Name of this evaluation session. If empty, a random name is automatically generated and used.

desc

string

Description of this evaluation session. This field is purely for your record, so later you can see how you generated the output files or how you trained your model.

num_eval

int

default:"10"

use_power_normalization

bool

default:"False"

Enable power normalization to ensure consistent audio volume levels during evaluation.

template_id

string

The unique identifier of the template to base the new evaluation on. Required to specify the predefined structure and settings for the evaluation.

etor = client.create_evaluator_from_template(
  name="How natural the voice it is", 
  desc="new_model_vs_competitor_model", 
  num_eval=10, 
  template_id="abcdef"
)

create_evaluator_from_template_json()

Create a new evaluation using a JSON template. This allows you to define custom evaluation structures programmatically.

json

Dict

Template JSON as a dictionary. Optional if json_file is provided.

json_file

string

Path to the JSON template file. Optional if json is provided.

name

string

Name of this evaluation session. Required.

custom_type

string

Type of evaluation. Must be either “SINGLE” or “DOUBLE”.

desc

string

Description of this evaluation session. Optional.

lan

string

default:"en-us"

Language for evaluation. See supported languages in create_evaluator().

num_eval

int

default:"10"

Number of evaluations per sample.

use_annotation

bool

default:"False"

Enable detailed annotation on script for detailed rating reasoning. Only available for single stimulus evaluations. File script must be provided.

use_power_normalization

bool

default:"False"

Enable power normalization to ensure consistent audio volume levels during evaluation.

max_upload_workers

int

default:"20"

The maximum number of upload workers. Must be a positive integer.

# Using JSON dictionary
template = {
    "query": [
        {
            "type": "SCORED",
            "question": "How natural the voice it is",
            "description": "Rate the quality of the voice",
            "options": [
              {"label_text": "Excellent"},
              {"label_text": "Good"},
              {"label_text": "Fair"},
              {"label_text": "Poor"},
              {"label_text": "Bad"},
            ]
        }
    ]
}

evaluator = client.create_evaluator_from_template_json(
    json=template,
    name="Quality Test",
    custom_type="SINGLE"
)

Returns an instance of Evaluator.

Here’s the JSON template for reference:

Question: Represents the main question posed to evaluators about the audio being assessed. It guides the evaluators on what specific aspect of the audio they should focus on during the evaluation.

Parameter	Description	Required	Notes
`type`	Type of question. Options: `SCORED`, `NON_SCORED`, `COMPARISON`	Yes	Determines the structure and requirements of the question
`question`	The main question text	Yes	Must be provided for all question types
`description`	Additional details or context for the question	No	Optional for all question types
`options`	List of possible options. Only for `SCORED` and `NON_SCORED` types	Conditional	Must have between 1 and 9 options for `SCORED` and `NON_SCORED` types
`scale`	Scale for comparison. Only for `COMPARISON` type	Conditional	Must be an integer between 2 and 9 for `COMPARISON` type
`allow_multiple`	Allows multiple selections. Only for `NON_SCORED` type	No	Enables multiple choice selection
`has_other`	Includes an “Other” option. Only for `NON_SCORED` type	No	Adds an option for evaluators to specify an unlisted choice
`related_model`	Related model for the question. Only for `Double` Evaluation type.	Conditional	Select which model the question is related to.
`anchor_label`	Labels for the ends of the comparison scale. Only for `COMPARISON` type.	Conditional	Provides context for what each end of the scale represents, enhancing evaluator understanding.

Important Notes:

SCORED and NON_SCORED questions can have a maximum of 9 options.
COMPARISON type questions must have a scale between 2 and 9.
related_model consists of ALL, MODEL_A and MODEL_B. Default is ALL. The related_model is only used for the question (not for instructions).

Question: Represents the main question posed to evaluators about the audio being assessed. It guides the evaluators on what specific aspect of the audio they should focus on during the evaluation.

Parameter	Description	Required	Notes
`type`	Type of question. Options: `SCORED`, `NON_SCORED`, `COMPARISON`	Yes	Determines the structure and requirements of the question
`question`	The main question text	Yes	Must be provided for all question types
`description`	Additional details or context for the question	No	Optional for all question types
`options`	List of possible options. Only for `SCORED` and `NON_SCORED` types	Conditional	Must have between 1 and 9 options for `SCORED` and `NON_SCORED` types
`scale`	Scale for comparison. Only for `COMPARISON` type	Conditional	Must be an integer between 2 and 9 for `COMPARISON` type
`allow_multiple`	Allows multiple selections. Only for `NON_SCORED` type	No	Enables multiple choice selection
`has_other`	Includes an “Other” option. Only for `NON_SCORED` type	No	Adds an option for evaluators to specify an unlisted choice
`related_model`	Related model for the question. Only for `Double` Evaluation type.	Conditional	Select which model the question is related to.
`anchor_label`	Labels for the ends of the comparison scale. Only for `COMPARISON` type.	Conditional	Provides context for what each end of the scale represents, enhancing evaluator understanding.

Important Notes:

SCORED and NON_SCORED questions can have a maximum of 9 options.
COMPARISON type questions must have a scale between 2 and 9.
related_model consists of ALL, MODEL_A and MODEL_B. Default is ALL. The related_model is only used for the question (not for instructions).

Option: Represents a possible answer or choice in a question. It provides evaluators with a range of options to choose from.

Parameter	Description	Required	Notes
`label_text`	The text displayed for the option	Yes	This is the text shown to the evaluator
`reference_file`	A reference audio file for the option. It helps the evaluator understand the quality of the audio	No	This file serves as a benchmark and is saved in the evaluation results

For options in SCORED and NON_SCORED questions:

score is automatically generated only for SCORED questions. If there are 5 options, the first option in the list receives a score of 5, the second option receives a score of 4, and so on, down to a score of 1 for the last option.
order is the index of the option in the list, starting from 0.

Anchor Label: Only for COMPARISON type. Provides context for what each end of the scale represents, enhancing evaluator understanding.

Parameter	Description	Required	Notes
`title`	The title of the anchor label	No	Provides a clear and concise title for the anchor label
`label_text`	The text displayed for the anchor label. `left` and `right` should be provided for `COMPARISON` type.	Yes	This is the text shown to the evaluator

For COMPARISON type:

left and right should be provided for label_text.
title is optional.

{
  ...

  "anchor_label": {
    "title": "Clarity",
    "label_text": {
      "left": "is clearer",
      "right": "is clearer"
    }
  }
}

Instruction: Provides key guidance to evaluators on how to approach the evaluation.

Parameter	Description	Required	Notes
`type`	Type of instruction. Options: `DO`, `WARNING`, `DONT`	Yes	Guides evaluators on how to approach the evaluation
`instruction`	The main instruction text	Yes	Provides key guidance to evaluators
`description`	Additional details or context for the instruction	No	Optional for all instruction types
`reference_file`	A reference audio file to set a benchmark for evaluators	No	Helps evaluators understand the standard or quality expected in the evaluation

Additional Notes:

Instruction is optional.
description and reference_file are optional but can provide valuable context.

get_evaluation_list()

Returns a JSON containing all your evaluations.

evaluations = client.get_evaluation_list()
print(evaluations)

The output JSON looks like:

[
  {
    'id': '<UUID>', 
    'title': 'How natural my synthetic voices are', 
    'internal_name': null, 
    'description': 'Used latest internal model. Epoch 10, alpha 0.1', 
    'status': 'ACTIVE', // DRAFT, ACTIVE, COMPLETED
    'created_time': 2024-06-25T01:40:43.429Z, 
    'updated_time': 2024-06-26T13:21:34.801Z
  },
  
  ...
]

get_stats_json_by_id()

Returns a list of JSONs containing the statistics of each stimulus for the evaluation referenced by the id.

evaluation_id

string

Evaluation id. See get_evaluation_list().

group_by

string

default:"question"

Group by criteria. Options are “question”, “script”, or “model”. Default is “question”. Note that “script” and “model” are only available for single-question evaluations.

evaluations = client.get_evaluation_list()
for eval in evaluations:
    stats = client.get_stats_json_by_id(eval['id'], group_by='question')
    print(stats)

Field	Description	SCORED	NON_SCORED
`mean`	Average score	✓	-
`median`	Median score	✓	-
`std`	Standard deviation	✓	-
`sem`	Standard error of the mean	✓	-
`ci_95`	95% confidence interval	✓	-
`options`	Each option name as key with integer value	✓	✓
`OTHER`	The number of evaluators who selected “Other”	✓	✓

For NON_SCORED questions:

The integer value is the number of evaluators who selected the option.
All options are included in the response regardless of their value

You can get the statistics of each question by calling get_stats_json_by_id() with group_by set to question, script, or model.

{
  "question": string,
  "description": string,
  "order": int,
  "responses": [
    {
      "name": string,
      "model_tag": string,
      "tags": string[],
      "type": "A" | "B" | "REF",
      "script": string | null,
      "mean": float | null, // null if the question is not SCORED
      "median": float | null, // null if the question is not SCORED
      "std": float | null, // null if the question is not SCORED
      "sem": float | null, // null if the question is not SCORED
      "ci_95": float | null, // null if the question is not SCORED
    }
  ]
}

You can get the statistics of each question by calling get_stats_json_by_id() with group_by set to question, script, or model.

{
  "question": string,
  "description": string,
  "order": int,
  "responses": [
    {
      "name": string,
      "model_tag": string,
      "tags": string[],
      "type": "A" | "B" | "REF",
      "script": string | null,
      "mean": float | null, // null if the question is not SCORED
      "median": float | null, // null if the question is not SCORED
      "std": float | null, // null if the question is not SCORED
      "sem": float | null, // null if the question is not SCORED
      "ci_95": float | null, // null if the question is not SCORED
    }
  ]
}

The group_by is set to question by default. Other options (script, model) cannot be used.

{
  "question": string,
  "description": string,
  "order": int,
  "responses": [
    {
      "targets": [
        {
          "name": string,
          "model_tag": string,
          "tags": string[],
          "type": "A" | "B" | "REF",
          "script": string | null,
        }
      ],
      "mean": float | null, // null if the question is not SCORED
      "median": float | null, // null if the question is not SCORED
      "std": float | null, // null if the question is not SCORED
      "sem": float | null, // null if the question is not SCORED
      "ci_95": float | null, // null if the question is not SCORED
    }
  ]
}

download_evaluation_files_by_evaluation_id()

Download all files associated with a specific evaluation, identified by its evaluation_id, from the Podonos evaluation service. It saves these files to a specified directory on the local file system and generates a metadata file describing the downloaded files. Return a string indicating the status of the download operation. This could be a success message or an error message if the download fails.

evaluation_id

string

Evaluation id. See get_evaluation_list().

output_dir

string

The directory path where the downloaded files will be saved. This should be a valid path on the local file system where the user has write permissions.

client.download_evaluation_files_by_evaluation_id(
  evaluation_id='12345', 
  output_dir='./output'
)

Field	Description
`file_path`	The path to the downloaded file relative to the `output_dir`.
`original_name`	The original name of the file before downloading.
`model_tag`	The model tag associated with the file, used for categorization.
`tags`	A list of tags associated with the file, providing additional context or categorization.

File Naming Convention:

Each downloaded file is saved in the format {output_dir}/{model_tag}/{file_name}. This means that files are organized into subdirectories named after their model_tag, and the original file name is hashed formatted.

File

A clsss representing one file, used for adding files in Evaluator.

path

string

required

Path to the file to evaluate. For audio files, we support wav and mp3 formats.

model_tag

string

required

Name of your model (e.g., “WhisperTTS”) or any unique name (e.g., “human”)

Evaluator

Evaluator manages a single type of evaluation.

add_file()

Add one file to evaluate in a single evaluation question. For a single file evaluation like NMOS, one file to evaluate is added.

file

string

required

Input File. This field is required if type is NMOS or P808.

etor.add_file(file=File(path='/path/to/speech_0_0.wav',
                        tags=['synthesized', 'male', 'ver1234']))

add_files()

Add multiple files for such evaluations that require multiple files for comparison.

file0

string

required

First Input File. This field is required if type is SMOS.

file1

string

required

Second Input File. This field is required if type is SMOS.

file0 = File(path='/path/to/speech0.wav', tags=['original', 'male', 'human'])
file1 = File(path='/path/to/speech1.wav', tags=['synthesized', 'male', 'ver1234'])
etor.add_file_set(file0=file0, file1=file1)

close()

Close the evaluation session. Once this function is called, all the evaluation files will be sent to the Podonos evaluation service, the files will go through a series of processing, and delivered to evaluators.

Returns a JSON object containing the uploading status.

python
status = etor.close()

Get Started

Basics

Details

Use Cases

Roadmap

SDK References

Podonos

init()

Client

create_evaluator()

create_evaluator_from_template()

create_evaluator_from_template_json()

get_evaluation_list()

get_stats_json_by_id()

download_evaluation_files_by_evaluation_id()

File

Evaluator

add_file()

add_files()

close()

Get Started

Basics

Details

Use Cases

Roadmap

SDK References

​Podonos

​init()

​Client

​create_evaluator()

​create_evaluator_from_template()

​create_evaluator_from_template_json()

​get_evaluation_list()

​get_stats_json_by_id()

​download_evaluation_files_by_evaluation_id()

​File

​Evaluator

​add_file()

​add_files()

​close()

Podonos

init()

Client

create_evaluator()

create_evaluator_from_template()

create_evaluator_from_template_json()

get_evaluation_list()

get_stats_json_by_id()

download_evaluation_files_by_evaluation_id()

File

Evaluator

add_file()

add_files()

close()