Comparative Similarity

Intro

The Comparative Similarity (CSMOS) evaluation is designed to assess which of two audio samples is more similar to a reference audio. This evaluation is particularly useful in scenarios where the goal is to match or mimic a reference audio, such as in voice cloning or audio restoration tasks.

Objective: Determine the similarity of two audio samples to a reference audio.
Use Case: Ideal for applications requiring audio matching or quality assessment against a standard.
Type: CSMOS in the SDK.

Example

Initialize the Client

Begin by initializing the Podonos client with your API key.

import podonos

client = podonos.init("<API_KEY>")

Create the Evaluator

Set up the evaluator for a CSMOS evaluation.

evaluator = client.create_evaluator(
    name="Comparative Similarity Test",
    desc="Evaluate similarity of audio samples to a reference",
    type="CSMOS"
)

CSMOS is not allowed by the create_evaluator_from_template_json method

Add Files for Evaluation

Add two audio samples and one reference audio. The reference file must be specified with is_ref=True.

from podonos import File

evaluator.add_files(
    file0=File(path="audio_sample1.wav", model_tag="Sample 1", tags=["test"], is_ref=False),
    file1=File(path="audio_sample2.wav", model_tag="Sample 2", tags=["test"], is_ref=False),
    file2=File(path="reference_audio.wav", model_tag="Reference", tags=["reference"], is_ref=True)
)

File Order: Ensure the reference file is the third file in the add_files method.

Finalize the Evaluation

Close the evaluator to complete the setup.

evaluator.close()

Key Considerations

File Configuration: The reference file must be clearly marked with is_ref=True and should be the last file in the add_files method.
Evaluation Logic: The CSMOS evaluation logic will compare the two audio samples against the reference to determine which is more similar.
Applications: Useful for tasks like voice cloning, audio restoration, and quality assurance where matching a reference is critical.

Use Case

Consider a scenario where you are developing a new speech synthesis model and want to evaluate how closely the generated audio matches a reference recording. Using CSMOS, you can objectively assess which version of your model produces audio that is more similar to the desired reference.

Get Started

Basics

Details

Use Cases

Roadmap

SDK References

Comparative Similarity

Intro

Example

Key Considerations

Use Case

Get Started

Basics

Details

Use Cases

Roadmap

SDK References

​Intro

​Example

​Key Considerations

​Use Case

Intro

Example

Key Considerations

Use Case