Introduction to Multi-Speaker Support

With a range of 600 voices to choose from it can be a difficult choice narrowing this down to just one.

Well in fact it is possible to have more than one voice read a script! In this tutorial we will cover how to set a voice per section/sub-section.

Let's first consider this script:


<as:section name="section_1" soundSegment="intro">
  Hello and welcome to api audio.
</as:section>

<as:section name="section_2" soundSegment="main">
  <as:sub name="sub_1"> This content is sub section 1 </as:sub>
  <as:sub name="sub_2"> This content is sub section 2 </as:sub>
</as:section>

Here we have two sections, and within section 2 we have two sub-sections.

By default synthesising this script using our /speech/tts endpoint or audiostack.Speech.TTS.create() function in the SDK would result in all sections being rendered with the selected voice, In this case sara

text = """
    <as:section name="section_1" soundSegment="intro">
        Hello and welcome to api audio.
    </as:section>

    <as:section name="section_2" soundSegment="main">
        <as:sub name="sub_1"> This content is sub section 1 </as:sub>
        <as:sub name="sub_2"> This content is sub section 2 </as:sub>
    </as:section>
"""

script = audiostack.Content.Script.create(scriptText=text)
speech = audiostack.Speech.TTS.create(scriptItem=script, voice="sara")

Voices per section

We can set the voice for each section/sub-section using the following code example:

speech = audiostack.Speech.TTS.create(scriptItem=script, voice="sara", sections={
  	"section_1" : {"voice" : "Lucinda"},
    "sub_1" : {"voice" : "Wren"},
    "sub_2" : {"voice" : "Renata"},
})

Now each of the 3 sections/sub-sections will be read by a different speaker.

🚧
Any sections not specified will be read by the default voice
In the above example, should we remove line number 4 from the code example. sub_2 would be read using sara voice.

In this example both sub-sections in section_2 will be read with voice wren.

speech = audiostack.Speech.TTS.create(scriptItem=script, voice="sara", sections={
  	"section_1" : {"voice" : "Lucinda"},
   	"section_2" : {"voice" : "Wren"},

})

Voices per section

🚧Any sections not specified will be read by the default voice

🚧
Any sections not specified will be read by the default voice