How to Add Multiple Speakers to a Single Script

How to add multiple speakers to one script

This example will walk you through a Python script that demonstrates how to use Audiostack to convert text to speech, assign different voices to sections, and produce a final audio mix.

I'm going to use a stylised example of interviewing two footballers about their favourite footballer. ⚽

Setting Up the Environment

First, ensure you have the Audiostack library installed and set up your API key. The API key is necessary to authenticate your requests with Audiostack services. Here’s how you can do it:

import audiostack
import os

# Set up your API key
audiostack.api_key = os.environ['AUDIOSTACK_API_KEY']

This snippet imports the Audiostack library and retrieves the API key from your environment variables.

🚧

API Key

Make sure to replace 'AUDIOSTACK_API_KEY' with your actual API key. You can find out more about setting API keys at https://blog.streamlit.io/8-tips-for-securely-using-api-keys/

Defining the Text Content

Next, define the text that you want to convert to speech. Audiostack allows you to structure your text using sections and subsections, making it easy to assign different voices to different parts of the text. Here’s an example:

text = """
    <as:section name="section_1" soundSegment="intro">
       Who was your footballing hero?
    </as:section>

    <as:section name="section_2" soundSegment="main">
        <as:sub name="sub_1"> Puyol from Barcelona and Maldini from Milan </as:sub>
        <as:sub name="sub_2"> The Barca the dream team that won the first European cup was my team and the hero
was I'd say Ronald Koeman, who was the centre back for Barcelona at the time but I had so many heroes
at that team which was Michael Laudrup, you've got Pep Guardiola, you've got so many
but Ronald Koeman was always the one for me. </as:sub>
    </as:section>
"""

In this script, the text is divided into sections and subsections, each with a unique name and content. This structure allows for precise control over the audio output.

You can see here that the first section has the question - and the second sections have the answers. I've also broken up the second section into two sub sections using <as:sub name="sub_1">

🚧

Close your tags

One problem you may run into is forgetting to close your tags. Remember if you use <as:sub name="sub_1"> to also use </as:sub>

Creating the Script Object

Once the text is defined, create a script object using Audiostack’s Content.Script.create method:

script = audiostack.Content.Script.create(scriptText=text)

Generating Speech with Different Voices

With the script object ready, you can now convert the text to speech. Audiostack allows you to specify different voices for different sections and subsections of the text. Here’s how:

speech = audiostack.Speech.TTS.create(scriptItem=script, voice="wren", sections={
    "section_1" : {"voice" : "Sara"},
    "sub_1" : {"voice" : "Lambros"},
    "sub_2" : {"voice" : "Elver"},
})

In this example, the main voice is set to "wren," but specific sections and subsections use "Sara," "Lambros," and "Elver."

📘

Try out other voices

We have hundreds of voices - try out more at http://library.audiostack.ai

Creating and Mastering the Audio Mix

Next, create an audio mix from the generated speech. Audiostack’s Production.Mix.create method helps you compile and master the audio content:

print(f"Creating your mix...")
mix = audiostack.Production.Mix.create(
    speechItem=speech,
    masteringPreset="radio"
)
print(mix)

This step combines the speech segments into a single audio file and applies a mastering preset to ensure the audio quality is suitable for radio.

You can find out more about mastering presets at Smart Mixing and Mastering

Encoding and Downloading the Audio File

Finally, encode the audio mix into your desired format and make it publicly accessible. Here’s how to encode the mix into an MP3 file:

encoder = audiostack.Delivery.Encoder.encode_mix(productionItem=mix, preset="mp3", public=True)
print("MP3 file URL:", encoder.url)

The encode_mix method converts the mix into an MP3 file and provides a URL to download it.

Conclusion

By following these steps, you can use Audiostack to create dynamic and professional-quality audio content. Whether you’re producing a podcast, creating an audiobook, or generating voice-overs, Audiostack’s API offers the tools you need to bring your text to life with engaging audio. Give it a try and see how easy it is to produce high-quality audio content with Audiostack!


What’s Next

You should have a look at Concepts to learn more or our FAQs