How to Mix Media Files and Sound Templates: Radio Example

Radio Program example: How to mix media files and sound templates. In this example we make a fictional radio program and we also mix together audio files and sound templates

When producing an audio asset, you might need to combine several different types of audio. For example, your asset could include:
1️⃣ Some recorded content - for example, an interview of someone.

2️⃣ Sounds that you want to create into a sound template.

3️⃣ You may also want certain parts of the radio program to be dependent on the length of the media files. For example transitions after the media files have finished playing.

Create a Radio Program

In this tutorial we're going to create a radio program. The radio program will have two recordings of media, and some space between them.

πŸ“˜

Freesound

You can look at https://freesound.org/ for example media files and recordings. There's also other repositories available

Step 1: Upload the Media Files

First, create and upload the two media files. Here are two examples:

  1. Uploading jukebox.wav:

    import audiostack
    
    f = audiostack.Content.File.create(localPath="jukebox.wav", uploadPath="audiofiles/jukebox.wav", fileType="audio")
    print(f)
    
  2. Uploading valor.mp3:

    import audiostack
    
    f = audiostack.Content.File.create(localPath="valor.mp3", uploadPath="audiofiles/valor.mp3", fileType="audio")
    print(f)
    

Step 2: Create a Sound Template

Next, create a sound template in the API.

  1. Set up the sound template:

    import audiostack
    import os
    
    # Get the API key from environment variables
    audiostack.api_key = os.environ['AUDIOSTACK_API_KEY']
    
    soundtemplatename = "jgt_hot"  # Rename this
    soundtemplatepath = "sound_templates/jgt_hot_intro.mp3"  # Rename this
    
    # Upload the segment to media storage
    response = audiostack.Content.File.create(localPath=soundtemplatepath, uploadPath="sound_templates/jgt_hot_bumper.mp3", fileType="audio")
    
    # Assign the uploaded segment to the template
    response = audiostack.Production.Sound.Segment.create(
        templateName=soundtemplatename, soundSegmentName="intro", mediaId=response.fileId
    )
    
    print("Content uploaded")
    
  2. Create the sound template (only needs to be done once):

    # Run this once. If you get errors, run it again - this just means a template with that name already existed.
    try:
        template = audiostack.Production.Sound.Template.create(templateName=soundtemplatename)
    except Exception as e:
        response = audiostack.Production.Sound.Template.delete(templateName=soundtemplatename)
        raise ValueError("Template already existed, so we cleared it for you. Just re-run the demo.")
    

Step 3: Generate the Radio Program

Here is an explanation of the provided code, broken down into its components:

Importing Required Libraries

import audiostack
import datetime
import os
from dataclasses import asdict
  • audiostack: The main library used for interacting with the AudioStack API.
  • datetime: Used for handling date and time operations.7
  • os: Used to interact with the operating system, particularly for accessing environment variables.
  • asdict from dataclasses: Utility to convert dataclass instances to dictionaries.

Setting Up Constants and API Configuration

VOICE_NAME = "vicki"
MASTERING_PRESET = "voiceenhanced"
BASE_URL = "http://v2.api.audio"
KEY = os.environ['AUDIOSTACK_API_KEY']
audiostack.api_base = BASE_URL
audiostack.api_key = KEY
  • VOICE_NAME and MASTERING_PRESET: Constants specifying the voice for text-to-speech and the audio mastering preset.
  • BASE_URL and KEY: Configuration for the AudioStack API. The API key is retrieved from environment variables and set up for authentication.

Extracting Audio Metadata Function

def extract_audio_metadata(data):
    items = data.get('items', [])
    if items:
        item = items[0]
        file_metadata = item.get('fileMetadata', {})
        length = file_metadata.get('data', {}).get('length', None)
        return file_metadata, length
    else:
        return {}, None
  • extract_audio_metadata(data): This function extracts metadata, particularly the length of the audio file, from the provided data dictionary.

Main Content Creation Function

def run_content():
    text = """
    <as:section name="section1" soundsegment="intro">
        <as:media name="audiofiles/jukebox.wav"> </as:media>
    </as:section>
    <as:section name="section2" soundsegment="main">
        <as:media name="audiofiles/valor.mp3"> </as:media>
    </as:section>
    <as:section name="section3" soundsegment="outro">
    <break time="0s"/>
    </as:section>
    """
  • run_content(): This function orchestrates the creation and handling of the radio content.
  • text: Defines the script structure with sections and media files.

🚧

Media files

You'll need to change your file names from whatever you download your media files as, and also the soundsegments may be named differently.

Creating Script and Generating Speech

    print(f"Creating your script...")
    script = audiostack.Content.Script.create(scriptText=text)
    print(script)

    print(f"Generating speech...")
    speech = audiostack.Speech.TTS.create(scriptItem=script, voice=VOICE_NAME)
    print(speech)
  • Creates a script from the provided text and then generates speech from the script using the specified voice.

Searching for Media Files and Extracting Metadata

    media_file_1 = audiostack.Content.File.search(name="audiofiles/jukebox.wav")
    media_file_2 = audiostack.Content.File.search(name="audiofiles/valor.mp3")
   
    file_metadata_1, audio_length_1 = extract_audio_metadata(media_file_1.data)
    file_metadata_2, audio_length_2 = extract_audio_metadata(media_file_2.data)
  • Searches for media files and extracts metadata, specifically the length of each audio file.

Defining Section Properties

    START_PADDING = 5.0
    START_PADDING_2 = 6.0
    END_AT_1 = START_PADDING + audio_length_1 + 1 
    END_AT_2 = END_AT_1 + audio_length_2 + START_PADDING_2 + 0.00047916667
    SECTIONPROPERTIES = {
        "section1": {"startPadding": START_PADDING, "endAt": END_AT_1, "alignment": "left"},
        "section2": {"startPadding": START_PADDING_2, "endAt": END_AT_2, "alignment": "left"},
        "section3": {"endAt": 123.42566893424036, "alignment": "left"}   
    }
  • Defines the properties of each section, including start padding and end time, ensuring proper alignment and timing.

Creating the Mix and Downloading the Customized Ad

    print(f"Creating your mix...")
    mix = audiostack.Production.Mix.create(
        speechItem=speech,
        soundTemplate="jgt_hot",
        timelineProperties={"fadeIn": 0, "fadeOut": 0},
        masteringPreset=MASTERING_PRESET,
        sectionProperties=SECTIONPROPERTIES
    )
    print(mix)

    print(f"Downloading your customised ad...")
    time_string = datetime.datetime.now().strftime("%Y-%m-%d_%I-%M-%S_%p")
    mix.download(fileName=f"{time_string}_{MASTERING_PRESET}_{VOICE_NAME}")
    print(mix)
  • Creates the mix by combining the speech, sound template, and defined section properties.
  • Downloads the final customized audio file with a timestamped filename.

Running the Script

if __name__ == "__main__":
    run_content()
  • Ensures that the run_content() function is executed when the script is run directly.

This code integrates audio files, text-to-speech, and custom templates to create a fully automated radio program, leveraging the capabilities of the AudioStack API.