Create a Video from an Audio File


A common usecase when working with AudioStack is creating audio adverts or content for social media. Often, the end goal is to share audio content on a video-led platform, such as TikTok or Youtube. This is now easier than ever before, as you can combine your audio with video content directly in the AudioStack API. πŸš€

AudioStack Tutorial: Creating a Video from a Production ID and an Image

In this tutorial, we will walk you through creating a video using the Audiostack API. We'll cover setting up the environment, creating the necessary components (script, speech, mix), and finally generating a video. The parameters used in each step will also be explained.


  • Ensure you have an Audiostack account and API key.
  • Make sure AudioStack is installed:
    pip3 install audiostack

Setting Up the Environment

First, configure the Audiostack API by specifying the base URL and your API key.

import audiostack
import os

audiostack.api_base = ""
audiostack.api_key = os.environ['AUDIOSTACK_API_KEY']

Make sure your API key is stored in your environment variables.


Having trouble getting started?

Check out our more detailed Get Started guides for help - we have guides specifically for developers and complete beginners.

Step 1: Creating the Script

Define the script that will be converted to speech and included in your video. This script contains the text that the voice will read.

template = "positive_pop"
preset = "radio"
voice = "wren"

script_text = f"""
<as:section name="main" soundsegment="main">
The video feature is out, you can call it using audiostack.Delivery.Video.create options.
The audio from the video can be passed either as a productionId, or from any audio file in your file manager.
The video can be passed as a videoFileId from the file manager or by default an audiostack image background will be chosen.

script = audiostack.Content.Script.create(
    scriptText=script_text, scriptName="demo", projectName="demo"


  • scriptText: The text content of your script.
  • scriptName: A name for your script.
  • projectName: The project to which this script belongs.

Step 2: Generating Speech from the Script

Convert the script into speech using Text-to-Speech (TTS).

speech = audiostack.Speech.TTS.create(


  • scriptId: The ID of the script created in the previous step.
  • voice: The voice used for TTS (e.g., "george").
  • speed: The speed of the speech.

Step 3: Creating a Mix

Combine the speech with sound effects to create a mix.

mix = audiostack.Production.Mix.create(


  • speechId: The ID of the generated speech.
  • soundTemplate: The template for sound effects.
  • masteringPreset: The preset used for mastering (e.g., "radio").

Step 4: Creating the Video

Generate a video using the production ID and an optional image.

video = audiostack.Delivery.Video.create_from_production_and_image(
print("vid1 response: ",
print("file downloaded to output1.mp4")


  • productionId: The ID of the production mix.
  • public: Boolean value to determine if the video is public.


You've successfully created a video using the Audiostack API! The steps included setting up the environment, creating a script, generating speech, creating a mix, and finally, generating and downloading the video.

For further customization, you can explore more options and parameters in the AudioStack documentation.

Here's how it looks