AudioStack allows you to organize content in a way that is optimized to programatically create professional sounding audio.

Overview

AudioStack is inspired by “traditional” audio production processes. The API is organized around 4 main steps: Content, Speech, Production and Delivery.

First, create a script, or upload other source materials and assets. Then generate speech from text (text-to-speech aka TTS) or from an uploaded file (speech-to-speech or STS). Next, select background music for your content from a growing list of professional sound designs, or upload your own. Once selected, our sound design engine automatically mix and master the speech and music, applying EQ, filters, fades, and effects to enhance the sound of the overall track.

Content

The most common kind of content used is a script. The script is the basis for your audio file, as it determines the speech that is created. To start creating an audio asset with generated speech, create a script. Each script has a unique scriptID so that you can refer to it throughout the production process (for example, when deciding which script a given voice should read). It's possible to create projects to organise your scripts.

You can also upload existing media files and manage them, either via the API or by visiting platform.audiostack.ai and logging in. See Files & Foldersfor further information.

Speech

With AudioStack, you can create natural sounding synthetic speech, and also specify things like speed of the voice. If you want just to use the best text-to-speech voices in the market with an easy to use API, you've come to the right place! We have a voice library that stores 1750+ expertly curated voices from the best voice providers in the market.

Production

AudioStack's advanced production capabilities allows you to improve the quality of your audio by performing processing tasks such as denoising. The automatic mixing service ensures that your audio always sounds clear and broadcast-ready.

Optionally, you can add background music and sound effects to your audio from our sound library. There are currently over 1,000 music tracks to choose from.

Delivery

Finally, a file (eg an .mp3 or a .wav file) is created. In order to make it sound like a professional audio production, it usually requires various effects to be applied. This is commonly referred to as post processing or mastering. The AudioStack engine handles this automatically - though you can specify how you want your audio to be encoded using presets.

Core resources

AudioStack relies on a hierarchical data model which makes it easy for you to administer content and keep workloads low while making sure your user data and/or your users’ data is safe. The latter is important as content can be sensitive: most musical assets are subject to licensing and voice data (as well as any voice model imitating that voice) is usually considered personal identifiable information (for more information see our security and ethics section).

Organisation

When you sign up for AudioStack, an organisation is created. Users on a paid plan receive an API key which grants access to the organisation and all data and information contained within the organisation. Unless you are in the enterprise plan you can only create one single organisation and can not share any information outside that organisation. Hence the name of your organisation is normally your company or user name.

Should you need more than one organisation (typically needed when your application allows your users to create audio), you will need to sign up for our Enterprise solution. This will enable you to issue API keys, create organisations for your users, share certain resources between the organisations of your users, take care of invoicing, and reporting requirements.

📘
Your organisation determines what data you can access and, along with your user permissions, what AudioStack functionality you have access to
Find out more about role based access in our User Management guide.

Project

Within the organisation, you can create any number of projects that can help you to keep your content organised. Find out more about projects here.

Module

Within a project, you can create any number of modules. A module can be used for example to thematically group together different versions of the same content. Find out more about modules here.

Script

As mentioned above, a script is an annotated piece of written content that will ultimately be rendered into an audio file. Within a module, you can create any number of scripts.

ScriptSection

A script can be divided into script sections. A script section does not only help to organise content but also makes it possible to change parameter settings within a script (e.g. you need to create new script section to switch a speaker).

Architecture Deep Dive

Overview

Content

Speech

Production

Delivery

Core resources

Organisation

📘
Your organisation determines what data you can access and, along with your user permissions, what AudioStack functionality you have access to

Project

Module

Script

ScriptSection

Overview

Content

Speech

Production

Delivery

Core resources

Organisation

📘Your organisation determines what data you can access and, along with your user permissions, what AudioStack functionality you have access to

Project

Module

Script

ScriptSection

📘
Your organisation determines what data you can access and, along with your user permissions, what AudioStack functionality you have access to