AudioStack: Concepts

The power of AudioStack starts where text-to-speech solutions stop. πŸš€

πŸ“˜

This section "AudioStack Concepts" is about discussion and explanation.

This helps you understand the underlying concepts. For Tutorials to help you get started, see the Tutorials section.

Introduction to some AudioStack concepts

This section is all about what concepts you'll need to understand to use AudioStack. We'll link to some other material, but after reading this you should have a solid understanding of the concepts behind AudioStack

What can AudioStack do?

The AudioStack platform brings together powerful functionality across the end-to-end audio production process to empower you to create beautiful, professional-sounding audio in seconds.

Audio Production process

Firstly it's good to root ourselves in the existing audio-production process. If you're an expert on the audio production process you can skip this.

Audio production is the art and science of sound recording, editing, and mixing. It can consist of many different projects like films, music, video games, TV advertisements, corporate videos, podcasts, and more. The main formats that are used for these productions are digital recordings.

Normally this process looks something like


The AudioStack Platform enriches each of these steps using AI and similar ML technologies but furthermore all of this is brought into software based workflows, allowing scalability, reliability and auditability.

Content

The AudioStack platform enables you to create and manage content using AI enhanced workflows. The core unit is the script, the platform also allows you to ingest your existing content.

For more details look at

Speech

The AudioStack platform allows a range of speech management and speech generation functionality. This includes


Production

Our proprietary production system consists of a mixing and mastering engine in the cloud, similar to how DAWS work.

This consists of (amongst others)

Delivery

Finally our delivery endpoints allow you to do the following

  • Audio assets in a number of file formats
  • Video assets for social media or internal use
  • Audio Adverts with VAST tags

What is LUFS anyway?

LUFS stands for Loudness Units Full Scale. It's a standardized measurement of audio loudness that takes into account human perception of volume. LUFS is widely used in broadcast, streaming, and music production to ensure consistent loudness across different audio content.
Key points about LUFS:

  • Perceptual measurement: Unlike peak meters that measure instantaneous signal levels, LUFS measures how loud audio content sounds to the human ear over time.
  • Integrated vs. Short-term: LUFS can be measured as an average over an entire track (Integrated LUFS) or over shorter periods (Short-term LUFS).
  • Industry standards: Many platforms and broadcasters have specific LUFS targets. For example, YouTube normalizes audio to -14 LUFS, while broadcast standards often target -23 LUFS.
  • Dynamic range: LUFS helps maintain dynamic range while ensuring overall loudness is consistent.
  • Metering: Special meters are used to measure LUFS, often displaying both momentary and integrated readings.
  • Loudness normalization: Streaming platforms use LUFS to adjust the playback volume of different tracks to create a consistent listening experience.
  • Mastering: Audio engineers use LUFS targets when mastering to ensure their content will sound appropriate on various platforms without being compressed or limited by automatic normalization processes.

Understanding and targeting appropriate LUFS levels is crucial for creating audio that translates well across different playback systems and platforms.

AudioStack Tutorials

AudioStack offers a range of different Stacks, which leverage the power of the API to solve use cases across advertising (AdStack), podcasting (PodStack), Dynamic Creative Optimisation (DCOStack) and video voiceover creation (VidStack). Find out more about the different Stacks and Workflows that AudioStack supports here.

You can use AudioStack for a huge range of use cases. Some of the most popular ones include:


What’s Next

Check out our guides to get started generating your first audio: