Introduction to Presets

Presets are pre-made processing chains with pre-configured settings that can be applied to vocals, instruments, or any audio tracks. Essentially, they’re templates that are crafted by our in-house audio experts to make sure your audio sounds just right, whether it's a podcast you're creating or a radio ad.

Encoding Presets

The AudioStack API gives you access to a range of different encoding presets. These determine factors like the sample rate and format of your delivered file. For most casual uses, such as sharing a demo with a colleague by email, MP3 is considered standard. When you need higher quality audio, such as for professional editing, an uncompressed format (such as WAV) or lossless format (such as FLAC) is usually more appropriate.

PresetDescription
mp3mp3 format 320k (320 CBR)
Low audio quality, low file size. So really useful for low quality streaming (e.g. over 3G or when quality doesn’t matter). Supported universally across platforms and operating systems, so good for sending files on mobile devices, putting on websites etc
wavwav format at 48 kHz sample rate and 16 bits per sample.
High quality, large file size, uncompressed. Great for high quality storage and widely supported.
oggogg format at 320k
Higher quality audio than MP3 (320kbps) but similar file size, so can be useful for developers trying to distribute sounds! However, not universally compatible, so better suited to e.g. audio streaming applications.
flacflac format at 48 kHz sample rate and 16 bits per sample.
High quality, large file size. Lossless (but compressed, so a smaller file size than WAV). So great for storing professional quality recordings, but less useful for consumer streaming. Open source so generally well supported.
mp3_very_lowmp3 lowest quality (~64 kbps VBR)
mp3_lowmp3 low quality (~115 kbps VBR)
mp3_mediummp3 medium quality (~165 kbps VBR)
mp3_highmp3 high quality (~190 kbps VBR)
mp3_very_highmp3 very high quality (~245 kbps VBR)
mp3_alexamp3 format mono at 48kHz sample rate
mp3_alexa_48brmp3 format mono at 48 bit rate and 24kHz sample rate
m4am4a format at 320k

Loudness Presets

It's also possible to specify the loudness required. This is particularly useful when you're planning to deliver the same asset across different media - for example, to deliver an ad for both Spotify and Radio broadcast.

PresetDescription
spotify-16 LUFS Loudness Integrated and -2 dB True Peak
radio-8 LUFS Loudness Integrated and -2 dB True Peak
podcast-16 LUFS Loudness Integrated and -3 dB True Peak
applePodcast-16 LUFS Loudness Integrated and -1 dB True Peak
youtube-14 LUFS Loudness Integrated and -1 dB True Peak
lowVol-20 LUFS Loudness Integrated and -5 dB True Peak
podcastDynamic-18 LUFS Loudness Integrated and -1 dB True Peak

🚧

Some delivery settings lead to higher quality sounding audio than others

For radio broadcasting, audio assets need to be particularly loud. For other use cases, we'd recommend avoiding the radio preset and choosing the description that's most appropriate to you.

Advanced Concepts

📘

This section is optional, and gives more detailed information about the concepts behind AudioStack's Delivery Presets.

You don't need a detailed understanding of this terminology to work with the API.

If you're not an audio expert 🤓 you may need to understand some terminology.

CBR (Constant Bit Rate)
CBR stands for Constant Bit Rate. It refers to a method of encoding audio (or video) where the bit rate remains consistent throughout the entire file. This means that every second of the audio will have the same amount of data, ensuring a predictable file size and a consistent quality level. CBR is useful for streaming and scenarios where maintaining a steady bit rate is important.

VBR (Variable Bit Rate)
VBR stands for Variable Bit Rate. In this encoding method, the bit rate varies depending on the complexity of the audio at any given moment. During complex sections, the bit rate increases to maintain quality, while during simpler sections, it decreases to save space. VBR generally results in better overall audio quality and smaller file sizes compared to CBR, but with less predictability in file size.

Audio Codec
An audio codec is a software or hardware tool that encodes and decodes audio data. Codecs compress audio files to reduce their size and decompress them for playback. They are essential for converting raw audio into formats suitable for storage, transmission, and playback. Common audio codecs include MP3, AAC, Ogg Vorbis, FLAC, and WAV.

🚧

Codec and Preset

In the AudioStack API we refer to Presets for mp3, aac and oggfor example. You'll just have to mentally remember that codec is called preset

Bit Rate
Bit rate refers to the amount of data processed per unit of time in an audio file, usually measured in kilobits per second (kbps). It indicates the audio quality and file size: higher bit rates mean better sound quality and larger files, while lower bit rates reduce both quality and size. For example, a 320 kbps MP3 file has higher quality and larger size than a 128 kbps MP3 file.

Sample Rate
Sample rate is the number of samples of audio carried per second, measured in hertz (Hz). It determines the frequency range that can be accurately represented in a digital audio file. Common sample rates include 44.1 kHz (standard for CDs), 48 kHz (used in professional audio and video production), and higher rates like 96 kHz or 192 kHz for high-resolution audio. Higher sample rates can capture more detail but result in larger files.

In summary, CBR and VBR are methods of encoding bit rates in audio files, with CBR maintaining a constant bit rate and VBR adjusting it as needed. An audio codec is a tool for encoding and decoding audio data, while bit rate and sample rate are metrics that affect the quality and size of audio files.

LUFS

LUFS (Loudness Units relative to Full Scale) is a standard measurement unit for perceived loudness in audio. It is designed to reflect how humans perceive sound, taking into account the sensitivity of human hearing across different frequencies. LUFS is widely used in broadcast and streaming industries to ensure consistent audio levels across different content. Here’s a breakdown of its key aspects:

Key Aspects of LUFS

  1. Perceived Loudness:

    • LUFS measures loudness in a way that approximates human hearing, which is more sensitive to certain frequencies. This makes LUFS a more accurate representation of how loud audio actually sounds to a listener, compared to traditional measurements like decibels (dB).
  2. Integrated Loudness:

    • Integrated loudness is the average loudness over the entire duration of an audio track or broadcast. It provides a comprehensive measure of the overall loudness of a piece of content.
  3. Short-Term and Momentary Loudness:

    • Short-term loudness measures the average loudness over a short period, typically three seconds. Momentary loudness measures the loudness over an even shorter period, typically 400 milliseconds. These metrics help in monitoring and controlling the loudness of dynamic content.
  4. True Peak Level:

    • LUFS also considers the true peak level, which is the highest point in an audio signal. This ensures that the audio does not distort or clip when played back on different systems.

Applications of LUFS

  • Broadcasting:

    • Ensures consistent loudness levels across different programs and channels, preventing sudden jumps in volume between commercials and regular programming.
  • Streaming Services:

    • Platforms like Spotify, YouTube, and Apple Music use LUFS to normalize loudness across different tracks, ensuring a consistent listening experience for users.
  • Podcasting:

    • Helps maintain consistent audio levels across different episodes and segments, improving listener experience.
  • Post-Production:

    • Used by audio engineers to ensure that final mixes meet industry loudness standards, making them suitable for various playback environments. Our Audio Mastering and Mixing engine in the AudioStack platform does this all for you!

Industry Standards

Different industries have adopted specific LUFS targets to standardize loudness levels:

  • Broadcast: Typically around -23 LUFS.
  • Streaming (Spotify, Apple Music): Around -14 to -16 LUFS.
  • Podcasts: Often around -16 LUFS.

In summary, LUFS is a crucial measurement in the audio industry, providing a standardized way to measure and control perceived loudness, ensuring a consistent and enjoyable listening experience across various platforms and devices.

You can see also that we've adjusted our presets for the industry standards, we've tested these in the wild with customers so you don't need to worry about this. As new platforms come on, or we have adjustments in the industry standards we will adjust these for you and add new presets.

True Peak

True Peak in audio refers to the highest level of an audio signal, taking into account the inter-sample peaks that can occur between the discrete samples of a digital audio file. It provides a more accurate representation of the maximum amplitude that the signal reaches, which is crucial for preventing distortion during playback, especially in digital-to-analog conversion.

Key Aspects of True Peak

Inter-Sample Peaks:

In digital audio, the signal is represented by discrete samples. However, the analog waveform that gets reconstructed from these samples can exceed the digital sample values, creating peaks that are higher than the highest sample value. These are called inter-sample peaks.
Measurement:

True Peak is measured using specialized meters that can interpolate the signal between the samples to detect these inter-sample peaks. This provides a more accurate measure of the signal's peak levels compared to traditional peak meters that only measure sample values.
Digital-to-Analog Conversion:

During playback, digital audio is converted to analog form. If the inter-sample peaks are not accounted for, this conversion can result in clipping and distortion. True Peak measurement helps in avoiding this by ensuring that the signal does not exceed the maximum allowable level during conversion.
Standards and Practices:

Various audio standards recommend True Peak measurements to ensure audio quality. For instance, broadcasting and streaming platforms often set maximum True Peak levels (e.g., -1 dBTP or -2 dBTP) to prevent distortion and ensure consistent playback quality across different devices.
True Peak Level (dBTP):

True Peak levels are expressed in decibels relative to Full Scale (dBTP). For example, a True Peak level of -1 dBTP indicates that the peak level of the audio signal is 1 decibel below the maximum possible level that can be represented in a digital format.

Applications of True Peak

Broadcasting:

Ensuring that audio content adheres to True Peak standards prevents distortion when broadcast over various channels. We handle this for you in the AudioStack API, and even let you customize your own if you have a specific need.

📘

Reach out to support if we don't support your needs

If you find a specific set of true peak parameters that we don't support, reach out to support[at]audiostack[dot]ai and we'll be sure to implement it for you. Please share details about the platform needs as well!

Music Production:

True Peak measurement helps in mastering audio tracks to ensure they sound good across different playback systems without clipping or distortion.
Streaming Services:

Services like Spotify and Apple Music use True Peak limits to normalize audio levels and prevent distortion during playback on different devices.

Post-Production:

Audio engineers use True Peak metering to ensure that final mixes are within acceptable peak levels, maintaining audio integrity.

Importance of True Peak

Prevents Clipping:

By accurately measuring and limiting the True Peak level, audio engineers can prevent clipping, which causes unpleasant distortion.

Ensures Consistent Playback Quality:

Maintaining appropriate True Peak levels ensures that audio sounds consistent and undistorted across various playback systems and environments.

📘

True Peak

True Peak is a critical measurement in audio production that accounts for inter-sample peaks to ensure high-quality, distortion-free playback of digital audio.