Voice permissions updated

We've updated our permissions in our voices so we're now enabling more of our premium and high quality voices to all plans. This is a much requested feature. 💯

Improved script sections functionality

Create a single section of a text-to-speech resource.

https://docs.audiostack.ai/reference/postspeechsection

Predicting the length of speech (based on text)

One problem that a lot of our customers have noted is that it's hard to tell how long a particular voice will work with the text provided, as there's variance in speech rates. So we produced a proprietary ML model based on our customer data to ship this feature.

Furthermore this will keep getting better as our usage of voices grows.

url = "<https://v2.api.audio/speech/predict>"  
r = requests.post(url=url, headers=headers, json={"voice": voice, "text": f})

Uploading custom sound templates

https://docs.audiostack.ai/docs/custom-sound-design-templates

Many of our customers asked how to upload custom sound beds or custom sound templates. You can see this above

 Voice pipeline improvements

We've made some improvements to our voice pipeline so our voice cloning engine works 3x better. We're constantly working on these improvements to our infrastructure

Mastering engine performance improvements

We improved the reliability of our mastering engine so you can produce more beautiful audio at scale.

Through our partnership with Eleven Labs, we bring you cutting-edge voices equipped with the groundbreaking feature of Multilingual Synthetic Models. With an extensive selection of languages such as English, Spanish, German, Italian, French, Portuguese, Polish, and Hindi, our MultiLingual voices ensure an immersive and localized experience. Now, effortlessly generate scripts in different languages or combine them using the following code example to leverage the power of Eleven Labs' voices within Audiostack.ai:

import audiostack
import os


audiostack.api_key = "APIKEY" # fill up



script = """
<as:section name="main" soundsegment="main">
Un homme armé d’un couteau a semé la terreur jeudi 8 juin au matin dans un parc sur les bords du lac d’Annecy, blessant grièvement plusieurs enfants, avant d’être interpellé. Le Monde fait le point sur ce que l’on sait.
Das Brandenburger Tor ist eine bekannte Sehenswürdigkeit in Berlin und in der Geschichte Deutschlands von großer Bedeutung. Das Tor wurde im 18. Jahrhundert erbaut und war damals das Eingangstor zur Stadt. Es wurde von dem preußischen König Friedrich Wilhelm II gebaut..

"""



names = ["aspen"]
presets = ["musicenhanced", "balanced", "voiceenhanced"]
templates = ["solution_zen_30"]



script = audiostack.Content.Script.create(scriptText=script, scriptName="multilingual_test", projectName="multilingual_test")        

for name in names:
    # Creates text to speech
    speech = audiostack.Speech.TTS.create(
            scriptItem=script,
            voice=name,
            speed=1
    )
    for template in templates:

        for preset in presets:

            mix = audiostack.Production.Mix.create(
                speechItem=speech,
                soundTemplate=template,
                masteringPreset=preset, public=True
            )
            print(mix)
  

            mix.download(fileName=f"french_{name}_{template}_{preset}")
            encoded = audiostack.Delivery.Encoder.encode_mix(productionId=mix.productionId, preset="mp3")
            print(encoded)
            encoded.download()
            

Immerse your audience in a truly global audio experience, breaking language barriers with Audiostack's Multilingual Models powered by Eleven Labs.They are available in all our Paid Plans, try it here Audiostack.

SonicSell V3

We made a number of changes to SonicSell (you'll need to reach out to us to get access it's still in Beta)

This v3 has the following updates:

  • German frontend (change flag in the top right corner) 🇩🇪

  • German voices German prompting
  • Generative script creation
  • Ad database

  • 700 Voices selectable now
  • Length & character estimation

Coming next:

  • Advanced mode
  • Multi-version creation (on default)
  • Mood & tone input
  • More languages
  • Dynamic parameters / versioning
  • Time Out and other fixes (*if ad is not created in 30 seconds please refresh or start new tab)

Breaking Change

We refactored and redesigned our Voice Intelligence Layer to use just a voiceIntelligence boolean and not specify the inner workings of the Dictionary or normaliser. We did this to enhance the Developer Experience

voiceIntelligence: bool = False

We deprecated:

"useDictionary": useDictionary,  
 "useTextNormalizer": useTextNormalizer

So

  speech = audiostack.Speech.TTS.create(
            scriptItem=script,
            voice=name,
            speed=100,
            voiceIntelligence="true"
    )

Not

  speech = audiostack.Speech.TTS.create(
            scriptItem=script,
            voice=name,
            speed=100,
            useTextNormalizer=True,
            useDictionary=True
    )

 SonicSell for all AudioStack paid users

Exciting news! Our innovative AI audio ad tool, SonicSell, is now available for all paid AudioStack users. SonicSell generates radio-ready ads in just 30 seconds, leveraging AI, synthetic voices, and generative music. Experience the future of efficient, high-quality audio ad creation today. We can’t wait to hear your feedback.

Julep Connector

https://docs.audiostack.ai/reference/postjulep

We implemented a much requested customer feature 💯

  • You can now send your audio content to the Julep Podcast network, you just need your API key and then you can roll.

Noise gate

We've been working hard on our audio intelligence, we just added a noise gate which will help with noisy voices. This should ensure your audio experience is even better.

 Performance enhancements

We've been hard at work on our performance of our system. More on that soon but we'r

Voice Update:

Effective from June 5th, our partner WellSaid Labs will be retiring the voice Narration_roxy from their platform, and consecutively from Audiostack Voice Library. We recommend you try Narration_fiona as the closest alternative.

We'll be adding some new voices in the near future

Doc updates

We made some excellent additions to docs. 💯

📔

Dynamic Creative Optimisation 📣

One of the many reasons for creating a script is so that it can be reused for different voices, and also for different personalisation parameters.

How to do dynamic creative optimisation https://docs.audiostack.ai/docs/dynamic-creative-optimisation-dco

Multispeaker support 🎉

Make your AI voices speak to each other!

https://docs.audiostack.ai/docs/multi-speaker-support

Enhanced mastering and timing parameters 🎓

An advanced feature

https://docs.audiostack.ai/docs/advance-timing-parameters

Other improvements

  • We've also been heavily involved in making our systems more stable and enhancing our security and control systems.

Validation

We have added a validation route in mastering:
https://docs.audiostack.ai/reference/validatemix
This allows you to validate your mastering request before sending it (and consuming credits) great for use cases involving user defined start and end values.

Updated voice library page

We've been working hard on making our voices easier to discover. We've invested in the following key features which you can see on https://library.audiostack.ai/

These include

  • Tagging and metadata - all our 600+ voices are tagged and up to date, so you can pick the perfect voice for your audio.
  • Search and discoverability, it was our most requested feature!
  • You can specify by language and provider and various other tags.
  • The website is also 👌and beautiful with responsive design.

Image: Showing the library with the search functionality

Audio Ad engine webpage

We updated our https://aflorithmic.ai/audio-ad-engine have a look!

 Enhanced audio quality

We've been hard at work over the last few months working on enhanced audio quality. This involved a complete rewrite of our audio engine.

You can look at some of the code here Mixing

Here's a demo example in some code. Run this and listen to how awesome the audio sounds 🎧

  • We've added enhanced plugins - you'll see some presets below they are aimed at making your audio sound superb.
  • We're working hard on adding more sound templates and will be adding more to this in the future.
import audiostack
import os
from uuid import uuid4

audiostack.api_key = "APIKEY" # Add your API key here

script = """

<as:section name="main" soundsegment="main">
 Are you ready to explore the vibrant city of Barcelona? Do you want to experience the culture, the nightlife, and the beauty of this incredible city? 
 Then we've got just the thing for you! 
 Join our travel agency for an unforgettable trip to Barcelona. Experience the bustling city streets, the stunning canals, and the charming architecture that Amsterdam is known for. Get lost in the vibrant nightlife, 
 explore the world-renowned museums, or simply soak in the local culture.
</as:section>


"""



names = ["Wren",  "jollie", "aspen", "monica"]
presets = ["musicenhanced", "balanced", "voiceenhanced"]
templates = ["your_take_30","listen_up_30", "future_focus_30"]



script = audiostack.Content.Script.create(scriptText=script, scriptName="test", projectName="ams_tests_2")        

for name in names:
    # Creates text to speech
    speech = audiostack.Speech.TTS.create(
            scriptItem=script,
            voice=name,
            speed=100
    )
    for template in templates:

        for preset in presets:

            mix = audiostack.Production.Mix.create(
                speechItem=speech,
                soundTemplate=template,
                masteringPreset=preset,
            )
            print(mix)
            uuid = uuid4()

        

            mix.download(fileName=f"V4_{name}_{template}_{preset}")

            print(mix)
            
            

Pricing updates

We are constantly investing in better products and services for our customers so we'll occasionally be changing some things in our pricing systems

  • We recently updated a few things in our pricing systems, this is to improve the customer experience, you shouldn't notice much change except a few cheaper endpoints.
  • We changed the £50 extra credits limit to £300 - so you'll be charged less frequently. This is due to customer feedback 💯
  • We also lowered some of our pricing on some voices - you can see the updates on the pricing page https://audiostack.readme.io/docs/pricing

Voices Library

We recently completed a full rewrite of our voices library. This is to allow us to handle security better - we take this seriously, and also to allow users to find voices faster.

  • We redesigned our library management system using Contentful, adding validation.
  • We added an integration between our voice systems and our search engine - which will enhance voice discoverability in our frontends.
  • We also added enhanced voice permissions, authentication and integrated with our user and organisation permission system.

Lots of this is under the hood, but it's an example of the sort of customer experience and operational excellence processes we're investing in. Well done to all in the team! Migrations are hard! 🗻

Voice Intelligence layer - Abbreviations and Ordinal numbers

We've been hard at work on our voice intelligence layer 👌

Ordinal Numbers

📘

Definition of Ordinal numbers

A number defining the position of something in a series, such as ‘first’, ‘second’, or ‘third’. Ordinal numbers are used as adjectives, nouns, and pronouns.

The Voice Intelligence Layer is now able to normalise ordinal numbers in German 🇩🇪. It covers all ordinal numbers in the format X. that are used as adjectives. Here’s an example:

import audiostack
import os

audiostack.api_key = os.environ["AUDIO_STACK_DEV_KEY"]
text = "Ich war wie im 7. Himmel"  

script = audiostack.Content.Script.create(scriptText=text)
tts = audiostack.Speech.TTS.create(scriptItem=script, voice="lena", useDictionary= True, useTextNormalizer= True)
print(tts.data['sections'][0]['preview'])

Out:
"Ich war wie im siebten Himmel."

Abbreviations

The Voice Intelligence Layer is now able to expand German abbreviations to their full form. It covers the 130 most frequent abbreviations in German. Here’s an example:

Here's some examples

import audiostack
import os

audiostack.api_key = os.environ["AUDIO_STACK_DEV_KEY"]
text = "Ich kenne eine Abk. zum Bhf. von Ulm."  

script = audiostack.Content.Script.create(scriptText=text)
tts = audiostack.Speech.TTS.create(scriptItem=script, voice="lena", useDictionary= True, useTextNormalizer= True)
print(tts.data['sections'][0]['preview'])

Out:
"Ich kenne eine Abkürzung zum Bahnhof von Ulm."

JavaScript SDK

After launching AudioStack API & Python SDK, we've continuously worked on more features to free up your engineering resources. Now you can easily integrate our API using our new JS SDK, a lightweight library that makes it easy to create professional audio assets within seconds.
We've design it to simplify the coding process - create a script, choose one of our 600+ voices, add a sound design and voila - you can encode your audio file as a high quality mp3 and download it. This allows you to focus on creating great audio experiences without having to worry about the underlying code. 🚀 💯

https://www.npmjs.com/package/@aflr/audiostack You can find it on NPM here

You can see an example here - https://stackblitz.com/edit/node-rwvl9n?file=example.js

How to install it!

npm install @aflr/audiostack

# or

yarn install @aflr/audiostack

How to use it

/**
 * Audiostack Hello World Example
 *
 * Add your API key on line 12 and run
 * node example.js
 */

// Import the library
const { Audiostack } = require('@aflr/audiostack');

// Provide your api key from the Audiostack Console
const apiKey = 'your_api_key';

/**
 * This example demonstrates how to produce and download a file using the Audiostack API.
 */
const example = async () => {
  // Create a new instance of Audiostack
  const AS = new Audiostack(apiKey);

  // Create a script asset with our hello text
  const script = await AS.Content.Script.create({
    scriptText:
      'Hello from Audiostack. This is a test script. I hope you enjoy it.',
  });

  // Create a speech asset using our script and the voice "sara"
  const tts = await AS.Speech.Tts.create({
    scriptId: script.scriptId,
    voice: 'sara',
  });

  // Create a mix with our speech asset and add the sound template "3am"
  const mix = await AS.Production.Mix.create({
    speechId: tts.speechId,
    soundTemplate: '3am',
  });

  // Encode the file to high quality mp3
  const encode = await AS.Delivery.Encoder.encodeMix({
    productionId: mix.productionId,
    preset: 'mp3_high',
  });

  // Download the file to the project directory
  const path = await encode.download();

  // Print the path to the file
  console.log(`File downloaded to: ${path}`);
};

// Run the example
example();

We look forward to seeing feedback and seeing what YOU build with our SDK.

Voice Cloner Re-vamp:

  • Introduction of Voice Cloning Protocols
    (Speed, Standard and Premium)
  • 📘

    Speed, Standard, Premium

    Speed is the basic protocol

    The standard protocol contains longer and more thorough scripts which improve the quality of the voice model.

    The premium protocol is our best quality voices.

End to End German Voice Pipeline

  • For 🇩🇪 speaking customers we now have a fully end to end german voice pipeline. So you can record your voice and get a synthetic representation in less than 48 hours. We're continuously working on improving this.

Voice Cloning Billing

We now have enhanced billing for our voice cloning experience.

 Dynamic Scripts

  • We have automatic insertion of First Name, Last Name and Org Name into the script so you can personalise your script to your own use case!

New Console Page

  • Better UI/UX
  • Real-time progress monitoring

Here's some screenshots!

Showing the plan and the number of organisations

Showing the plan and the number of organisations

Showing your monthly progress and your credit usage

Showing your monthly progress and your credit usage

New Account Management Page includes:

  • Improved UI/UX
  • Improved Analytics (Last Activity, Number of credits spent by User, Organisation and Account)

Here's a screenshot

Improved Billing Experience

We recently rolled out improvements in our billing experience. This might cause you some differences in your credit costs.

❗️

Changes in billing (costs will change)

You may notice some changes in your billing with these new endpoints.

ProviderCost in credits per 10 seconds
Messner1.5
Resemble9
Deepzen18
Polly1
IBM1
Cerevoice (cereproc)1.2
Azure1.2
Google1
Wellsaid9
ElevenLabs12

 Bug fixes

  • We fixed a bug in applying sound effects to messner voices. Now this successfully works.