over 2 years ago

16th June - Voice Permissions updated, Custom sound templates, Predicting the length of speech

by Peadar Coyle

Voice permissions updated

We've updated our permissions in our voices so we're now enabling more of our premium and high quality voices to all plans. This is a much requested feature. 💯

Improved script sections functionality

Create a single section of a text-to-speech resource.

https://docs.audiostack.ai/reference/postspeechsection

Predicting the length of speech (based on text)

One problem that a lot of our customers have noted is that it's hard to tell how long a particular voice will work with the text provided, as there's variance in speech rates. So we produced a proprietary ML model based on our customer data to ship this feature.

Furthermore this will keep getting better as our usage of voices grows.

url = "<https://v2.api.audio/speech/predict>"  
r = requests.post(url=url, headers=headers, json={"voice": voice, "text": f})

Uploading custom sound templates

https://docs.audiostack.ai/v1.0/docs/custom-sound-design-templates

Many of our customers asked how to upload custom sound beds or custom sound templates. You can see this above

Voice pipeline improvements

We've made some improvements to our voice pipeline so our voice cloning engine works 3x better. We're constantly working on these improvements to our infrastructure

Mastering engine performance improvements

We improved the reliability of our mastering engine so you can produce more beautiful audio at scale.

over 2 years ago

9th June - Introducing Multilingual Models in Audiostack; SonicSell v3

by Maria Chatzi

Through our partnership with Eleven Labs, we bring you cutting-edge voices equipped with the groundbreaking feature of Multilingual Synthetic Models. With an extensive selection of languages such as English, Spanish, German, Italian, French, Portuguese, Polish, and Hindi, our MultiLingual voices ensure an immersive and localized experience. Now, effortlessly generate scripts in different languages or combine them using the following code example to leverage the power of Eleven Labs' voices within Audiostack.ai:

import audiostack
import os


audiostack.api_key = "APIKEY" # fill up



script = """
<as:section name="main" soundsegment="main">
Un homme armé d’un couteau a semé la terreur jeudi 8 juin au matin dans un parc sur les bords du lac d’Annecy, blessant grièvement plusieurs enfants, avant d’être interpellé. Le Monde fait le point sur ce que l’on sait.
Das Brandenburger Tor ist eine bekannte Sehenswürdigkeit in Berlin und in der Geschichte Deutschlands von großer Bedeutung. Das Tor wurde im 18. Jahrhundert erbaut und war damals das Eingangstor zur Stadt. Es wurde von dem preußischen König Friedrich Wilhelm II gebaut..

"""



names = ["aspen"]
presets = ["musicenhanced", "balanced", "voiceenhanced"]
templates = ["solution_zen_30"]



script = audiostack.Content.Script.create(scriptText=script, scriptName="multilingual_test", projectName="multilingual_test")        

for name in names:
    # Creates text to speech
    speech = audiostack.Speech.TTS.create(
            scriptItem=script,
            voice=name,
            speed=1
    )
    for template in templates:

        for preset in presets:

            mix = audiostack.Production.Mix.create(
                speechItem=speech,
                soundTemplate=template,
                masteringPreset=preset, public=True
            )
            print(mix)
  

            mix.download(fileName=f"french_{name}_{template}_{preset}")
            encoded = audiostack.Delivery.Encoder.encode_mix(productionId=mix.productionId, preset="mp3")
            print(encoded)
            encoded.download()

Immerse your audience in a truly global audio experience, breaking language barriers with Audiostack's Multilingual Models powered by Eleven Labs.They are available in all our Paid Plans, try it here Audiostack.

SonicSell V3

We made a number of changes to SonicSell (you'll need to reach out to us to get access it's still in Beta)

This v3 has the following updates:

German frontend (change flag in the top right corner) 🇩🇪

German voices German prompting
Generative script creation
Ad database

700 Voices selectable now
Length & character estimation

Coming next:

Advanced mode
Multi-version creation (on default)
Mood & tone input
More languages
Dynamic parameters / versioning
Time Out and other fixes (*if ad is not created in 30 seconds please refresh or start new tab)

over 2 years ago

8th of June - Voice Intelligence Layer

by Peadar Coyle

Breaking Change

We refactored and redesigned our Voice Intelligence Layer to use just a voiceIntelligence boolean and not specify the inner workings of the Dictionary or normaliser. We did this to enhance the Developer Experience

voiceIntelligence: bool = False

We deprecated:

"useDictionary": useDictionary,  
 "useTextNormalizer": useTextNormalizer

  speech = audiostack.Speech.TTS.create(
            scriptItem=script,
            voice=name,
            speed=100,
            voiceIntelligence="true"
    )

Not

  speech = audiostack.Speech.TTS.create(
            scriptItem=script,
            voice=name,
            speed=100,
            useTextNormalizer=True,
            useDictionary=True
    )

almost 3 years ago

2nd June - Julep Connector, SonicSell

by Peadar Coyle

SonicSell for all AudioStack paid users

Exciting news! Our innovative AI audio ad tool, SonicSell, is now available for all paid AudioStack users. SonicSell generates radio-ready ads in just 30 seconds, leveraging AI, synthetic voices, and generative music. Experience the future of efficient, high-quality audio ad creation today. We can’t wait to hear your feedback.

Julep Connector

https://docs.audiostack.ai/reference/postjulep

We implemented a much requested customer feature 💯

You can now send your audio content to the Julep Podcast network, you just need your API key and then you can roll.

Noise gate

We've been working hard on our audio intelligence, we just added a noise gate which will help with noisy voices. This should ensure your audio experience is even better.

Performance enhancements

We've been hard at work on our performance of our system. More on that soon but we'r

almost 3 years ago

19th May - Voices update

by Peadar Coyle

Voice Update:

Effective from June 5th, our partner WellSaid Labs will be retiring the voice Narration_roxy from their platform, and consecutively from Audiostack Voice Library. We recommend you try Narration_fiona as the closest alternative.

We'll be adding some new voices in the near future

almost 3 years ago

16th May - DOCS DOCS DOCS

by Peadar Coyle

Doc updates

We made some excellent additions to docs. 💯

📔

Dynamic Creative Optimisation 📣

One of the many reasons for creating a script is so that it can be reused for different voices, and also for different personalisation parameters.

How to do dynamic creative optimisation https://docs.audiostack.ai/docs/dynamic-creative-optimisation-dco

Multispeaker support 🎉

Make your AI voices speak to each other!

https://docs.audiostack.ai/docs/multi-speaker-support

Enhanced mastering and timing parameters 🎓

An advanced feature

https://docs.audiostack.ai/docs/advance-timing-parameters

Other improvements

We've also been heavily involved in making our systems more stable and enhancing our security and control systems.

Validation

We have added a validation route in mastering:
https://docs.audiostack.ai/reference/validatemix
This allows you to validate your mastering request before sending it (and consuming credits) great for use cases involving user defined start and end values.

almost 3 years ago

12th of May - Voice Discoverability, updated website, and more beautiful audio

by Peadar Coyle

Updated voice library page

We've been working hard on making our voices easier to discover. We've invested in the following key features which you can see on https://platform.audiostack.ai/workflows/voice-library

These include

Tagging and metadata - all our 600+ voices are tagged and up to date, so you can pick the perfect voice for your audio.
Search and discoverability, it was our most requested feature!
You can specify by language and provider and various other tags.
The website is also 👌and beautiful with responsive design.

Image: Showing the library with the search functionality

Audio Ad engine webpage

We updated our https://aflorithmic.ai/audio-ad-engine have a look!

Enhanced audio quality

We've been hard at work over the last few months working on enhanced audio quality. This involved a complete rewrite of our audio engine.

You can look at some of the code here Mixing

Here's a demo example in some code. Run this and listen to how awesome the audio sounds 🎧

We've added enhanced plugins - you'll see some presets below they are aimed at making your audio sound superb.
We're working hard on adding more sound templates and will be adding more to this in the future.

import audiostack
import os
from uuid import uuid4

audiostack.api_key = "APIKEY" # Add your API key here

script = """

<as:section name="main" soundsegment="main">
 Are you ready to explore the vibrant city of Barcelona? Do you want to experience the culture, the nightlife, and the beauty of this incredible city? 
 Then we've got just the thing for you! 
 Join our travel agency for an unforgettable trip to Barcelona. Experience the bustling city streets, the stunning canals, and the charming architecture that Amsterdam is known for. Get lost in the vibrant nightlife, 
 explore the world-renowned museums, or simply soak in the local culture.
</as:section>


"""



names = ["Wren",  "jollie", "aspen", "monica"]
presets = ["musicenhanced", "balanced", "voiceenhanced"]
templates = ["your_take_30","listen_up_30", "future_focus_30"]



script = audiostack.Content.Script.create(scriptText=script, scriptName="test", projectName="ams_tests_2")        

for name in names:
    # Creates text to speech
    speech = audiostack.Speech.TTS.create(
            scriptItem=script,
            voice=name,
            speed=100
    )
    for template in templates:

        for preset in presets:

            mix = audiostack.Production.Mix.create(
                speechItem=speech,
                soundTemplate=template,
                masteringPreset=preset,
            )
            print(mix)
            uuid = uuid4()

        

            mix.download(fileName=f"V4_{name}_{template}_{preset}")

            print(mix)

almost 3 years ago

5th of May - Voice Intelligence Layer, Voice library, Pricing Updates

by Peadar Coyle

Pricing updates

We are constantly investing in better products and services for our customers so we'll occasionally be changing some things in our pricing systems

We recently updated a few things in our pricing systems, this is to improve the customer experience, you shouldn't notice much change except a few cheaper endpoints.
We changed the £50 extra credits limit to £300 - so you'll be charged less frequently. This is due to customer feedback 💯
We also lowered some of our pricing on some voices - you can see the updates on the pricing page https://docs.audiostack.ai/v1.0/docs/pricing

Voices Library

We recently completed a full rewrite of our voices library. This is to allow us to handle security better - we take this seriously, and also to allow users to find voices faster.

We redesigned our library management system using Contentful, adding validation.
We added an integration between our voice systems and our search engine - which will enhance voice discoverability in our frontends.
We also added enhanced voice permissions, authentication and integrated with our user and organisation permission system.

Lots of this is under the hood, but it's an example of the sort of customer experience and operational excellence processes we're investing in. Well done to all in the team! Migrations are hard! 🗻

Voice Intelligence layer - Abbreviations and Ordinal numbers

We've been hard at work on our voice intelligence layer 👌

Ordinal Numbers

📘
Definition of Ordinal numbers
A number defining the position of something in a series, such as ‘first’, ‘second’, or ‘third’. Ordinal numbers are used as adjectives, nouns, and pronouns.

The Voice Intelligence Layer is now able to normalise ordinal numbers in German 🇩🇪. It covers all ordinal numbers in the format X. that are used as adjectives. Here’s an example:

import audiostack
import os

audiostack.api_key = os.environ["AUDIO_STACK_DEV_KEY"]
text = "Ich war wie im 7. Himmel"  

script = audiostack.Content.Script.create(scriptText=text)
tts = audiostack.Speech.TTS.create(scriptItem=script, voice="lena", useDictionary= True, useTextNormalizer= True)
print(tts.data['sections'][0]['preview'])

Out:
"Ich war wie im siebten Himmel."

Abbreviations

The Voice Intelligence Layer is now able to expand German abbreviations to their full form. It covers the 130 most frequent abbreviations in German. Here’s an example:

Here's some examples

import audiostack
import os

audiostack.api_key = os.environ["AUDIO_STACK_DEV_KEY"]
text = "Ich kenne eine Abk. zum Bhf. von Ulm."  

script = audiostack.Content.Script.create(scriptText=text)
tts = audiostack.Speech.TTS.create(scriptItem=script, voice="lena", useDictionary= True, useTextNormalizer= True)
print(tts.data['sections'][0]['preview'])

Out:
"Ich kenne eine Abkürzung zum Bahnhof von Ulm."

almost 3 years ago

21st April - The JS SDK!

by Peadar Coyle

JavaScript SDK

After launching AudioStack API & Python SDK, we've continuously worked on more features to free up your engineering resources. Now you can easily integrate our API using our new JS SDK, a lightweight library that makes it easy to create professional audio assets within seconds.
We've design it to simplify the coding process - create a script, choose one of our 600+ voices, add a sound design and voila - you can encode your audio file as a high quality mp3 and download it. This allows you to focus on creating great audio experiences without having to worry about the underlying code. 🚀 💯

https://www.npmjs.com/package/@aflr/audiostack You can find it on NPM here

You can see an example here - https://stackblitz.com/edit/node-rwvl9n?file=example.js

How to install it!

npm install @aflr/audiostack

# or

yarn install @aflr/audiostack

How to use it

/**
 * Audiostack Hello World Example
 *
 * Add your API key on line 12 and run
 * node example.js
 */

// Import the library
const { Audiostack } = require('@aflr/audiostack');

// Provide your api key from the Audiostack Console
const apiKey = 'your_api_key';

/**
 * This example demonstrates how to produce and download a file using the Audiostack API.
 */
const example = async () => {
  // Create a new instance of Audiostack
  const AS = new Audiostack(apiKey);

  // Create a script asset with our hello text
  const script = await AS.Content.Script.create({
    scriptText:
      'Hello from Audiostack. This is a test script. I hope you enjoy it.',
  });

  // Create a speech asset using our script and the voice "sara"
  const tts = await AS.Speech.Tts.create({
    scriptId: script.scriptId,
    voice: 'sara',
  });

  // Create a mix with our speech asset and add the sound template "3am"
  const mix = await AS.Production.Mix.create({
    speechId: tts.speechId,
    soundTemplate: '3am',
  });

  // Encode the file to high quality mp3
  const encode = await AS.Delivery.Encoder.encodeMix({
    productionId: mix.productionId,
    preset: 'mp3_high',
  });

  // Download the file to the project directory
  const path = await encode.download();

  // Print the path to the file
  console.log(`File downloaded to: ${path}`);
};

// Run the example
example();

We look forward to seeing feedback and seeing what YOU build with our SDK.

almost 3 years ago

12th April - Voice Cloning revamp

by Peadar Coyle

Voice Cloner Re-vamp:

Introduction of Voice Cloning Protocols
(Speed, Standard and Premium)
📘
Speed, Standard, Premium
Speed is the basic protocol
The standard protocol contains longer and more thorough scripts which improve the quality of the voice model.
The premium protocol is our best quality voices.

End to End German Voice Pipeline

For 🇩🇪 speaking customers we now have a fully end to end german voice pipeline. So you can record your voice and get a synthetic representation in less than 48 hours. We're continuously working on improving this.

Voice Cloning Billing

We now have enhanced billing for our voice cloning experience.

Dynamic Scripts

We have automatic insertion of First Name, Last Name and Org Name into the script so you can personalise your script to your own use case!

New Console Page

Better UI/UX
Real-time progress monitoring

Here's some screenshots!

Showing the plan and the number of organisations

Showing your monthly progress and your credit usage

New Account Management Page includes:

Improved UI/UX
Improved Analytics (Last Activity, Number of credits spent by User, Organisation and Account)

Here's a screenshot

Improved Billing Experience

We recently rolled out improvements in our billing experience. This might cause you some differences in your credit costs.

❗️
Changes in billing (costs will change)
You may notice some changes in your billing with these new endpoints.

Provider	Cost in credits per 10 seconds
Messner	1.5
Resemble	9
Deepzen	18
Polly	1
IBM	1
Cerevoice (cereproc)	1.2
Azure	1.2
Google	1
Wellsaid	9
ElevenLabs	12

Bug fixes

We fixed a bug in applying sound effects to messner voices. Now this successfully works.