๐Ÿฐ ๐Ÿ‡

Easter launch

We want to wish you all a happy easter from all at Aflorithmic!

Here's some things we've launched

ElevenLabs voices are now part of our Library!

ElevenLabs is a voice technology research company, that merely focuses on developing high-quality AI voices for publishers and creators. Their text-to-speech models use high compression and context understanding unparalleled to other artificial voices we have encountered so far.
Currently we have sourced 9 American English voices, both male and female that you can use to render speech ultra-realistically for all your projects! https://library.api.audio/voices?providerFullName=elevenlabs

import audiostack, os


SCRIPT_TEXT = """
<as:section name="intro" soundsegment="intro">
    This is the first section and will be combined with the intro music.
</as:section>

<as:section name="main" soundsegment="main">
    This is the second section and will be combined with the main music. The section name and soundsegment don't have to have the same name.
</as:section>
"""

audiostack.api_base = "https://v2.api.audio"
audiostack.api_key = os.environ['AUDIOSTACK_API_KEY']  #stick your API key here


VOICE = "wren" #Others include Renata and Bryer
script = audiostack.Content.Script.create(scriptText=SCRIPT_TEXT)
print(script.message, script.scriptId)

tts = audiostack.Speech.TTS.create(scriptItem=script, voice=VOICE, public=True)
print(tts)

SyncTTS

We recently fixed some bugs in SyncTTS and more importantly enabled it for ALL of our voices.

https://aflorithmiclabs.postman.co/workspace/Aflorithmic~46256327-58bb-43d3-9a19-52b33e85e0c3/request/16492976-5f1c6454-28af-426c-ada6-964eacfb0b94

Profile picture

  • Our profile pictures in the Console sometimes weren't shown, we've updated this so you can see your wonderful user profile picture.

Bug fixes

  • We fixed a bug that the child organisations weren't inheriting account level voices, so this will mean that if your account is a paid plan the voices (that aren't marked private) will be inherited too. This means a better user experience for your users.

Bug Fixing ๐Ÿ›

We've been hard at work on improving the reliability and bug fixing based on customer feedback.

Break tag embed - When you used a voice with a style in this format f"""<mstts:express-as style="{style}">{text}</mstts:express-as>"""
and a user embeds <break time="5000ms"/>it wasn't working. We've fixed this break tag embedding issue.

Voice cloning - We've enhanced the voice cloning experience

  • We've enhanced the reliability and experience of voice cloner, especially in languages such as German ๐Ÿ‡ฉ๐Ÿ‡ช
  • We also fixed some bugs in the script that was specific to verticals (sales for example) this greatly enhances the experience.

Billing

  • We've made some improvements to our billing experience, this should reduce errors, and also we've enhanced our operational processes about this.

Reliability enhancements

We've been hard at work on internal stability and performance improvements over the last few weeks. We've seen a significant drop in error rates for most of our customers.

We made significant improvements to our voice intelligence layer and our voice cloning experience.
We'll continue to invest in these improvements.

Voices

We made big improvements to our voice listings, and our styles in our voices.

You can use this with something like

script_text = f"""<mstts:express-as style="{style}">{text}</mstts:express-as>"""

 async with AsyncClient() as client:
        reply = await client.post(
            _URL + "content/script",
            headers=_HEADERS,
            json={"scriptText": script_text, "scriptName": vocf_id},
            timeout=settings.timeout,
        )

๐Ÿ“˜

Example of styles

We suggest neutral and cheerful with the aria voice if you want to start with one. There's plenty of others.

New website

Our website got a cool new lick of paint ๐Ÿ’–

Head over to our new website to have a look!!!!

Excellent work by our solutions team on delivering this, and helping our customers better understand our value prop.

WellSaidโ€™s voices now featuring on our library!

Our latest addition to our 700+ library of synthetic voices are Seattle-based synthetic speech technology startup WellSaid Labs.

They have created a collection of voices with the highest naturalness score. In addition to that, some of their voices support 3 styles: Narration, Promo, and Conversational.

Use the Narration voices if you are after an explanatory, stable and calm delivery of your script, the Promo voices if you are after a more enthusiastic, advert-like style and the Conversational Voices that are optimised for customer interactions. ๐ŸŽ‚

Try them all here: https://library.api.audio/voices?providerFullName=wellsaid+labs and let us know what you think! ๐ŸŽง

Docs improvements

We've been working hard on our docs here's some highlights ๐ŸŽ‰

Improvements to the sign up experience

  • We've made a number of improvements to the security and user experience of the sign up experience. We discovered in beta testing that this improved the user onboarding experience.

Better user permission management

At Aflorithmic we take really seriously security for customers. So we recently rolled out a new improved user permissions management (which is managed by our customer success team) and have taken user feedback to make this even better.

We've also added enhancements in resource based control. We'll be rolling out more and more features based on this in the coming weeks. However here's a few improvements.

  • We've a 10x faster system for handling this, and reduced load on our engineering team by 10x, so they can focus on producing features for our customers! ๐Ÿฎ
  • We've enabled better privacy and control. For example - We are assigning private voices with an owner role of the user, which only has access to certain Operations like invite others and delete the voice.
import audiostack


audiostack.api_key = "APIKEY"


PUBLIC_VOICE = "stefan"
PRIVATE_VOICE = "private_aflr"
PREMIUM_VOICE = "dahlia"
SCRIPT_TEXT = """
Hello Viviane! Look what I have build!!!!
"""

script = audiostack.Content.Script.create(scriptText=SCRIPT_TEXT, 
                                          scriptName="privacytest")

speech = audiostack.Speech.TTS.create(
    scriptItem=script, voice=PRIVATE_VOICE, speed=1.1, public=True
)
speech.download()

The premium voice should break because you won't have access to this.

  • This allows you to have specific voices to specific users. So your PII is better protected.

๐Ÿ”

  • We now have a more auditable, better security, and helping protect our customer data, which we take very seriously. ๐Ÿ

Bug fixes and enhancements

We fixed a bunch of bugs in the recent release. We can't highlight them all.

However let's celebrate some ๐Ÿ’ฏ

  • We've improved our voice systems - we reduced technical debt and did a whole new design - this will enhance the customer experience and also allow us to ship features faster ๐Ÿšข
  • We've made improvements to billing, we've fixed some bugs (for example 2FA didn't work with some credit cards), and enhanced the transparency (it'll be clearer in your reporting). We'll be rolling out further changes in the future. ๐Ÿ’ฐ
  • Our script pipeline was failing silently in some edge cases - now we've fixed this and it fails safely. ๐Ÿ‘ท

New better Script Syntax

Based on over 50 user interviews, over the past 18 months we realised that our old script syntax had some usability issues. We're now launching a much more usable script syntax ๐Ÿ’ฏ
You can have a look here from our beta testers we found it was much easier to write and we've observed less bugs in our monitoring (up to 20% less bugs)

Here's a teaser

<as:section name="hello">
    Hello world, this section is named hello. 
</as:section>

Faster Voice Intelligence

We're super excited about Voice Intelligence which enables our AI voices to speak like a human. We've been hard at work on this over the last few months.

Up to 5x faster voice intelligence ๐Ÿš€ ๐Ÿšค

  • You'll notice these improvements most with the normaliser as the normaliser produces very loooong text strings. ๐Ÿ’ฏ

Benchmarks from our tests

  • We can now process a 10,000 character script in ~10 seconds
  • Before this was taking ~50 seconds

Documentation

  • We updated our documentation to add billing info and the media files.
  • We also made quality improvements.

Performance enhancement

  • One issue we had in the past was that when you added many users - we faced some slowdowns in authorisation. Now thanks to our design, and the help of Oso our authoriser now performs ~100ms. We are constantly investing in these improvements and more are coming ๐Ÿš€

Bug Fixes

Not necessarily bugs but we implemented a few changes for our customers

  • We increased validity time of user invitations from 1 day to 7 days. Until now some users were caught out by this.
  • We fixed a bug in our authentication service causing occasional failure in child org creation.
  • We fixed some bugs in our Voice Cloning experience. This should be even more awesome.

Voices update

โŒ We have now deprecated all our VocaliD Voices. If you encounter an error, please try one of our other voices from 7 providers, featuring DeepZen, CereProc, and Resemble. ai.

๐Ÿ‘

Spoiler Alert: We have some surprise new voices coming in the following weeks, so keep an eye out!

IBM Voices๐Ÿ’ก

IBM Voices: In line with IBM deprecation of the voices listed here, we have also updated our offering to only present you with V3 and Expressive voices.

Expressive neural voices offer natural-sounding speech that is exceptionally clear and crisp. Their pronunciation and inflections are natural and conversational, and the resulting speech offers extremely smooth transitions between words. The voices determine sentiment from context and automatically use the proper intonation to suit the text. You can try duchess and reynold to test out the expressive voices.

Stay tuned with our updates and letโ€™s build the future of audio together!๐ŸŽง

I wanna hear an example

A demo is below with reynold

import audiostack


"""
Hello! Welcome to the audiostack python SDK. 
"""

# make sure you change this to be your api key, or export it as APIKEY="<key>"
audiostack.api_key =  "APIKEY"
print("In Content you can create scripts and manage your porduction assets.")
script = audiostack.Content.Script.create(scriptText= 
"Welcome to AudioStack, the world's most powerful AI audio creation infrastructure. The unlimited possibilities of generative AI at your fingertips. In one single API.", projectName="testingthings")
print(script)

print("In Speech you can access almost a thousand AI voice models or your own, cloned voice.")
voices = audiostack.Speech.TTS.list()
tts = audiostack.Speech.TTS.create(scriptItem=script, voice="reynold")


print("In Production you can dynamically mix it with a sound design of your choice and master it so it sounds great.")
mix = audiostack.Production.Mix.create(speechItem=tts, soundTemplate="cityechoes", masteringPreset = "balanced")
print(mix)

print("In Delivery, we produce a great sounding file and deliver it where you need it.")
encoder = audiostack.Delivery.Encoder.encode_mix(productionItem=mix, preset="mp3")
encoder.download(fileName="MyFirstAudioStackTrack")

We've got a lot more coming.

We are delighted to announce new features.

  • You can now handle Roman Numerals in our Voice Intelligence Layer. What does this mean? Well most text to speech providers struggle with stuff like "Charles IV" this is particularly hard problem. Here's an example!
text = """Johanna VI. war eine groรŸe Kรถnigin. Benedikt XVI. starb letzes Jahr.
/Die Kinder Marias II. standen weiter hinten in der Thronfolge.
  Die Herrschaft Karls V. dauerte mehrere Jahrzehnte. Ich sah Charles III. /
    zum ersten Mal. Er zeichnete Anna I. auf ihrem Sterbebett. Vor Edward XX gab / 
      es keine Feinde. Ich kรคmpfte nie gegen Pedro VI."""

script = audiostack.Content.Script.create(scriptText=text)

tts = audiostack.Speech.TTS.create(scriptItem=script, voice="vicki", useDictionary= True, useTextNormalizer= True)
print(tts)

item = audiostack.Speech.TTS.get(tts.speechId)
        
item.download(fileName=item.speechId)
  • Weโ€™ve now integrated our voice cloning capabilities into Audiostack! You can invite yourself to try it out from the AudioStack Console

Welcome to Audiostack

by Peadar Coyle

Welcome to the developer hub and documentation for Audiostack!