added

17th Sep 2024- AutoFix, more STS voices and Azure Multilingual Voices

📣 New Feature Launch: AutoFix 📣

🛠️ We are thrilled to introduce AutoFix, a powerful new feature designed to enhance the quality of Text-to-Speech (TTS) output by automatically detecting and fixing hallucinations and artifacts generated by TTS models across all 15 providers in AudioStack.

Why use AutoFix?

As we continue to onboard more niche TTS providers, some models may generate unwanted audio hallucinations or artifacts due to rapid development and technical limitations.

Traditionally, users have had to manually review and regenerate these problematic assets, which is both time-consuming and cumbersome. AutoFix streamlines this process by automatically detecting and regenerating faulty assets, saving you time and ensuring high-quality audio outputs.

Examples of Issues AutoFix Resolves:

AutoFix eliminates various types of hallucinations and artifacts, such as 👻 ghostly sounds, 🧟 distorted voices, and 🚌 background noise. Listen to some examples here

How Does AutoFix Work?

  • Quality Scoring: AutoFix evaluates TTS assets for speech quality and background noise using an internal scoring system.
  • Automatic Regeneration: If an asset is flagged for hallucinations or artifacts, AutoFix automatically regenerates it, ensuring cleaner, higher-quality output.
  • Consistent Results: This automated process reduces manual quality assurance and improves the reliability of your TTS assets.

Pricing

AutoFix is available at a cost of 💰5 production credits per minute of audio. You can easily activate AutoFix by setting useAutofix = True in your API call.

Example API call:

curl --location '<https://v2.api.audio/speech/tts>' \\
--header 'Content-Type: application/json' \\
--header 'Accept: application/json' \\
--header 'x-api-key: <your key here>' \\
--data '{
    "scriptId": "<your script id here>",
    "voice": "jeremy",
    "useAutofix": true
}'

🚧

Disclaimer

While AutoFix significantly reduces artifacts in generated assets, we strongly encourage users to manually review the quality of all assets before publishing.


🗣️STS (Speech to Speech) Voices Update

We have just added more STS voices to our library, bringing the total of the voices that support STS technology to an astonishing 287 🤯.

This makes 🦜 AudioStack's STS library the biggest and most diverse. STS is delivering incredible value to creative customers by producing lifelike speech with unparalleled naturalness!

📘

One thing to note:

The accent of the source speaker will be transferred to the resulting STS asset, so it is recommended to choose voices that are closer to your accent for the speech transfer (i.e. if your accent is American, choose a voice whose accent is tagged as American)


🎙️ More Azure Multilingual Voices!

We’re excited to announce that our Text-to-Speech (TTS) library just got an upgrade! 🌍🎙️

We’ve added 22 new multilingual voices ,that speak 50+ languages 🥳 to the library, making it more diverse and flexible for all your projects! 🗣️✨ Whether you need voices for different languages, accents, or styles, we’ve got you covered.

Try them out and let us know what you think! 😄 Here are their aliases:

'Wei-wei 🇨🇳’, 'Denisa 🇪🇸’, ‘Sunny 🇰🇷’, ‘Ezra 🇺🇸’, ‘Branson 🇺🇸’, ‘Marieta 🇺🇸’, ‘Meral’ 🇺🇸, ‘Hale 🇺🇸’,
‘Haoyu’ 🇨🇳, ‘Christophe 🇨🇳’, ‘German 🇧🇷’, ‘Tan 🇫🇷’, ‘Antonette 🇺🇸’, ‘Thanos 🇺🇸’, ‘Sharita’ 🇺🇸,
‘Deno’ 🇺🇸, ‘Michae 🇺🇸’ , ‘Paden 🇺🇸’, ‘Furkan 🇺🇸’, ‘Chrishawn 🇺🇸’, ‘Sura’ 🇺🇸, ‘Deven🇺🇸’

Happy creating with AudioStack! 🎶