What does AudioStack do?
AudioStack facilitates the creation of professional-sounding audio in a matter of seconds.. You can use it as voice overs to videos, add engaging audio to your application, create audio ads, enhance your smart speaker skill, let users of your creative projetc dynamically create audio, version podcasts with our mastering engine or experiment with various state of the art text to speech APIs. You do not need any audio experise to create high-quality audio.
How long does it take to learn AudioStack?
It generally takes a developer around ten minutes to run the most basic usecase. (check our example from Get Started ) It probably takes another hour to understand the concepts of AudioStack, and for it to feel a bit more natural. If you have questions, don't be shy, we're eager to help. Just send an email at [email protected] or [email protected].
Who uses AudioStack? Engineers or non-technical people?
Anyone who knows some Python or JavaScript and wants to create aprofessional-sounding audio in a matter of second can use AudioStack pretty easily. At this point, a good chunk of AudioStack users are not very technical and we are always working on making it easier and more accessible to everyone while giving more advanced developer all the controls they need.
What kind of software do people build with AudioStack?
Starting with audio advertisments, AudioStack is currently in use to personalise children audiostories and audio toys, to add audio personalisation features to fitness, wellbeing, and meditation apps, to enriching podcast audio, to add voice overs to sales videos, to make conversational AI avatars speak, and to make smart speaker skills more engaging. Stay tuned, it's just the beginning!
What SDKs do you provide?
AudioStack currently offers a Javascript SDK (NodeJS), and a Python 3 SDK and a Javascript SDK (NodeJS) one. More coming soon!
What can you not do with AudioStack?
The goal of AudioStack is to enable everyone with little or no audio expertise to programmatically create high quality audio that can compete with a professional production. This also means that it is not designed for the full control you might want for your high-end production that is supposed to win a Grammy. However, if you are struggling to realize a project, we'd love to help or hear about your feature requests. Just send us an email at [email protected] or [email protected].
What if I want a component that AudioStack doesn't yet have?
If you have a use case that is not handled by AudioStack's built-in components, you can build your own custom component to solve that use case. We welcome contributions and you can also submit a feature request. 💪
Is AudioStack secure? Where's my data stored?
We treat Security seriously at Api.Audio. If you need further information, please let us know.
How does it compare to eg Google Text-To-Speech (TTS)?
AudioStack goes beyond text-to-speech. Our product is created by audio enthusiasts that know all about making audio sound good. The cloud technology giants provide amazing Text-To-Speech voices that we are actually offering through AudioStack as well.
However, the focus of Api.Audio is to make it easy for developers to use such voices (among others) to build innovative audio solutions. This includes:
- a wide voice library (including voices for niche use-cases)
- guardrails to handle text content so it sounds good when converted to speech
- audio processing that makes the difference between a pure speech track (think GPS or smart speaker)
- an engaging, fully produced audio piece (think radio ad or podcast) along with connectors that make it easy to integrate audio creation into an existing product or project and make audio creation scalable.
We also have already thought thorough and solved some very complicated problems that developers tend to encounter when building audio and offer tools here that developers love us for. We don't want to brag, but AudioStack often makes it possible to build in a days what takes a development team multiple months when plugging in directly into a Text-To-Speech provider.
How many languages do you support?
We currently support 60+ human languages. More are coming soon.
How easy is it to implement the API?
You can get up and running with our API in less than 30 minutes. We are continuously focusing on making that experience easier. You can check how to get started here.
Is the audio output in real-time? How fast and how scalable is it?
Depending on what you want to do, it can take seconds to several minutes to produce a piece of audio. This - along with the strategy to pre-produce certain parts of the creation process - is normally quick enough for the vast majority of the usecases we encounter.
We are prepared event for cases where this is not fast enough - such as conversational use-cases (ie AI chatbots) or cases where users dynamically create audio and really quick feedback. For these we offer a synchronous Text-To-Speech API with response times <=1s as well as some other tools. In terms of scaling, even your most ambitious usecase should not be an issue.
How does "clone your own voice" work? Should the customer record their voice and send it to you?
We have developed a process that allows your users to record their voices in a way that is usable for us to voice model. The resulting model is then available through our AudioStack, so you can use it in your audio creation process.
Who is behind AudioStack?
AudioStack is an audio creation environment developed by Aflorithmic Labs. You can learn all about the company here: https://www.aflorithmic.ai/company.
I have another question 

Great! We'd love to answer it. Send us an email to [email protected], or chat with us in the bottom right corner.
Updated about 1 month ago