Generate Speech Files using Speech-to-Speech (STS)

Bring your Synthetic Speech to Life with STS

With STS, you can apply all of the characteristics of a voice recording onto one of the synthetic voices available in our Voice Library, allowing you to more precisely control the emotional expression, tone, timing and pronunciation of your generated speech.

STS is currently only available on a limited number of voices, but you can now generate it directly in the AudioStack platform, no coding needed. Simply upload a voice recording of the speech you'd like the synthetic voice to replicate in the Resources / Content area of the platform.


The maximum file size for STS is currently 50mb

Uploaded files must be smaller than 50mb for STS to be generated.

Next, open up the speech file and in the metadata, set the file type to Voice. Then, you'll be able to access the Voice tab.

In the Voice tab, select the voice you'd like to have perform, and click Generate. It's as easy as that. Your speech will be generated as an audio file, which can either be downloaded to use in another piece of software, or used within another workflow or in the AudioStack API.

How can I use my STS output using the API?

When working with AudioStack API, you can use Speech-to-Speech in extremely flexible ways.

Check out our tutorial on how to overlay media content with generated audio here.

Why can't I use the whole voice library with STS?

Over time, more voices will become enabled to use STS. At the moment, only a limited number of voices support this.

We're also planning to expand our voice library to allow you to preview STS voices more easily in the near future - watch this space. πŸ‘€