Data Protection FAQ

Data collection

1. How are audio recordings collected? Mobile app and/or web site?
Will it be integrated in 3rd party app and/or platforms? It’ll be a website, which will be integrated with an agency built platform.

2. Is the mobile app available only to selected and trained personnel or will it be publicly available to anyone?

We dont’ have a mobile app

3. How are the audio recording secured and which are the security safeguards in place in these collecting tools (e.g. what encryption?)

We have a range of encryption standards - all listed here - https://docs.audiostack.ai/docs/security

We use AES-256 and at rest encryption. All of our endpoints are HTTPS. We also use VPC’s that are restricted from the internet.

What metadata are collected when participants submit audio samples? Where and how are those meta-data stored?

we collect only the device info we get through the browser. This metadata is stored in an S3 bucket as metadata of S3 object which is the submitted audio file. Metadata being phone model or audio interface.

Should the initiative not be offered to minors, are there any safeguards to ensure age restriction?

Our content moderation team will regularly check samples come through and then remove any age restricted submissions.

Pre-processing of the data (incl. quality)

What process is performed on the raw voice samples to prepare the training and test data?

a. How are audio samples verified against the provided text? In other words, how is it ensured that the recording sample is not a random (e.g. hate speech) text?-> We run internal models to verify for this process.

b. How is it ensured that the provided text in the audio sample is separated from any personal message and that the personal message is not part of the training data? We’ll route to two S3 buckets on AWS servers in Ireland which are separated and encrypted. This will be tested by our internal quality control team and automated systems. All systems will be backed up.

c. How many different files/formats of the voice samples are stored and are the meta-data stored along the audio samples? We’ll store data generally in mp3 and wav format.

Training and model

Is the AI model created inhouse or obtained from third party?

Has the model been studied/verified/audited by third party and are those studies available publicly? (e.g.ICO sandbox) We use third party providers none of them have been studied or put into the ICO sandbox.  We integrate with multiple providers (Eleven Labs, internal models, PlayHT, Resemble), all of these providers have a DPA signed with us, and we have contractual terms with them to delete data.

Explainability - would X or the service provider be able to explain how the model has created a particular voice?

We’d be able to explain the underlying principles of the models and we’d also be able to do further analysis to do some support in terms of explainability. Please contact [email protected]

Management

Who will have access to the X training data, audio recordings, and AI Voice? Both internally within AudioStack and others, e.g. sub-processors? It will be certain trained members of the staff and we have agreements with subprocessors that all data will not be stored on their servers.A list of subprocessors is here - https://audiostack.ai/en/terms-of-use

What is the retention policy for training data once AI model is trained. Is the information retained for maintenance, re-training, improvements etc.? For the duration of the project we will retain data and then 90 days after the end of the project we will delete data as much is technically possible.

How would data subject rights be exercised:

a. Access to training data (phrase audio, audio message, provided personal details etc.) Emailing our customer support team and we will provide that data if it’s not been deleted. Or contacting us via Intercom.

b. Correction, deletion (consent withdrawal) of training data Simply emailing our customer support team and we’ll delete. Our email address is [email protected]

c. Access, correction and deletion (consent withdrawal) of output data. What is the service providers practice and approach towards such requests made against output data - Service providers have agreements with us to delete if necessary.

EU AI Act

What about the AI act?

We are an AI Deployer because we deploy AI systems whether provided by third parties or internal IP, under our own brand and trademark. We run a range of evaluation metrics and auditing software. Where we provide voice cloning services. We inform users that the results are generated by AI through a transparency guide and our documentation. We also refer to AI throughout our documentation and have technical documentation of how we build our services.
We have an AI ethics policy, and policies for conforming with copyright law. We don’t train general purpose AI models such as foundation models.
We’ve conducted risk analysis for our models - and the biggest risk is in regards voice cloning - which could lead to impersonation. We mitigate this by having a secure and robust enterprise level system and we don’t allow sharing of content without consent. We also collect consent documents from all voice actors affected and they are renumerated for their voices. We are constantly evaluating this setup and investing more in our AI governance and AI safety.

Content Moderation

Content Moderation policies and other questions

What Content moderation policies and brand safety policies do you have?

We have the following.

  • Right to be forgotten, with deletion of cloned voice and raw Data
  • We have an explicit consent form before cloning a voice, in addition of our extensive Terms of Service and Privacy notes
  • You can view our terms of service here: https://audiostack.ai/en/legal/terms-of-service
  • You can view our Privacy policy here: https://audiostack.ai/legal/privacy-policy
  • Enterprise level secure setup, where the voice is cloned in a registered org, this is independently tested by a third party auditor every year
  • Our customer support team are regualrly monitoring for bad actors and profanity (to an extend, nuanced content might still escape). You can see our Acceptable Use Policy here - https://audiostack.ai/legal/acceptable-use-policy
  • AutoFix: catches most hallucinations and regenerates the section to potentially fix the issue (not a 100% solution, user still needs to check the output)-->user needs to activate this with a ON/OFF flag
  • We will soon have compliance checks that will suggest and flag any potential non-compliant ads to advertising guidelines of the region (2nd half of Q1)
  • We have an AI ethics policy and regularly work with governments and third parties on ensuring suitable AI ethics - https://audiostack.ai/en/ethics
  • We are also regularly running extensive testing of our platform looking for areas where unsafe content is produced and working closely with brands about maintaining their brand safety. We run a range of scenarios including extensive quality assurance testing and red-teaming via our engineering team.

IT Compliance

  • Are you SOC2 compliant?
  • Yes our SOC2 type 2 report is available here https://compliance.audiostack.ai/ at your request to Enterprise customers
  • Are you GDPR compliant?
  • Yes we follow GDPR and are registered with the ICO in the UK.

Licensing

  • Is it safe to use the music licenses that you have?
  • We have deals with various music providers and are committed to working closely with them. You can see more about which music is royalty free and which is only usable for non-commercial use in our Term of Service https://audiostack.ai/en/legal/terms-of-service