Transcription of an audio file π
This is a guide for using our Transcription endpoint /production/suite/transcribe
. This endpoint allows you to easily transform audio content into readable and searchable text, which is valuable for documentation, accessibility, content analysis, and more.
Languages supported
The transcription service supports multiple languages (language codes attached).
English (en), Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Dutch (nl), Afrikaans (af), Albanian (sq), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Azerbaijani (az), Bashkir (ba), Basque (eu), Belarusian (be), Bengali (bn), Bosnian (bs), Breton (br), Bulgarian (bg), Burmese (my), Catalan (ca), Chinese (zh), Croatian (hr), Czech (cs), Danish (da), Estonian (et), Faroese (fo), Finnish (fi), Galician (gl), Georgian (ka), Greek (el), Gujarati (gu), Haitian Creole (ht), Hausa (ha), Hawaiian (haw), Hebrew (he), Hindi (hi), Hungarian (hu), Icelandic (is), Indonesian (id), Japanese (ja), Javanese (jw), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Lao (lo), Latin (la), Latvian (lv), Lingala (ln), Lithuanian (lt), Luxembourgish (lb), Macedonian (mk), Malagasy (mg), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Nepali (ne), Norwegian (no), Norwegian Nynorsk (nn), Occitan (oc), Punjabi (pa), Pashto (ps), Persian (fa), Polish (pl), Romanian (ro), Russian (ru), Sanskrit (sa), Serbian (sr), Shona (sn), Sindhi (sd), Sinhala (si), Slovak (sk), Slovenian (sl), Somali (so), Sundanese (su), Swahili (sw), Swedish (sv), Tagalog (tl), Tajik (tg), Tamil (ta), Tatar (tt), Telugu (te), Thai (th), Tibetan (bo), Turkish (tr), Turkmen (tk), Ukrainian (uk), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Yiddish (yi), Yoruba (yo).
The Transcription endpoint is available via a simple POST
request. No complex setup is required; just make the request and get the data you need.
- Method:
POST
- URL:
https://v2.api.audio/production/suite/transcribe
Example Request and Response
import requests
API_KEY = "00000000-0000-0000-0000-000000000000"
ORG_ID = "my_example_org"
my_file_id = "11111111-1111-1111-1111-111111111111"
r = requests.post(
url="https://v2.api.audio/production/suite/transcribe",
json={"fileId": my_file_id, "language": "en"},
headers={"x-api-key": API_KEY, "x-assume-org": ORG_ID},
)
print(r.status_code)
# 202
print(r.json())
# {
# "meta": {
# "version": "123",
# "requestId": "1fe70beb-f055-4d61-8428-71f10700f8f0",
# "creditsUsed": 0,
# "creditsRemaining": 96620.74,
# },
# "message": "Task being processed",
# "warnings": [
# "Indicated billing is lower then the actual value, which will be calculated once your request is finished."
# ],
# "data": {
# "status": 202,
# "dateCreated": 1730737260,
# "pipelineId": "22222222-2222-2222-2222-222222222222",
# "results": {
# "replacedFileIds": [],
# "newFileIds": [],
# "inputFileIds": [],
# "data": {},
# },
# "message": "",
# "errors": [],
# },
# }
r2 = requests.get(
url="https://v2.api.audio/production/suite/pipeline/22222222-2222-2222-2222-222222222222",
headers={"x-api-key": API_KEY, "x-assume-org": ORG_ID},
)
print(r2.status_code)
# 200
print(r2.json())
# {
# "meta": {
# "version": "123",
# "requestId": "c78f3fe4-eb03-4b50-ad99-1fb73e88a10c",
# "creditsUsed": 0,
# "creditsRemaining": 96612.54,
# },
# "message": "Pipeline finished",
# "warnings": [],
# "data": {
# "status": 200,
# "dateCreated": 1730737260,
# "pipelineId": "22222222-2222-2222-2222-222222222222",
# "results": {
# "newFileIds": [],
# "replacedFileIds": [],
# "data": {
# "language": "en_us",
# "text": "The mind is not a vessel to be filled, but a fire to be kindled.",
# "sections": [
# {
# "name": "section_1",
# "speaker": "A",
# "text": "The mind is not a vessel to be filled, but a fire to be kindled.",
# }
# ],
# "script": '<as:section name="section_1">\\n The mind is not a vessel to be filled, but a fire to be kindled.\\n</as:section>',
# },
# "inputFileIds": [
# {
# "fileId": "11111111-1111-1111-1111-111111111111",
# "filePath": "audiostack_examples/example_assets/emily_plutarch_quote.mp3",
# "label": "input",
# }
# ],
# },
# "message": "complete",
# "errors": [],
# },
# }
The cost of this service is 0.41 creds per second
of transcribed audio.
Updated about 1 month ago