Transcription of an audio file πŸ“

This is a guide for using our Transcription endpoint /production/suite/transcribe . This endpoint allows you to easily transform audio content into readable and searchable text, which is valuable for documentation, accessibility, content analysis, and more.


Languages supported

The transcription service supports multiple languages (language codes attached).

English (en), Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Dutch (nl), Afrikaans (af), Albanian (sq), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Azerbaijani (az), Bashkir (ba), Basque (eu), Belarusian (be), Bengali (bn), Bosnian (bs), Breton (br), Bulgarian (bg), Burmese (my), Catalan (ca), Chinese (zh), Croatian (hr), Czech (cs), Danish (da), Estonian (et), Faroese (fo), Finnish (fi), Galician (gl), Georgian (ka), Greek (el), Gujarati (gu), Haitian Creole (ht), Hausa (ha), Hawaiian (haw), Hebrew (he), Hindi (hi), Hungarian (hu), Icelandic (is), Indonesian (id), Japanese (ja), Javanese (jw), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Lao (lo), Latin (la), Latvian (lv), Lingala (ln), Lithuanian (lt), Luxembourgish (lb), Macedonian (mk), Malagasy (mg), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Nepali (ne), Norwegian (no), Norwegian Nynorsk (nn), Occitan (oc), Punjabi (pa), Pashto (ps), Persian (fa), Polish (pl), Romanian (ro), Russian (ru), Sanskrit (sa), Serbian (sr), Shona (sn), Sindhi (sd), Sinhala (si), Slovak (sk), Slovenian (sl), Somali (so), Sundanese (su), Swahili (sw), Swedish (sv), Tagalog (tl), Tajik (tg), Tamil (ta), Tatar (tt), Telugu (te), Thai (th), Tibetan (bo), Turkish (tr), Turkmen (tk), Ukrainian (uk), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Yiddish (yi), Yoruba (yo).

The Transcription endpoint is available via a simple POST request. No complex setup is required; just make the request and get the data you need.

  • Method: POST
  • URL: https://v2.api.audio/production/suite/transcribe

Example Request and Response

import requests

API_KEY = "00000000-0000-0000-0000-000000000000"
ORG_ID = "my_example_org"
my_file_id = "11111111-1111-1111-1111-111111111111"

r = requests.post(
    url="https://v2.api.audio/production/suite/transcribe",
    json={"fileId": my_file_id, "language": "en"},
    headers={"x-api-key": API_KEY, "x-assume-org": ORG_ID},
)
print(r.status_code)
# 202

print(r.json())

# {
#     "meta": {
#         "version": "123",
#         "requestId": "1fe70beb-f055-4d61-8428-71f10700f8f0",
#         "creditsUsed": 0,
#         "creditsRemaining": 96620.74,
#     },
#     "message": "Task being processed",
#     "warnings": [
#         "Indicated billing is lower then the actual value, which will be calculated once your request is finished."
#     ],
#     "data": {
#         "status": 202,
#         "dateCreated": 1730737260,
#         "pipelineId": "22222222-2222-2222-2222-222222222222",
#         "results": {
#             "replacedFileIds": [],
#             "newFileIds": [],
#             "inputFileIds": [],
#             "data": {},
#         },
#         "message": "",
#         "errors": [],
#     },
# }

r2 = requests.get(
    url="https://v2.api.audio/production/suite/pipeline/22222222-2222-2222-2222-222222222222",
    headers={"x-api-key": API_KEY, "x-assume-org": ORG_ID},
)
print(r2.status_code)
# 200

print(r2.json())
# {
#     "meta": {
#         "version": "123",
#         "requestId": "c78f3fe4-eb03-4b50-ad99-1fb73e88a10c",
#         "creditsUsed": 0,
#         "creditsRemaining": 96612.54,
#     },
#     "message": "Pipeline finished",
#     "warnings": [],
#     "data": {
#         "status": 200,
#         "dateCreated": 1730737260,
#         "pipelineId": "22222222-2222-2222-2222-222222222222",
#         "results": {
#             "newFileIds": [],
#             "replacedFileIds": [],
#             "data": {
#                 "language": "en_us",
#                 "text": "The mind is not a vessel to be filled, but a fire to be kindled.",
#                 "sections": [
#                     {
#                         "name": "section_1",
#                         "speaker": "A",
#                         "text": "The mind is not a vessel to be filled, but a fire to be kindled.",
#                     }
#                 ],
#                 "script": '<as:section name="section_1">\\n    The mind is not a vessel to be filled, but a fire to be kindled.\\n</as:section>',
#             },
#             "inputFileIds": [
#                 {
#                     "fileId": "11111111-1111-1111-1111-111111111111",
#                     "filePath": "audiostack_examples/example_assets/emily_plutarch_quote.mp3",
#                     "label": "input",
#                 }
#             ],
#         },
#         "message": "complete",
#         "errors": [],
#     },
# }

The cost of this service is 0.41 creds per secondof transcribed audio.