Transcribing is the process of writing down the words you hear in an audio. Our solution allows you to transcribe audio from your video and get subtitles automatically. To do this, we use modern AI models. The result:
audio_language
” otherwise AI may force the system to recognize gibberish.What can be transcribed? Service uses additional methods to detect presence of speech in audio track, thus improving the detection of any human conversations:
What about translation?
It is also possible to automatically translate from the original language to another you need.
To create a translation, specify the desired language explicitly in “subtitles_language
” parameter. Otherwise, the subtitles will be in the original language. Translation into different languages should be done by creating separate tasks.
Use MP4 videos to process. This method is not tied to videos that are stored only in our video hosting (look at how get a link to MP4 rendition), so you can use links to any other external file with HTTP/HTTPS access.
For now, only the first audio track can be processed; later this functionality will be improved to allow to use any.
Also, not all language pairs are currently supported. If a language pair is not supported for automatic translation, the task status will be FAILURE with description of the reason.
Example: eng => uzb
.
You can request to add the language pair you need for automatic translation. Contact our support.
Example of modes to transcibe and/or translate:
{ "url":"..." }
{ "url":"...", "
audio_language":"ger" }
{ "url":"...", "
subtitles_language":"eng" }
{ "url":"...", "
audio_language":"ger", "
subtitles_language":"eng" }
Example of setting a task to process MP4 file (animated gif from above):curl -L 'https://api.gcore.com/streaming/ai/transcribe' \
-H 'Content-Type: application/json' \
-H 'Authorization: APIKey 1234$abcd...' \
-d '{
"url": "https://demo-files.gvideo.io/apidocs/spritefright-blender-cut30sec.mp4"
}'
As described above, transcription is done automatically using AI. Therefore, the quality may differ from a manual transcription by a professional person. If this happens to you, then you can download subtitles and change them in an external editor.
Transcription and translation are 2 different AI tasks:
The heart for transcribing is the AI model Whisper from OpenAI, with additional optimisations and services. The AI models run on our own infrastructure, so the files/data are not transferred anywhere to external services. After processing, origianl files are also deleted from local storage of AI. Read more detailed information about our solution, and architecture, and benefits in the knowledge base and blog.
API key for authentication. Make sure to include the word apikey
, followed by a single space and then your token.
Example: apikey 1234$abcdef
Response returns ID of the created AI task. Using this AI task ID, you can check the status and get the video processing result. Look at GET /ai/results method.
The response is of type object
.