Thomas' Tidbits

Creating Subtitles for Videos in Linux

By: Thomas Hawkins Published Date: 2019-10-31 08:04:00

subtitles linux

One thing we do at our workplace is produce video. We have a team of producers, techs, and editors that plan, shoot and edit video. What we don't have is staff who create subtitles, closed captions or transcripts. Often we outsource this work to a 3rd party, and include transcripts with our educational video. But if we have a finished video, and a script the video was created from, it is more cost effective (and timely) to create those internally.

I recently completed a survey of the available software, and although I don't have a complete list available, I will share what I thought was the best tool during my short testing.

Subtitle Composer: https://github.com/maxrd2/subtitlecomposer

There are versions available for Linux (Arch, Ubuntu, Debian, OpenSUSE and an AppImage) as well as Windows. You can of course build from source as well. Here is a list of all features from the feature list on the webpage:

Open/Save Text Subtitle Formats
- SubRip/SRT, MicroDVD, SSA/ASS, MPlayer, TMPlayer and YouTube captions
Open/OCR Graphics Subtitle Formats
- VobSub (.idx/.sub/.rar), BluRay/PGS (*.sup), formats supported by ffmpeg (DVD/Vob, DVB, XSUB, HDMV-PGS)
Demux Graphics/Text Subtitle Stream from video file
- SRT, SSA/ASS, MOV text, MicroDVD, Graphic formats supported by ffmpeg (DVD/Vob, DVB, XSUB, HDMV-PGS)
Speech Recognition from audio/video file using PocketSphinx
Smart language/text encoding detection
Live preview of subtitles in integrated video player (MPV, GStreamer, MPlayer, Xine, Phonon) w/ audio stream selection
Preview/editing of subtitles on audio waveform w/ audio stream selection
Quick and easy subtitle sync:
- Dragging several anchors/graftpoints and stretching timeline
- Time shifting and scaling, lines duration re-calculation, framerate conversion, etc.
- Joining and splitting of subtitle files
Side by side subtitle translations
Text styles (italic, bold, underline, stroke, color)
Spell checking
Detection of timing errors in subtitles
Scripting (JavaScript, Python, Ruby and other languages supported by Kross).

I like it as it shows the video as well as an audio waveform that makes it easy to line up starting and stopping points for your audio. It also has a number of features that help correct and adjust your output automatically. It has a configurable interface, and lots of features to help you subtitle/caption better including variable playback rates, translations, and the ability to add scripts to help improve your workflow and tasks.

One feature I'm interested in is speech recognition from the audio/video file, as that would save me considerable time. To enable speech recognition, you need to install pocketsphinx and it dependencies. You can find more info here: https://github.com/maxrd2/SubtitleComposer/issues/88#issuecomment-341405873 and https://github.com/maxrd2/SubtitleComposer/issues/88#issuecomment-364688364 After installing the dependencies, you need to relaunch and reload your video. The Video -> Recognize Speech option doesn't appear until you actually play your video. Then you can run the recognition and pause the video. Once complete, you will have the recognized text (it wasn't great, maybe 60%) as well as natural breaking points for your audio. It is likely you will want to adjust those break points, as well as need to adjust the text.

I found it helpful to use the Edit -> Join Lines option to get all of the text into a single block, then transcribe it (fix the mistakes) and then Edit -> Split Lines to put it all back together. It looked like it worked perfectly and assigned the text back to the original breakpoints, but then the program crashed. :( Luckily, I had copied all of the text to a text editor for safe keeping. I reopened the program, reopened my video, created a single subtitle, pasted in my text (this would also work great if you had an existing script or transcript to work from), then did Edit -> Split Lines and it automatically creates breaks based on amount of text. Then, click Times -> Shift and enter the time stamp that speech starts, and speech stops. The program then automatically creates all of the breakpoints, with roughly the right size given the number of characters, and off sets them all to the correct amount. There are still manual adjustments required to merge together a few lines, split others, and adjust breakpoints, but it gets you an excellent head start.

Comment on this post:

Showing comments:

#	By	Comment	Post Date	Likes

Thomas' Tidbits: A few tidbits from Thomas about a variety of topics like programming, gaming, sports and other things.

Creating Subtitles for Videos in Linux

By: Thomas Hawkins Published Date: 2019-10-31 08:04:00

subtitles linux