This article contains specifications for each common caption file type. If you are experiencing errors with your captions, check to make sure your file meets the requirements below.
In this article:
WebVTT
Specs:
- Files must begin with a valid signature (WEBVTT)
- Timecode is represented as 00:00:00.000
- Cues often (but do not have to) include a numeral identifier (e.g. 1, 2, 3, ...)
- Cues must have a content line (even if empty) followed by a break line
SRT
Specs:
- Files must begin with the first cue's numerical identifier (e.g. 1, 2, 3, ...)
-
Timecode is represented as 00:00:00,000
-
Hours, minutes, and seconds are always written as two characters (07 not 7).
Milliseconds are always written as three characters.
-
Hours, minutes, and seconds are always written as two characters (07 not 7).
- Cues must have a content line (even if empty) followed by a break line
DFXP/TTML
Specs :
- Files must begin with a <tt> tag, include a <body> tag then a <div> tag, wherein each cue is written using the <p> tag
- All tags must open < > and close </>, and be properly nested
- Timecode is represented as 00:00:00.0
SCC
Specs:
- Timecode represented as 00:00:00:00
- Text content represented as '9420 9454 97a1 52ef 73e5 7320 61f2 e520 f2e5 6480 942c 8080 8080 942f' or something similar.
⚠️Note: Because this is a broadcast standard format, the timecode begins at one hour (01:00:00:00) by default. This practice is not standard on the web, and therefore we do not accommodate for it – to ensure your captions begin at the start of the video, make sure the first timecode is 00:00:00:00.
SAMI
Specs:
- Files must begin with a <sami> tag, include a <body> tag then a <sync> tag, wherein each cue is written using the <p> tag
- All tags must open < > and close </>, and be properly nested
- Timecode is represented in milliseconds (one second = 1000)