Different film gauges and standards involve different numbers of frames in each second, and audio has no frames at all. Without a standardised reference point, there would be no clear or efficient way to communicate when to start playback, for instance, at a certain point in two separate media. At best, communication of such a simple concept would become over-verbose, and at worst there would be several different ways of expressing it across the industry.
The Society of Motion Picture and Television Engineers (SMPTE) developed a standard called Timecode or SMPTE Timecode, which assigns timestamps to media. For strict and accurate timecode, the media must be digital; analogue media cannot truly retain timecode, although it can have timecode “burned” into it, or it can be controlled by a digital player which may be able to assign timecode from known positions (but you lose accuracy, which is the point of timecode).
Hour : Minute : Second : Frame
This means that a timecode of
00:00:30:00 indicates 30 seconds from the start of a timecode region (usually, but not always, the start of the media).
A timecode of
00:21:12:00 translates to 21 minutes and 12 seconds.
A timecode of
00:03:14:15 translates to 3 minutes, 14 seconds, and 15 frames.
And so on.
It is also (time being what it is) possible to specify units less than a whole frame. The timecode
00:00:14:12.5 indicates 14 seconds, 12 frames, and a half of a frame. If you know how many frames are in one second, you can convert that value to milliseconds. For instance, if there are 24 frames per second, then 12.5 frames is
.5208 second or
520.8 milliseconds (because 12.5 divided by 24 frames is 0.5208, with 24 frames being a value equal to 1 second).
The value of 24 frames per second is common because USA 35mm film stock played back at 24 fps happens to synchronize flawlessly with the normal playback speed of audio (that is, when someone's mouth moves on screen at 24 fps, the recording of their voice matches perfectly, and continues to match for as long as a consistent speed is maintained).
That doesn't mean, however, that it's the only timebase you will see; British filmmakers use 25 fps instead of 24 fps, and video traditionally used 30 fps and later 29.97 fps, but eventually favoured 48 fps or 60 fps.
There is no inherent method in SMPTE timecode to identify the timebase, except to look at when the frames turn over into seconds.
Historically, there has also been something called drop frame timecode, which was used to account for the inherent mismatch between the traditional video 30 fps and the 29.97 fps of “modern” video formats: the fix was to simply drop 2 frame counts every minute, except on the tenth minute. The actual video frame is not dropped, only the numbers in the time code are dropped. It's a little like a leap year; on a leap year, you don't actually gain an extra day of life (and, likewise, you don't actually skip a day of life on a non-leap year), we just assign a new number to the 24 hour period that has accumulated in the leftover bits of our calendar.
Drop frames are mostly phased out of necessity at this point, since most video formats can record and be edited in a variety of frame rates.
When to Use Timecode
Timecode is available across both video and audio. Strictly speaking, you don't need to use SMPTE timecode if you are not working to video, but pragmatically it is probably available to you if you find it useful. Many DAWs display timecode just as happily as they display traditional hour-and-minutes counters or measures-and-beats.
$ ffmpeg -i video.mp4 -vcodec copy -acodec copy \ -vf "drawtext=fontfile=DroidSansMono.ttf: timecode='01\:00\:00\:00': r=24: \ x=(w-tw)/2: y=h-(2*lh): fontcolor=white: box=1: boxcolor=0x00000099" \ -threads 6 -y output.mp4
Or you can use a video player that dynamically displays timecode, like xjadeo