Your goal is to develop a command-line utility which will be compiled on Linux and Windows so it must work on both platforms.
The utility will be able to read a transport stream with TV recordings (TS with MPEG2). It will be able to analyze the video in order to detect channel logo.
Since in our country commercials are broadcasted without the channel logo, and TV program is always broatcasted with channel logo, we will use this to detect which part of the video is commercial and which part is the actual program.
So the utility will detect that, based on the presence or absence of the channel logo. It will print to command line the intervals with commercials and video in seconds.
0 65.9 C
65.9 644.5 V
644.5 1205.4 C
This means that in the video, first 65.9 seconds were detected as having commercials (thus flag C), then there was the actual video (flag V) and then there was another commercial from [url removed, login to view] to [url removed, login to view], and the video ends there. Fields separated by TAB character.
Keep in mind:
- channel logo can be partially transparent
- channel logo's position is different for each channel, some channels have it at top right, some at top left, etc.
- the entire video file can start with a commercial. Thus first bunch of frames doesn't need to contain the channel logo.
- the entire video file can end with a commercial too
The utility will not be open source. So you can only use libraries or other opensource parts whose license permits that.
The utility must be very quick. It must analyze only very few frames. Here is how to do it:
We will assume that if there is a commercial, it is always longer than 30 seconds (configurable). So in first iteration, the software will collect key frames each 30 seconds. For example, if it is a 2 hours of video, it will need to analyze only 240 key frames. We will call them "pivot frames" here. Those will be used to identify if there is a logo, and how does it look like, and later we will analyze out pivot frames to get first guess for intervals.
It is forbidden to read sequentially all video data. It could be a 5GB file on disk and we do not want to read it as whole! You must use seek() in order to skip the 30 seconds intervals and then find a nearest following keyframe. The seeking doesn't need to be precise. Speed of processing is very important.
Once the software identifies the logo, it will go through all the previously read key frames and it will be able to say which one is commercial and which one is not. It is wise to remember keyframe positions from the previous run, so the utility could quickly seek() to them again to read them, or even hold them in RAM if that is feasible.
At this point, the utility identified which of the pivot frames are commercials. It will look similar to this:
(V means the pivot frame has a video, C means commercial)
So now it is possible to calculate intervals in the given precision (our precision is now 30 seconds).
When done, the utility will need to read some video data (starting 30 seconds before the first known C frame, not more than 30 seconds long), to precisely identify where the first commercial block starts. It will not read the entire 30 seconds of video, but it will use Divide and conquer algorithm (seek into half of the data and see if we are behind or after a commercial, then seek again, etc. to find precise keyframe). The same technique will be used to find out where the first commercial block ends, etc.
The utility must accept commandline parameters to list only intervals with commercials (C flag), or to list only intervals with video (V flag).
It must also accept a commandline parameter to specify the length between keyframes for the initial check (default 30 seconds, see above)
Also the .ts video file name which is processed will be specified as commandline parameter.
You will provide guidance how to compile the utility on Linux and on Windows.
added image attachment - example channel logo