How does MythTV's Commercial Detection work? Surprisingly well. Ever wonder how it does such a good job of identifying commercials?
There are three key indicators that MythTV uses from recorded content to identify commercials.
A blank frame is many times sandwiched in-between the television show and the commercials. The most simple form of detecting commercials is to search for blank frames in the video feed. The problem with this is that it can be very misleading. There can be a blank frame anywhere. Just because there is a blank frame, doesn't mean it's a commercial break. You could easily end up with commercials marked as part of the show and parts of the show marked as commercial.
Scene transitions are another indicator. A scene transition is a cut between one video of something and a video of something else. A simple example would be in a newscast where someone is being interviewed. While the anchor is asking the question, you may see both the anchor and the person being interviewed. When the person being interviewed starts to answer the question, the scene "cuts" to a close-up of the face of the person answering the question. In regards to commercials, there is a scene transition "cut" between each commercial. Each commercial usually is unrelated to the next. The last frame of one commercial would be totally different from the first frame of the next. Looking for patterns in scene transitions is one way to identify commercials. Five groups of 30 second scenes all grouped together may be a good indication of a block of commercials. This method works better than the blank frame method, but also isn't foolproof. There's no reason scene changes in a show might not mimic commercials, and vis-versa.
The third indicator of commercials that MythTV uses I find rather ironic. Bugs, also referred to as DOGS (Digital On-Screen Graphics), or Watermarks. A Bug is that little TV station logo in usually the bottom right corner of your screen during a TV show. I find this ironic because one of the reasons or it being there is to build channel awareness in the world of digital video recorders like MythTV. Since DVR users usually find shows by name rather than by channel, they are less concerned with which station a show is on than are other viewers. MythTV watches for these things. Because the digital watermarks are generally not shown during commercials, identifying one and then watching for it is a good indication of when a commercial break starts or stops. While much more complicated to implement than watching for the blank frame or screen transition, in theory it's probably the most effective in some circumstances. Because in practice they are hard to identify on some stations, the actual implementation can be error prone.
MythTV looks for all three of these identifiers to locate commercials. It breaks each show up into scenes, and then applys a series of score for the scene based on looking at all three factors in relation to one another, especially taking timing and patterns into account. Based on the final score of a scene, it's either (essentially) dropped into the show bucket or the commercial bucket. It's not a black/white type thing. Because of the scoring, there are a whole range of grays in the middle. You end up with scenes that looks "more" like commercials or "more" like show content, and they are then flagged as such.
I've been quite impressed at the quality of the commercial flagger that MythTV has implemented. In my experience, the system does an excellent job.
Commercial flagging is set globally in:
Utilties/Setup -> Setup -> TV Settings-> General
Do you have ideas or talent that can help increase the quality of this great tool? Check out and contribute to the MythTV commercial flagging developers' wiki.