Lifang Gu


Sep 2002

Video analysis in MPEG compressed domain


The amount of digital video has been increasing dramatically due to the technology advances in video capturing, storage, and compression. The usefulness of vast repositories of digital information is limited by the effectiveness of the access methods, as shown by the Web explosion.

The key issues in addressing the access methods are those of content description and of information space navigation. While textual documents in digital form are somewhat self-describing (i.e., they provide explicit indices, such as words and sentences that can be directly used to categorise and access them), digital video does not provide such an explicit content description.

Digital video is a very rich medium, and the characteristics in which users may be interested are quite diverse, ranging from the structure of the video to the identity of the people who appear in it, their movements and dialogues and the accompanying music and audio effects.

Indexing digital video, based on its content, can be carried out at several levels of abstraction, beginning with indices like the video program name and name of subject, to much lower level aspects of video like the location of edits and motion properties of video. Manual video indexing requires the sequential examination of the entire video clip. This is a time-consuming, subjective, and expensive process.

As a result, there is an urgent need for tools to automate the indexing process. In response to such needs, various video analysis techniques from the research fields of image processing and computer vision have been proposed to parse, index and annotate the massive amount of digital video data. However, most of these video analysis techniques have been developed for uncompressed video. Since most video data are stored in compressed formats for efficiency of storage and transmission, it is necessary to perform decompression on compressed video before such analysis techniques can be applied. Two consequences of having to first decompress before processing are incurring computation time for decompression and requiring extra auxiliary storage.

To save on the computational cost of decompression and lower the overall size of the data which must be processed, this study attempts to make use of features available in compressed video data and proposes several video processing techniques operating directly on compressed video data. Specifically, techniques of processing MPEG-1 and MPEG-2 compressed data have been developed to help automate the video indexing process. This includes the tasks of video segmentation (shot boundary detection), camera motion characterisation, and highlights extraction (detection of skin-colour regions, text regions, moving objects and replays) in MPEG compressed video sequences.

The approach of performing analysis on the compressed data has the advantages of dealing with a much reduced data size and is therefore suitable for computationally-intensive low-level operations. Experimental results show that most analysis tasks for video indexing can be carried out efficiently in the compressed domain. Once intermediate results, which are dramatically reduced in size, are obtained from the compressed domain analysis, partial decompression can be applied to enable high resolution processing to extract high level semantic information.

Why my research is important

To be able to effectively access vast respositories of compressed digital video.

In order to access video material in an effective way, without looking at the material in its entirety, it is therefore necessary to analyse and annotate video sequences, and provide an explicit content description targeted to the user needs.


