In: Computer Science
The key to the high performance of video compression lies in an efficient reduction of the temporal redundancy. For this purpose, the block-based motion estimation (BBME) technique has been successfully applied in the video compression standards from H.261 to H.264 [1]. The Exhaustive Search (ES) algorithm is the most computationally expensive block matching algorithm of all. It is also known as the Full Search Algorithm (FSA). This algorithm calculates the cost function at each possible location in the search window. Consequently, many fast motion estimation algorithms with a reduced number of search locations have been proposed.
In this project, you are required to make a summary(within 2
pages) to explain
1) give a summarized review on existing fast motion estimation
algorithms
Answer:
With the increasing popularity of technologies such as Internet streaming video and video conferencing, video compression has became an essential component of broadcast and entertainment media. Motion Estimation (ME) and compensation techniques, which can eliminate temporal redundancy between adjacent frames effectively, have been widely applied to popular video compression coding standards such as MPEG-2, MPEG-4. Traditional fast block matching algorithms are easily traped into the local minima resulting in degradation on video quality to some extent after decoding.
Video compression can be achieved by exploiting the similarities or redundancies and irrelevancy that exists in a typical video signal. The redundancy in a video signal is based on two principles. The first is the spatial redundancy that exists in each frame. The second is the fact that most of the time, a video frame is very similar to its immediate neighbors. This is called temporal redundancy. This temoparal redundancy can be eliminared by using motion estimation and compensation procedure. Another goal of video compression is to reduce the irrelevancy in the video signal, that is to only code video features that are perceptually important and not to waste valuable bits on information that is not perceptually important or irrelevant. Identifying and reducing the redundancy in a video signal is relatively straightforward, however identifying what is perceptually relevant and what is not is very difficult and therefore irrelevancy is difficult to exploit. This can be done by using appropriate models of the Human Vision System.
Successive video frames may contain the same objects (still or moving). Motion estimation examines the movement of objects in an image sequence to try to obtain vectors representing the estimated motion. Motion compensation uses the knowledge of object motion so obtained to achieve data compression. In inter frame coding motion estimation and compensation have become powerful techniques to eliminate the temporal redundancy due to high correlation between consecutive frames. In real video scenes, motion can be a complex combination of translation and rotation. Such motion is difficult to estimate and may require large amounts of processing. However, translational motion is easily estimated and has been used successfully for motion compensated coding.
In video coding, full search (FS) algorithm based on block matching finds optimal motion vectors which minimize the matching differences between reference blocks and candidate blocks in search area. FS algorithm has been widely used in video coding applications because of its simple and easy hardware implementation. However, high computational cost of the FS algorithm with very large search area has been considered as a serious problem for realizing fast real-time video coding as mentioned in the statement.
Several fast motion estimation algorithms have been studied in recent years in order to reduce the computational cost required. These algorithms can be classified into two main groups. One group of algorithms is based on lossy motion estimation technique with degradation of prediction quality compared with the conventional FS algorithm. The other group of algorithms is based on lossless estimation technique that does not degrade the prediction quality.
Different search algorithms are used to estimate motion between frames. When motion estimation is performed by an MPEG-2 encoder it groups pixels into 16×16 macro blocks. MPEG-4 AVC encoders can divide these macro blocks into partitions as small as 4 × 4, and even of variable size within the same Macro block. Partitions allow for more accuracy in motion estimation because areas with high motion can be isolated from those with less movement.
ref:
https://www.researchgate.net/publication/220882072_A_Fast_Motion_Estimation_Algorithm_for_H264
https://core.ac.uk/download/pdf/53187201.pdf
https://ieeexplore.ieee.org/document/4712175
https://link.springer.com/chapter/10.1007/11867586_53