PhD Defense
Tong Qiu
High Order Context Modeling Entropy Coding of Multimedia Data
Time:
Place:
Supervisor:
Thesis Examiners:
Extra-Departmental
Examiner:
External Examiner:
1:30 p.m.
Middlesex College, Room 320
Dr. Xiaolin Wu
Dr. Robert Webber
Dr. Mahmoud El-Sakka
Dr. Jin Jiang (Engineering)
Dr. Nasir Memon (Polytechnic University, New York)
Abstract:
Recent years have been seen explosive use of multimedia data in the Internet applications, including multimedia streaming, online digital library, teleconference, and wireless communications. Because the data volume of digital media is huge and increasing daily, putting punishing burden on communication bandwidth and data storage, there are pressing needs for multimedia data compression. At the heart of any multimedia data compression system is a module called statistical context modeling entropy coding.
The statistical context modeling uses future and past information to estimate the probability of the event, which gives the conditional probability. The goal of the context modeling is to get a skewed probability distribution, which hopefully gives high probability to the symbols that actually frequently occur and fewer bits can be used to encode them. Theoretically, context modeling can reduce the first order entropy, thus shorten the average coding length and achieve the better compression rate.
This thesis will present the basic concepts of high order context modeling and its applications in multimedia data compression, including still image coding, video coding and audio coding. A universal high order context modeling method, which can be used for any entropy codec to improve the coding performance, will be studied in this thesis. This context modeling method uses prior knowledge to estimate the conditional probability of current coding symbol. Based on the construction method proposed in the thesis, this kind of prior knowledge is known to both encoder and decoder. The context modeling estimates the probabilities on the fly so that no side information needs to be transmitted. The underlying statistical model is a self-learning process where the probabilities are updated during the actual coding/decoding process.
The front-end of the codecs discussed in this thesis has a wavelet transform, which transforms data from time/spatial domain into frequency domain. Since wavelet transform has localization in both time domain and frequency domain, it provides more useful information for high-order context modeling, such as different feature orientations in different subbands, coarse to fine data decomposition, energy packing in the lower frequency subband, and etc.
At mean time, wavelet transform makes it possible for the codecs to embedded encode the coefficients, which provides an important scalable feature for codecs. Instead of coding one coefficient at one time, embedded coding actually encode the coefficients from most to least significant bit based on their binary representations. The actual decoding process can stop at any time, and the approximate data can still be reconstructed based on partial bit values of the coefficients. This kind of scalability makes the codec more attracting and useful for Internet applications for progressive image/video transmission.
Since wavelet transform decomposes the data into natural hierarchical subbands, a tree structure is normally used to represent the relationships between the subbands. In this thesis, a quadtree structure, different from zero-tree and SPIHT, will be studied. It can provide information to encoder/decoder so that small coefficients at higher coding bitplane can be skipped over, which results in saving time and shortening the coding length.
This thesis will also present some applications of high order context modeling in the actual codecs, including still image coding, video coding and audio coding. For still image coding, the high order context modeling is designed to reflect the different feature orientations of different subbands. For video coding, it is designed to take advantages of the neighboring coefficients in the previous frame and next frame. Two audio coding methods will be presented in this thesis, predictive audio coding and wavelet packet audio coding. The predictive audio coding uses linear prediction to predict the data values and error feedback will be used to adjust the prediction errors. Wavelet packet audio coding uses wavelet packet transform to decompose the audio signal from time domain into frequency domain. Both prediction errors and wavelet coefficients are entropy coded using high order context modeling.
Based on the studies in this thesis, we can concludes that high-order context modeling can provide better probability estimations which entropy coder can use to improve the coding efficiency.
Also from this web page:
Info for Current Grads
Hot Items
In Memoriam
- Sheng Yu (1950 - 2012)

