Adaptive Huffman coding – Knowledge and References

Explore chapters and articles related to this topic

Lossless Compression

Published in Jerry D. Gibson, The Communications Handbook, 2018

The Huffman code relies on knowledge of source statistics for its efficiency. In many applications, the source statistics are not known a priori and the Huffman code is implemented as a two-pass procedure. The statistics of the source are collected in the first pass. These statistics are used to generate the Huffman code, which is then used to encode the source in the second pass. In order to convert this algorithm into a one-pass procedure, the probabilities need to be estimated adaptively and the code altered to reflect these estimates. Theoretically, if we wanted to encode the (k + 1)st symbol using the statistics of the first k symbols, we could recompute the code using the Huffman coding procedure each time a symbol is transmitted. However, this would not be a very practical approach due to the large amount of computation involved. The adaptive Huffman coding procedure is a computationally efficient procedure for estimating the probabilities and updating the Huffman code.

A lossless compression method for logging data while drilling

View Article

Journal Information

Published in Systems Science & Control Engineering, 2021

Shan Song, Taiwei Lian, Wei Liu, Zhengbing Zhang, Mingzhang Luo, Aiping Wu

The existing lossless data compression techniques for logging data transmission include Huffman coding, arithmetic coding (Witten et al., 1987) and LZW algorithm. Among them, LZW algorithm is simple to implement, but it needs to establish the dictionary at the encoder and decoder simultaneously. So, its error resilience is poor, which cannot meet the requirements of resilience. Arithmetic coding and Huffman coding are both coding methods based on probability and statistics. They all use the probabilistic model to compress data. In terms of the compression effect, the compression efficiency of the two is close to the entropy of the information source, and there is no big difference (Moffat, 2019). In terms of computational complexity, arithmetic coding requires a lot of multiplication, addition and subtraction methods. It also has high requirements for the accuracy of floating-point calculation (Bodden et al., 2007). So arithmetic coding has high requirements on the processor, which does not meet the requirements of low power consumption and small size of compression module. Huffman coding is divided into non-adaptive Huffman coding and adaptive Huffman coding. Adaptive Huffman coding is to dynamically establish and update the probability model of data in the process of coding and constantly update the Huffman tree and Huffman codeword at the same time. Moreover, in order to decode correctly, the encoder and decoder must update the probability model and Huffman tree synchronously. They have to ensure that the probability model and Huffman tree are completely consistent. If the compressed data appears error bits in the process of data transmission, the decoder can’t decode all the subsequent data correctly. Therefore, adaptive Huffman coding not only has higher computational complexity than non-adaptive Huffman coding does but also has poor error resilience. Instead of adaptive Huffman coding, non-adaptive Huffman coding uses the fixed probability model, so the Huffman codeword of each symbol is independent. Thus, compared with LZW algorithm, arithmetic coding, adaptive Huffman coding and other adaptive coding algorithms, the error resilience performance of non-adaptive Huffman coding is better.