We have discussed Huffman Encoding in a previous post. In this post decoding is discussed.
Input Data : AAAAAABCCCCCCDDEEEEE Frequencies : A: 6, B: 1, C: 6, D: 2, E: 5 Encoded Data : 0000000000001100101010101011111111010101010 Huffman Tree: '#' is the special character used for internal nodes as character field is not needed for internal nodes. #(20) / #(12) #(8) / / A(6) C(6) E(5) #(3) / B(1) D(2) Code of 'A' is '00', code of 'C' is '01', .. Decoded Data : AAAAAABCCCCCCDDEEEEE Input Data : GeeksforGeeks Character With there Frequencies e 10, f 1100, g 011, k 00, o 010, r 1101, s 111 Encoded Huffman data : 01110100011111000101101011101000111 Decoded Huffman Data geeksforgeeks
To decode the encoded data we require the Huffman tree. We iterate through the binary encoded data. To find character corresponding to current bits, we use following simple steps.
- We start from root and do following until a leaf is found.
- If current bit is 0, we move to left node of the tree.
- If the bit is 1, we move to right node of the tree.
- If during traversal, we encounter a leaf node, we print character of that particular leaf node and then again continue the iteration of the encoded data starting from step 1.
The below code takes a string as input, it encodes it and save in a variable encodedString. Then it decodes it and print the original string.
The below code performs full Huffman Encoding and Decoding of a given input data.