/\as builder said five months ago(earlier in this topic), the only way to know for sure if there will be any size improvements is to test it out.
as for ralphdspam's question: i'm not entirely sure what it is you're asking (and i'm sure that somebody else will likely have a question about other things at some point), so ima just go through a quick explanation of huffman.
Simple Huffman compression is very easy to read and write. Firstly go through the data you wish to compress and determine, in order, the number of times each byte value appears(this can be done with two+ byte numbers as well, I suppose, but you aren't likely to gain any space from it). Once you have these values, store them in a lookup table which will be accessed by your compressor/decompressor later on.
Reading and writing are then very simple, if you choose to use the quick and easy method:
each of your byte values will be stored in your compressed data, not as fixed length chunks, but as variable length, null-terminated strings of bits. This means that, to your decompressor
10
will read as byte value number one in your look-up table,
110
will read as the second,
1110
as the third, and so on.
If you're really looking to save some space on redundancies in your data then it's possible to use these methods alongside Run Length Encoding (like builder hinted at [also five months ago ]) which appends after each value a second value which tells how many values afterwards are the same. Therefore, using full bytes: 00000001 11111111, or 01 FF would be read by a decompiler as string of 255 0's.
In order to use this concept alongside Huffman compression, one can apply the same concept but, instead of using byte sized chunks, using Huffman-style varying strings:
110 1110
for example, when using the quick and easy patter from above, would be translated as 3 consecutive 'value # 2's from your look-up table.
11110 110
would be 2 'value # 4's,
111110 1111110
would be 6 'value #5's, and so on...
Have fun playing around with data!
http://www.omnimaga.org/index.php?topic=4545.msg120038#msg120038
|