[Haskell-cafe] Toy compression algorithms [was: A very edgy language]

Bulat Ziganshin bulat.ziganshin at gmail.com
Fri Jul 13 02:45:21 EDT 2007


Hello ajb,

Friday, July 13, 2007, 4:50:15 AM, you wrote:
> To encode a "word" in PPM, you encode the conditional probability
> that it takes to get to the end of the word from the start of the word.
> It looks like you're encoding single characters (and, programmatically,
> you are), but because of the way that arithmetic coding works, it's
> mathematically equivalent to encoding the probability of finding the
> word as a whole.  You're just decomposing the probability of finding
> the word into the probabilities of finding the individual input symbols
> along the way.

> Does that make sense?

it seems that you either misunderstood PPM or make too wide
assumptions. in classical fixed-order ppm each char probability
found in context of previous N chars. i.e. when encoding abcdef..
with order 3, 'd' encoded in context of 'abc', 'e' encoded in context
of 'bcd' and so on

when you consider lzw as a sort of ppm, context order is grown from 0
up to size of phrase, as shown in lzrw5 paper (http://www.cbloom.com/src/lzrw.zip)

so you may consider lzw either as "flat static", i.e. order -1 encoding
of *words* or floating-order encoding of chars. you just mixed up two
ideas. don't forget that order -1 coder isn't ppm - it just a trivial
copying. as i already said, using of tries for model representation
means similarity in implementation techniques but not in mathematical properties

regarding lzw+huffman - i think that key to efficient huffman encoding
of English words was sorting dictionary by word frequency, which is
inappropriate for lzw. also, unlike English dictionary, lzw dictionary
grows dynamically which means that static huffman techniques will lose
part of redundancy


-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin at gmail.com



More information about the Haskell-Cafe mailing list