[Haskell-cafe] Toy compression algorithms [was: A very edgy language]

Mon Jul 9 16:42:16 EDT 2007

Stefan O'Rear wrote:
> On Mon, Jul 09, 2007 at 09:15:07PM +0100, Andrew Coppin wrote:
>   
>> Oh, sure, the *idea* is simple enough. Trying to actually *implement* it 
>> correctly is something else... ;-)
>>     
>
> Took me about an hour and 50 lines of code (about a year ago - this was
> one of my first Haskell programs) to implement a PPM (de)compressor that
> didn't crash, always generated the same output as input, and achieved
> 50% ratios on its own source code (not quite as good as gzip, but what
> do you expect from a completely untuned compressor?).
>   

...aren't you one of those insanely intelligent people working on GHC? :-P

Actually, I wrote a working compressor in JavaScript. (You know, as an 
interactive web page. No, I don't still have it...) At least, I presume 
it works... I never had occasion to try to decompress the output! (I 
remember it had no underflow handling though...)

Now, writing one that works in binary rather than decimal, and isn't 
absurdly inefficient? (I.e., it *doesn't* work by converting every thing 
to decimal strings, jiggling the characters around, and then parsing 
back to machine integers.) That is something I haven't managed yet...

> Peak throughput: 2 bits / sec. :)
>   

How fast can you click buttons? That was mine's peak output. ;-)

>>> the jhc is very different story
>>>       
>> Yes - last I heard, it's an experimental research project rather than a 
>> production-ready compiler...
>>     
>
> Correct.  It requires 5 minutes and 600MB of RAM to compile Hello,
> World, and fails with internal pattern match errors on anything
> significantly larger.
>   

Ouch. That's pretty advanced... :-D