[Haskell-cafe] Re: ANN: bitspeak 0.0.1

Maurí­cio CA mauricio.antunes at gmail.com
Wed Jun 23 22:59:56 EDT 2010


>> Sure, Huffman was actually my first tought. But I couldn't think
>> of a pratical display for the result of Huffman encoding that
>> could be easily followed by a human looking at the screen. Since
>> it's an optimal code, letters would not be grouped in alphabetical
>> order.
>
> There is a compromise.
> There is such a thing as an ORDERED Huffman code.
> Consider a set of strings.
> If they call begin with the same first letter,
> assume that letter and consider the suffixes instead.
> Otherwise, choose a letter L such that
> as close as possible to half of the strings begin
> with a letter preceding L in the alphabet
> as close as possible to half of the strings begin
> with the letter L or a later letter.

I believe that's what I've done. I use this file to give weight
to letters:

   http://bitbucket.org/mauricio/bitspeak/src/tip/src/Corpora.hs

After each letter is selected I keep only sufixes after that
letter, and then I use 'splitData' to find the point where
I'll separate both halves:

   http://bitbucket.org/mauricio/bitspeak/src/tip/src/Main.hs

Best,

Maurício



More information about the Haskell-Cafe mailing list