[Haskell-beginners] How would you improve this program?

Tue Oct 11 13:04:23 CEST 2011

You have your main function which calls printTokens which calls again
printToken. Both of the print* functions are IO (). I see that as
being bad style.

It would be cleaner if you would build the string to be printed in
some nice pure functions and then print that string in main directly.
That way you isolate the IO actions in the main function only.

I'm thinking something like this:

tokenToString :: Int -> Int -> Token -> String
tokenToString maxLength maxCount (Token w c) =
    ...

tokenListToString :: [Token] -> String
tokenListToString tokens =
    join "\n" result -- from Data.List.Utils
    where
        result = map (tokenToString maxLength maxCount) sortedTokens
        ...

main = do
    words <- getContents
    let output = tokenListToString $ countTokens words
    putStr output

This way your function types actually mean something instead of just
having functions IO () which could do whatever they like

ovidiu

On Sun, Oct 9, 2011 at 11:11 PM, Lorenzo Bolla <lbolla at gmail.com> wrote:
> Hi all,
> I'm new to Haskell and I'd like you to take a look at one of my programs and
> tell me how you would improve it (in terms of efficiency, style, and so
> on!).
> The source code is
> here: https://github.com/lbolla/stanford-cs240h/blob/master/lab1/lab1.hs
> The program is an implementation of this
> problem: http://www.scs.stanford.edu/11au-cs240h/labs/lab1.html (basically,
> counting how many times a word appear in a text.)
> (I'm not a Stanford student, so by helping me out you won't help me to cheat
> my exam, don't worry!)
> I've implemented 3 versions of the algorithm:
>
> a Haskell version using the standard "sort": read all the words from stdin,
> sort them and group them.
> a Haskell version using map: read all the words from stdin, stick each word
> in a Data.Map incrementing a counter if the word is already present in the
> map.
> a Python version using defaultdict.
>
> I timed the different versions and the results are
> here: https://github.com/lbolla/stanford-cs240h/blob/master/lab1/times.png.
> The python version is the quickest (I stripped out the fancy formatting
> before benchmarking, so IO is not responsible for the time difference).
> Any comments on the graph, too?
> Thanks a lot!
> L.
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
>