[Haskell-beginners] Code help requested

Yusaku Hashimoto nonowarn at gmail.com
Tue Jan 12 01:27:12 EST 2010


Hello,

I forked your gist and made some changes: http://gist.github.com/274956

The main change is use foldl' instead of explicit recursion (HLint may
point this out.) This gives us clearness and strictness.

And It runs faster than ruby's one on my 2GHz MacBook =)

I used ruby-1.9.1 for ruby interpreter, and GHC-6.12.1 for haskell compiler.

time ruby ./anagram.rb < /usr/share/dict/words
[["caret", "carte", "cater", "crate", "creat", "creta", "react",
"recta", "trace"], ["ester", "estre", "reest", "reset", "steer",
"stere", "stree", "terse", "tsere"], ["angor", "argon", "goran",
"grano", "groan", "nagor", "orang", "organ", "rogan"]]

real	0m4.620s
user	0m4.473s
sys	0m0.150s

ghc --make anagram
time ./anagram < /usr/share/dict/words
[["caret","carte","cater","crate","creat","creta","react","recta","trace"],["angor","argon","goran","grano","groan","nagor","orang","organ","rogan"],["ester","estre","reest","reset","steer","stere","stree","terse","tsere"]]

real	0m3.797s
user	0m3.613s
sys	0m0.173s

Each output is slightly different because of spec of Hash in ruby 1.9.

see also:
http://www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash/
http://hackage.haskell.org/packages/archive/containers/0.2.0.1/doc/html/Data-Map.html#v:elems

Cheers
-- nwn

On Tue, Jan 12, 2010 at 10:22 AM, Joe Van Dyk <joe at fixieconsulting.com> wrote:
> I've written two versions of the same program, one in ruby and one in
> haskell.  Given words on stdin, find all the anagrams in those words.
> For nicer display, we're only going to display the top 3 results.
>
> I'm obviously new to haskell.  The ruby version runs about 5x as fast
> on a large file.   How can I improve the haskell version?
>
> http://gist.github.com/274774
>
> # Ruby version
> input = STDIN.read.split("\n")
> result = Hash.new([])
> input.each do |word|
>  sorted_word = word.split('').sort.join
>  result[sorted_word] += [word]
> end
> values = result.values.sort { |a, b| b.size <=> a.size }
> p values[0..3]
>
> # Haskell version
> import List
> import qualified Data.Map as Map
>
> -- Given as stdin
> -- presents
> -- serpents
> -- no
> -- on
> -- whatever
> -- Expected Output:
> -- [["serpents","presents"],["on","no"]]
>
> -- This version only displays words that have more than one
> -- match in the list, and sorts by the words that got the most matches.
>
> -- Can we do the map bit better?
>
> main = do
>  input <- getContents
>  print $ anagrams $ lines input
>
> anagrams words =
>  sorted_anagrams
>  where
>    sorted_anagrams = sortBy sorter filtered_anagrams
>    sorter a b = compare (length b)  (length a)
>    filtered_anagrams = Map.elems $ Map.filter filter_function all_anagrams
>    filter_function words = length words > 1
>    all_anagrams = do_anagrams words Map.empty
>    do_anagrams [] result = result
>    do_anagrams words result = do_anagrams
>                                 (tail words)
>                                 (Map.unionWith
>                                   (++)
>                                   (Map.fromList
> [(sorted_current_word, [current_word])])
>                                   result)
>      where
>        current_word = head words
>        sorted_current_word = sort current_word
>
>
>
> --
> Joe Van Dyk
> http://fixieconsulting.com
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>


More information about the Beginners mailing list