[Haskell-cafe] Nice way to calculate character frequency in a
iwin_1 at hotmail.com
Tue Oct 25 12:49:17 EDT 2005
It seems like the fastest way to build a list of frequency from a string is
import Data.Array (assocs, accumArray, Array)
frequency :: String -> [(Char,Int)]
frequency = filter (\f -> snd f>0) . assocs . accumArray (+) 0 ('\0',
'\255') . map (\x->(x,1))
-- (~1.3 sec on a string of 700ko)
But if we want to make this polymorphic we have to do this :
import Data.Ix (Ix)
frequency :: (Ix a, Bounded a) => [a] -> [(a,Int)]
frequency = filter (\f -> snd f>0) . assocs . accumArray (+) 0 (minBound,
maxBound) . map (\x->(x,1))
-- (>5 sec on a string of 700ko)
but it's really much slower, it seems like min/maxBound are check each time
or I dont know, and it's restrictive on the type of the elements of the
The best polymorphic version seems to be :
import Data.Map as Map (lookup, assocs, empty, insertWith)
frequency :: (Ord a) => [a] -> [(a,Int)]
frequency xs = Map.assocs $ foldl f Map.empty xs
where f m x = let
m' = Map.insertWith (+) x 1 m
Just v = Map.lookup x m'
in seq v m'
-- (~2.1 sec on a string of 700ko)
Which is not as fast as the specialized version on string, but really faster
and less restrictive than the second.
Thanks everyone for all your answers !
More information about the Haskell-Cafe