[Haskell-cafe] Re: Suspected stupid Haskell Question

Thomas Hartman thomas.hartman at db.com
Wed Oct 17 21:48:28 EDT 2007

Since I'm interested in the stack overflow issue, and getting acquainted 
with quickcheck, I thought I would take this opportunity to compare your 
ordTable with some code Yitzchak Gale posted earlier, against Ham's 
original problem.

As far as I can tell, they're the same. They work on lists up to 100000 
element lists of strings, but on 10^6 size lists I lose patience waiting 
for them to finish. 

Is there a more scientific way of figuring out if one version is better 
than the other by using, say profiling tools?

Or by reasoning about the code?



import Data.List
import qualified Data.Map as M
import Control.Arrow
import Test.QuickCheck
import Test.GenTestData
import System.Random

Is there a library function to take a list of Strings and return a list of
ints showing how many times each String occurs in the list.

So for example:

["egg", "egg", "cheese"] would return [2,1] 

testYitzGale n = do
  l <- rgenBndStrRow (10,10) (10^n,10^n)  -- 100000 strings, strings are 
10 chars long, works. craps out on 10^6.
  m <- return $ freqFold l 
  putStrLn $ "map items: " ++ ( show $ M.size m )

testCScherer n = do
  l <- rgenBndStrRow (10,10) (10^n,10^n)  -- same limitations as yitz gale 
  m <- return $ ordTable l 
  putStrLn $ "items: " ++ ( show $ length m )

-- slow for big lists
--freqArr = Prelude.map ( last &&& length ) . group . sort

-- yitz gale code. same as chad scherer code? it's simpler to understand, 
but is it as fast?
freqFold :: [[Char]] -> M.Map [Char] Int
freqFold = foldl' g M.empty
  where g accum x = M.insertWith' (+) x 1 accum
-- c scherer code. insists on ord. far as I can tell, same speed as yitz.
ordTable :: (Ord a) => [a] -> [(a,Int)]
ordTable xs = M.assocs $! foldl' f M.empty xs
    where f m x = let  m' = M.insertWith (+) x 1 m
                       Just v = M.lookup x m'
                  in v `seq` m'

l = ["egg","egg","cheese"]

-- other quickcheck stuff
--prop_unchanged_by_reverse = \l -> ( freqArr (l :: [[Char]]) ) == ( 
freqArr $ reverse l )
--prop_freqArr_eq_freqFold = \l -> ( freqArr (l :: [[Char]]) == (freqFold 
--test1 = quickCheck prop_unchanged_by_reverse
--test2 = quickCheck prop_freqArr_eq_freqFold

--------------- generate test data: 
genBndStrRow (minCols,maxCols) (minStrLen, maxStrLen) = rgen ( genBndLoL 
(minStrLen, maxStrLen) (minCols,maxCols) )

gen gen = do
  sg <- newStdGen
  return $ generate 10000 sg gen

-- generator for a list with length between min and max
genBndList :: Arbitrary a => (Int, Int) -> Gen [a]
genBndList (min,max) = do
  len <- choose (min,max)
  vector len

-- lists of lists
--genBndLoL :: (Int, Int) -> (Int, Int) -> Gen [[a]]
genBndLoL (min1,max1) (min2,max2) = do
  len1 <- choose (min1,max1)
  len2 <- choose (min2,max2)
  vec2 len1 len2

--vec2 :: Arbitrary a => Int -> Int -> Gen [[a]]
vec2 n m = sequence [ vector m | i <- [1..n] ]

Chad Scherrer <chad.scherrer at gmail.com> 
Sent by: haskell-cafe-bounces at haskell.org
10/17/2007 01:35 PM

haskell-cafe at haskell.org

[Haskell-cafe] Re: Suspected stupid Haskell Question

Big_Ham <joymachine2001 <at> hotmail.com> writes:

> Is there a library function to take a list of Strings and return a list 
> ints showing how many times each String occurs in the list.
> So for example:
> ["egg", "egg", "cheese"] would return [2,1]
> I couldn't find anything on a search, or anything in the librarys.
> Thanks BH.

Hi BH,

This might be overkill, but it works well for me. And it avoid stack 
overflows I
was originally getting for very large lists. Dean Herrington and I came up 

ordTable :: (Ord a) => [a] -> [(a,Int)]
ordTable xs = Map.assocs $! foldl' f Map.empty xs
    where f m x = let  m' = Map.insertWith (+) x 1 m
                       Just v = Map.lookup x m'
                  in v `seq` m'

intTable :: [Int] -> [(Int,Int)]
intTable xs = IntMap.assocs $! foldl' f IntMap.empty xs
    where f m x = let  m' = IntMap.insertWith (+) x 1 m
                       Just v = IntMap.lookup x m'
                  in v `seq` m'

enumTable :: (Enum a) => [a] -> [(a,Int)]
enumTable = map fstToEnum . intTable . map fromEnum
    where fstToEnum (x,y) = (toEnum x, y)

If you like, it's easily wrapped in a Table class.


Haskell-Cafe mailing list
Haskell-Cafe at haskell.org


This e-mail may contain confidential and/or privileged information. If you 
are not the intended recipient (or have received this e-mail in error) 
please notify the sender immediately and destroy this e-mail. Any 
unauthorized copying, disclosure or distribution of the material in this 
e-mail is strictly forbidden.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20071017/d9a93c99/attachment-0001.htm

More information about the Haskell-Cafe mailing list