[Haskell-cafe] Re: can there be (hash-table using) O(n) version
of this (I think currently) n log n algo?
Matt Morrow
moonpatio at gmail.com
Wed Dec 9 18:06:31 EST 2009
Never underestimate teh power of the Int{Set,Map}:
{-# LANGUAGE BangPatterns #-}
import Data.Set(Set)
import Data.IntSet(IntSet)
import qualified Data.Set as S
import qualified Data.IntSet as IS
import Control.Parallel.Strategies(rnf)
import Data.Monoid(Monoid(..))
import Data.List
findsumsIS :: [Int] -> Int -> IntSet
findsumsIS xs wanted = snd . foldl' f mempty $ xs
where f (!candidates,!successes) next =
let x = wanted - next
in case x `IS.member` candidates of
True -> (candidates, IS.insert next successes)
False -> (IS.insert next candidates,successes)
-- (i had to add bangs in f since it was blowing the stack)
findsums :: [Int] -> Int -> Set (Int,Int)
findsums xs wanted = snd . foldl' f (S.empty,S.empty) $ xs
where f (!candidates,!successes) next =
if S.member (wanted-next) candidates
then (candidates, S.insert (next,wanted-next) successes)
else (S.insert next candidates,successes)
{-
Note that the list has 10 million elements,
(time is roughly 0.4s with 1 million with IntSet).
-}
{-
main = do
let s = findsums (take 10000000 (cycle [1..999])) 500
print (rnf s `seq` ())
[m at monire ~]$ time ./FindSums
()
real 0m8.793s
user 0m8.762s
sys 0m0.022s
-}
{-
main = do
let s = findsumsIS (take 10000000 (cycle [1..999])) 500
print (rnf s `seq` ())
[m at monire ~]$ time ./FindSumsIS
()
real 0m4.488s
user 0m4.459s
sys 0m0.023s
-}
Matt
> On Sunday 19 July 2009 09:26:14 Heinrich Apfelmus wrote:
>> Thomas Hartman wrote:
>> > The code below is, I think, n log n, a few seconds on a million +
>> > element
>> > list.
>> >
>> > I wonder if it's possible to get this down to O(N) by using a
>> > hashtable implemementation, or other better data structure.
>> >
>> > -- findsums locates pairs of integers in a list
>> > that add up to a wanted sum.
>> >
>> > findsums :: [Int] -> Int -> S.Set (Int,Int)
>> > findsums xs wanted = snd . foldl' f (S.empty,S.empty) $ xs
>> > where f (candidates,successes) next =
>> > if S.member (wanted-next) candidates
>> > then (candidates, S.insert (next,wanted-next) successes)
>> > else (S.insert next candidates,successes)
>>
>> Remember that hash tables are actually O(key length) instead of O(1), so
>> I don't think you can break the log n for really large lists this
>> way since the key length increases as well (unless most elements are
>> equal anyway).
More information about the Haskell-Cafe
mailing list