[Haskell-beginners] Re: [Haskell-begin] Maximum of a List?
Rafael Gustavo da Cunha Pereira Pinto
rafaelgcpp at gmail.com
Tue Jul 29 07:12:27 EDT 2008
Thanks Daniel!
I only asked, because I was comparing the versions:
1) Steve's first version, without g x took forever to run and ate all the
stack.
2) Mine (with Map) ran somewhat fast, but it takes almost 2 minutes running
and eats 48MB of stack
3) Steve's last version (with g x) took less than 30s on my machine and
needed no more than 8MB of stack.
If I understood it right, using (g x) on foldl' forced the strictness of the
thunk (acc+1), without the need of using the bang pattern.
I'll try to play with the strictness on my code, trying to take advantage of
both the Map and strictness.
Rafael
On Mon, Jul 28, 2008 at 23:15, Daniel Fischer <daniel.is.fischer at web.de>wrote:
> Am Dienstag, 29. Juli 2008 01:53 schrieb Rafael Gustavo da Cunha Pereira
> Pinto:
> > Since this is the "beginners" list, could someone explain me why using g
> > made everything run like the wind, with almost no memory?
>
> I'd rate "run like the wind" as an exaggeration since the completely
> unoptimised C version runs in 1.6 seconds vs. 54 seconds for the code below
> (49 seconds if using Int for the chain length) on my box. And the "g" isn't
> important for speed or memory, just for getting the correct result.
>
> Two factors make it run in constant memory without stack overflow:
> 1. strictness
> 2. laziness.
>
> The strictness of foldl' is the reason that every g k is evaluated without
> delay and can immediately be thrown away (unless it's the maximum so far,
> in
> which case the previous maximum can be discarded) and the laziness of map
> and
> enumFromTo (remember that [a .. b] is syntactic sugar for enumFromTo a b)
> allows the list to be consumed as it is generated.
>
> Let us trace a few evaluation steps of answer:
> answer
> ---> foldl' max (0,0) $ map g (enumFromTo 1 999999)
> -- First, we must see whether the list is empty or not,
> -- 1 <= 999999, so the list is not empty
> ---> foldl' max (0,0) $ map g (1 : enumFromTo (1+1) 999999)
> ---> foldl' max (0,0) $ g 1 : map g (enumFromTo (1+1) 999999)
> ---> foldl' max (max (0,0) (g 1)) $ map g (enumFromTo (1+1) 999999)
> -- Now, due to the strictness of foldl', max (0,0) (g 1) has to be
> evaluated
> -- far enough to know if it is _|_ or not. For that, we obviously have to
> -- compare (0,0) and g 1, hence we must evaluate g 1 far enough to see if
> -- (0,0) <= g 1. Now g 1 = (f 1 1, 1) and we must check whether 0 <= f 1 1.
> -- However, f 1 1 = 1, so yes, and max (0,0) (g 1) turns out to be (1,1)
> ---> foldl' max (1,1) $ map g (enumFromTo (1+1) 999999)
> -- Now to decide if the list is empty, (1+1) must be compared to 999999,
> -- for that it must be evaluated, so
> ---> foldl' max (1,1) $ map g (2 : enumFromTo (2+1) 999999)
> ---> foldl' max (1,1) $ g 2 : map g (enumFromTo (2+1) 999999)
> ---> foldl' max (max (1,1) (g 2)) $ map g (enumFromTo (2+1) 999999)
> -- Is max (1,1) (g 2) _|_ or not? To find out evaluate g 2 = (2,2)
> ---> foldl' max (2,2) $ map g (enumFromTo (2+1) 999999)
> ---> foldl' max (2,2) $ map g (3 : enumFromTo (3+1) 999999)
> ---> foldl' max (2,2) $ g 3 : map g (enumFromTo (3+1) 999999)
> ---> foldl' max (max (2,2) (g 3)) $ map g (enumFromTo (3+1) 999999)
> ---> foldl' max (8,3) $ map g (enumFromTo (3+1) 999999)
> ...
> ---> foldl' max (max (8,3) (g 4)) $ map g (enumFromTo (4+1) 999999)
> ---> foldl' max (8,3) $ map g (enumFromTo (4+1) 999999)
>
> and so on. Hardly any memory needed.
> But you have more than 100,000,000 calls to f.
> Now if you store previous results in a Map, you save many of these calls
> (to
> rank in your code), but you pay for it with memory usage (and a couple of
> million expensive insertions into the Map). For every entry in the Map,
> apart
> from the space for key and value, you also have an Int holding the size of
> the subtree and pointers to the two subtrees of the node, on a 32-bit box
> that's a theoretical minimal overhead of 12 bytes per entry, probably
> rather
> 20 in reality, so a Map Integer Integer with one million entries would use
> some 30 MB memory.
>
> Hope that helps,
> Daniel
>
> >
> > I am puzzled! :-(
> >
> > On Mon, Jul 28, 2008 at 17:32, Steve Klabnik <steve.klabnik at gmail.com
> >wrote:
> > > Finally, what I ended up doing:
> > >
> > > f :: Integer -> Integer -> Integer
> > > f acc x
> > >
> > > | x == 1 = acc
> > > |
> > > | even x = f (acc + 1) (x `div` 2)
> > > | otherwise = f (acc + 1) (3 * x + 1)
> > >
> > > g :: Integer -> (Integer, Integer)
> > > g x = (f 1 x, x)
> > >
> > > answer = (foldl' max (0,0)) $ map g [1 .. 999999]
> > >
> > >
> > > main = putStrLn( show answer)
>
>
--
Rafael Gustavo da Cunha Pereira Pinto
Electronic Engineer, MSc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20080729/49097dd9/attachment.htm
More information about the Beginners
mailing list