Potential space leak in transpose
Carter Schonwald
carter.schonwald at gmail.com
Fri May 15 16:41:05 UTC 2020
David : call me unimaginative, but can you share an example that can
trigger the space
Leak ? At least conceptually I’m being a tad thick about it though I think
I see what you’re pointing at.
On Fri, May 15, 2020 at 4:24 AM Ryan Reich <ryan.reich at gmail.com> wrote:
> Never mind, that one is broken. I should probably not write these late in
> the evening.
>
> On Thu, May 14, 2020 at 10:30 PM Ryan Reich <ryan.reich at gmail.com> wrote:
>
>> So it doesn't, but this one does (*):
>>
>> ---
>> transpose :: [[a]] -> [[a]]
>> transpose [] = []
>> transpose [r] = map (:[]) r
>> transpose (r1 : rrest) = zip' r1 (transpose rrest)
>> where
>> zip' [] bs = bs
>> zip' (a : as) ~(b : bs) = (a : b) : zip' as bs
>>
>> main = do
>> mapM_ (print . transpose)
>> [
>> [[1,2,3]],
>> [[1,2,3],[4,5,6]],
>> [[1,2,3],[4,5,6],[7,8,9]],
>> [[10,11],[20],[],[30,31,32]]
>> ]
>> print $ take 10 $ transpose [[0 ..]]
>> --print $ map (take 10) $ transpose (repeat [0])
>> print $ take 3 $ map (take 3) $ transpose (repeat [0..])
>> ---
>>
>> It's true that this implementation will make the commented line diverge.
>> *However*, (*) the stock implementation of transpose also diverges! Try
>> it: head <$> transpose (repeat [0]) in ghci will print [0 and then loop.
>> So I think this one is okay by that standard. Does it do better at
>> allocation than the existing one or the one with unzip?
>>
>> <rant, possibly beside the point>
>> It's actually not even well-defined what the transpose would be if we
>> allow infinitely many rows and also use the "condensing" property where the
>> columns skip over nonexistent terms in the rows. How would you even prove
>> that transpose (repeat [0]) has length 1? You'd have to rule out the
>> existence of any second term in any of the infinitely many rows. Of
>> course, in this example there is a fast way to do that, but the rows could
>> be anything, so the problem is uncomputable. This logic is also why the
>> final test above actually does terminate: because we find the requisite
>> number of elements early on (in fact, instantly, since the rows are all
>> infinite).
>>
>> This actually raises the question of whether transpose even "should" care
>> about infinitely many rows or columns (I know, the behavior is
>> standardized; pretend we're discussing a new alternative transpose' that
>> explicitly only does finite lists).
>> </rant>
>>
>> Ryan
>>
>> On Thu, May 14, 2020 at 5:15 PM David Feuer <david.feuer at gmail.com>
>> wrote:
>>
>>> That doesn't look like it works on infinite lists.
>>>
>>> On Thu, May 14, 2020, 8:09 PM Ryan Reich <ryan.reich at gmail.com> wrote:
>>>
>>>> Hopefully I'm not fooling myself...how about this?
>>>>
>>>> ---
>>>> transpose :: [[a]] -> [[a]]
>>>> transpose [] = []
>>>> transpose (r1 : rrest) = zipWith' (:) (:[]) id r1 (transpose rrest)
>>>>
>>>> zipWith' :: (a -> b -> c) -> (a -> c) -> (b -> c) -> ([a] -> [b] -> [c])
>>>> zipWith' _ _ fb [] bs = fb <$> bs
>>>> zipWith' _ fa _ as [] = fa <$> as
>>>> zipWith' f fa fb (a : as) (b : bs) = f a b : zipWith' f fa fb as bs
>>>>
>>>> main = do
>>>> mapM_ (print . transpose)
>>>> [
>>>> [[1,2,3]],
>>>> [[1,2,3],[4,5,6]],
>>>> [[1,2,3],[4,5,6],[7,8,9]],
>>>> [[10,11],[20],[],[30,31,32]]
>>>> ]
>>>> ---
>>>>
>>>> I see the output:
>>>> ---
>>>> [[1],[2],[3]]
>>>> [[1,4],[2,5],[3,6]]
>>>> [[1,4,7],[2,5,8],[3,6,9]]
>>>> [[10,20,30],[11,31],[32]]
>>>> ---
>>>>
>>>> which is correct (the last example is the one from the haddocks).
>>>>
>>>> I was concerned that the definition of zipWith' (which is akin to the
>>>> Map function unionWith) is not compatible with the build/fold fusion rule,
>>>> but the implementation of zipWith itself is basically the same.
>>>>
>>>> Ryan
>>>>
>>>> On Thu, May 14, 2020 at 1:17 PM David Feuer <david.feuer at gmail.com>
>>>> wrote:
>>>>
>>>>> Right. I still think it might be the right thing to do, though. I'm
>>>>> not a big fan of general-purpose library functions that have any
>>>>> unnecessary memory leak hazard. The biggest counterargument is that real
>>>>> code is unlikely to run into that problem.
>>>>>
>>>>> On Thu, May 14, 2020, 3:35 PM Ryan Reich <ryan.reich at gmail.com> wrote:
>>>>>
>>>>>> My suggestion was much less sophisticated even than that, and is
>>>>>> basically what David answered with fusion. Also according to his answer,
>>>>>> the original code of transpose lacks the laziness that unzip's actual
>>>>>> implementation would provide.
>>>>>>
>>>>>> I think what that means is that his concern over allocating extra
>>>>>> pairs is about the ones created internally by unzip when it builds the lazy
>>>>>> heads-and-tails accessors.
>>>>>>
>>>>>> On Thu, May 14, 2020, 03:27 Andreas Abel <andreas.abel at ifi.lmu.de>
>>>>>> wrote:
>>>>>>
>>>>>>> On 2020-05-13 22:27, Ryan Reich wrote:
>>>>>>> > Why not just inline the definition of unzip and hand-optimize away
>>>>>>> the
>>>>>>> > pairs?
>>>>>>>
>>>>>>> Isn't this what the original code of transpose is doing?
>>>>>>>
>>>>>>> > On Tue, May 12, 2020, 10:24 David Feuer <david.feuer at gmail.com
>>>>>>> > <mailto:david.feuer at gmail.com>> wrote:
>>>>>>> >
>>>>>>> > Also, the more eager list allocation can increase residency,
>>>>>>> but I
>>>>>>> > don't think it can cause a leak.
>>>>>>> >
>>>>>>> > On Tue, May 12, 2020, 9:48 AM David Feuer <
>>>>>>> david.feuer at gmail.com
>>>>>>> > <mailto:david.feuer at gmail.com>> wrote:
>>>>>>> >
>>>>>>> > The cost of allocating the extra pairs.
>>>>>>> >
>>>>>>> > On Tue, May 12, 2020, 5:11 AM Andreas Abel
>>>>>>> > <andreas.abel at ifi.lmu.de <mailto:andreas.abel at ifi.lmu.de>>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > > I don't know how much that'll cost in practice.
>>>>>>> >
>>>>>>> > What costs are you worried about?
>>>>>>> >
>>>>>>> > On 2020-05-12 00:08, David Feuer wrote:
>>>>>>> > > In Data.List, we define
>>>>>>> > >
>>>>>>> > > transpose :: [[a]] -> [[a]]
>>>>>>> > > transpose [] = []
>>>>>>> > > transpose ([] : xss) = transpose xss
>>>>>>> > > transpose ((x : xs) : xss) = (x : [h | (h : _) <-
>>>>>>> xss]) :
>>>>>>> > transpose (xs
>>>>>>> > > : [t | (_ : t) <- xss])
>>>>>>> > >
>>>>>>> > > The potential difficulty is that we essentially
>>>>>>> mapMaybe
>>>>>>> > over the xss
>>>>>>> > > list twice in the third case. So we hang on to the
>>>>>>> heads
>>>>>>> > where we need
>>>>>>> > > the tails and the tails where we need the heads. We
>>>>>>> could
>>>>>>> > fix that with
>>>>>>> > > something like this:
>>>>>>> > >
>>>>>>> > > transpose ((x : xs) : xss) = (x : fronts) :
>>>>>>> transpose (xs
>>>>>>> > : rears)
>>>>>>> > > where
>>>>>>> > > (fronts, rears) = unzip [(h,t) | (h : t) <-
>>>>>>> xss]
>>>>>>> > >
>>>>>>> > > I don't know how much that'll cost in practice.
>>>>>>> > >
>>>>>>> > > _______________________________________________
>>>>>>> > > Libraries mailing list
>>>>>>> > > Libraries at haskell.org <mailto:Libraries at haskell.org
>>>>>>> >
>>>>>>> > >
>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>>> > >
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > Libraries mailing list
>>>>>>> > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>>> >
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > Libraries mailing list
>>>>>>> > Libraries at haskell.org
>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>>> >
>>>>>>>
>>>>>> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20200515/9b18a5a5/attachment.html>
More information about the Libraries
mailing list