Potential space leak in transpose

David Feuer david.feuer at gmail.com
Fri May 15 16:58:59 UTC 2020


Carter, imagine the list elements are thunks and that each of them, when
forced, produces a large structure (e.g., a vector of length 10^7). Does
that help your imagination?

On Fri, May 15, 2020, 12:41 PM Carter Schonwald <carter.schonwald at gmail.com>
wrote:

> David : call me unimaginative, but can you share an example that can
> trigger the space
> Leak ? At least conceptually I’m being a tad thick about it though I think
> I see what you’re pointing at.
>
> On Fri, May 15, 2020 at 4:24 AM Ryan Reich <ryan.reich at gmail.com> wrote:
>
>> Never mind, that one is broken.  I should probably not write these late
>> in the evening.
>>
>> On Thu, May 14, 2020 at 10:30 PM Ryan Reich <ryan.reich at gmail.com> wrote:
>>
>>> So it doesn't, but this one does (*):
>>>
>>> ---
>>> transpose :: [[a]] -> [[a]]
>>> transpose [] = []
>>> transpose [r] = map (:[]) r
>>> transpose (r1 : rrest) = zip' r1 (transpose rrest)
>>>   where
>>>     zip' [] bs = bs
>>>     zip' (a : as) ~(b : bs) = (a : b) : zip' as bs
>>>
>>> main = do
>>>   mapM_ (print . transpose)
>>>     [
>>>       [[1,2,3]],
>>>       [[1,2,3],[4,5,6]],
>>>       [[1,2,3],[4,5,6],[7,8,9]],
>>>       [[10,11],[20],[],[30,31,32]]
>>>     ]
>>>   print $ take 10 $ transpose [[0 ..]]
>>>   --print $ map (take 10) $ transpose (repeat [0])
>>>   print $ take 3 $ map (take 3) $ transpose (repeat [0..])
>>> ---
>>>
>>> It's true that this implementation will make the commented line
>>> diverge.  *However*, (*) the stock implementation of transpose also
>>> diverges!  Try it: head <$> transpose (repeat [0]) in ghci will print [0
>>> and then loop.  So I think this one is okay by that standard.  Does it do
>>> better at allocation than the existing one or the one with unzip?
>>>
>>> <rant, possibly beside the point>
>>> It's actually not even well-defined what the transpose would be if we
>>> allow infinitely many rows and also use the "condensing" property where the
>>> columns skip over nonexistent terms in the rows.  How would you even prove
>>> that transpose (repeat [0]) has length 1?  You'd have to rule out the
>>> existence of any second term in any of the infinitely many rows.  Of
>>> course, in this example there is a fast way to do that, but the rows could
>>> be anything, so the problem is uncomputable.  This logic is also why the
>>> final test above actually does terminate: because we find the requisite
>>> number of elements early on (in fact, instantly, since the rows are all
>>> infinite).
>>>
>>> This actually raises the question of whether transpose even "should"
>>> care about infinitely many rows or columns (I know, the behavior is
>>> standardized; pretend we're discussing a new alternative transpose' that
>>> explicitly only does finite lists).
>>> </rant>
>>>
>>> Ryan
>>>
>>> On Thu, May 14, 2020 at 5:15 PM David Feuer <david.feuer at gmail.com>
>>> wrote:
>>>
>>>> That doesn't look like it works on infinite lists.
>>>>
>>>> On Thu, May 14, 2020, 8:09 PM Ryan Reich <ryan.reich at gmail.com> wrote:
>>>>
>>>>> Hopefully I'm not fooling myself...how about this?
>>>>>
>>>>> ---
>>>>> transpose :: [[a]] -> [[a]]
>>>>> transpose [] = []
>>>>> transpose (r1 : rrest) = zipWith' (:) (:[]) id r1 (transpose rrest)
>>>>>
>>>>> zipWith' :: (a -> b -> c) -> (a -> c) -> (b -> c) -> ([a] -> [b] ->
>>>>> [c])
>>>>> zipWith' _ _ fb [] bs = fb <$> bs
>>>>> zipWith' _ fa _ as [] = fa <$> as
>>>>> zipWith' f fa fb (a : as) (b : bs) = f a b : zipWith' f fa fb as bs
>>>>>
>>>>> main = do
>>>>>   mapM_ (print . transpose)
>>>>>     [
>>>>>       [[1,2,3]],
>>>>>       [[1,2,3],[4,5,6]],
>>>>>       [[1,2,3],[4,5,6],[7,8,9]],
>>>>>       [[10,11],[20],[],[30,31,32]]
>>>>>     ]
>>>>> ---
>>>>>
>>>>> I see the output:
>>>>> ---
>>>>> [[1],[2],[3]]
>>>>> [[1,4],[2,5],[3,6]]
>>>>> [[1,4,7],[2,5,8],[3,6,9]]
>>>>> [[10,20,30],[11,31],[32]]
>>>>> ---
>>>>>
>>>>> which is correct (the last example is the one from the haddocks).
>>>>>
>>>>> I was concerned that the definition of zipWith' (which is akin to the
>>>>> Map function unionWith) is not compatible with the build/fold fusion rule,
>>>>> but the implementation of zipWith itself is basically the same.
>>>>>
>>>>> Ryan
>>>>>
>>>>> On Thu, May 14, 2020 at 1:17 PM David Feuer <david.feuer at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Right. I still think it might be the right thing to do, though. I'm
>>>>>> not a big fan of general-purpose library functions that have any
>>>>>> unnecessary memory leak hazard. The biggest counterargument is that real
>>>>>> code is unlikely to run into that problem.
>>>>>>
>>>>>> On Thu, May 14, 2020, 3:35 PM Ryan Reich <ryan.reich at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> My suggestion was much less sophisticated even than that, and is
>>>>>>> basically what David answered with fusion. Also according to his answer,
>>>>>>> the original code of transpose lacks the laziness that unzip's actual
>>>>>>> implementation would provide.
>>>>>>>
>>>>>>> I think what that means is that his concern over allocating extra
>>>>>>> pairs is about the ones created internally by unzip when it builds the lazy
>>>>>>> heads-and-tails accessors.
>>>>>>>
>>>>>>> On Thu, May 14, 2020, 03:27 Andreas Abel <andreas.abel at ifi.lmu.de>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On 2020-05-13 22:27, Ryan Reich wrote:
>>>>>>>> > Why not just inline the definition of unzip and hand-optimize
>>>>>>>> away the
>>>>>>>> > pairs?
>>>>>>>>
>>>>>>>> Isn't this what the original code of transpose is doing?
>>>>>>>>
>>>>>>>> > On Tue, May 12, 2020, 10:24 David Feuer <david.feuer at gmail.com
>>>>>>>> > <mailto:david.feuer at gmail.com>> wrote:
>>>>>>>> >
>>>>>>>> >     Also, the more eager list allocation can increase residency,
>>>>>>>> but I
>>>>>>>> >     don't think it can cause a leak.
>>>>>>>> >
>>>>>>>> >     On Tue, May 12, 2020, 9:48 AM David Feuer <
>>>>>>>> david.feuer at gmail.com
>>>>>>>> >     <mailto:david.feuer at gmail.com>> wrote:
>>>>>>>> >
>>>>>>>> >         The cost of allocating the extra pairs.
>>>>>>>> >
>>>>>>>> >         On Tue, May 12, 2020, 5:11 AM Andreas Abel
>>>>>>>> >         <andreas.abel at ifi.lmu.de <mailto:andreas.abel at ifi.lmu.de>>
>>>>>>>> wrote:
>>>>>>>> >
>>>>>>>> >               > I don't know how much that'll cost in practice.
>>>>>>>> >
>>>>>>>> >             What costs are you worried about?
>>>>>>>> >
>>>>>>>> >             On 2020-05-12 00:08, David Feuer wrote:
>>>>>>>> >              > In Data.List, we define
>>>>>>>> >              >
>>>>>>>> >              > transpose :: [[a]] -> [[a]]
>>>>>>>> >              > transpose [] = []
>>>>>>>> >              > transpose ([] : xss) = transpose xss
>>>>>>>> >              > transpose ((x : xs) : xss) = (x : [h | (h : _) <-
>>>>>>>> xss]) :
>>>>>>>> >             transpose (xs
>>>>>>>> >              > : [t | (_ : t) <- xss])
>>>>>>>> >              >
>>>>>>>> >              > The potential difficulty is that we essentially
>>>>>>>> mapMaybe
>>>>>>>> >             over the xss
>>>>>>>> >              > list twice in the third case. So we hang on to the
>>>>>>>> heads
>>>>>>>> >             where we need
>>>>>>>> >              > the tails and the tails where we need the heads.
>>>>>>>> We could
>>>>>>>> >             fix that with
>>>>>>>> >              > something like this:
>>>>>>>> >              >
>>>>>>>> >              > transpose ((x : xs) : xss) = (x : fronts) :
>>>>>>>> transpose (xs
>>>>>>>> >             : rears)
>>>>>>>> >              >    where
>>>>>>>> >              >      (fronts, rears) = unzip [(h,t) | (h : t) <-
>>>>>>>> xss]
>>>>>>>> >              >
>>>>>>>> >              > I don't know how much that'll cost in practice.
>>>>>>>> >              >
>>>>>>>> >              > _______________________________________________
>>>>>>>> >              > Libraries mailing list
>>>>>>>> >              > Libraries at haskell.org <mailto:
>>>>>>>> Libraries at haskell.org>
>>>>>>>> >              >
>>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>>>> >              >
>>>>>>>> >
>>>>>>>> >     _______________________________________________
>>>>>>>> >     Libraries mailing list
>>>>>>>> >     Libraries at haskell.org <mailto:Libraries at haskell.org>
>>>>>>>> >     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > _______________________________________________
>>>>>>>> > Libraries mailing list
>>>>>>>> > Libraries at haskell.org
>>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>>>> >
>>>>>>>>
>>>>>>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20200515/faf2142c/attachment.html>


More information about the Libraries mailing list