Potential space leak in transpose

Fri May 15 08:22:26 UTC 2020

Never mind, that one is broken.  I should probably not write these late in
the evening.

On Thu, May 14, 2020 at 10:30 PM Ryan Reich <ryan.reich at gmail.com> wrote:

> So it doesn't, but this one does (*):
>
> ---
> transpose :: [[a]] -> [[a]]
> transpose [] = []
> transpose [r] = map (:[]) r
> transpose (r1 : rrest) = zip' r1 (transpose rrest)
>   where
>     zip' [] bs = bs
>     zip' (a : as) ~(b : bs) = (a : b) : zip' as bs
>
> main = do
>   mapM_ (print . transpose)
>     [
>       [[1,2,3]],
>       [[1,2,3],[4,5,6]],
>       [[1,2,3],[4,5,6],[7,8,9]],
>       [[10,11],[20],[],[30,31,32]]
>     ]
>   print $ take 10 $ transpose [[0 ..]]
>   --print $ map (take 10) $ transpose (repeat [0])
>   print $ take 3 $ map (take 3) $ transpose (repeat [0..])
> ---
>
> It's true that this implementation will make the commented line diverge.
> *However*, (*) the stock implementation of transpose also diverges!  Try
> it: head <$> transpose (repeat [0]) in ghci will print [0 and then loop.
> So I think this one is okay by that standard.  Does it do better at
> allocation than the existing one or the one with unzip?
>
> <rant, possibly beside the point>
> It's actually not even well-defined what the transpose would be if we
> allow infinitely many rows and also use the "condensing" property where the
> columns skip over nonexistent terms in the rows.  How would you even prove
> that transpose (repeat [0]) has length 1?  You'd have to rule out the
> existence of any second term in any of the infinitely many rows.  Of
> course, in this example there is a fast way to do that, but the rows could
> be anything, so the problem is uncomputable.  This logic is also why the
> final test above actually does terminate: because we find the requisite
> number of elements early on (in fact, instantly, since the rows are all
> infinite).
>
> This actually raises the question of whether transpose even "should" care
> about infinitely many rows or columns (I know, the behavior is
> standardized; pretend we're discussing a new alternative transpose' that
> explicitly only does finite lists).
> </rant>
>
> Ryan
>
> On Thu, May 14, 2020 at 5:15 PM David Feuer <david.feuer at gmail.com> wrote:
>
>> That doesn't look like it works on infinite lists.
>>
>> On Thu, May 14, 2020, 8:09 PM Ryan Reich <ryan.reich at gmail.com> wrote:
>>
>>> Hopefully I'm not fooling myself...how about this?
>>>
>>> ---
>>> transpose :: [[a]] -> [[a]]
>>> transpose [] = []
>>> transpose (r1 : rrest) = zipWith' (:) (:[]) id r1 (transpose rrest)
>>>
>>> zipWith' :: (a -> b -> c) -> (a -> c) -> (b -> c) -> ([a] -> [b] -> [c])
>>> zipWith' _ _ fb [] bs = fb <$> bs
>>> zipWith' _ fa _ as [] = fa <$> as
>>> zipWith' f fa fb (a : as) (b : bs) = f a b : zipWith' f fa fb as bs
>>>
>>> main = do
>>>   mapM_ (print . transpose)
>>>     [
>>>       [[1,2,3]],
>>>       [[1,2,3],[4,5,6]],
>>>       [[1,2,3],[4,5,6],[7,8,9]],
>>>       [[10,11],[20],[],[30,31,32]]
>>>     ]
>>> ---
>>>
>>> I see the output:
>>> ---
>>> [[1],[2],[3]]
>>> [[1,4],[2,5],[3,6]]
>>> [[1,4,7],[2,5,8],[3,6,9]]
>>> [[10,20,30],[11,31],[32]]
>>> ---
>>>
>>> which is correct (the last example is the one from the haddocks).
>>>
>>> I was concerned that the definition of zipWith' (which is akin to the
>>> Map function unionWith) is not compatible with the build/fold fusion rule,
>>> but the implementation of zipWith itself is basically the same.
>>>
>>> Ryan
>>>
>>> On Thu, May 14, 2020 at 1:17 PM David Feuer <david.feuer at gmail.com>
>>> wrote:
>>>
>>>> Right. I still think it might be the right thing to do, though. I'm not
>>>> a big fan of general-purpose library functions that have any unnecessary
>>>> memory leak hazard. The biggest counterargument is that real code is
>>>> unlikely to run into that problem.
>>>>
>>>> On Thu, May 14, 2020, 3:35 PM Ryan Reich <ryan.reich at gmail.com> wrote:
>>>>
>>>>> My suggestion was much less sophisticated even than that, and is
>>>>> basically what David answered with fusion. Also according to his answer,
>>>>> the original code of transpose lacks the laziness that unzip's actual
>>>>> implementation would provide.
>>>>>
>>>>> I think what that means is that his concern over allocating extra
>>>>> pairs is about the ones created internally by unzip when it builds the lazy
>>>>> heads-and-tails accessors.
>>>>>
>>>>> On Thu, May 14, 2020, 03:27 Andreas Abel <andreas.abel at ifi.lmu.de>
>>>>> wrote:
>>>>>
>>>>>> On 2020-05-13 22:27, Ryan Reich wrote:
>>>>>> > Why not just inline the definition of unzip and hand-optimize away
>>>>>> the
>>>>>> > pairs?
>>>>>>
>>>>>> Isn't this what the original code of transpose is doing?
>>>>>>
>>>>>> > On Tue, May 12, 2020, 10:24 David Feuer <david.feuer at gmail.com
>>>>>> > <mailto:david.feuer at gmail.com>> wrote:
>>>>>> >
>>>>>> >     Also, the more eager list allocation can increase residency,
>>>>>> but I
>>>>>> >     don't think it can cause a leak.
>>>>>> >
>>>>>> >     On Tue, May 12, 2020, 9:48 AM David Feuer <
>>>>>> david.feuer at gmail.com
>>>>>> >     <mailto:david.feuer at gmail.com>> wrote:
>>>>>> >
>>>>>> >         The cost of allocating the extra pairs.
>>>>>> >
>>>>>> >         On Tue, May 12, 2020, 5:11 AM Andreas Abel
>>>>>> >         <andreas.abel at ifi.lmu.de <mailto:andreas.abel at ifi.lmu.de>>
>>>>>> wrote:
>>>>>> >
>>>>>> >               > I don't know how much that'll cost in practice.
>>>>>> >
>>>>>> >             What costs are you worried about?
>>>>>> >
>>>>>> >             On 2020-05-12 00:08, David Feuer wrote:
>>>>>> >              > In Data.List, we define
>>>>>> >              >
>>>>>> >              > transpose :: [[a]] -> [[a]]
>>>>>> >              > transpose [] = []
>>>>>> >              > transpose ([] : xss) = transpose xss
>>>>>> >              > transpose ((x : xs) : xss) = (x : [h | (h : _) <-
>>>>>> xss]) :
>>>>>> >             transpose (xs
>>>>>> >              > : [t | (_ : t) <- xss])
>>>>>> >              >
>>>>>> >              > The potential difficulty is that we essentially
>>>>>> mapMaybe
>>>>>> >             over the xss
>>>>>> >              > list twice in the third case. So we hang on to the
>>>>>> heads
>>>>>> >             where we need
>>>>>> >              > the tails and the tails where we need the heads. We
>>>>>> could
>>>>>> >             fix that with
>>>>>> >              > something like this:
>>>>>> >              >
>>>>>> >              > transpose ((x : xs) : xss) = (x : fronts) :
>>>>>> transpose (xs
>>>>>> >             : rears)
>>>>>> >              >    where
>>>>>> >              >      (fronts, rears) = unzip [(h,t) | (h : t) <- xss]
>>>>>> >              >
>>>>>> >              > I don't know how much that'll cost in practice.
>>>>>> >              >
>>>>>> >              > _______________________________________________
>>>>>> >              > Libraries mailing list
>>>>>> >              > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>>>>> >              >
>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>> >              >
>>>>>> >
>>>>>> >     _______________________________________________
>>>>>> >     Libraries mailing list
>>>>>> >     Libraries at haskell.org <mailto:Libraries at haskell.org>
>>>>>> >     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>> >
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > Libraries mailing list
>>>>>> > Libraries at haskell.org
>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>> >
>>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20200515/f9edb2e3/attachment.html>