Potential space leak in transpose

Fri May 15 05:30:32 UTC 2020

So it doesn't, but this one does (*):

---
transpose :: [[a]] -> [[a]]
transpose [] = []
transpose [r] = map (:[]) r
transpose (r1 : rrest) = zip' r1 (transpose rrest)
  where
    zip' [] bs = bs
    zip' (a : as) ~(b : bs) = (a : b) : zip' as bs

main = do
  mapM_ (print . transpose)
    [
      [[1,2,3]],
      [[1,2,3],[4,5,6]],
      [[1,2,3],[4,5,6],[7,8,9]],
      [[10,11],[20],[],[30,31,32]]
    ]
  print $ take 10 $ transpose [[0 ..]]
  --print $ map (take 10) $ transpose (repeat [0])
  print $ take 3 $ map (take 3) $ transpose (repeat [0..])
---

It's true that this implementation will make the commented line diverge.
*However*, (*) the stock implementation of transpose also diverges!  Try
it: head <$> transpose (repeat [0]) in ghci will print [0 and then loop.
So I think this one is okay by that standard.  Does it do better at
allocation than the existing one or the one with unzip?

<rant, possibly beside the point>
It's actually not even well-defined what the transpose would be if we allow
infinitely many rows and also use the "condensing" property where the
columns skip over nonexistent terms in the rows.  How would you even prove
that transpose (repeat [0]) has length 1?  You'd have to rule out the
existence of any second term in any of the infinitely many rows.  Of
course, in this example there is a fast way to do that, but the rows could
be anything, so the problem is uncomputable.  This logic is also why the
final test above actually does terminate: because we find the requisite
number of elements early on (in fact, instantly, since the rows are all
infinite).

This actually raises the question of whether transpose even "should" care
about infinitely many rows or columns (I know, the behavior is
standardized; pretend we're discussing a new alternative transpose' that
explicitly only does finite lists).
</rant>

Ryan

On Thu, May 14, 2020 at 5:15 PM David Feuer <david.feuer at gmail.com> wrote:

> That doesn't look like it works on infinite lists.
>
> On Thu, May 14, 2020, 8:09 PM Ryan Reich <ryan.reich at gmail.com> wrote:
>
>> Hopefully I'm not fooling myself...how about this?
>>
>> ---
>> transpose :: [[a]] -> [[a]]
>> transpose [] = []
>> transpose (r1 : rrest) = zipWith' (:) (:[]) id r1 (transpose rrest)
>>
>> zipWith' :: (a -> b -> c) -> (a -> c) -> (b -> c) -> ([a] -> [b] -> [c])
>> zipWith' _ _ fb [] bs = fb <$> bs
>> zipWith' _ fa _ as [] = fa <$> as
>> zipWith' f fa fb (a : as) (b : bs) = f a b : zipWith' f fa fb as bs
>>
>> main = do
>>   mapM_ (print . transpose)
>>     [
>>       [[1,2,3]],
>>       [[1,2,3],[4,5,6]],
>>       [[1,2,3],[4,5,6],[7,8,9]],
>>       [[10,11],[20],[],[30,31,32]]
>>     ]
>> ---
>>
>> I see the output:
>> ---
>> [[1],[2],[3]]
>> [[1,4],[2,5],[3,6]]
>> [[1,4,7],[2,5,8],[3,6,9]]
>> [[10,20,30],[11,31],[32]]
>> ---
>>
>> which is correct (the last example is the one from the haddocks).
>>
>> I was concerned that the definition of zipWith' (which is akin to the Map
>> function unionWith) is not compatible with the build/fold fusion rule, but
>> the implementation of zipWith itself is basically the same.
>>
>> Ryan
>>
>> On Thu, May 14, 2020 at 1:17 PM David Feuer <david.feuer at gmail.com>
>> wrote:
>>
>>> Right. I still think it might be the right thing to do, though. I'm not
>>> a big fan of general-purpose library functions that have any unnecessary
>>> memory leak hazard. The biggest counterargument is that real code is
>>> unlikely to run into that problem.
>>>
>>> On Thu, May 14, 2020, 3:35 PM Ryan Reich <ryan.reich at gmail.com> wrote:
>>>
>>>> My suggestion was much less sophisticated even than that, and is
>>>> basically what David answered with fusion. Also according to his answer,
>>>> the original code of transpose lacks the laziness that unzip's actual
>>>> implementation would provide.
>>>>
>>>> I think what that means is that his concern over allocating extra pairs
>>>> is about the ones created internally by unzip when it builds the lazy
>>>> heads-and-tails accessors.
>>>>
>>>> On Thu, May 14, 2020, 03:27 Andreas Abel <andreas.abel at ifi.lmu.de>
>>>> wrote:
>>>>
>>>>> On 2020-05-13 22:27, Ryan Reich wrote:
>>>>> > Why not just inline the definition of unzip and hand-optimize away
>>>>> the
>>>>> > pairs?
>>>>>
>>>>> Isn't this what the original code of transpose is doing?
>>>>>
>>>>> > On Tue, May 12, 2020, 10:24 David Feuer <david.feuer at gmail.com
>>>>> > <mailto:david.feuer at gmail.com>> wrote:
>>>>> >
>>>>> >     Also, the more eager list allocation can increase residency, but
>>>>> I
>>>>> >     don't think it can cause a leak.
>>>>> >
>>>>> >     On Tue, May 12, 2020, 9:48 AM David Feuer <david.feuer at gmail.com
>>>>> >     <mailto:david.feuer at gmail.com>> wrote:
>>>>> >
>>>>> >         The cost of allocating the extra pairs.
>>>>> >
>>>>> >         On Tue, May 12, 2020, 5:11 AM Andreas Abel
>>>>> >         <andreas.abel at ifi.lmu.de <mailto:andreas.abel at ifi.lmu.de>>
>>>>> wrote:
>>>>> >
>>>>> >               > I don't know how much that'll cost in practice.
>>>>> >
>>>>> >             What costs are you worried about?
>>>>> >
>>>>> >             On 2020-05-12 00:08, David Feuer wrote:
>>>>> >              > In Data.List, we define
>>>>> >              >
>>>>> >              > transpose :: [[a]] -> [[a]]
>>>>> >              > transpose [] = []
>>>>> >              > transpose ([] : xss) = transpose xss
>>>>> >              > transpose ((x : xs) : xss) = (x : [h | (h : _) <-
>>>>> xss]) :
>>>>> >             transpose (xs
>>>>> >              > : [t | (_ : t) <- xss])
>>>>> >              >
>>>>> >              > The potential difficulty is that we essentially
>>>>> mapMaybe
>>>>> >             over the xss
>>>>> >              > list twice in the third case. So we hang on to the
>>>>> heads
>>>>> >             where we need
>>>>> >              > the tails and the tails where we need the heads. We
>>>>> could
>>>>> >             fix that with
>>>>> >              > something like this:
>>>>> >              >
>>>>> >              > transpose ((x : xs) : xss) = (x : fronts) : transpose
>>>>> (xs
>>>>> >             : rears)
>>>>> >              >    where
>>>>> >              >      (fronts, rears) = unzip [(h,t) | (h : t) <- xss]
>>>>> >              >
>>>>> >              > I don't know how much that'll cost in practice.
>>>>> >              >
>>>>> >              > _______________________________________________
>>>>> >              > Libraries mailing list
>>>>> >              > Libraries at haskell.org <mailto:Libraries at haskell.org>
>>>>> >              >
>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>> >              >
>>>>> >
>>>>> >     _______________________________________________
>>>>> >     Libraries mailing list
>>>>> >     Libraries at haskell.org <mailto:Libraries at haskell.org>
>>>>> >     http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>> >
>>>>> >
>>>>> > _______________________________________________
>>>>> > Libraries mailing list
>>>>> > Libraries at haskell.org
>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>> >
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20200514/4bb1b6da/attachment.html>