Proposal to add mapAccumL function to vector package

Carter Schonwald carter.schonwald at gmail.com
Fri Nov 9 00:54:03 UTC 2018


great

the one using streams for your hand rolled stuff would be more
instructive... your proposed definiton cant fuse.

i'll switch to the ticket

On Thu, Nov 8, 2018 at 5:53 PM John Ky <newhoggy at gmail.com> wrote:

> After removing the load time of the file from benchmarking, the difference
> between mapM and hand rolled version is even more stark 😱
>
> benchmarking medium.csv/mapAccumL               for DVS.Vector Word64
> time                 2.371 ms   (2.347 ms .. 2.396 ms)
>                      0.999 R²   (0.998 R² .. 1.000 R²)
> mean                 2.442 ms   (2.420 ms .. 2.477 ms)
> std dev              91.95 μs   (57.33 μs .. 137.7 μs)
> variance introduced by outliers: 23% (moderately inflated)
>
> benchmarking medium.csv/mapAccumLViaStrictState for DVS.Vector Word64
> time                 604.4 ms   (479.0 ms .. 700.5 ms)
>                      0.995 R²   (0.983 R² .. 1.000 R²)
> mean                 643.8 ms   (607.3 ms .. 695.2 ms)
> std dev              51.95 ms   (13.07 ms .. 69.16 ms)
> variance introduced by outliers: 21% (moderately inflated)
>
> benchmarking medium.csv/mapAccumLViaLazyState   for DVS.Vector Word64
> time                 62.08 ms   (61.14 ms .. 63.23 ms)
>                      0.999 R²   (0.999 R² .. 1.000 R²)
> mean                 63.16 ms   (62.43 ms .. 65.35 ms)
> std dev              2.013 ms   (637.6 μs .. 3.434 ms)
>
>
> On Fri, 9 Nov 2018 at 09:31, John Ky <newhoggy at gmail.com> wrote:
>
>> Hi Carter,
>>
>> I've created a ticket <https://github.com/haskell/vector/issues/228> as
>> requested and have reproduced the text here:
>>
>> I've implemented a hand rolled version, and another two versions based on
>> a combination of mapM and the lazy and strict versions of State monad.
>>
>> haskell-works/hw-prim#38
>> <https://github.com/haskell-works/hw-prim/pull/38>
>>
>> The benchmarks show that the hand rolled versions run two times faster
>> than the lazy state monad version and 16 times faster than the strict state
>> monad.
>>
>> I found the slow performance of the strict monad version most surprising.
>>
>> I'm aware that the version that using mapM might enable fusion, however
>> it is a fair bit slower than a hand rolled version that defeats fusion.
>>
>> I would love to have a fusion-enabled version that runs as fast as the
>> hand rolled version. Would that be possible?
>>
>>
>> Cheers,
>>
>>
>> -John
>>
>> On Fri, 9 Nov 2018 at 01:18, Carter Schonwald <carter.schonwald at gmail.com>
>> wrote:
>>
>>> I thought about it a teeny bit more
>>>
>>> This should be trivially definable using mapM or equivalent using state
>>> t
>>>
>>> Have you tried doing that simple high level definition?  I think that
>>> works with vector fusion quite nicely.
>>>
>>> ... ohhh. I see.  There’s two vectors of inputs.  I’ll have to think
>>> about this more.
>>>
>>> On Thu, Nov 8, 2018 at 8:14 AM Carter Schonwald <
>>> carter.schonwald at gmail.com> wrote:
>>>
>>>> Hey John
>>>> :
>>>> I’m happy to help you contribute to to vector.
>>>>
>>>> 1). This might not actually be needed with stream fusion on  ... though
>>>> perhaps this sort of shared computation needs to be its own combinators
>>>> because of the sharing meaning that the stream fusion on map and foldl
>>>> might not work. (Have you tried doing let x = map ... in let y = foldl ...
>>>> in something with x and y? Even eg just writing map accum l in terms of
>>>> just that? It could very well fuse ... though I don’t think it can with the
>>>> current vector fusion framework. Though I think one of the more exotic
>>>> fusion frameworks Amos Robinson did a few years ago could handle that
>>>> fusion.   Sadly that one requires an ILP solver at compile time.  But
>>>> there’s some tricks I think we could do )
>>>>
>>>> 2) writing it in stream fusion form / as map accum l for streams will
>>>> actually be simpler I think
>>>>
>>>> 3) put a ticket on our GitHub. I’ve a huge backlog (life and stuff got
>>>> me a bit slow), but this sounds like a super reasonable feature request
>>>>
>>>>
>>>>
>>>> On Thu, Nov 8, 2018 at 6:31 AM John Ky <newhoggy at gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I'd like to add the mapAccumL function to the vector package.
>>>>>
>>>>> Specifically the Data.Vector.Storable module, but it would also be
>>>>> useful other vector modules.
>>>>>
>>>>> This is my attempt at an implementation:
>>>>>
>>>>> {-# LANGUAGE ScopedTypeVariables #-}
>>>>>
>>>>> mapAccumL :: forall a b c. (Storable b, Storable c)
>>>>> => (a -> b -> (a, c))
>>>>> -> a
>>>>> -> DVS.Vector b
>>>>> -> (a, DVS.Vector c)
>>>>> mapAccumL f a vb = DVS.createT $ do
>>>>> vc <- DVSM.unsafeNew (DVS.length vb)
>>>>> a' <- go 0 a vc
>>>>> return (a', vc)
>>>>> where go :: Int -> a -> DVS.MVector s c -> ST s a
>>>>> go i a0 vc = if i < DVS.length vb
>>>>> then do
>>>>> let (a1, c1) = f a0 (DVS.unsafeIndex vb i)
>>>>> DVSM.unsafeWrite vc i c1
>>>>> go (i + 1) a1 vc
>>>>> else return a0
>>>>> {-# INLINE mapAccumL #-}
>>>>>
>>>>> The implementation should obey the following law:
>>>>>
>>>>> import qualified Data.List as L
>>>>> import qualified Data.Vector.Storable as DVS
>>>>>
>>>>> (DVS.toList <$> DVS.mapAccumL f a (DVS.fromList bs)) === L.mapAccumL
>>>>> f a bs
>>>>>
>>>>> Cheers,
>>>>>
>>>>> -John
>>>>>
>>>>> _______________________________________________
>>>>> Libraries mailing list
>>>>> Libraries at haskell.org
>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20181108/8f637653/attachment-0001.html>


More information about the Libraries mailing list