[Haskell-cafe] Re: Arrow instance of Transducer (Was: [Haskell] ANN: Streaming Component Combinators 0.4)

Sun Jan 17 11:59:50 EST 2010

>>> Stupid question: Is it related to Arrows?
>>
>>     Not really. You might say it's more general than arrows, because 
>> the streaming components are not restricted to a single input and 
>> single output type. On the other hand, they are all specific to stream 
>> processing, much like Fudgets and Yampa which are arrow-based.
>
> Arrows use tuple values for multiple inputs and outputs. Actually I'm 
> using arrows this way for signal processing.
> 
>>     I suppose the Transducer data type [1] could be made an Arrow 
>> instance, which would let me rename the 'connect' combinator [2] to 
>> (>>>). I'll have a look at Yampa to see if I can harvest more 
>> combinator names, thank you!

    After some investigation, I've concluded that Transducer cannot be made an
instance of Arrow in any way that's both practical and general.

    The reason is that a Transducer converts a finite stream to another finite
stream of arbitrary length. If we
ignore effects, it's like a function of type [a] -> [b].

When trying to define an instance of Arrow for

> newtype Transducer a b = Transducer ([a] -> [b])

the arr and (>>>) methods are trivial:

> instance Arrow Transducer where
>    arr f = Transducer (map f)
>    Transducer t1 >>> Transducer t2 = Transducer (t2 . t1)

but there is no satisfactory definition for the method first. A sensible
definition, IMO, would have to give rise to a pairwise (***), so if we have two
transducers t1 and t2 which respectively convert [a,b] to [c,d] and [1,2] to
[3,4], t1 *** t2 would have to map [(a,1), (b,2)] to [(c,3), (d,4)]. There is no
general way to define such a method first for the Transducer data type above. One
can try appending the right-hand sides of the input to a queue and adding them to
the outputs of the argument transducer:

>    first (Transducer t) = Transducer (uncurry (zip . t) . unzip)

but this works only for injective transducers that map each new input item to
exactly one output item. If the input
stream has finite length, the input and output stream must be of equal length.

    In his paper "Generalising Monads to Arrows" from 1998, John Hughes defines a
similar Arrow instance for stream processors operating on infinite streams which
theoretically works for non-injective transforms, but leaks infinite space and
can't be adopted to finite stream processors.

    HXT defines a different kind of an Arrow instance similar to the list monad
instance. This approach can combine non-injective transforms, but in it the t1
*** t2 example above would map the input [(a,1), (b,2)] to [(c,3), (c,4), (d,3),
(d,4)]. While this is a simple and lawful Arrow instance, I don't find it useful
for stream processing.

    Yampa and other Functional Reactive Programming libraries associate a unique
timestamp to each stream item. Method first can use this timestamp to associate
input and output items even if the transducer leaves out some output items, and
if the time stamps are ordered in each stream I guess it could also handle extra
output items unrelated to any input. I'm not certain to what extent the FRP
transducers (i.e., signal processors) are allowed to do this.

    To summarize: while I'd love the benefits of satisifying an Arrow instance,
the loss of generality seems too high for me. SCC transducers work on finite
streams of any values, and their output stream need not have any resemblance to
the input stream. I may define an Arrow-conformant subclass of Transducer in a
future version of SCC, perhaps operating on streams with timestamped values.

>> 
>> [1] 
>>
http://hackage.haskell.org/packages/archive/scc/0.4/doc/html/Control-Concurrent-SCC-Types.html#t%3ATransducer

>> 
>> [2] 
>>
http://hackage.haskell.org/packages/archive/scc/0.4/doc/html/Control-Concurrent-SCC-Combinators.html#v%3Aconnect

> 
> In whatever way it is related to arrows, maybe you can mention it on the 
> project page.

I'll copy this message there.