[Haskell-cafe] Re: Network.HTTP+ByteStrings Interface--Or: How to shepherd handles and go with the flow at the same time?

Sat May 26 03:51:50 EDT 2007

Pete Kazmier wrote:
> Jules Bean <jules at jellybean.co.uk> writes:
>> E,F. Progressive GET
>>     pSynGET :: URL -> ((Bool,ByteString) -> IO ()) -> IO ()
>>     pAsynGET :: URL -> ((Bool,ByteString) -> IO ()) -> IO (MVar ())

>>     Incidentally there are more complex options than (Bool,Bytestring)
>>     -> IO ().  A simple and obvious change is to add a return
>>     value. Another is a 'state monad by hand', as in (Bool,Bytestring)
>>     -> s -> s, and change the final return value of the type to IO s,
>>     which allows the callback to accumulate summary information and
>>     still be written as pure code. 
> 
> I want to be sure that I understand the implications of the callback
> function returning an IO action as originally proposed versus it being
> a pure function.  It would seem to me that if it were a pure callback
> the usefulness would be limited as I would not be able to take the
> data read from the network and immediately write it out to a file.  Is
> this correct?

Absolutely. A sensibly flexible API needs both possibilities; the 
IO-doing callback and the non-IO-doing callback.

> 
> And if the above is correct, is there a way to define the callback
> such that one does not have to hardcode the IO monad in the return
> type so you can have the best of both worlds?

Not that I could think of. The closest I could get to a general type was 
'any monad which can be interleaved with IO', for which the best type I 
could think of was 'any monad which can be written as a monad 
transformer over IO', which was my final example below.

You certainly can't have the very general type 'any monad m', since 
there is no way to interleave "a general monad" with IO. To interleave a 
monad with IO, you essentially need a pair of functions runM : m a -> IO 
(FOO a) and mkM : FOO a -> m a, where 'FOO' is some kind of structure 
which "freezes" all the side-effects of m. Like the 'freeze' routine in 
some of the array classes. Both 'MonadIO' and 'MonadTrans' are possible 
ways to get this kind of structure.

> 
>>     Other options allow the 'callback' to request early termination,
>>     by layering in an 'Either' type in there. 
> 
> I believe the ability to request early termination is important, and
> was one of the nice features of Oleg's left-fold enumerators.  It
> would be a shame if the API did not offer this capability.

Yes, I agree. I was simplifying to make the presentation shorter, not 
because I felt that feature was optional.

> 
>>     Another more sophisticated option, I think, is the higher rank
>>
>>     MonadTrans t => URL ->
>>     	      	   ((forall m. Monad m) => (Bool,ByteString) -> t m)
>> 		   -> t IO ()
>>
>>     ...which, unless I've made a mistake, allows you to write in 'any
>>     monad which can be expressed as a transformer', by transforming it
>>     over IO, but still contains the implicit promise that the
>>     'callback' does no IO. For example t = StateT reduces to the
>>     earlier s -> s example, in effect, with a slightly different data
>>     layout.
> 
> I don't fully understand this, but would this prevent one from calling
> IO actions as it was receiving the chunks in the callback (such as
> writing it to a file immediately)?

Yes, and that's the point :)

Of course, you want both variants available: doing IO, and definitely 
not doing IO. Whilst the IO case is definitely a common one 
(progressively rendering graphics, etc), the non IO case is also quite 
feasible (parsing/calculating/summarising). If it happens that you don't 
need to do any IO, it's nice to have that reflected in the type. It's 
easier to write tests for non-IO code, for example.

Jules