[Haskell-cafe] mdo with multiple values
Heinrich Apfelmus
apfelmus at quantentunnel.de
Thu Jan 16 10:34:52 UTC 2014
Joachim Breitner wrote:
> the background is a binary format assembler, so you can think of the
> monad as the Put monad, only sufficiently lazy to admit a useful
> MonadFix instance. Then one can do nice things like
>
> mdo
> putWord32 offset
> putBS someMoreHeaderData
> ...
> offset <- getCurrentOffset
> putBS byteString1
>
> where I conceptually use the offset before it is known.
>
> [..]
> Now I try to generalize that to a list of Bytestrings, and just from the
> looks of it, this is what you want to do:
>
> mdo
> mapM_ putWord32 offsets
> putBS someMoreHeaderData
> ...
> offsets <- forM byteStrings $ \b ->
> offset <- getCurrentOffset
> putBS b
> return offset
>
> but that will <<loop>>.
So, the overall idea is that we can reserve 32 bits for the offset and
defer its calculation to later. In other words, putWord32 can "forward
the file pointer" long before it actually writes the word.
Now, what's the problem with the `offsets` list? I think it may actually
work in some monads that have a good MonadFix instance, but often, the
MonadFix instances tend to be poor. One noteable example of the latter
is the IO monad, and I have found the following rule to be very useful:
To use value recursion in the IO monad, make sure
that the sequence of *computations* is always
defined in advance, only the values that these
computations operate on may be lazy.
This rule is best explained with the problem at hand, by asking the
following question: how many `putWord32` instructions does the following
expression generate?
mapM_ putWord32 offsets
Clearly, it should be `length offsets` many, but the problem is that
this number is not known until the spine of `offsets` has been
calculated. Now, some monads may be able to do that, but the rule for IO
is that the number of `putWord32` must be known before the later
definition of `offsets` can yield a value different from _|_ at all.
Now, if a recursive expression works in the IO monad, then it will work
in any other monad as well.
Fortunately, we do known the spine of `offsets` in advance: it has the
same spine as `byteStrings`. The solution is to make that explicit in
the code, by using a lazy `zip`:
...
mapM_ putWord32 (offsets `spine` byteStrings)
...
where
spine :: [a] -> [void] -> [a]
spine ~[] [] = []
spine ~(x:xs) (y:ys) = x : tag xs ys
This code takes the spine of `byteStrings` and fills it with values from
`offsets`.
It may also be possible to get rid of the <<loop>> by defining the monad
appropriately: one pass calculates all offsets, a second pass calculates
the actual output. The spine of `offsets` does not depend on the offset
calculation, so there is a good chance that this might work recursively.
Best regards,
Heinrich Apfelmus
--
http://apfelmus.nfshost.com
More information about the Haskell-Cafe
mailing list