A fancier Get monad or two (a la binary and binary-strict)

Chris Kuklewicz haskell at list.mightyreason.com
Wed Jul 30 06:14:55 EDT 2008


Summary: I have made the simplified version Duncan speculates about...

Duncan Coutts wrote:
> Yep, definitely interested. Sounds like we could make something that
> would satisfy the needs of existing users of the binary and
> binary-strict packages.

I see this as incremental development of incremental get.  The comments in 
binary-strict's code point to one additional useful feature:  When the parser 
suspends and asks for more input it could potentially also return a list or 
Sequence of the output-so-far.

I believe this is possible to add to my existing code (even the simplified 
version below), so there will be an even fancier version eventually.  This would 
make the parser a controllable part of a pipeline.

> We'll have to look closely at the performance costs of the new features
> but my intuition is that a non-transformer but continuation based
> version that has error handling (and plus/alternative) and can request
> more input should have minimal cost.
> 
> Duncan
> 

To modify down "MyGet.hs" to produce that type is a matter of using the delete 
key (which I have done, see below).  I only have my Apple powerpc/G4 laptop to 
run Haskell, this and the lack of either need or available time means I will not 
be making performance measurements.  I have written for the language shootout 
before and I had binary's Get.hs to look at, and so I claim my code has no 
show-stopping performance killers in it.  A bit of !strictness and even more 
INLINE pragmas should be all it needs.

And so I just took such a knife to MyGet.hs to make MyGetSimplified.hs.
It is at
http://darcs.haskell.org/packages/protocol-buffers/Text/ProtocolBuffers/
with the other files.

The simplified form of the data definitions mostly fits in this email:

newtype Get a = Get {
   unGet :: forall b.    -- the forall hides the CPS style
            Success b a  -- main continuation
         -> S            -- parser state
         -> FrameStack b -- error handler stack
         -> Result b     -- operation
     }

type Success b a = (a -> S -> FrameStack b -> Result b)

data S = S { top :: {-# UNPACK #-} !S.ByteString
            , current :: {-# UNPACK #-} !L.ByteString
            , consumed :: {-# UNPACK #-} !Int64
            } deriving Show

data FrameStack b = [snip details]

data Result a = Failed {-# UNPACK #-} !Int64 String
   -- the Int64 is amount consumed successfully
               | Finished {-# UNPACK #-} !L.ByteString {-# UNPACK #-} !Int64 a
   -- the bytestring is the unconsumed part, the Int64 is amount consumed
               | Partial (Maybe L.ByteString -> Result a)
   -- passing Nothing indicates that there will never be more input

This could be streamlined further by making (FrameStack b) a field of S.
The MonadError/Plus/Alternative should still work.  It will still suspend and 
resume.
A few more short functions and this will have the same exported signatures as 
binary's Get.hs.

Cheers,
   Chris


More information about the Libraries mailing list