[Haskell-cafe] Haskell poker server

Wed Aug 31 07:48:12 EDT 2005

On Tue, Aug 30, 2005 at 01:31:22PM +0200, Joel Reymont wrote:
> Can I beg for examples?

This is from some old code, slightly polished for presentation - the
code for parsing DNS domain name label in DNS packets:

    parseLabel :: CharParser st Label
    parseLabel = (<?> "label") $ do
        len <- byte
        guard (len <= 63)
        s <- count (fromIntegral len) anyChar
        return $! stringToLabel s

Today I would rather process Word8 lists:

    type ByteParser a = GenParser Word8 st a

    parseLabel :: ByteParser st Label

Here is a parser for the whole DNS message:

    parseMessage :: CharParser st Domain -> CharParser st Message
    parseMessage pDomain =
        do
            msgid       <- parseMsgID
            header      <- parseMsgHeader

            qdcount     <- fmap fromIntegral beWord16
            ancount     <- fmap fromIntegral beWord16
            nscount     <- fmap fromIntegral beWord16
            arcount     <- fmap fromIntegral beWord16

            questions   <- count qdcount (parseQuestion pDomain)
            answers     <- count ancount (parseRR pDomain)
            auth        <- count nscount (parseRR pDomain)
            additional  <- count arcount (parseRR pDomain)

            return (Message {
                        msgID           = msgid,
                        msgHeader       = header,
                        msgQuestions    = questions,
                        msgAnswers      = answers,
                        msgAuth         = auth,
                        msgAdditional   = additional
                    })

The pDomain parameter is for dealing with DNS domain suffix compression
- parsing a domain name may require jumping to an earlier part of the
message. Today I would either use a MonadReader to hide this parameter,
or a different parser monad with random access.

In another application for reading some binary files I defined a
BinaryParser monad, with one implementation using Parsec and another
using unboxed arrays. IIRC, the implementation using UArrays was about
30-60 times faster than the one using parsec, probably because Parsec
uses lists. Surprisingly, the biggest speed boost was caused (again IIRC)
by writing a specialised "times" implementation for the UArray version.

    class (Functor m, Monad m) => BinaryParser m where
        byte    :: m Word8

        bytes   :: Int -> m (UArray Int Word8)
        bytes n = do
            l <- count n byte
            return $! (listArray (0, n-1) l)

        int8    :: m Int8
        int16   :: m Int16
        int32   :: m Int32
        int64   :: m Int64

        word16  :: m Word16
        word32  :: m Word32
        word64  :: m Word64

        word16 = fmap fromIntegral int16
        word32 = fmap fromIntegral int32
        word64 = fmap fromIntegral int64

        asciiz :: m (UArray Int Word8)
        asciiz = do
            s <- decodeStr []
            return $! (listArray (0, length s - 1) s)
          where
            decodeStr acc = do
                b <- byte
                if b == 0
                    then return (reverse acc)
                    else decodeStr (b : acc)

        eof     :: m ()
        atEof   :: m Bool

        times   :: Int -> m a -> m [a]
        times = count

Again, I would do it a bit differently today. For example, this
interface says nothing about endianness.

Recently I've used a different interface for a different protocol. This
is a state monad where the state is a slice of the buffer:

    newtype BufferReader a

    instance Monad BufferReader
    instance MonadZero BufferReader

    byteAt :: Int -> BufferReader Word8

    -- changes the state for subsequent computation
    skip :: Int -> BufferReader ()

    -- runs the given computation in a slice
    slicing :: Int -> Int -> BufferReader a -> BufferReader a
    slicing start len br = ...

    many :: BufferReader a -> BufferReader [a]

    runBufferReader :: WithBuffer b => b -> BufferReader a -> IO (Either String a)

darcs' FastPackedString module, which was recently put into a separate
library by Don Stewart (http://www.cse.unsw.edu.au/~dons/code/fps),
could be nice for parsing binary messages, because:
- it is (supposed to be) fast and memory efficient
- supports fast (O(1)) random access and slices (tailPS, initPS, dropPS, takePS)
  with purely functional interface
- is based on bytes
But I am slightly worried about the possibility of space leaks, when a
small slice holds the entire message in memory.

Random thoughts:
- I am often using Template Haskell to automate the generation of
  parsers and unparsers (it helps tremendously when you have many
  data types with many fields to parse/unparse, and even more if the
  protocol changes often)
- there are some libraries for dealing with serialisation in Haskell,
  for example : http://www.cs.helsinki.fi/u/ekarttun/SerTH/
- there is an attempt to write an operating system in Haskell:
    http://www.cse.ogi.edu/~hallgren/House/
  you can check how it handles IP4/UDP/TCP headers

Best regards
Tomasz