[Haskell-cafe] Reading/writing packed bytes from file

Jefferson Heard jeff at renci.org
Tue Jun 19 16:20:28 EDT 2007

I've read the documentation for some of the marshalling packages out
there for Haskell, and I'm left confused as to which one I should be
using and how to actually do what I want to do.   I have a file, a
little over 2gb, of packed data in the format

(recordcount) records of:

4-byte int (count),
(count) 2-byte unsigned shorts,
(count) 4-byte floats

all in little-endian order.  What I want to do is read each record
(lazily), and unpack it into Data.IntMap.IntMap Float where the unsigned
shorts become the keys and the 4-byte floats become the values.

Then I want to do a lot of interesting processing which we'll skip here,
and write back out packed data to a file in the format of

4-byte float,
4-byte float,
4-byte float

for each record. I need these output records to be four-byte C floats.
I've gotten as far as datatypes and a couple of signatures, but I can't
figure out the functions themselves that go with the signatures, and
then again, maybe I have the signatures wrong.  

import qualified Data.IntMap as M
import qualified Data.ByteString.Lazy.Char8 as B

data InputRecord = M.IntMap Float
data OutputRecord = (Float, Float, Float)

-- open a file as a lazy ByteString and break up the individual records
-- by reading the count variable, reading that many bytes times 
-- sizeof short + sizeof float into a lazy ByteString.
readRawRecordsFromFile :: String -> IO [B.ByteString] 

-- take a bytestring as returned by readRawRecordsFromFile and turn it
-- into a map.
decodeRawRecord :: B.ByteString -> M.IntMap Float

Can anyone help with how to construct these functions?  I'm going to
have to make a few passes over this file, so I'd like the IO to be as
fast as Haskelly possible.

-- Jeff

More information about the Haskell-Cafe mailing list