[Haskell-cafe] Reading/writing packed bytes from file

Jefferson Heard jeff at renci.org
Wed Jun 20 09:54:33 EDT 2007


What about the Data.Binary module from the Hackage database?  I can call
C, no problem, but I hate to do something that's already been done.

On Wed, 2007-06-20 at 12:02 +1000, Donald Bruce Stewart wrote:
> jeff:
> > I've read the documentation for some of the marshalling packages out
> > there for Haskell, and I'm left confused as to which one I should be
> > using and how to actually do what I want to do.   I have a file, a
> > little over 2gb, of packed data in the format
> > 
> > (recordcount) records of:
> > 
> > 4-byte int (count),
> > (count) 2-byte unsigned shorts,
> > (count) 4-byte floats
> > 
> > all in little-endian order.  What I want to do is read each record
> > (lazily), and unpack it into Data.IntMap.IntMap Float where the unsigned
> > shorts become the keys and the 4-byte floats become the values.
> > 
> > Then I want to do a lot of interesting processing which we'll skip here,
> > and write back out packed data to a file in the format of
> > 
> > 4-byte float,
> > 4-byte float,
> > 4-byte float
> > 
> > for each record. I need these output records to be four-byte C floats.
> > I've gotten as far as datatypes and a couple of signatures, but I can't
> > figure out the functions themselves that go with the signatures, and
> > then again, maybe I have the signatures wrong.  
> > 
> > -- 
> > import qualified Data.IntMap as M
> > import qualified Data.ByteString.Lazy.Char8 as B
> > 
> > data InputRecord = M.IntMap Float
> > data OutputRecord = (Float, Float, Float)
> > 
> > -- open a file as a lazy ByteString and break up the individual records
> > -- by reading the count variable, reading that many bytes times 
> > -- sizeof short + sizeof float into a lazy ByteString.
> > readRawRecordsFromFile :: String -> IO [B.ByteString] 
> > 
> > 
> > -- take a bytestring as returned by readRawRecordsFromFile and turn it
> > -- into a map.
> > decodeRawRecord :: B.ByteString -> M.IntMap Float
> > --
> > 
> > Can anyone help with how to construct these functions?  I'm going to
> > have to make a few passes over this file, so I'd like the IO to be as
> > fast as Haskelly possible.
> > 
> > -- Jeff
> 
> Data.ByteString.Lazy.Char8.readFile should suffice for the IO.
> then use drop/take to split up the file in pieces if you know the length
> of each field.
> 
> For converting ByteString chunks to Floats, I'd probably call C for that.
> 
> -- Don



More information about the Haskell-Cafe mailing list