[Haskell-cafe] Re: [Haskell] reading binary files
Dmitry V'yal
akamaus at gmail.com
Fri Apr 7 05:44:23 EDT 2006
Hello, Bulat
I'm currently working on some kind of program for analysing FAT partitions.
Don't ask why did I chose to implement it in Haskell :) Just for fun.
My program needs to read scattered chunks of binary data from a huge file and to
do a good amount of deserialisation.
I implemented basic functionality using Handles and Ptr's and now I'm starting
to regret for it. I have some pieces of code like:
type PtrAdvancer a b = StateT (Ptr a) IO b
peek_one :: Storable b => PtrAdvancer a b
peek_one = do
p <- get
res <- lift $ peek $ castPtr p
put $ plusPtr p $ sizeOf res
return res
peek_many :: Storable b => Int -> PtrAdvancer a [b]
peek_many 0 = return []
peek_many n = do
first <- peek_one
rest <- peek_many $ n-1
return $ first:rest
data DirEntry = DirEntry
{ name :: String,
attr :: Word8,
crt_time_tenth :: Word8,
crt_time :: Word16,
crt_data :: Word16,
lst_acc_data :: Word16,
wrt_time :: Word16,
wrt_date :: Word16,
fst_cluster :: Word32,
file_size :: Word32
} deriving Show
instance Storable DirEntry where
sizeOf _ = 32
alignment _ = 32
peek = evalStateT peek_dir_entry
peek_dir_entry = do
n <- peek_many 11 :: PtrAdvancer a [Word8]
at <- peek_one
peek_one :: PtrAdvancer a Word8
ctt <- peek_one
ct <- peek_one
cd <- peek_one
lad <- peek_one
fch <- peek_one :: PtrAdvancer a Word16
wt <- peek_one
wd <- peek_one
fcl <- peek_one :: PtrAdvancer a Word16
fs <- peek_one
return $ DirEntry (words_to_str n) at ctt ct cd lad wt wd
((fromIntegral fch `shiftL` 16) + fromIntegral fcl) fs
or:
read_cluster_chain32 :: Handle -> FatAddress -> Cluster -> IO [Cluster]
read_cluster_chain32 h start cluster = do
allocaBytes 4 $ \p -> chain32' p cluster True
where
chain32' p c need_seek = do
when need_seek $ hSeek h AbsoluteSeek (fromIntegral $ start + c * 4)
hGetBuf h p 4
val <- peek p :: IO Word32
let val28 = val .&. 0x0fffffff
case val28 of
0x0 -> return []
0x0fffffff -> return [c]
otherwise -> do rest <- if c+1 == val28 then chain32' p val28 False
else chain32' p val28 True
return $ c:rest
It works with a mediocre speed (about 10Mb/s when extracting files), but design
is ugly IMO. For example I need to write twice as much number of lines of
marshalling code compared to C. For data type declaration and then for Storable
instance. Is there a way to avoid it?
>
> with my lib, you can either read data directly from file (through
> implicit 512-byte buffer) or read whole file into the automatically
> allocated buffer with `readFromFile`. what is better - depends on what
> you plan to do with rest of file
>
Now I'm going to rewrite my code to make use of io library. So my question is
whether your library is well suited for such application (frequent positioning
and reading small pieces of data).
More information about the Haskell-Cafe
mailing list