[Haskell-cafe] representation on persistent store question
Antoine Latter
aslatter at gmail.com
Fri Jan 2 01:24:54 EST 2009
On Jan 1, 2009 11:50pm, "Galchin, Vasili" <vigalchin at gmail.com> wrote:
> it is a bioinformatics standard .. . I am writing on this newsgroup in order to try to be objective to get a "correct" and elegant answer .. in any case I am helping on the bioinformatics code (you can see on Hackage). I am trying to finish the 2Bit file format code ... it seems to me that bioinformatics as an area is not clearly defined .... e.g. it is unclear to me whether "offset" is a marshalled/serialized concept or or unmarshalled/unserialized concept ..... this distinction is very important .... I will have to think about more myself!
>
>
> Regards, Vasili
>
Here's some code using Data.Binary to store data as offsets into a
byte array. I haven't tested it too much, so it may have bugs. Maybe
there's some inspiration in there.
-Antoine
>>>>
import Data.Binary
import Data.Binary.Get
import Data.Binary.Put
import Data.ByteString.Lazy (ByteString)
import qualified Data.ByteString.Lazy as B
data TestStruct = TestStruct
{ property1 :: ByteString
, property2 :: ByteString
, property3 :: ByteString
}
deriving Show
{-
The serialized format looks like (all big-endian):
* first offset into data block (Word32)
* second offset into data block (Word32)
* third offset into data block (Word32)
* length of bnary data block (Word32)
* binary data block (Arbitrary binary data)
-}
instance Binary TestStruct where
put struct =
let data1 = property1 struct
data2 = property2 struct
data3 = property3 struct
dataBlock = data1 `B.append` data2 `B.append` data3
offset1 = 0
offset2 = offset1 + B.length data1
offset3 = offset2 + B.length data2
in do
putWord32be $ fromIntegral offset1
putWord32be $ fromIntegral offset2
putWord32be $ fromIntegral offset3
putWord32be $ fromIntegral $ B.length dataBlock
putLazyByteString dataBlock
get = do
offset1 <- getWord32be
offset2 <- getWord32be
offset3 <- getWord32be
dataBlockLength <- getWord32be
dataBlock <- B.drop (fromIntegral offset1) `fmap`
getLazyByteString (fromIntegral dataBlockLength)
let (data1, rest1) =
B.splitAt (fromIntegral $ offset2 - offset1) dataBlock
(data2, rest2) =
B.splitAt (fromIntegral $ offset3 - offset2 - offset1) rest1
data3 = rest2
return $ TestStruct data1 data2 data3
<<<<<
More information about the Haskell-Cafe
mailing list