[GHC] #13298: Compact API design improvements
GHC
ghc-devs at haskell.org
Sun Feb 19 00:54:14 UTC 2017
#13298: Compact API design improvements
-------------------------------------+-------------------------------------
Reporter: ezyang | Owner: ezyang
Type: feature | Status: new
request |
Priority: normal | Milestone: 8.2.1
Component: | Version: 8.0.1
libraries/compact |
Keywords: | Operating System: Unknown/Multiple
Architecture: | Type of failure: None/Unknown
Unknown/Multiple |
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
I took a look at the compact API today and I realized that there are a lot
of improvements we should make to it:
1. `fmap getCompact . compact` is a pretty common thing to do, if you
don't actually care about the compact pointer. We should make a helper
function for this.
2. The `SerializedCompact` data structure needs a bunch of instances.
Especially, a user who wants to serialize the compact region needs to save
this metadata somewhere, but we don't offer any help for doing this.
3. `importCompact` will always print a message to stderr saying pointer
fixup is happening. Need to be able to suppress this message.
4. The serialization API is really unsafe; we should make it more safe by
default by including some sort of fingerprint, at least.
5. There should be a convenience function for serializing to and from a
file, and serializing to and from a handle. RDMA is always going to be
delicate business (so keep the old API around) but for these cases we
should pave the cowpaths.
Here is some sample code that might help:
{{{
{-# LANGUAGE ScopedTypeVariables #-}
import System.Environment (getArgs)
import qualified Data.Set as Set
import System.IO
import Data.Compact
import Data.Compact.Serialized
import Foreign.Ptr
import Foreign.Storable
import Foreign.Marshal.Alloc
import Control.Monad
main = do
[dict_file, out_file] <- getArgs
dict <- readFileLatin1 dict_file
c <- compact (Set.fromList (words dict))
withBinaryFile out_file WriteMode $ \h ->
withSerializedCompact c $ \sc -> do
-- Write out the metadata header
hPutStorable h (serializedCompactRoot sc)
forM_ (serializedCompactBlockList sc) $ \(ptr, l) -> do
hPutStorable h ptr
hPutStorable h l
hPutStorable h nullPtr
-- Write out the payload
forM_ (serializedCompactBlockList sc) $ \(ptr, l) ->
hPutBuf h ptr (fromIntegral l)
mb_r <- withBinaryFile out_file ReadMode $ \h -> do
-- Read out the metadata header
root <- hGetStorable h
let go h xs = do
ptr <- hGetStorable h
if ptr == nullPtr
then return (reverse xs)
else do l <- hGetStorable h
go h ((ptr, l):xs)
blocks <- go h []
let sc = SerializedCompact {
serializedCompactBlockList = blocks,
serializedCompactRoot = root
}
-- Read the payload into memory
importCompact sc $ \ptr l -> void $ hGetBuf h ptr (fromIntegral l)
print (fmap getCompact mb_r == Just (getCompact c))
hPutStorable :: forall a. Storable a => Handle -> a -> IO ()
hPutStorable h a =
alloca $ \ptr -> do
poke ptr a
hPutBuf h ptr (sizeOf (undefined :: a))
hGetStorable :: forall a. Storable a => Handle -> IO a
hGetStorable h =
alloca $ \ptr -> do
hGetBuf h ptr (sizeOf (undefined :: a))
peek ptr
readFileLatin1 f = do
h <- openFile f ReadMode
hSetEncoding h latin1
hGetContents h
}}}
I'm happy to do these but I want to make sure I'm not stepping on FB's
toes, also I don't know if bgamari wants to take patches along these lines
so late in the release cycle.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13298>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list