Unsafe aspects of ByteString
Donald Bruce Stewart
dons at cse.unsw.edu.au
Mon Jan 29 22:08:50 EST 2007
iavor.diatchki:
> Hello,
> The "packCString" function (and other similar functions) in the
> ByteString library break referential transperancy, which is one of the
> big selling points of Haskell (and its libraries).
The Data.ByteString functions relating to CString have now been modified
as follows, in the darcs repository. These changes will be propogated
into base in due course.
Public CString functions:
Data.ByteString:
packCString :: CString -> IO ByteString
packCStringLen :: CStringLen -> IO ByteString
useAsCString :: ByteString -> (CString -> IO a) -> IO a
useAsCStringLen :: ByteString -> (CStringLen -> IO a) -> IO a
These are safe, copying functions. Never can modifying the CString affect the
Haskell ByteString, or any substrings of it.
Private, unsafe functions, only available by importing Data.ByteString.Base:
Dangerous, efficient api, suitable for constant CStrings only (the CString functions
may also require null termination):
unsafeUseAsCString :: ByteString -> (CString -> IO a) -> IO a
unsafeUseAsCStringLen :: ByteString -> (CStringLen -> IO a) -> IO a
unsafePackCString :: CString -> IO ByteString
unsafePackCStringLen :: CStringLen -> IO ByteString
unsafePackMallocCString :: CString -> IO ByteString
The documentation has also been extensively revised. In particular, all unsafe
functions contain text explaining in what way they are unsafe. For example:
unsafeUseAsCString :: ByteString -> (CString -> IO a) -> IO a
O(1) construction Use a ByteString with a function requiring a CString.
This function does zero copying, and merely unwraps a ByteString to appear as a CString. It is
unsafe in two ways:
* After calling this function the CString shares the underlying byte
buffer with the original ByteString. Thus modifying the CString, either in
C, or using poke, will cause the contents of the ByteString to change,
breaking referential transparency. Other ByteStrings created by sharing
(such as those produced via take or drop) will also reflect these changes.
Modifying the CString will break referential transparency. To avoid this,
use useAsCString, which makes a copy of the original ByteString.
* CStrings are often passed to functions that require them to be
null-terminated. If the original ByteString wasn't null terminated, neither
will the CString be. It is the programmers responsibility to guarantee that
the ByteString is indeed null terminated. If in doubt, use useAsCString.
The plain old Data.ByteString CString api should now be safe from FFI
manipulation. Note that Iavor's original demo looks like:
import qualified Data.ByteString as B
import Data.ByteString (packCString)
import Foreign.C.String
import Foreign
main = do x <- newCString "Hello"
s <- packCString x
let h1 = B.head s
print s
poke x (toEnum 97)
print s
let h2 = B.head s
print h1
print h2
And now produces:
$ runhaskell iavor.hs
"Hello"
"Hello"
72
72
No more ghostly telekinesis from the CString side!
Thanks to everyone for feedback and criticism.
-- Don
More information about the Libraries
mailing list