Support conversion from (UArray i Word8) to ShortByteString?
Viktor Dukhovni
ietf-dane at dukhovni.org
Sat Nov 11 21:20:52 UTC 2023
The 'ShortByteString' type in the "bytestring" package has seen some
singnificant improvement recently, and yet its API is still noticeably
limited in comparison to its pinned, I/O friendly 'ByteString' elder
sibling.
One of the limitations is that that there are fewer ways to construct a
'ShortByteString' object. One often has to restort to constructing a
pinned ByteString, and then copy. (There no "ST" Builders that write
to resizable MutableByteArrays instead of raw memory pointers).
Meanwhile, under-the covers, both the "UArray i Word8" type and
'ShortByteString' hold an immutable 'ByteArray', and the STUArray API
provides a flexible "UArray" construction interface.
Would it be reasonable to "bridge" the two APIs:
Data.Array.Unboxed: (re-export from Data.Array.Base)
import Data.Array.Byte
arrayToByteArray :: UArray i Word8 -> ByteArray
arrayToByteArray (UArray _ _ _ ba#) = ByteArray ba#
{-# INLINE arrayToByArrray #-}
Data.ByteString.Short: (re-export from Data.ByteString.Short.Internal)
byteArrayToShort :: ByteArray -> ShortByteString
byteArrayToShort = coerce
{-# INLINE byteArrayToShort #-}
It would then be possible to write:
short = byteArrayToShort $ arrayToByteArray $ runSTUArray m
where
m = do
a <- newArray (0, last) 0 -- zero fill
sequence_ [ writeArray a ix e | (ix, e) <- generator ]
and generate the bytes of a 'ShortByteString' from an arbitrary
computation, possibly merging multiple inputs into some bytes by using
the recently introduced "modifyArray" (or explicit read/modify/write).
Any thoughts about the wisdom or lack thereof of this proposal?
An alternative is to add a tailored version of the UArray and STUArray
APIs to 'MutableByteArray' by extending the rather limited API of
'Data.Array.Byte':
runMutableByteArray :: (forall s. ST s (MutableByteArray s))
-> ByteArray
runMutableByteArray m = runST $ m >>= freezeMutableByteArray
freezeMutableByteArray (MutableByteArray mba#) =
ST $ \s -> case unsafeFreezeByteArray# mba# s of
(# s', ba# #) -> (# s', ByteArray ba# #)
Since "Data.Array.Byte" is an "array" (rather than string) interface, it
could have a richer set of indexed read/write/modify primitives along
the lines of those found in "Data.Array.STUArray", but specialised to
'Word8' elements and implicit zero-based integer indexing.
The flexible construction I seek would then be via "Data.Array.Bytes",
rather than the somewhat too general index and value types from UArray.
short = byteArrayToShort $ runMutableByteArray m
where
m = do
a <- newByteArray size 0 -- 0 fill
sequence_ [ writeByteArray a ix e | (ix, e) <- generator ]
In this scenario, the indexed-mutation of ShortByteStrings under
construction, or indexed-mutation of copies for various transformations,
could live in Data.Array.Byte, with ShortByteString and various
applications leveraging the random-access mutation (and resizing, ...)
to implement higher level operations.
--
Viktor.
More information about the ghc-devs
mailing list