Support conversion from (UArray i Word8) to ShortByteString?

Matthew Craven clyring at gmail.com
Tue Nov 14 02:35:06 UTC 2023


Viktor,

Your proposed `arrayToByteArray` seems plausible.

Your proposed `byteArrayToShort` is just the newtype-constructor
`ShortByteString` which is exposed from Data.ByteString.Short since
bytestring-0.12.0.0.

> An alternative is to add a tailored version of the UArray and STUArray
> APIs to 'MutableByteArray' by extending the rather limited API of
> 'Data.Array.Byte':

How does this compare to the interface provided by
primitive:Data.Primitive.ByteArray? Their `runByteArray` is your
`runMutableByteArray`.


> The 'ShortByteString' type in the "bytestring" package has seen some
> singnificant improvement recently, and yet its API is still noticeably
> limited in comparison to its pinned, I/O friendly 'ByteString' elder
> sibling.
>
> One of the limitations is that that there are fewer ways to construct a
> 'ShortByteString' object.  One often has to restort to constructing a
> pinned ByteString, and then copy.  (There no "ST" Builders that write
> to resizable MutableByteArrays instead of raw memory pointers).
>
> Meanwhile, under-the covers, both the "UArray i Word8" type and
> 'ShortByteString' hold an immutable 'ByteArray', and the STUArray API
> provides a flexible "UArray" construction interface.
>
> Would it be reasonable to "bridge" the two APIs:
>
>     Data.Array.Unboxed:  (re-export from Data.Array.Base)
>         import Data.Array.Byte
>
>         arrayToByteArray :: UArray i Word8 -> ByteArray
>         arrayToByteArray (UArray _ _ _ ba#) = ByteArray ba#
>         {-# INLINE arrayToByArrray #-}
>
>     Data.ByteString.Short: (re-export from Data.ByteString.Short.Internal)
>         byteArrayToShort :: ByteArray -> ShortByteString
>         byteArrayToShort = coerce
>         {-# INLINE byteArrayToShort #-}
>
> It would then be possible to write:
>
>     short = byteArrayToShort $ arrayToByteArray $ runSTUArray m
>       where
>         m = do
>             a <- newArray (0, last) 0 -- zero fill
>             sequence_ [ writeArray a ix e | (ix, e) <- generator ]
>
> and generate the bytes of a 'ShortByteString' from an arbitrary
> computation, possibly merging multiple inputs into some bytes by using
> the recently introduced "modifyArray" (or explicit read/modify/write).
>
> Any thoughts about the wisdom or lack thereof of this proposal?
>
> An alternative is to add a tailored version of the UArray and STUArray
> APIs to 'MutableByteArray' by extending the rather limited API of
> 'Data.Array.Byte':
>
>     runMutableByteArray :: (forall s. ST s (MutableByteArray s))
>                         -> ByteArray
>     runMutableByteArray m = runST $ m >>= freezeMutableByteArray
>
>     freezeMutableByteArray (MutableByteArray mba#) =
>         ST $ \s -> case unsafeFreezeByteArray# mba# s of
>             (# s', ba# #) -> (# s', ByteArray ba# #)
>
> Since "Data.Array.Byte" is an "array" (rather than string) interface, it
> could have a richer set of indexed read/write/modify primitives along
> the lines of those found in "Data.Array.STUArray", but specialised to
> 'Word8' elements and implicit zero-based integer indexing.
>
> The flexible construction I seek would then be via "Data.Array.Bytes",
> rather than the somewhat too general index and value types from UArray.
>
>     short = byteArrayToShort $ runMutableByteArray m
>       where
>         m = do
>             a <- newByteArray size 0 -- 0 fill
>             sequence_ [ writeByteArray a ix e | (ix, e) <- generator ]
>
> In this scenario, the indexed-mutation of ShortByteStrings under
> construction, or indexed-mutation of copies for various transformations,
> could live in Data.Array.Byte, with ShortByteString and various
> applications leveraging the random-access mutation (and resizing, ...)
> to implement higher level operations.
>
> --
>     Viktor.


More information about the ghc-devs mailing list