[Haskell-cafe] Re: Copying Arrays

Thu May 29 15:03:39 EDT 2008

On Thu 2008-05-29 18:45, Chad Scherrer wrote:
> Jed Brown <jed <at> 59A2.org> writes:
> > Uh, ByteString is Unicode-agnostic.  ByteString.Char8 is not.  So why not do IO
> > with lazy ByteString and parse into your own representation (which might look a
> > lot like StorableVector)?
> 
> One problem you might run into doing it this way is if a wide character is split
> between two different arrays. In that case you have to do some post-porcessing
> to put the pieces back together. More efficient, I think, if you could force a
> given alignment when reading in the lazy bytestring. But there's not a way to do
> that, is there?

Unless you are reading UTF-32, you won't know what alignment you want until you
get there.  If I remember correctly, the default block size is nicely aligned so
that in practice you shouldn't have to worry about a chunk ending with weird
alignment.  However, such alignment issues shouldn't affect you unless you are
using the internal interface.  If you want fast indexing, you have to parse one
character at a time anyway so you won't gain anything by unsafe casting (or
memcpy) into your data structure.

Jed
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
Url : http://www.haskell.org/pipermail/haskell-cafe/attachments/20080529/4980b6fa/attachment.bin