Is it safe to index a little bit out of bounds

Andrew Martin andrew.thaddeus at gmail.com
Thu Mar 8 19:24:45 UTC 2018


If you are looking for ascii (or non-ascii characters) in a byte array, you
build a word-sized mask like 0b1000000010000000... However, on the last
word, if you cannot go past the end, you have to go one byte at a time.
But, if you can go past the end, you can mask out the irrelevant bits and
use the same mask as before.

On Thu, Mar 8, 2018 at 1:35 PM, David Feuer <david.feuer at gmail.com> wrote:

> What do you gain from this?
>
> On Mar 8, 2018 9:19 AM, "Andrew Martin" <andrew.thaddeus at gmail.com> wrote:
>
>> Let's say I have a gc-managed byte array of length 19. GHC promises that
>> byte arrays are machine-word-aligned on the front end. That is, on a 64-bit
>> machine, this array starts on a memory address that divide 8 evenly.
>> However, the back end will certainly be unaligned. So, these two calls will
>> be fine:
>>
>> - indexWordArray# myArr# 0#
>> - indexWordArray# myArr# 1#
>>
>> But this one is non-deterministic:
>>
>> - indexWordArray# myArr# 2#
>>
>> Some of the bytes in the word will have garbage in them. However, this
>> could always be masked out with a bit mask (you have to know the platform
>> endianness for this to work right). Is this safe? I doubt think this could
>> ever cause a segfault but I wanted to check.
>>
>> --
>> -Andrew Thaddeus Martin
>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>>


-- 
-Andrew Thaddeus Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20180308/066f7a09/attachment.html>


More information about the Libraries mailing list