[Haskell-cafe] Optimising UTF8-CString -> String marshaling, plus comments on withCStringLen/peekCStringLen

Mon Jun 4 06:25:10 EDT 2007

On Mon, 2007-06-04 at 09:43 +0100, Alistair Bayley wrote:

> After some expriments with the simplifier, I think I have a portable
> version of a direct-from-buffer decoder which seems to perform nearly
> as well as one written directly against GHC primitive unboxed functions.
> I'm wondering if there's anything further I can do to improve performance.
> The "portable" unboxed version is within about 15% of the unboxed version
> in terms of time and allocation.

Well done.

> Changes I made:
>  - added strictness "annotations" in the form of a strictness guards
>    that are always False
>  - unrolled loops (they were always short loops anyway, with a maximum
>    of 3 or 4 iterations)
>  - replaced shiftL with multiplication, because multiplication unboxes,
>    while shiftL doesn't.
> 
> Some things I've noticed in the simplifier output:
>  - the shiftL call hasn't unboxed or inlined into a call to
>    uncheckedShiftL#, which I would prefer.
>    Would this be possible if we added unchecked versions of
>    the shiftL/R functions to Data.Bits?

I think this was discussed before. Check the archives. I don't think
there was any resolution however. As I recall there was some attempt to
get the ordinary shiftL/R functions to inline and eliminate the bounds
checks when the shifts were constants less than the bounds.

>  - Ptrs don't get unboxed. Why is this? Some IO monad thing?

Got any more detail?

>  - the chr function tests that its Int argument is less than 1114111,
>    before constructing the Char. It'd be nice to avoid this test.

Yeah. In Data.ByteString.Char8 we invent this w2c & c2w functions to
avoid the test. There should probably be a standard version of this
unchecked conversion.

>  - why does this code:
> 
>       | x <= 0xF7 = remaining 3 (bAND x 0x07) xs
>       | otherwise = err x
> 
>    turn into this
>    i.e. the <= turns into two identical case-branches, using eqword#
>    and ltword#, rather than one case-branch using leword# ?

I thought it might be to do with the instance declaration not giving a
method for <= but letting it default to being defined in terms of < and
==, however it's hard to say what's really going on, since the instance
declaration is just:

data Word = W# Word# deriving (Eq, Ord)

I have no idea how ghc derives Eq and Ord instances for a type when it
contains unboxed components, which cannot themselves be members of a
type class.

> BTW, what's the difference between the indexXxxxOffAddr# and
> readXxxxOffAddr# functions in GHC.Prim? AFAICT they are equivalent,
> except that the read* functions take an extra State# s parameter.
> Presumably this is to thread the IO monad's RealWorld value through,
> to create some sort of data dependency between the functions (and so
> to ensure ordered evaluation?)

Right. So it'd only be safe to use the index ones on immutable arrays
because there's no way to enforce sequencing with respect to array
writes when using the index version.

Duncan