[Haskell-cafe] Optimising UTF8-CString -> String marshaling,
plus comments on withCStringLen/peekCStringLen
Stefan O'Rear
stefanor at cox.net
Mon Jul 23 01:05:55 EDT 2007
On Mon, Jun 04, 2007 at 09:43:32AM +0100, Alistair Bayley wrote:
> (The docs tell me that using GHC.Exts is the "approved" way of
> accessing GHC-specific extensions, but all of the useful stuff seems
> to be in GHC.Prim.)
All of the useful stuff *is* exported from GHC.Exts, it even says so in the haddock:
Synopsis
...
module GHC.Prim
That is, GHC.Exts exports everything GHC.Prim does. Standard H98
re-export syntax. Besides, user code can't import GHC.Prim at all in
GHC HEADs newer than a couple months (arguably a bug, but it only breaks
bad code, so...)
> Some things I've noticed in the simplifier output:
> - the shiftL call hasn't unboxed or inlined into a call to
> uncheckedShiftL#, which I would prefer.
> Would this be possible if we added unchecked versions of
> the shiftL/R functions to Data.Bits?
> - Ptrs don't get unboxed. Why is this? Some IO monad thing?
fromUTF8Ptr unboxes fine for me with HEAD and 6.6.1.
> - the chr function tests that its Int argument is less than 1114111,
> before constructing the Char. It'd be nice to avoid this test.
You want unsafeChr from the (undocumented) GHC.Base module.
http://darcs.haskell.org/ghc-6.6/packages/base/GHC/Base.lhs for
reference (but don't copy the file, it's already an importable module).
> - why does this code:
>
> | x <= 0xF7 = remaining 3 (bAND x 0x07) xs
> | otherwise = err x
>
> turn into this
> i.e. the <= turns into two identical case-branches, using eqword#
> and ltword#, rather than one case-branch using leword# ?
>
> case GHC.Prim.eqWord# a11_a2PJ __word 247 of wild25_X2SU {
> GHC.Base.False ->
> case GHC.Prim.ltWord# a11_a2PJ __word 247 of wild6_Xcw {
> GHC.Base.False -> <error call>
> GHC.Base.True ->
> $wremaining_r3dD
> 3
> (__scc {fromUTF8 main:Foreign.C.UTF8 !}
> GHC.Base.I# (GHC.Prim.word2Int# (GHC.Prim.and# a11_a2PJ __word
> 7)))
> xs_aVm
> };
> GHC.Base.True ->
> $wremaining_r3dD
> 3
> (__scc {fromUTF8 main:Foreign.C.UTF8 !}
> GHC.Base.I# (GHC.Prim.word2Int# (GHC.Prim.and# a11_a2PJ __word 7)))
> xs_aVm
> };
ISTR seeing a bug report about this a while back, we know it's dumb.
You could probably use x < 0xF8 instead.
> BTW, what's the difference between the indexXxxxOffAddr# and
> readXxxxOffAddr# functions in GHC.Prim? AFAICT they are equivalent,
> except that the read* functions take an extra State# s parameter.
> Presumably this is to thread the IO monad's RealWorld value through,
> to create some sort of data dependency between the functions (and so
> to ensure ordered evaluation?)
Exactly. readFoo won't be reordered, indexFoo will - which matters when
doing reads and writes at addresses that might alias.
Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://www.haskell.org/pipermail/haskell-cafe/attachments/20070722/ba286b28/attachment.bin
More information about the Haskell-Cafe
mailing list