UVector overallocating for (Word/Int)(8/16/32)

Fri Jan 30 03:23:24 EST 2009

Thanks Tyson.  Not only for finding the problem, but for fixing it too!  We love that.

Simon

| -----Original Message-----
| From: glasgow-haskell-users-bounces at haskell.org [mailto:glasgow-haskell-users-
| bounces at haskell.org] On Behalf Of Tyson Whitehead
| Sent: Friday, January 30, 2009 5:44 AM
| To: GHC users
| Subject: UVector overallocating for (Word/Int)(8/16/32)
|
| I believe the arrays for (Word/Int)(8/16/32) are currently taking eight, four,
| and two times, respectively, as much memory as actually required.  That is,
|
| newMBU n = ST $ \s1# ->
|   case sizeBU n (undefined::e) of {I# len#          ->
|   case newByteArray# len# s1#   of {(# s2#, marr# #) ->
|   (# s2#, MBUArr n marr# #) }}
|
| sizeBU (I# n#) _ = I# (wORD_SCALE n#)
| wORD_SCALE   n# = scale# *# n# where I# scale# = SIZEOF_HSWORD
|
| (sizeBU is a class member, but all the instances for (Word/Int)(8/16/32) are
| as given above, and SIZEOF_HSWORD is defined as 8 in MachDeps.h on my x86_64)
|
| which would seems to always allocate memory assuming an underlying alignment
| that is always eight bytes.  It seems like the readWord(8/16/32)Array#
| functions may have once operated that way, but, when I dumped the assembler
| associated with them under ghc 6.8.2 (both native and C), I get
|
| readWord8Array
|   leaq 16(%rsi),%rax
|   movzbl (%rax,%rdi,1),%ebx
|   jmp *(%rbp)
|
| readWord16Array
|   leaq 16(%rsi),%rax
|   movzwl (%rax,%rdi,2),%ebx
|   jmp *(%rbp)
|
| readWord32Array
|   leaq 16(%rsi),%rax
|   movl (%rax,%rdi,4),%ebx
|   jmp *(%rbp)
|
| readWord64Array
|   leaq 16(%rsi),%rax
|   movq (%rax,%rdi,8),%rbx
|   jmp *(%rbp)
|
| which is using alignments of one, two, four, and eight bytes respectively.
|
| I'll attach a patch (which I haven't tested beyond compiling and looking at
| the generated assembler).
|
| Cheers!  -Tyson
|
|