haskell array access
Lex Stein
stein@eecs.harvard.edu
Thu, 26 Jun 2003 19:14:39 -0400 (EDT)
Actually, to be completely correct, the access is the movl 2 instructions
above the "-->". The "-->" points to the accumulation in the local
variable k.
Any ideas how I can further narrow this 20x gap and improve GHC's
observed relative array access performance ?
int main () {
int i, k = 0;
int a [16];
for (i=0; i<100000000; i++) {
k += a [i % 16];
}
return k;
}
Thanks!
Lex
--
Lex Stein http://www.eecs.harvard.edu/~stein/
stein@eecs.harvard.edu TEL: 617-233-0246
On Thu, 26 Jun 2003, Lex Stein wrote:
>
> You must be using my old example, with gcc -O3 my new example does access
> the array and is 20x faster than the ghc -O code. Scroll down for
> "-->" that I inserted beside the instruction to see the array access.
>
> .file "arrayloop.c"
> .version "01.01"
> gcc2_compiled.:
> .text
> .p2align 2,0x90
> .globl main
> .type main,@function
> main:
> pushl %ebp
> movl %esp,%ebp
> subl $80,%esp
> pushl %esi
> pushl %ebx
> xorl %ebx,%ebx
> xorl %ecx,%ecx
> leal -64(%ebp),%esi
> .p2align 2,0x90
> .L6:
> movl %ecx,%edx
> testl %ecx,%ecx
> jge .L7
> leal 15(%ecx),%edx
> .L7:
> andl $-16,%edx
> movl %ecx,%eax
> subl %edx,%eax
> --> addl (%esi,%eax,4),%ebx
> incl %ecx
> cmpl $99999999,%ecx
> jle .L6
> movl %ebx,%eax
> popl %ebx
> popl %esi
> leave
> ret
> .Lfe1:
> .size main,.Lfe1-main
> .ident "GCC: (GNU) c 2.95.4 20020320 [FreeBSD]"
>
> On Thu, 26 Jun 2003, Hal Daume wrote:
>
> > This has been hinted to before: gcc gets rid of the loop.
> >
> > bash-2.05b$ gcc -O3 arr.c -S -o arr.s
> > bash-2.05b$ cat arr.s
> > .file "arr.c"
> > .def ___main; .scl 2; .type 32; .endef
> > .text
> > .align 2
> > .align 16
> > .globl _main
> > .def _main; .scl 2; .type 32; .endef
> > _main:
> > pushl %ebp
> > movl %esp, %ebp
> > subl $72, %esp
> > xorl %eax, %eax
> > andl $-16, %esp
> > call __alloca
> > call ___main
> > xorl %eax, %eax
> > .align 16
> > L7:
> > incl %eax
> > cmpl $100000000, %eax
> > jle L7
> > leave
> > ret
> >
> > the inside of the loop is exactly:
> >
> > L7:
> > incl %eax
> > cmpl $100000000, %eax
> > jle L7
> >
> > which doesn't read the array! essentially you are being unfare to ghc
> > by making it seq the array element, but not doing the same to GCC. even
> > with -O0, gcc gets rid of the loop body.
> >
> > one solution: add in the assembly to actually read the array (i'm too
> > lazy and busy to do this).
> >
> > - hal
> >
> >
> > --
> > Hal Daume III | hdaume@isi.edu
> > "Arrest this man, he talks in maths." | www.isi.edu/~hdaume
> >
> >
> > > -----Original Message-----
> > > From: haskell-cafe-admin@haskell.org
> > > [mailto:haskell-cafe-admin@haskell.org] On Behalf Of Lex Stein
> > > Sent: Thursday, June 26, 2003 3:46 PM
> > > To: Sven Panne
> > > Cc: haskell-cafe@haskell.org
> > > Subject: Re: haskell array access
> > >
> > >
> > >
> > > Great, thanks. Those suggestions narrow the gap from GHC -O being 330x
> > > slower than GCC -O3 to it being 20x slower. Here are the new results:
> > >
> > > gcc -O3 0.54s
> > > ocamlopt 1.11s
> > > ghc -O 10.76s
> > > ocamlc 14.10s
> > >
> > > GHC is still pretty slow for native x86 instruction code. Is
> > > there any way
> > > to further explain the performance gap ? (new code below)
> > >
> > > Thanks!
> > > Lex
> > >
> > > Haskell (GHC) source:
> > >
> > > import Array
> > > k :: Array Int Int
> > > k = Array.array (0,15) [(i, i+1) | i <- [0 .. 15] ]
> > > acc :: Int -> Int
> > > acc 0 = 0
> > > acc n = seq (k Array.! (n `mod` 16)) (acc (n-1))
> > > main = do
> > > print (acc 100000000)
> > >
> > > Caml test source:
> > >
> > > let a = Array.make 16 0;;
> > > let rec do1 i z =
> > > if (i>100000000) then z else
> > > do1 (i+1) (z + a.(i mod 16));;
> > > do1 0 0;;
> > >
> > > C (gcc) test source:
> > >
> > > int main () {
> > > int i, k = 0;
> > > int a [16];
> > > for (i=0; i<100000000; i++) {
> > > k += a [i % 16];
> > > }
> > > return (k);
> > > }
> > >
> > >
> > > On Fri, 27 Jun 2003, Sven Panne wrote:
> > >
> > > > Well, part of the answer is definitely that the Haskell
> > > program is the
> > > > *only* one which really uses the array elements. :-) I
> > > guess that the
> > > > compilers for the other languages simply remove the array
> > > access from
> > > > the generated code (gcc definitely does, only an empty loop
> > > remains).
> > > >
> > > > Another reason is that the type signatures are missing, so
> > > Integer is
> > > > used instead of Int, which is a bit unfair. Adding
> > > >
> > > > k :: Array Int Int
> > > > acc :: Int -> Int
> > > >
> > > > cuts down the time by more than factor 5 on my machine.
> > > >
> > > > Cheers,
> > > > S.
> > > >
> > > >
> > > > _______________________________________________
> > > > Haskell-Cafe mailing list
> > > > Haskell-Cafe@haskell.org
> > > > http://www.haskell.org/mailman/listinfo/haskell-cafe
> > > >
> > > _______________________________________________
> > > Haskell-Cafe mailing list
> > > Haskell-Cafe@haskell.org
> > > http://www.haskell.org/mailman/listinfo/haskell-cafe
> > >
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe@haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>