haskell array access

Thu, 26 Jun 2003 19:14:39 -0400 (EDT)

Actually, to be completely correct, the access is the movl 2 instructions
above the "-->". The "-->" points to the accumulation in the local
variable k.

Any ideas how I can further narrow this 20x gap and improve GHC's
observed relative array access performance ?

int main () {
  int i, k = 0;
  int a [16];
  for (i=0; i<100000000; i++) {
    k += a [i % 16];
  }
  return k;
}

Thanks!
Lex

--
Lex Stein                     http://www.eecs.harvard.edu/~stein/
stein@eecs.harvard.edu        TEL: 617-233-0246

On Thu, 26 Jun 2003, Lex Stein wrote:

>
> You must be using my old example, with gcc -O3 my new example does access
> the array and is 20x faster than the ghc -O code. Scroll down for
> "-->" that I inserted beside the instruction to see the array access.
>
> 	.file	"arrayloop.c"
> 	.version	"01.01"
> gcc2_compiled.:
> .text
> 	.p2align 2,0x90
> .globl main
> 		.type		 main,@function
> main:
> 	pushl %ebp
> 	movl %esp,%ebp
> 	subl $80,%esp
> 	pushl %esi
> 	pushl %ebx
> 	xorl %ebx,%ebx
> 	xorl %ecx,%ecx
> 	leal -64(%ebp),%esi
> 	.p2align 2,0x90
> .L6:
> 	movl %ecx,%edx
> 	testl %ecx,%ecx
> 	jge .L7
> 	leal 15(%ecx),%edx
> .L7:
> 	andl $-16,%edx
> 	movl %ecx,%eax
> 	subl %edx,%eax
> -->	addl (%esi,%eax,4),%ebx
> 	incl %ecx
> 	cmpl $99999999,%ecx
> 	jle .L6
> 	movl %ebx,%eax
> 	popl %ebx
> 	popl %esi
> 	leave
> 	ret
> .Lfe1:
> 		.size		 main,.Lfe1-main
> 	.ident	"GCC: (GNU) c 2.95.4 20020320 [FreeBSD]"
>
> On Thu, 26 Jun 2003, Hal Daume wrote:
>
> > This has been hinted to before: gcc gets rid of the loop.
> >
> > bash-2.05b$ gcc -O3 arr.c -S -o arr.s
> > bash-2.05b$ cat arr.s
> > 	.file	"arr.c"
> > 	.def	___main;	.scl	2;	.type	32;	.endef
> > 	.text
> > 	.align 2
> > 	.align 16
> > .globl _main
> > 	.def	_main;	.scl	2;	.type	32;	.endef
> > _main:
> > 	pushl	%ebp
> > 	movl	%esp, %ebp
> > 	subl	$72, %esp
> > 	xorl	%eax, %eax
> > 	andl	$-16, %esp
> > 	call	__alloca
> > 	call	___main
> > 	xorl	%eax, %eax
> > 	.align 16
> > L7:
> > 	incl	%eax
> > 	cmpl	$100000000, %eax
> > 	jle	L7
> > 	leave
> > 	ret
> >
> > the inside of the loop is exactly:
> >
> > L7:
> > 	incl	%eax
> > 	cmpl	$100000000, %eax
> > 	jle	L7
> >
> > which doesn't read the array!  essentially you are being unfare to ghc
> > by making it seq the array element, but not doing the same to GCC.  even
> > with -O0, gcc gets rid of the loop body.
> >
> > one solution: add in the assembly to actually read the array (i'm too
> > lazy and busy to do this).
> >
> >  - hal
> >
> >
> > --
> >  Hal Daume III                                   | hdaume@isi.edu
> >  "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume
> >
> >
> > > -----Original Message-----
> > > From: haskell-cafe-admin@haskell.org
> > > [mailto:haskell-cafe-admin@haskell.org] On Behalf Of Lex Stein
> > > Sent: Thursday, June 26, 2003 3:46 PM
> > > To: Sven Panne
> > > Cc: haskell-cafe@haskell.org
> > > Subject: Re: haskell array access
> > >
> > >
> > >
> > > Great, thanks. Those suggestions narrow the gap from GHC -O being 330x
> > > slower than GCC -O3 to it being 20x slower. Here are the new results:
> > >
> > > gcc -O3      0.54s
> > > ocamlopt     1.11s
> > > ghc -O      10.76s
> > > ocamlc      14.10s
> > >
> > > GHC is still pretty slow for native x86 instruction code. Is
> > > there any way
> > > to further explain the performance gap ? (new code below)
> > >
> > > Thanks!
> > > Lex
> > >
> > > Haskell (GHC) source:
> > >
> > > import Array
> > > k :: Array Int Int
> > > k = Array.array (0,15) [(i, i+1) | i <- [0 .. 15] ]
> > > acc :: Int -> Int
> > > acc 0 = 0
> > > acc n = seq (k Array.! (n `mod` 16)) (acc (n-1))
> > > main = do
> > >     print (acc 100000000)
> > >
> > > Caml test source:
> > >
> > > let a = Array.make 16 0;;
> > > let rec do1 i z =
> > >     if (i>100000000) then z else
> > >         do1 (i+1) (z + a.(i mod 16));;
> > > do1 0 0;;
> > >
> > > C (gcc) test source:
> > >
> > > int main () {
> > >     int i, k = 0;
> > >     int a [16];
> > >     for (i=0; i<100000000; i++) {
> > >         k += a [i % 16];
> > >     }
> > >     return (k);
> > > }
> > >
> > >
> > > On Fri, 27 Jun 2003, Sven Panne wrote:
> > >
> > > > Well, part of the answer is definitely that the Haskell
> > > program is the
> > > > *only* one which really uses the array elements. :-) I
> > > guess that the
> > > > compilers for the other languages simply remove the array
> > > access from
> > > > the generated code (gcc definitely does, only an empty loop
> > > remains).
> > > >
> > > > Another reason is that the type signatures are missing, so
> > > Integer is
> > > > used instead of Int, which is a bit unfair. Adding
> > > >
> > > >     k :: Array Int Int
> > > >     acc :: Int -> Int
> > > >
> > > > cuts down the time by more than factor 5 on my machine.
> > > >
> > > > Cheers,
> > > >     S.
> > > >
> > > >
> > > > _______________________________________________
> > > > Haskell-Cafe mailing list
> > > > Haskell-Cafe@haskell.org
> > > > http://www.haskell.org/mailman/listinfo/haskell-cafe
> > > >
> > > _______________________________________________
> > > Haskell-Cafe mailing list
> > > Haskell-Cafe@haskell.org
> > > http://www.haskell.org/mailman/listinfo/haskell-cafe
> > >
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe@haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>