haskell array access

Thu, 26 Jun 2003 15:58:19 -0700

This has been hinted to before: gcc gets rid of the loop.

bash-2.05b$ gcc -O3 arr.c -S -o arr.s
bash-2.05b$ cat arr.s
	.file	"arr.c"
	.def	___main;	.scl	2;	.type	32;	.endef
	.text
	.align 2
	.align 16
.globl _main
	.def	_main;	.scl	2;	.type	32;	.endef
_main:
	pushl	%ebp
	movl	%esp, %ebp
	subl	$72, %esp
	xorl	%eax, %eax
	andl	$-16, %esp
	call	__alloca
	call	___main
	xorl	%eax, %eax
	.align 16
L7:
	incl	%eax
	cmpl	$100000000, %eax
	jle	L7
	leave
	ret

the inside of the loop is exactly:

L7:
	incl	%eax
	cmpl	$100000000, %eax
	jle	L7

which doesn't read the array!  essentially you are being unfare to ghc
by making it seq the array element, but not doing the same to GCC.  even
with -O0, gcc gets rid of the loop body.

one solution: add in the assembly to actually read the array (i'm too
lazy and busy to do this).

 - hal

--
 Hal Daume III                                   | hdaume@isi.edu
 "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume

> -----Original Message-----
> From: haskell-cafe-admin@haskell.org=20
> [mailto:haskell-cafe-admin@haskell.org] On Behalf Of Lex Stein
> Sent: Thursday, June 26, 2003 3:46 PM
> To: Sven Panne
> Cc: haskell-cafe@haskell.org
> Subject: Re: haskell array access
>=20
>=20
>=20
> Great, thanks. Those suggestions narrow the gap from GHC -O being 330x
> slower than GCC -O3 to it being 20x slower. Here are the new results:
>=20
> gcc -O3      0.54s
> ocamlopt     1.11s
> ghc -O      10.76s
> ocamlc      14.10s
>=20
> GHC is still pretty slow for native x86 instruction code. Is=20
> there any way
> to further explain the performance gap ? (new code below)
>=20
> Thanks!
> Lex
>=20
> Haskell (GHC) source:
>=20
> import Array
> k :: Array Int Int
> k =3D Array.array (0,15) [(i, i+1) | i <- [0 .. 15] ]
> acc :: Int -> Int
> acc 0 =3D 0
> acc n =3D seq (k Array.! (n `mod` 16)) (acc (n-1))
> main =3D do
>     print (acc 100000000)
>=20
> Caml test source:
>=20
> let a =3D Array.make 16 0;;
> let rec do1 i z =3D
>     if (i>100000000) then z else
>         do1 (i+1) (z + a.(i mod 16));;
> do1 0 0;;
>=20
> C (gcc) test source:
>=20
> int main () {
>     int i, k =3D 0;
>     int a [16];
>     for (i=3D0; i<100000000; i++) {
>         k +=3D a [i % 16];
>     }
>     return (k);
> }
>=20
>=20
> On Fri, 27 Jun 2003, Sven Panne wrote:
>=20
> > Well, part of the answer is definitely that the Haskell=20
> program is the
> > *only* one which really uses the array elements. :-) I=20
> guess that the
> > compilers for the other languages simply remove the array=20
> access from
> > the generated code (gcc definitely does, only an empty loop=20
> remains).
> >
> > Another reason is that the type signatures are missing, so=20
> Integer is
> > used instead of Int, which is a bit unfair. Adding
> >
> >     k :: Array Int Int
> >     acc :: Int -> Int
> >
> > cuts down the time by more than factor 5 on my machine.
> >
> > Cheers,
> >     S.
> >
> >
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe@haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>=20