-O and -prof
Simon Marlow
simonmar@microsoft.com
Fri, 16 May 2003 13:04:18 +0100
=20
> John Meacham <john@repetae.net> writes:
>=20
> > I always use them in combination. simply because optimization can
> > drastically change the memory useage/profile. profiling the=20
> unoptimized
> > version seems rather moot.=20
>=20
> Exactly. If this is indeed the intention, I guess the stack overflow
> is a bug?
Yes, strictly speaking. It may be a spurious case of the code generated
with -O -prof being slightly different from that generated by -O alone -
this happens because having cost centres floating around in the
intermediate code disables some transformations that would normally be
applicable.
Nevertheless, if you can provide a test case we'll look into it.
> > BTW. does O2 still do not much more than -O? it seems to reduce the
> > memory footprint of some of my apps pretty noticbly.
>=20
> Not sure. The docs seem to indicate the speed improvement is
> negligible. =20
It's normally negligible. -O2 turns on one more optimisation pass,
which *might* have a significant impact on your program if it hits the
inner loop, but in most cases just costs you extra compilation time for
not much benefit.
> I thought I saw some hints in the GHC docs, including using
> -fvia-C, but I couldn't find them, and I'm not sure if they would be
> still current.
>=20
> Memory footprint is a problem, I wonder if GHC makes any
> effort to pack strict data types? I.e.
>=20
> data D1 =3D A | B =20
> data D2 =3D A2 | B2 | C2
>=20
> data D3 =3D D !D1 !D2 -- could fit inside e.g. a Word8?
We don't do any useful optimsation of the representation of D3 here, but
we could (it's been on my ToDo list since I implemented
-funbox-strict-fields some time ago). In a similar vein, the strictness
analyser doesn't take advantage of strict enumerated types - it could
map them to Int#, for example.
Semitagging (an optimisation in the works along with optimistic
evaluation) will provide some of the benefit that a more efficient
representation would yield here.
=20
> Is there an elegant way to achieve this manually (if I know I'll need
> large arrays of D3s, for instance -- can I map them to arrays of Word8
> or a similar type?)
If you map D1 and D2 to Int by hand, then use
data D3 =3D D !Int !Int
you'll get the speed benefit, but not all the space (each field will
still take up a 32-bit word).
Cheers,
Simon