computer language shootout

Simon Marlow simonmar@microsoft.com
Mon, 30 Jul 2001 18:11:03 +0100


This is a multi-part message in MIME format.

------_=_NextPart_001_01C1191A.9CB81489
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

> On Mon, 30 Jul 2001, Simon Marlow wrote:
>=20
> > I've looked at a few.  The most common factor in the worst=20
> performers
> > seems to be the performance of String-related operations.  I tried
> > converting a few to use PackedStrings but that didn't help much.  I
> > suspect our PackedString implementation could do with some=20
> tuning, and
> > we could gain considerable benefit from having an=20
> hGetLinePS function
> > (like hGetLine but gets a PackedString).
> >
> Well, I think the common factor is I/O. The lazy I/O seems to=20
> be a real
> bottleneck here. Trying to improve that would gain even more, I think.
> I'm saying this because I did quite a bit of fiddling with some of the
> examples (including trying PackedStrings). The conclusion I drew from
> doing this was that the I/O performance was the problem.

Agreed - I/O is indeed a bottleneck, but I believe it's the
strings-as-lists-of-characters aspect of I/O rather than the lazy aspect
that's the killer.

Enclosed is a version of the spell-checker program that is about a
factor of 3 faster than the one that Doug currently has; it uses
PackedStrings and a home-grown hash table.  The only down side is that
it reads the entire input into a big PackedString before producing
anything, and profiling suggests that it spends about half its time
splitting the big string into lines.  I think if I hack up an hGetLinePS
I might be able to get another factor of 2 out of it.

Cheers,
	Simon

------_=_NextPart_001_01C1191A.9CB81489
Content-Type: application/octet-stream;
	name="spell-checker.hs"
Content-Transfer-Encoding: base64
Content-Description: spell-checker.hs
Content-Disposition: attachment;
	filename="spell-checker.hs"

LS0gY29tcGlsZSB3aXRoOiBnaGMgLU8gLXBhY2thZ2UgbGFuZwotLSBydW4gd2l0aDogICAgIC4v
YS5vdXQgK1JUUyAtSDEwbSAtSzRtIDxJbnB1dC50eHQKCmltcG9ydCBNQXJyYXkKaW1wb3J0IFBh
Y2tlZFN0cmluZwppbXBvcnQgSU9FeHRzCmltcG9ydCBJTwppbXBvcnQgQ2hhcgppbXBvcnQgTW9u
YWQKCmFycl9zaXplID0gMjAwMDAKCm1haW4gPSBkbwogIGggPC0gb3BlbkZpbGUgIlVzci5EaWN0
LldvcmRzIiBSZWFkTW9kZQogIHN6IDwtIGhGaWxlU2l6ZSBoCiAgcHMgPC0gaEdldFBTIGggKGZy
b21JbnRlZ3JhbCBzeikKCiAgdGJsIDwtIG5ld0FycmF5ICgwLGFycl9zaXplKSBbXQogIG1hcE0g
KGFkZFRvSGFzaFRhYmxlIHRibCkgKGxpbmVzUFMgcHMpCgogIGggPC0gb3BlbkZpbGUgIklucHV0
LnR4dCIgUmVhZE1vZGUKICBzeiA8LSBoRmlsZVNpemUgaAogIHBzIDwtIGhHZXRQUyBoIChmcm9t
SW50ZWdyYWwgc3opCgogIGxldCB0ZXN0IHMgPSBkbyBiIDwtIGVsZW1IYXNoVGFibGUgcyB0YmwK
CQkgIHdoZW4gKG5vdCBiKSAocHV0U3RyTG4gKHVucGFja1BTIHMpKQogIG1hcE0gdGVzdCAobGlu
ZXNQUyBwcykKCnR5cGUgSGFzaFRhYmxlID0gSU9BcnJheSBJbnQgW1BhY2tlZFN0cmluZ10KCi0t
IExvb2tzIGJhZCwgYnV0IEdIQyBkb2VzIGEgZ3JlYXQgam9iIG9mIG9wdGltaXNpbmcgaXQ6Cmhh
c2hQUyA6OiBQYWNrZWRTdHJpbmcgLT4gSW50Cmhhc2hQUyBwcyA9IGZvbGRyIGYgMCAobWFwIChv
cmQuaW5kZXhQUyBwcykgWzEuLmxlbmd0aFBTIHBzXSkKICB3aGVyZSBmIG4gbSA9IG4gKyBtICog
MTI4IGBtb2RgIDEwNDg1ODMKCmFkZFRvSGFzaFRhYmxlIDo6IEhhc2hUYWJsZSAtPiBQYWNrZWRT
dHJpbmcgLT4gSU8gKCkKYWRkVG9IYXNoVGFibGUgdGJsIHMgPSBkbwogIGxldCBoID0gaGFzaFBT
IHMKICAgICAgaW5kZXggPSBoIGBtb2RgIGFycl9zaXplCiAgciA8LSByZWFkQXJyYXkgdGJsIGlu
ZGV4CiAgaWYgKHMgYGVsZW1gIHIpIHRoZW4gcmV0dXJuICgpIGVsc2UgZG8KICB3cml0ZUFycmF5
IHRibCBpbmRleCAocyA6IHIpCgplbGVtSGFzaFRhYmxlIDo6IFBhY2tlZFN0cmluZyAtPiBIYXNo
VGFibGUgLT4gSU8gQm9vbAplbGVtSGFzaFRhYmxlIHMgdGJsID0gZG8KICBsZXQgaCA9IGhhc2hQ
UyBzCiAgICAgIGluZGV4ID0gaCBgbW9kYCBhcnJfc2l6ZQogIHIgPC0gcmVhZEFycmF5IHRibCBp
bmRleAogIHJldHVybiAocyBgZWxlbWAgcikK

------_=_NextPart_001_01C1191A.9CB81489--