[Haskell-cafe] WideFinder
Don Stewart
dons at galois.com
Sun Nov 11 00:31:45 EST 2007
s.clover:
> http://www.tbray.org/tmp/o10k.ap is the basic data set. For heavier
> duty testing, folks seem to be appending it to itself 99 more times
> to yield a "o1000k.ap" dataset. I'd be curious for comments on my
> code or other suggestions to speed things up -- the strictness
> semantics of the mapUnionPar function seem pretty decent to me, but
> I'd like to find a way to give higher preference to evaluating later
> iterations of until as opposed to earlier ones (so as to improve
> memory performance) but can't think of any way to do that without
> explicit threads. Implementing memory mapped reads, as was suggested
> here recently in a different context, might be another big
> performance gain.
>
> On my decidedly not powerful machine (Mac PowerPC G5, 1.8GHz) I can't
> get much lower than 12.25s for the 1000k dataset (out of which,
> roughly 3s in GC), which is 192M, which is actually slower than his
> sample ruby implementation. :-(. I'm sure parallel processing will
> help quite a bit, however, as profiling indicates that most time is
> spent in the "count" function. Maps are a good choice for parallelism
> because they merge efficiently, but for the iterative aspect their
> performance leaves a lot to be desired. This seems evident in that
> even on a single processor, lower sizes of chunks, at least to a
> point, still improve overall performance, although this may possibly
> be equally an issue with space efficiency.
>
> I wonder if Haskell's lack of an efficient hashtable isn't hurting it
> here again too, but on the other hand for a real efficiency gain,
> switching to a custom-built trie that combined pattern matching and
> insertion into a single operation would probably be a significant
> win, and it would let us force unboxing ints too, for whatever that
> gains.
Did you also try Bryan O'Sullivan's smp code, btw?
http://www.serpentine.com/blog/2007/09/25/what-the-heck-is-a-wide-finder-anyway/
-- Don
More information about the Haskell-Cafe
mailing list