[Haskell-cafe] WideFinder

Don Stewart dons at galois.com
Sun Nov 11 00:31:45 EST 2007

> http://www.tbray.org/tmp/o10k.ap is the basic data set. For heavier  
> duty testing, folks seem to be appending it to itself 99 more times  
> to yield a "o1000k.ap" dataset. I'd be curious for comments on my  
> code or other suggestions to speed things up -- the strictness  
> semantics of the mapUnionPar function seem pretty decent to me, but  
> I'd like to find a way to give higher preference to evaluating later  
> iterations of until as opposed to earlier ones (so as to improve  
> memory performance) but can't think of any way to do that without  
> explicit threads. Implementing memory mapped reads, as was suggested  
> here recently in a different context, might be another big  
> performance gain.
> On my decidedly not powerful machine (Mac PowerPC G5, 1.8GHz) I can't  
> get much lower than 12.25s for the 1000k dataset (out of which,  
> roughly 3s in GC), which is 192M, which is actually slower than his  
> sample ruby implementation. :-(. I'm sure parallel processing will  
> help quite a bit, however, as profiling indicates that most time is  
> spent in the "count" function. Maps are a good choice for parallelism  
> because they merge efficiently, but for the iterative aspect their  
> performance leaves a lot to be desired. This seems evident in that  
> even on a single processor, lower sizes of chunks, at least to a  
> point, still improve overall performance, although this may possibly  
> be equally an issue with space efficiency.
> I wonder if Haskell's lack of an efficient hashtable isn't hurting it  
> here again too, but on the other hand for a real efficiency gain,  
> switching to a custom-built trie that combined pattern matching and  
> insertion into a single operation would probably be a significant  
> win, and it would let us force unboxing ints too, for whatever that  
> gains.

Did you also try Bryan O'Sullivan's smp code, btw?


-- Don

More information about the Haskell-Cafe mailing list