Optimising words

Neil Mitchell ndmitchell at gmail.com
Mon Jul 9 11:10:32 EDT 2007


Hi

While benchmarking a word count program I found that it wasn't running
as fast as it could. I traced this back to the original definition of
words, which isn't as good as it could be:

This is the version in Base:

words			:: String -> [String]
words s			=  case dropWhile isSpace s of
				"" -> []
				s' -> w : words s''
				      where (w, s'') =
                                             break isSpace s'

We can instantiate s' with s:ss, since we already pattern match:

words			:: String -> [String]
words s			=  case dropWhile isSpace s of
				"" -> []
				s:ss -> w : words s''
				      where (w, s'') =
                                             break isSpace (s:ss)

Now we know that s is not a space, since if it was the dropWhile would
have removed it. This means that we don't need to test it, and can put
it straight on to w:

words			:: String -> [String]
words s			=  case dropWhile isSpace s of
				"" -> []
				s:ss -> (s:w) : words s''
				      where (w, s'') = break isSpace ss

Now we have eliminated an extra isSpace test per character, and as it
happens, isSpace is very slow.

Is my reasoning correct? If so, can we make this optimisation?

Thanks

Neil


More information about the Libraries mailing list