[Haskell-cafe] wordsBy in the base libraries?

Richard A. O'Keefe ok at cs.otago.ac.nz
Tue Oct 23 00:04:33 EDT 2007

On 22 Oct 2007, at 10:50 pm, Neil Mitchell wrote:
> wordsBy :: (a -> Bool) -> [a] -> [[a]]
> wordsBy p s = case dropWhile p s of
>    []      -> []
>    s':rest -> (s':w) : wordsBy p (drop 1 s'')
>           where (w, s'') = break p rest

For things like this, I think it may be easier to just write down
a finite state automaton, so that it is *obvious* that each character
is fetched and tested exactly once.

wordsBy :: (a -> Bool) -> [a] -> [[a]]

wordsBy sep str = s_skip str
   where s_skip []     = []
         s_skip (c:cs) = if sep c then s_skip cs else s_word cs [c]
         s_word []     w = [reverse w]
         s_word (c:cs) w = if sep c then reverse w : s_skip cs
			  else s_word cs (c:w)
   -- s_skip state: not in a word, skipping separators
   -- s_word state: in a word, accumulating "letters"

Of course the s_skip state corresponds to the dropWhile call, but is
there any guarantee about how often dropWhile or break do their tests?
The FSA approach remains easy to use in more complex cases where
dropWhile and break are awkward.


More information about the Haskell-Cafe mailing list