Tokenizing Strings

Shae Matijs Erisson shae@ScannedInAvian.com
Wed, 02 Apr 2003 20:46:25 +0200


<jamesd@mena.org.au> writes:

> I have a string that needs to be split/tokenized based on a delimiter. This
> can easily be accomplished using 'break' if the delimiter is only 1
> character (i.e. break isSpace "this is a string"), but I can't see any way
> of using this for a delimiter with multiple characters.
>
> in this case, I have a string containing multiples fields seperated by *two*
> blank lines (\n\n). I can't just break on the newline character, as single
> newline characters can be found inside each field.
>
> any idea how I can do this without too much hassle?

There's a split function that does this in lambdabot's cvs tree:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/haskell-libs/libs/lambdabot/Util.hs

Here's a demo:
Prelude Util> split "foo" "bazfoobarfooblipp"
["baz","bar","blipp"]

-- 
Shae Matijs Erisson - 2 days older than RFC0226
#haskell on irc.freenode.net - We Put the Funk in Funktion
10 PRINT "HELLO" 20 GOTO 10 ; putStr $ fix ("HELLO\n"++)