[Haskell-cafe] "Parsing" a string

Dmitri Pissarenko mailing-lists at dapissarenko.com
Tue Jan 25 08:40:32 EST 2005


Hello!

I have to read a PGM image and transform it into a list of Int values.

I read the image (this is an ASCII PGM format) using the readFile function and
get a string with the contents of the file.

This string contains height and widht of the image at the beginning, and then
the pixel values follow.

I need to read the height and width, then "cut" them from the string, create
an array (or finite map) of Int's (for this I need to know the height and
width), and then recursively process the pixel values (i. e. put them into the
array).

The string is structured as follows:

<string>
P2
# comment
# comment
320<WHITESPACE>243<WHITESPACE>255
130 130 130 130 130
130 130 130 130 130
</string>

P2 is the magic number. All lines starting with # are comments and must be
ignored. 320 and 243 are width and height. <WHITESPACE> is either a space, or
tab, or newline.

At the moment, I don't have an idea about how to read width and height.

One possible approach would be to convert the string into a list of strings A,
using newline as separator. Then, I could create list A' with comments removed.

Then, A' can be transformed into a string A'' again. From A'' I know that
width is contained between the end of P2 and the first occurrence of
whitespace (Char.isSpace). I also know that height is contained between first
and second occurrence of white space. I could use some sort of regular
expression analogons to access width and height.

I have several questions concerning this approach:

1) How can I transform a string into a list of strings, separated by some
character (in Java one uses StringTokenizer for this) ?

2) How do the aforementioned "regular expressions' analogons" work in Haskell?

Thanks in advance

Dmitri Pissarenko
--
Dmitri Pissarenko
Software Engineer
http://dapissarenko.com


More information about the Haskell-Cafe mailing list