[Haskell-beginners] Convert String to List/Array of Numbers

Daniel Fischer daniel.is.fischer at web.de
Wed Sep 8 10:06:22 EDT 2010


On Wednesday 08 September 2010 15:31:19, Lorenzo Isella wrote:
> Dear All,
> I must be stuck on something pretty basic (I am struggling badly with
> I/O). Let us assume you have a rather simple file mydata.dat (3 columns
> of integer numbers), see below.
>
>
>
> 1246191122 1336 1337
> 1246191142 1336 1337
> 1246191162 1336 1337
> 1246191182 1336 1337
> 1246191202 1336 1337
> 1246191222 1336 1337
> 1246191242 1336 1337
> 1246191262 1336 1337
> 1246191282 1336 1337
> 1246191302 1336 1337
> 1246191322 1336 1337
> 1246191342 1336 1337
> 1246191362 1336 1337
> 1246191382 1336 1337
> 1246191402 1336 1337
> 1246191422 1336 1337
>
> Now, my intended pipeline could be
>
> read file as string--> convert to list of integers-->pass it to hmatrix
> (or try to convert it into a matrix/array).
> Leaving aside the last step, I can easily do something like
>
> let dat=readFile "mydata.dat"
>
>
> in the interactive shell and get a string,

Not quite. `dat' is the IO-action that reads the file, of type (IO String) 
and not a String.
In a programme, you'd do something like

main = do
    ... -- argument parsing perhaps
    txt <- readFile "mydata.dat"
    let dat = convert txt
    doSomething with dat

> but I am having problems in
> converting this to a list or anything more manageable (where every entry
> is an integer number i.e. something which can be summed, subtracted
> etc...). Ideally even a list where every entry is a row (a list in
> itself) would do.

Depending on what the reult type should be, different solutions are 
required.
The simplest solutions for such a file format are built from

read  -- to convert e.g. "135" to 135
lines :: String -> [String]
words :: String -> [String]
map :: (a -> b) -> [a] -> [b]

If you want a flat list of Integers from that file,

convert = map read . words

will do. First, `words' splits the String on whitespace (spaces and 
newlines), producing a list of digit-strings, those are then read as 
Integers.

If you want a list of lists, each line its own list inside the top level 
list,

convert = map (map read . words) . lines

is what you want.

If you want to convert each line into a different data structure, say 
(Integer, Double, Int64), the general form would still be

convert = map parseLine . lines

and parseLine would depend on the structure you want. For the above,

parseLine str
    = case words str of
        (a : b : c : _) -> (read a, read b, read c)
        _ -> error "Bad line format"

would be a solution.

For any but the simplest formats, you should write a real parser to deal 
with possible bad formatting though (writing parsers is fun in Haskell).

> I found online this suggestion
> http://bit.ly/9jv1WG
> but I am not sure if it really applies to this case.
> Many thanks
>
> Lorenzo



More information about the Beginners mailing list