[Haskell-beginners] Convert String to List/Array of Numbers

Lorenzo Isella lorenzo.isella at gmail.com
Wed Sep 8 13:24:12 EDT 2010


Hi Daniel,
Thanks for your help.
I have a couple of questions left
(1) The first one is quite down to earth.
The snippet below

---------------------------------------------------
main :: IO ()

main = do
   txt <- readFile "mydata.dat"

   let dat = convert txt

   print dat -- this prints out my chunk of data

   return ()

convert x = lines x

-----------------------------------------------

pretty much does what it is supposed to do, but if I use this definition 
of convert x

convert x = map (map read . words) . lines x

I bump into compilation errors. Is that the way I am supposed to deal 
with your function?

(2) This is a bit more about I/O in general. I start an action with "do" 
to read some files and I define outside the action some functions which 
are supposed to operate (within the do action) on the read data.
Is this the way it always has to be? I read something about monads but 
did not get very far (and hope that they are not badly needed for simple 
I/O). Is there a way in Haskell to have the action return to the outside 
world e.g. the value of dat and then work with it elsewhere?
That is what I would do in Python or R, but I think I understood that 
Haskell's philosophy is different...
Am I on the right track here? And what is the benefit of this?

Cheers

Lorenzo


On 09/08/2010 04:06 PM, Daniel Fischer wrote:
> On Wednesday 08 September 2010 15:31:19, Lorenzo Isella wrote:
>> Dear All,
>> I must be stuck on something pretty basic (I am struggling badly with
>> I/O). Let us assume you have a rather simple file mydata.dat (3 columns
>> of integer numbers), see below.
>>
>>
>>
>> 1246191122 1336 1337
>> 1246191142 1336 1337
>> 1246191162 1336 1337
>> 1246191182 1336 1337
>> 1246191202 1336 1337
>> 1246191222 1336 1337
>> 1246191242 1336 1337
>> 1246191262 1336 1337
>> 1246191282 1336 1337
>> 1246191302 1336 1337
>> 1246191322 1336 1337
>> 1246191342 1336 1337
>> 1246191362 1336 1337
>> 1246191382 1336 1337
>> 1246191402 1336 1337
>> 1246191422 1336 1337
>>
>> Now, my intended pipeline could be
>>
>> read file as string-->  convert to list of integers-->pass it to hmatrix
>> (or try to convert it into a matrix/array).
>> Leaving aside the last step, I can easily do something like
>>
>> let dat=readFile "mydata.dat"
>>
>>
>> in the interactive shell and get a string,
>
> Not quite. `dat' is the IO-action that reads the file, of type (IO String)
> and not a String.
> In a programme, you'd do something like
>
> main = do
>      ... -- argument parsing perhaps
>      txt<- readFile "mydata.dat"
>      let dat = convert txt
>      doSomething with dat
>
>> but I am having problems in
>> converting this to a list or anything more manageable (where every entry
>> is an integer number i.e. something which can be summed, subtracted
>> etc...). Ideally even a list where every entry is a row (a list in
>> itself) would do.
>
> Depending on what the reult type should be, different solutions are
> required.
> The simplest solutions for such a file format are built from
>
> read  -- to convert e.g. "135" to 135
> lines :: String ->  [String]
> words :: String ->  [String]
> map :: (a ->  b) ->  [a] ->  [b]
>
> If you want a flat list of Integers from that file,
>
> convert = map read . words
>
> will do. First, `words' splits the String on whitespace (spaces and
> newlines), producing a list of digit-strings, those are then read as
> Integers.
>
> If you want a list of lists, each line its own list inside the top level
> list,
>
> convert = map (map read . words) . lines
>
> is what you want.
>
> If you want to convert each line into a different data structure, say
> (Integer, Double, Int64), the general form would still be
>
> convert = map parseLine . lines
>
> and parseLine would depend on the structure you want. For the above,
>
> parseLine str
>      = case words str of
>          (a : b : c : _) ->  (read a, read b, read c)
>          _ ->  error "Bad line format"
>
> would be a solution.
>
> For any but the simplest formats, you should write a real parser to deal
> with possible bad formatting though (writing parsers is fun in Haskell).
>
>> I found online this suggestion
>> http://bit.ly/9jv1WG
>> but I am not sure if it really applies to this case.
>> Many thanks
>>
>> Lorenzo
>



More information about the Beginners mailing list