[Haskell-beginners] Seeking some clarification...
Daniel Fischer
daniel.is.fischer at googlemail.com
Sun Jun 5 21:35:07 CEST 2011
On Sonntag, 5. Juni 2011, 20:46, Sean Charles wrote:
> I have a CSV file containing some data about trains: stations from, to,
> times etc. and I wanted to 'learn some more Haskell' and, to my
> astonishment,
> I have gotten thus far but I am not sure *why* it works or *how* I got
> there! LMAO Here is the relevant code ...
>
> ====>
> trains :: String -> IO ()
> trains csvfile = do
> legodata <- parseCSVFromFile csvfile
> case legodata of
> Left error -> print error
> Right legodata -> mapM_ putStrLn (trainCodes legodata)
>
> -- Assumes row 1 contains cell header information
> -- Note: the train-code is always the third cell
>
> trainCodes :: [Record] -> [String]
> trainCodes = nub . map (!! 2) . tail
That'll bomb of course on malformed input, but that's probably okay in this
scenario.
> ====>
>
> I was chuffed with writing the trainCodes as a point-free function,
> that sort of thing is getting a little easier to work with but I still
> have real head-banging frustrations sometimes with seemingly simple
> things, like looping and just printing stuff out, despite having taught
> myself
> LISP six years ago and Erlang in recent years! I quit!! I really do!!!
>
> My confusion arises over: mapM_ putStrLn (traincodes legodata)
> Given that: mapM_ :: Monad
> <http://haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Control-
> Monad.html#t%3AMonad> m => (a -> m b) -> [a] -> m ()
> <http://haskell.org/ghc/docs/6.12.2/html/libraries/ghc-prim-0.2.0.0/GHC-
> Unit.html#t%3A%28%29>
>
> Here's how I read it: For any m that is a Monad, mapM_ takes a function
> that "takes
> an 'a' and returns it wrapped in a monad",
Not it, but a value based on it (the mapM_'ed function has type (a -> m b))
"A value wrapped in a monad" is kind of a skewed picture, doesn't really do
justice to State or Cont for example.
mapM_ is the composition of
map :: (a -> c) -> [a] -> [c],
restricted to types c = m b for some Monad m -
that part produces a list [m b], then - and
sequence_ :: (Monad m) => [m b] -> m ()
sequence_ "runs" all the actions in the list and discards their results.
If you want to collect the results, there's
sequence :: (Monad m) => [m b] -> m [b]
and the composition of sequence and map,
mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
> a "list of a's" and returns a
> "monad containing
> 'unit'", the empty-list LISP-() undefined voidy thing.
>
> Given that: putStrLn :: String
> <http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelud
> e.html#t:String> -> IO
> <http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelud
> e.html#t:IO> ()
> <http://hackage.haskell.org/packages/archive/base/latest/doc/ghc-prim-0.
> 2.0.0/GHC-Unit.html#t:-40--41->, this means that 'a' is String and 'm b'
> is
> IO () and my list of [a] is the result of calling 'traincodes legodata'.
Right.
>
> trainCodes = nub . map (!! 2) . tail
>
> legodata is [Record] (from Text.CSV) and so, 'tail' removes the header
> row from
> the data, 'map (!! 2)' extracts the third field from each row and
> finally 'nub'
> removes the duplicates. Thus the return type from trainCodes is
> [String].
Yup.
>
> Gluing it all together:
>
> mapM_ :: Monad
> <http://haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Control-
> Monad.html#t%3AMonad> m => (a -> m b) -> [a] -> m ()
> <http://haskell.org/ghc/docs/6.12.2/html/libraries/ghc-prim-0.2.0.0/GHC-
> Unit.html#t%3A%28%29> putStrLn :: String
> <http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelud
> e.html#t:String> -> IO
> <http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelud
> e.html#t:IO> ()
> <http://hackage.haskell.org/packages/archive/base/latest/doc/ghc-prim-0.
> 2.0.0/GHC-Unit.html#t:-40--41-> trainCodes :: [Record] -> [String]
>
> the type of my call then would seem to be:
>
> String -> IO () -> [String] -> IO ()
Missing parentheses, it's (String -> IO ()) -> [String] -> IO ()
> "putStrLn" -> (trainCodes legodata) -> IO ()
>
> which means that not only have I got the types correct for the call but
> the result
> type of 'IO ()' also satisfies the type return for my function and hence
> it executes
> from 'main' (where it is called from) with no issues.
>
> So, am I finally beginning to get a grip on it all ?
Looks quite so.
> This list is a constant source
> of education and I don't post very often as you guys give me far too
> much stuff to be reading all the time! :)
>
> I am using Text.CSV to read my file and all I wanted to do was to output
> a list of unique codes from column three of the spreadsheet data, one
> per line, so that I can use this Haskell as part of a bigger 'bash'
> script.
And that's what your code does :)
>
> Any detailed explanations that might help me better understand my
> solution would be welcome; right now I feel I 'just got lucky' although
> there must be a glimmer of understanding somewhere! LOL
ToDos:
1. parse file to get a list of rows -- parseCSVFromFile
2. remove header row -- tail
3. extract the field(s) of interest -- (!! 2) for one, map (!! 2) for the
list
4. remove duplicates -- nub
5. output -- mapM_ putStrLn
1. is delegated to a library function, how that works need not concern us
at the moment
2. should be clear
3. also clear
4. library, you need not care how it does what it does (unless performance
becomes an issue; nub is O(n^2), if that's too slow, you have to use faster
variants exploiting that in your case you have more than the Eq constraint
nub can only work with; an Ord constraint gives easy O(n*log n)
implementations [using Data.Set, for example]; in a few special cases O(n)
is possible)
5. putStrLn is clear, for mapM_ see above
1. and 5. involve IO (reading a file resp. printing to stdout), 2., 3. and
4. operate only on data, so those steps can be combined into a pipeline
like you did.
>
> Thanks,
> Sean.
>
> PS: Phew!
More information about the Beginners
mailing list