[Haskell-beginners] PHP vs Haskell... a challenge!

Thu Mar 17 19:07:28 CET 2011

Perhaps a simpler (more Prelude oriented) solution can clarify. Using
Sean's partial solution, I created the one attached. It uses String's,
which slide easier into the brain than bytestrings (I highly encourage
studying BS's, though), and performe quite similarly for short strings
of chars.
There are optimizations that can be performed in the code I show (lots
of appends presently, for instance), but I'd rather keep it simple.
After all, I'd bet my head is faster than PHP's version as is.
I added a single line note in the code, regarding the usage of
hGetContents. Check it out, and play with hClose to spot the quirks of
lazy IO.
Notice that processRows is nothing but a left fold... i love FP :D

module Main where

import Text.CSV           ( parseCSVFromFile, Record, Field, CSV)
import System.Environment ( getArgs)
import System.IO          ( putStrLn, openFile, hGetContents ,
IOMode(ReadMode), hClose)
import Data.Char          ( isDigit)

main = do
   args <- getArgs
   case length args of
     2 -> do
       csvdata <- parseCSVFromFile (last args)
       either
        (\error -> print error)
        (\csvdata -> processData (head args) csvdata)
        csvdata
     _ -> print "Usage: csvsql template datafile.csv (options later!)"

{-
     csvdata is [CSV]
     a CSV is [Record]
     a Record is [Field]
     a Field is String
-}
processData :: String -> CSV -> IO ()
processData filename csvdata = do
   tin <- openFile filename ReadMode
   template <- hGetContents tin
   let processedTemplate = processRows $ zip (lines template) csvdata
   putStrLn processedTemplate
   hClose tin -- NOTE: close handle AFTER IO action that forces the file
read.
   return ()

processRows :: [(String, Record)] -> String
processRows = foldl fn ""
  where
    fn :: String -> (String,Record) -> String
    fn acc (line, record) = acc ++ ('\n' : (processRecord line record))

    processRecord :: String -> [Field] -> String
    processRecord "" _ = "" -- just in case
    processRecord line recs =
      let (pre, after) = break (=='@') line
          (numberS, rest) = span isDigit (drop 1 after)
          n = read numberS :: Int
      in if null after -- we're done with the line
         then pre
         else pre ++ (recs !! n) ++ (processRecord rest recs)

El jue, 17-03-2011 a las 00:34 -0400, David McBride escribió:
> I was bored, so I wrote this as a solution.  I wrote the replace
> string function using attoparsec.  Uses bytestrings throughout so that
> it will be fast.
> 
> I intially tried regexes and soon abandoned that.  Then I tried
> parsec, but parsec doesn't do bytestrings very well, so I went to
> attoparsec and that was the best code.
> 
> Requires csv-bytestring and attoparsec libraries.
> 
> import Data.ByteString as B
> 
> import Data.ByteString.Internal (c2w)
> import Text.CSV.ByteString
> 
> import Data.Attoparsec.Char8 as AC
> import Control.Applicative ((<|>))
> import Control.Monad (liftM2)
> 
> loadcsv = do
>   temp <- B.readFile "blah.temp"
>   fmap parseCSV (B.readFile "blah.csv") >>= parse' temp
>   where
>     parse' _ Nothing = return ()
>     parse' temp (Just csv) = mapM_ (B.putStrLn . replaceInStr temp) csv
> 
> replaceInStr :: ByteString -> [ByteString] -> ByteString
> replaceInStr str xs = case feed (parse replacePat str) B.empty of
>     Done _ r -> r
>     otherwise -> error "Something wrong with my parser."
>   where
>     replacePat = liftM2 B.append (fmap B.concat (many replace)) rest
>     replace  = do
>       beg <- takeTill (== '@')
>       char '@'
>       i <- decimal
>       return $ beg `B.append` (xs !! i)
>     rest = do
>       x <- takeTill (isEndOfLine . c2w)
>       endOfLine
>       return x
> 
> On Wed, Mar 16, 2011 at 8:00 PM, Sean Charles <sean at objitsu.com> wrote:
> > I tried in vain using Text.CSV to write a small utility for myself that
> > would *simply* map a CSV file across a template, specifically for
> > mass-producing Drupal nodes from a CSV file... Here's what the Haskell code
> > has to compete with :-
> >
> > TESTDATA.CSV
> > eric,42,"Hacker"
> > bert,20,"Janitor"
> > harry,15,"Web nerd"
> >
> > TEST.SQL
> > UPDATE foo SET rating = @1, role = '@2' where name = '@0';
> >
> > And the PHP code that is used to produce ANY output from the template and
> > CSV data (forget injection attacks and all that for now, this is
> > proof-of-concept remember)...
> >
> > <?php
> > $template = file_get_contents($argv[1]);
> > if ($fh=fopen($argv[2],"r")) {
> >  while($line = fgetcsv($fh)) {
> >    foreach ($line as $k => $v ) $line['@'.$k] = $v;
> >    echo strtr($template,$line)."\n";
> >  }
> >  fclose($fh);
> >  }
> > ?>
> >
> > Note that I have had to remap the index keys to @n in order not to
> > accidentally hit any numerical data that may be in the template. This is my
> > test, the *real* template creates a Drupal node dynamically and has lots of
> > [0]['value'] bits in it so I had to do this, I will allow any shortcuts to
> > be used though as I would have liked to have coded the utility in Haskell to
> > get better at it but tonight it defeats me!
> >
> > Here then is my unsatisfactory code and no, it doesn't compile or work very
> > well as I am still trying hard with it!
> > I want to learn!
> >
> > Still my biggest headache is type matching and deciphering the sometimes
> > complete gibberish output from the type inference system when I screw up...
> > I am finding it very hard to work out what I did wrong at times. Sitting
> > here with Real World Haskell is proving fruitless tonight. :(
> >
> > Here's the rubbish code...
> >
> > import Text.CSV
> > import System.Environment
> > import System.IO
> > import Text.ParserCombinators.Parsec.Error
> >
> >
> > main = do
> >  args <- getArgs
> >  case length args of
> >    2 -> do
> >      csvdata <- parseCSVFromFile (last args)
> >      either
> >    (\error -> print error)
> >    (\csvdata -> processData (head args) csvdata)
> >    csvdata
> >    _ -> print "Usage: csvsql template datafile.csv (options later!)"
> >
> > {-
> >    csvdata is [CSV]
> >    a CSV is [Record]
> >    a Record is [Field]
> >    a Field is String
> > -}
> > processData :: String -> CSV -> IO ()
> > processData filename csvdata = do
> >  tin <- openFile filename ReadMode
> >  template <- hGetContents tin
> >  processRow template csvdata
> >  hClose tin
> >  return ()
> >
> > processRow :: String -> [Record] -> String
> > processRow template row = "eric" ++ template
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > Beginners mailing list
> > Beginners at haskell.org
> > http://www.haskell.org/mailman/listinfo/beginners
> >
> 
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners