[Haskell-cafe] Speedy parsing
Tillmann Rendel
rendel at rbg.informatik.tu-darmstadt.de
Thu Jul 19 20:47:52 EDT 2007
Re, Joseph (IT) wrote:
> At this point I'm out of ideas, so I was hoping someone could identify
> something stupid I've done (I'm still novice of FP in general, let alone
> for high performance) or direct me to a guide,website,paper,library, or
> some other form of help.
Two ideas about your aproaches:
(1) try to avoid explicit recursion by using some standard library
functions instead. it's easier (once you learned the library) and may be
faster (since the library may be written in a easy to optimize style).
(2) try lazy ByteStrings, they should be faster.
http://www.cse.unsw.edu.au/~dons/fps.html
As an example, sorting of the individual lines of a csv files by key.
csv parses the csv format, uncsv produces it. these functions can't
handle '=' in the key or ',' in the key or value. treesort sorts by
inserting stuff into a map and removing it in ascending order:
> import System.Environment
> import qualified Data.ByteString.Lazy.Char8 as B
> import qualified Data.Map as Map
> import Control.Arrow (second)
>
> csv = (map $ map $ second B.tail . B.break (== '=')) .
> (map $ B.split ',') .
> (B.split '\n')
>
> uncsv = (B.join $ B.pack "\n") .
> (map $ B.join $ B.pack ",") .
> (map $ map $ \(key, val) -> B.concat [key, B.pack "=", val])
>
> treesort = Map.toAscList . Map.fromList
>
> main = B.interact $ uncsv . map treesort . csv
Tillmann
More information about the Haskell-Cafe
mailing list