[Haskell-cafe] Speedy parsing
Re, Joseph (IT)
Joseph.Re at MorganStanley.com
Fri Jul 20 13:54:16 EDT 2007
Ah, I thought I might have to resort to one of the ByteStrings modules.
I've heard of them but was more or less avoiding them due to some
complexities with installing extra libraries in my current dev
environment. I'll try to work that out with the sysadmins and try it
out.
Thanks for the advice
-----Original Message-----
From: Tillmann Rendel [mailto:rendel at rbg.informatik.tu-darmstadt.de]
Sent: Thursday, July 19, 2007 8:48 PM
To: Re, Joseph (IT)
Cc: haskell-cafe at haskell.org
Subject: Re: [Haskell-cafe] Speedy parsing
Re, Joseph (IT) wrote:
> At this point I'm out of ideas, so I was hoping someone could identify
> something stupid I've done (I'm still novice of FP in general, let
> alone for high performance) or direct me to a
> guide,website,paper,library, or some other form of help.
Two ideas about your aproaches:
(1) try to avoid explicit recursion by using some standard library
functions instead. it's easier (once you learned the library) and may be
faster (since the library may be written in a easy to optimize style).
(2) try lazy ByteStrings, they should be faster.
http://www.cse.unsw.edu.au/~dons/fps.html
As an example, sorting of the individual lines of a csv files by key.
csv parses the csv format, uncsv produces it. these functions can't
handle '=' in the key or ',' in the key or value. treesort sorts by
inserting stuff into a map and removing it in ascending order:
> import System.Environment
> import qualified Data.ByteString.Lazy.Char8 as B import qualified
> Data.Map as Map import Control.Arrow (second)
>
> csv = (map $ map $ second B.tail . B.break (== '=')) .
> (map $ B.split ',') .
> (B.split '\n')
>
> uncsv = (B.join $ B.pack "\n") .
> (map $ B.join $ B.pack ",") .
> (map $ map $ \(key, val) -> B.concat [key, B.pack "=", val])
>
> treesort = Map.toAscList . Map.fromList
>
> main = B.interact $ uncsv . map treesort . csv
Tillmann
--------------------------------------------------------
NOTICE: If received in error, please destroy and notify sender. Sender does not intend to waive confidentiality or privilege. Use of this email is prohibited when received in error.
More information about the Haskell-Cafe
mailing list