[Haskell-cafe] strict version of Haskell - does it exist?

Marc Weber marco-oweber at gmx.de
Mon Jan 30 17:12:19 CET 2012


Replying to all replies at once:

> Malcolm Wallace
>  At work, we have a strict version of Haskell
:-) which proofs that it is worth thinking about it.

> Ertugrul
> If you want to save the time to learn how to write efficient Haskell
> programs, you may want to have a look into the Disciple language. 
Yes - I will. Its on my TODO list for at least 12 month :(
Not sure whether there are parser combinator libraries yet.

@  Herbert Valerio Riedel  (suggesting aeson)
I gave it yet another try - see below.
It still fails.

@ Felipe Almeida Lessa  (suggesting conduits and atto parsec)
I mentioned that I already tried it. Counting lines only was a lot slower than
counting lines and parsing JSON using PHP.

@ Chris Wong
> flag is that strictness isn't just something you turn on and off willy-nilly
You're right. But those features are not required for all tasks :)
Eg have a look at Data.Map. Would a strict implementation look that different?




I came up with this now. Trying strict bytestrings and Aeson.
note the LB.fromChunks usage below to turn a strict into a lazy bytestring.

Result: PHP script doing the same runs in 33secs (using the
AFindCheckoutsAndContacts branch) The haskell program below - I stopped it
after 8 min. (At least it does no longer cause my system to swap ..

You're right: I could start profiling. I could learn about how to optimize it.)
But why? The conclusion is that if I ever use yesod - and if I ever want to
analyse logs - I should call PHP from yesod as external process !? :-(

Even if I had a core i7 (8 threads in parallel) I still would not be sure
whether Haskell would be the right choice. I agree that I assume that all data
fits into memory so that piecewise evaluation doesn't matter too much.

Thanks for all your help - it proofs once again that the Haskell
community is the most helpful I know about. Yet I think me being the
programmer Haskell is the wrong choice for this task.

Thanks Marc Weber

my new attempt - now I should start profiling.. Looks like I haven't
built all libs with profiling support ..



  import Data.Aeson.Types
  import Data.Aeson
  import Data.List
  import Control.Applicative
  import Debug.Trace
  import qualified Data.Map as M
  import Action
  import Data.Aeson.Parser as AP

  import qualified Data.ByteString.Lazy as LB

  import qualified Data.ByteString as BS
  import qualified Data.ByteString.Char8 as LBC8

  data Action = ACountLine | AFindCheckoutsAndContacts

  -- lines look like this:
  -- {"id":"4ee535f01550c","r":"","ua":"Mozilla\/5.0 (compatible; bingbot\/2.0; +http:\/\/www.bing.com\/bingbot.htm)","t":1323644400,"k":"XXX","v":"YY"}

  data Item = Item {
    id :: SB.ByteString,
    ua :: SB.ByteString,
    t  :: Int,
    k  :: SB.ByteString,
    v  :: SB.ByteString
  }

  instance FromJSON Item where
      parseJSON (Object v) = Item <$>
                             v .: "id" <*>
                             v .: "ua" <*>
                             v .: "t" <*>
                             v .: "k" <*>
                             v .: "v"
      parseJSON _ = empty


  run :: Action -> [FilePath] -> IO ()

  run AFindCheckoutsAndContacts files = do
    -- find all ids quering the server more than 100 times.
    -- do so by building a map counting queries

    (lines :: [BS.ByteString]) <- fmap (concat . (map LBC8.lines) ) $ mapM BS.readFile files
    let processLine :: (M.Map BS.ByteString Int) -> BS.ByteString -> (M.Map BS.ByteString Int)
        processLine st line = case decode' (LB.fromChunks [line]) of
                                  Nothing -> st -- traceShow ("bad line " ++ (LBC8.unpack line)) st
                                  Just (Item id ua t k v) -> M.insertWith (+) k 1 st
    let count_by_id = foldl' processLine (M.empty) lines
    mapM_ print $ M.toList $ M.filter (> 100) count_by_id



More information about the Haskell-Cafe mailing list