[Haskell-cafe] 1st attempt at parallelizing
Felipe Lessa
felipe.lessa at gmail.com
Mon Jul 26 18:25:09 EDT 2010
2010/7/26 Günther Schmidt <gue.schmidt at web.de>:
> Hi all,
Hello!
> I'm spidering web pages, the implementation currently is synchronous. I'd
> like to parallelize this for speed-up, ie. get up to 6 pages in parallel and
> recycle those threads.
This is usually called concurrent programming, not parallel.
> Now I have come across good examples for this on the web before, but I doubt
> I'd find it again right away.
>
> I'd appreciate some good pointers.
There's a simple way of doing this with Chans, for example:
import Control.Applicative
import Control.Concurrent.STM
import Control.Monad
import qualified Data.Map as M
data Page = ...
data Info = ...
download :: Page -> IO Info
download = ...
getOneByOne :: [Page] -> IO (M.Map Page Info)
getOneByOne = M.fromList <$> mapM (\p -> (,) p <$> download p)
downloader :: TChan (Maybe Page) -> TChan (Page, Info) -> IO ()
downloader in out = do
mp <- atomically (readTChan in)
case mp of
Nothing -> return ()
Just p -> download p >>= atomically . writeTChan out
getConcurrent :: Int -> [Page] -> IO [M.Map Page Info]
getConcurrent n xs = do
in <- newTChanIO
out <- newTChanIO
replicateM_ n (forkIO $ downloader in out) -- create n threads
mapM (writeTChan in . Just) xs
replicateM_ n (writeTChan in Nothing) -- kill n threads
M.fromList <$> mapM (\_ -> readTChan out) xs
This code doesn't take exceptions into account, which you should, but
this works. Well, I guess, didn't try, if it compiles then it should
;).
HTH,
--
Felipe.
More information about the Haskell-Cafe
mailing list