. > (Just ask http://validator.w3.org/ .) Indeed. The original is even worse, with overlapping nodes and other such treasures which makes navigation in HXT tricky at times. > I trust that you are parsing > this because you realize it is all wrong and you want to > programmatically convert it to proper markup. Yep! I sure wouldn't be doing this if I had control of the the original HTML. > > Since the file is unstructured, I choose not to sweat over restoring > the structure in an HXT arrow. The HXT arrow will return a flat list, > just as the file is a flat ensemble. I was about to write a follow-up just as your mail came in... I've ended up with the same solution as you've kindly suggested. Another option I came across is Control.Arrow.ArrowTree.changeChildren which could be used to restore a more normalised structure ready for more processing. Thanks Daniel From isto.aho at dnainternet.net Wed Nov 1 17:16:55 2006 From: isto.aho at dnainternet.net (isto) Date: Wed Nov 1 17:16:42 2006 Subject: [Haskell-cafe] How to improve speed? (MersenneTwister is several times slower than C version) Message-ID: <1162419415.26204.26.camel@localhost.localdomain> Hi all, On HaWiki was an announcement of MersenneTwister made by Lennart Augustsson. On a typical run to find out 10000000th rnd num the output is (code shown below): $ time ./testMTla Testing Mersenne Twister. Result is [3063349438] real 0m4.925s user 0m4.856s I was exercising with the very same algorithm and tried to make it efficient (by using IOUArray): now a typical run looks like (code shown below): $ time ./testMT Testing Mersenne Twister. 3063349438 real 0m3.032s user 0m3.004s The original C-version (modified so that only the last number is shown) gives typically $ time ./mt19937ar outputs of genrand_int32() 3063349438 real 0m0.624s user 0m0.616s Results are similar with 64 bit IOUArray against 64 bit C variant. C seems to work about 5 to 10 times faster in this case. I have tried to do different things but now I'm stuck. unsafeRead and unsafeWrite improved a bit the lazy (STUArray-version) and IOUArray-versions but not very much. I took a look of Core file but then, I'm not sure where the boxed values are ok. E.g. should IOUArray Int Word64 be replaced with something else? Any hints and comments on how to improve the efficiency and make everything better will be appreciated a lot! br, Isto ----------------------------- testMTla.hs (MersenneTwister, see HaWiki) module Main where -- ghc -O3 -optc-O3 -optc-ffast-math -fexcess-precision --make testMTla import MersenneTwister main = do putStrLn "Testing Mersenne Twister." let mt = mersenneTwister 100 w = take 1 (drop 9999999 mt) -- w = take 1 (drop 99 mt) putStrLn $ "Result is " ++ (show w) ----------------------------- ----------------------------- testMT.hs module Main where -- Compile eg with -- ghc -O3 -optc-O3 -optc-ffast-math -fexcess-precision --make testMT import Mersenne genRNums32 :: MT32 -> Int -> IO (MT32) genRNums32 mt nCnt = gRN mt nCnt where gRN :: MT32 -> Int -> IO (MT32) gRN mt nCnt | mt `seq` nCnt `seq` False = undefined gRN mt 1 = do (r,mt') <- next32 mt putStrLn $ (show r) return mt' gRN mt nCnt = do (r,mt') <- next32 mt gRN mt' $! (nCnt-1) main = do putStrLn "Testing Mersenne Twister." mt32 <- initialiseGenerator32 100 genRNums32 mt32 10000000 ----------------------------- ----------------------------- Mersenne.hs (sorry for linewraps) module Mersenne where import Data.Bits import Data.Word import Data.Array.Base import Data.Array.MArray import Data.Array.IO -- import System.Random data MT32 = MT32 (IOUArray Int Word32) Int data MT64 = MT64 (IOUArray Int Word64) Int last32bitsof :: Word32 -> Word32 last32bitsof a = a .&. 0xffffffff -- == (2^32-1) lm32 = 0x7fffffff :: Word32 um32 = 0x80000000 :: Word32 mA32 = 0x9908b0df :: Word32 -- == 2567483615 -- Array of length 624. initialiseGenerator32 :: Int -> IO MT32 initialiseGenerator32 seed = do let s = last32bitsof (fromIntegral seed)::Word32 mt <- newArray (0,623) (0::Word32) unsafeWrite mt 0 s iG mt s 1 mt' <- generateNumbers32 mt return (MT32 mt' 0) where iG :: (IOUArray Int Word32) -> Word32 -> Int -> IO (IOUArray Int Word32) iG mt lastNro n | n == 624 = return mt | otherwise = do let n1 = lastNro `xor` (shiftR lastNro 30) new = (1812433253 * n1 + (fromIntegral n)::Word32) unsafeWrite mt n new iG mt new (n+1) generateNumbers32 :: (IOUArray Int Word32) -> IO (IOUArray Int Word32) generateNumbers32 mt = gLoop 0 mt where gLoop :: Int -> (IOUArray Int Word32) -> IO (IOUArray Int Word32) gLoop i mt | i==623 = do wL <- unsafeRead mt 623 w0 <- unsafeRead mt 0 w396 <- unsafeRead mt 396 let y = (wL .&. um32) .|. (w0 .&. lm32) :: Word32 if even y then unsafeWrite mt 623 (w396 `xor` (shiftR y 1)) else unsafeWrite mt 623 (w396 `xor` (shiftR y 1) `xor` mA32) return mt | otherwise = do wi <- unsafeRead mt i wi1 <- unsafeRead mt (i+1) w3 <- unsafeRead mt ((i+397) `mod` 624) let y = (wi .&. um32) .|. (wi1 .&. lm32) if even y then unsafeWrite mt i (w3 `xor` (shiftR y 1)) else unsafeWrite mt i (w3 `xor` (shiftR y 1) `xor` mA32) gLoop (i+1) mt next32 :: MT32 -> IO (Word32, MT32) next32 (MT32 mt i) | i >= 624 = do mt' <- generateNumbers32 mt let m = MT32 mt' (i `mod` 624) (w,m') <- next32 m return (w,m') | otherwise = do y <- unsafeRead mt i let y1 = y `xor` (shiftR y 11) y2 = y1 `xor` ((shiftL y1 7 ) .&. 0x9d2c5680) -- == 2636928640 y3 = y2 `xor` ((shiftL y2 15) .&. 0xefc60000) -- == 4022730752 y4 = y3 `xor` (shiftR y3 18) return $ (y4, MT32 mt (i+1)) mA64 = 0xB5026F5AA96619E9 :: Word64 um64 = 0xFFFFFFFF80000000 :: Word64 lm64 = 0x7FFFFFFF :: Word64 initialiseGenerator64 :: Int -> IO (MT64) initialiseGenerator64 seed = do let s = (fromIntegral seed)::Word64 mt <- newArray (0,311) (0::Word64) unsafeWrite mt 0 s iG mt s 1 generateNumbers64 mt return (MT64 mt 0) where iG :: (IOUArray Int Word64) -> Word64 -> Int -> IO (IOUArray Int Word64) iG mt lN i | mt `seq` lN `seq` i `seq` False = undefined iG mt lastNro 312 = return mt iG mt lastNro n = do let n1 = lastNro `xor` (shiftR lastNro 62) new = (6364136223846793005 * n1 + (fromIntegral n)::Word64) unsafeWrite mt n new iG mt new $! (n+1) generateNumbers64 :: (IOUArray Int Word64) -> IO () generateNumbers64 mt = gLoop 0 where gLoop :: Int -> IO () gLoop i | i `seq` False = undefined gLoop 311 = do wL <- unsafeRead mt 311 w0 <- unsafeRead mt 0 w155 <- unsafeRead mt 155 let y = (wL .&. um64) .|. (w0 .&. lm64) :: Word64 if even y then unsafeWrite mt 311 (w155 `xor` (shiftR y 1)) else unsafeWrite mt 311 (w155 `xor` (shiftR y 1) `xor` mA64) return () gLoop i = do wi <- unsafeRead mt i wi1 <- unsafeRead mt (i+1) w3 <- unsafeRead mt ((i+156) `mod` 312) let y = (wi .&. um64) .|. (wi1 .&. lm64) if even y then unsafeWrite mt i (w3 `xor` (shiftR y 1)) else unsafeWrite mt i (w3 `xor` (shiftR y 1) `xor` mA64) gLoop $! (i+1) next64 :: MT64 -> IO (Word64, MT64) next64 (MT64 mt 312) = do generateNumbers64 mt let m = MT64 mt 0 (w,m') <- next64 m return (w,m') next64 (MT64 mt i) = do y <- unsafeRead mt i let y1 = y `xor` ((shiftR y 29) .&. 0x5555555555555555) y2 = y1 `xor` ((shiftL y1 17) .&. 0x71D67FFFEDA60000) y3 = y2 `xor` ((shiftL y2 37) .&. 0xFFF7EEE000000000) y4 = y3 `xor` (shiftR y3 43) return $! (y4, MT64 mt (i+1)) From slavomir.kaslev at gmail.com Wed Nov 1 18:06:05 2006 From: slavomir.kaslev at gmail.com (Slavomir Kaslev) Date: Wed Nov 1 18:06:00 2006 Subject: [Haskell-cafe] Class Message-ID: <171dfd0a0611011506y2ad9c14ev33b126d9cc4160be@mail.gmail.com> Hello. I am new to Haskell and I am going through "Haskell: The craft of functional programming". I am trying to grasp haskell's classes and instances, so here is slightly modified code from the book: class Show a => Visible a where toString :: a -> String toString = show size :: a -> Int size = length . show instance Visible a => Visible [a] where toString = concat . map toString size = foldl (+) 0 . map size vSort :: (Visible a, Ord a) => [a] -> String vSort = toString . List.sort s = vSort [1..3] Unfortunetly in ghc it gives the following type error: Ambiguous type variable `a' in the constraints: `Visible a' arising from use of `vSort' at d:/tmp.hs:83:4-8 `Enum a' arising from the arithmetic sequence `1 .. 3' at d:/tmp.hs:83:10-15 `Num a' arising from the literal `3' at d:/tmp.hs:83:14 `Ord a' arising from use of `vSort' at d:/tmp.hs:83:4-8 Probable fix: add a type signature that fixes these type variable(s) Failed, modules loaded: none. As you can see, Visible is nothing more than an adapter to the Show class. How I got thing so far, [1..3] :: (Num a, Enum a) => [a], has a Show instance so does class Num (which 'subclasses' Show). Therefore, I can't see any reason why toString function can't call show from those instances. Can someone please enlighten my (still) C++ thinking head? -- Slavomir Kaslev From slavomir.kaslev at gmail.com Wed Nov 1 18:08:01 2006 From: slavomir.kaslev at gmail.com (Slavomir Kaslev) Date: Wed Nov 1 18:07:51 2006 Subject: [Haskell-cafe] Re: Class In-Reply-To: <171dfd0a0611011506y2ad9c14ev33b126d9cc4160be@mail.gmail.com> References: <171dfd0a0611011506y2ad9c14ev33b126d9cc4160be@mail.gmail.com> Message-ID: <171dfd0a0611011508g7324d6beu4d75d4557bcca3c9@mail.gmail.com> Err, sorry for the meaningless mail subject. Should be 'Newbie class problem' or something like that. -- Slavomir Kaslev From dons at cse.unsw.edu.au Wed Nov 1 20:17:35 2006 From: dons at cse.unsw.edu.au (Donald Bruce Stewart) Date: Wed Nov 1 20:17:33 2006 Subject: [Haskell-cafe] How to improve speed? (MersenneTwister is several times slower than C version) In-Reply-To: <1162419415.26204.26.camel@localhost.localdomain> References: <1162419415.26204.26.camel@localhost.localdomain> Message-ID: <20061102011735.GB6957@cse.unsw.EDU.AU> Now, this will be hard to get close the the highly tuned C. Possibly its doable. The main tricks are documented here: http://haskell.org/haskellwiki/Performance/GHC#Unboxed_types Inspecting the Core to ensure the math is being inlined and unboxed will be the most crucial issue, I'd imagine. Then again, an FFI binding to mersenne.c is also a good idea :) -- Don isto.aho: > Hi all, > > On HaWiki was an announcement of MersenneTwister made by Lennart > Augustsson. On a typical run to find out 10000000th rnd num the output > is (code shown below): > > $ time ./testMTla > Testing Mersenne Twister. > Result is [3063349438] > > real 0m4.925s > user 0m4.856s > > > I was exercising with the very same algorithm and tried to make it > efficient (by using IOUArray): now a typical run looks like (code shown > below): > > $ time ./testMT > Testing Mersenne Twister. > 3063349438 > > real 0m3.032s > user 0m3.004s > > > The original C-version (modified so that only the last number is > shown) gives typically > > $ time ./mt19937ar > outputs of genrand_int32() > 3063349438 > > real 0m0.624s > user 0m0.616s > > Results are similar with 64 bit IOUArray against 64 bit C variant. > C seems to work about 5 to 10 times faster in this case. > > I have tried to do different things but now I'm stuck. unsafeRead > and unsafeWrite improved a bit the lazy (STUArray-version) and > IOUArray-versions but not very much. I took a look of Core file but > then, I'm not sure where the boxed values are ok. E.g. should IOUArray > Int Word64 be replaced with something else? > > Any hints and comments on how to improve the efficiency and make > everything better will be appreciated a lot! > > br, Isto > > ----------------------------- testMTla.hs (MersenneTwister, see HaWiki) > module Main where > > -- ghc -O3 -optc-O3 -optc-ffast-math -fexcess-precision --make testMTla > > import MersenneTwister > > main = do > putStrLn "Testing Mersenne Twister." > let mt = mersenneTwister 100 > w = take 1 (drop 9999999 mt) > -- w = take 1 (drop 99 mt) > putStrLn $ "Result is " ++ (show w) > ----------------------------- > > ----------------------------- testMT.hs > module Main where > > -- Compile eg with > -- ghc -O3 -optc-O3 -optc-ffast-math -fexcess-precision --make testMT > > import Mersenne > > genRNums32 :: MT32 -> Int -> IO (MT32) > genRNums32 mt nCnt = gRN mt nCnt > where gRN :: MT32 -> Int -> IO (MT32) > gRN mt nCnt | mt `seq` nCnt `seq` False = undefined > gRN mt 1 = do > (r,mt') <- next32 mt > putStrLn $ (show r) > return mt' > gRN mt nCnt = do > (r,mt') <- next32 mt > gRN mt' $! (nCnt-1) > > > main = do > putStrLn "Testing Mersenne Twister." > mt32 <- initialiseGenerator32 100 > genRNums32 mt32 10000000 > ----------------------------- > > ----------------------------- Mersenne.hs (sorry for linewraps) > module Mersenne where > > import Data.Bits > import Data.Word > import Data.Array.Base > import Data.Array.MArray > import Data.Array.IO > -- import System.Random > > > data MT32 = MT32 (IOUArray Int Word32) Int > data MT64 = MT64 (IOUArray Int Word64) Int > > > last32bitsof :: Word32 -> Word32 > last32bitsof a = a .&. 0xffffffff -- == (2^32-1) > > lm32 = 0x7fffffff :: Word32 > um32 = 0x80000000 :: Word32 > mA32 = 0x9908b0df :: Word32 -- == 2567483615 > > -- Array of length 624. > initialiseGenerator32 :: Int -> IO MT32 > initialiseGenerator32 seed = do > let s = last32bitsof (fromIntegral seed)::Word32 > mt <- newArray (0,623) (0::Word32) > unsafeWrite mt 0 s > iG mt s 1 > mt' <- generateNumbers32 mt > return (MT32 mt' 0) > where > iG :: (IOUArray Int Word32) -> Word32 -> Int -> IO (IOUArray Int > Word32) > iG mt lastNro n > | n == 624 = return mt > | otherwise = do let n1 = lastNro `xor` (shiftR lastNro 30) > new = (1812433253 * n1 + (fromIntegral n)::Word32) > unsafeWrite mt n new > iG mt new (n+1) > > > generateNumbers32 :: (IOUArray Int Word32) -> IO (IOUArray Int Word32) > generateNumbers32 mt = gLoop 0 mt > where > gLoop :: Int -> (IOUArray Int Word32) -> IO (IOUArray Int Word32) > gLoop i mt > | i==623 = do > wL <- unsafeRead mt 623 > w0 <- unsafeRead mt 0 > w396 <- unsafeRead mt 396 > let y = (wL .&. um32) .|. (w0 .&. lm32) :: Word32 > if even y > then unsafeWrite mt 623 (w396 `xor` (shiftR y 1)) > else unsafeWrite mt 623 (w396 `xor` (shiftR y 1) `xor` mA32) > return mt > | otherwise = do > wi <- unsafeRead mt i > wi1 <- unsafeRead mt (i+1) > w3 <- unsafeRead mt ((i+397) `mod` 624) > let y = (wi .&. um32) .|. (wi1 .&. lm32) > if even y > then unsafeWrite mt i (w3 `xor` (shiftR y 1)) > else unsafeWrite mt i (w3 `xor` (shiftR y 1) `xor` mA32) > gLoop (i+1) mt > > > next32 :: MT32 -> IO (Word32, MT32) > next32 (MT32 mt i) > | i >= 624 = do mt' <- generateNumbers32 mt > let m = MT32 mt' (i `mod` 624) > (w,m') <- next32 m > return (w,m') > | otherwise = do > y <- unsafeRead mt i > let y1 = y `xor` (shiftR y 11) > y2 = y1 `xor` ((shiftL y1 7 ) .&. 0x9d2c5680) -- == 2636928640 > y3 = y2 `xor` ((shiftL y2 15) .&. 0xefc60000) -- == 4022730752 > y4 = y3 `xor` (shiftR y3 18) > return $ (y4, MT32 mt (i+1)) > > > mA64 = 0xB5026F5AA96619E9 :: Word64 > um64 = 0xFFFFFFFF80000000 :: Word64 > lm64 = 0x7FFFFFFF :: Word64 > > initialiseGenerator64 :: Int -> IO (MT64) > initialiseGenerator64 seed = do > let s = (fromIntegral seed)::Word64 > mt <- newArray (0,311) (0::Word64) > unsafeWrite mt 0 s > iG mt s 1 > generateNumbers64 mt > return (MT64 mt 0) > where > iG :: (IOUArray Int Word64) -> Word64 -> Int -> IO (IOUArray Int > Word64) > iG mt lN i | mt `seq` lN `seq` i `seq` False = undefined > iG mt lastNro 312 = return mt > iG mt lastNro n = do > let n1 = lastNro `xor` (shiftR lastNro 62) > new = (6364136223846793005 * n1 + (fromIntegral > n)::Word64) > unsafeWrite mt n new > iG mt new $! (n+1) > > generateNumbers64 :: (IOUArray Int Word64) -> IO () > generateNumbers64 mt = gLoop 0 > where > gLoop :: Int -> IO () > gLoop i | i `seq` False = undefined > gLoop 311 = do > wL <- unsafeRead mt 311 > w0 <- unsafeRead mt 0 > w155 <- unsafeRead mt 155 > let y = (wL .&. um64) .|. (w0 .&. lm64) :: Word64 > if even y > then unsafeWrite mt 311 (w155 `xor` (shiftR y 1)) > else unsafeWrite mt 311 (w155 `xor` (shiftR y 1) `xor` mA64) > return () > gLoop i = do > wi <- unsafeRead mt i > wi1 <- unsafeRead mt (i+1) > w3 <- unsafeRead mt ((i+156) `mod` 312) > let y = (wi .&. um64) .|. (wi1 .&. lm64) > if even y > then unsafeWrite mt i (w3 `xor` (shiftR y 1)) > else unsafeWrite mt i (w3 `xor` (shiftR y 1) `xor` mA64) > gLoop $! (i+1) > > > next64 :: MT64 -> IO (Word64, MT64) > next64 (MT64 mt 312) = do generateNumbers64 mt > let m = MT64 mt 0 > (w,m') <- next64 m > return (w,m') > next64 (MT64 mt i) = do > y <- unsafeRead mt i > let y1 = y `xor` ((shiftR y 29) .&. 0x5555555555555555) > y2 = y1 `xor` ((shiftL y1 17) .&. 0x71D67FFFEDA60000) > y3 = y2 `xor` ((shiftL y2 37) .&. 0xFFF7EEE000000000) > y4 = y3 `xor` (shiftR y3 43) > return $! (y4, MT64 mt (i+1)) > > > > > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe From nuno at hotmail.co.uk Wed Nov 1 20:22:34 2006 From: nuno at hotmail.co.uk (Nuno Pinto) Date: Wed Nov 1 20:22:26 2006 Subject: [Haskell-cafe] Basic Binary IO Message-ID: