[Haskell-cafe] Space leak

Daniel Fischer daniel.is.fischer at web.de
Thu Mar 11 17:46:11 EST 2010


Am Donnerstag 11 März 2010 00:24:28 schrieb Daniel Fischer:
> Hmm, offhand, I don't see why that isn't strict enough.

Turns out, mapM_ was a red herring. The villain was (zip and map).
I must confess, I don't know why it sort-of worked without the mapM_, 
though. "sort-of", because that also hung on to unnecessarily much memory, 
the space leak was just smaller than with the mapM_.

A very small change that eliminates the space leak, is

readFasta :: Int -> [Char] -> [Window]
readFasta windowSize sequence =
    -- get the header
    let (header,rest) = span (/= '\n') sequence
        chr = parseChromosome header
        go i (w:ws) = Window w chr i : go (i+1) ws
        go _ [] = []
    in go 0 $ slideWindow windowSize $ filter (/= '\n') rest

You can improve performance by eliminating slideWindow and the intermediate 
Window list (merging fastaExtractor and readFasta), 

{-# LANGUAGE BangPatterns #-}

readFasta2 :: (String -> Bool) -> Int -> String
readFasta2 test windowSize sequence =
    let (header,rest) = span (/= '\n') sequence
        chr = parseChromosome header
        schr = show chr
        go !i st@(_:tl)
            | test w    = w ++ '\t' : schr ++ '\t' : show i ++ '\n' : go 
(i+1) tl
            | otherwise = go (i+1) tl
              where
                w = take windowSize st
        go _ [] = []
    in go 0 (filter (/= '\n')) rest



More information about the Haskell-Cafe mailing list