[Haskell] Probably a trivial thing for people knowing Haskell

Friedrich frido at q-software-solutions.de
Tue Oct 21 03:18:04 EDT 2008


Udo Stenzel <u.stenzel at web.de> writes:

>> Friedrich wrote:
>> >Ok to  be more concrete is the laziness "hidden" here?
>> >
>> >check_line line sum count =  
>> >    let match = matchRegex regexp line
>> >        in case match of
>> >               Just strs -> (sum + read (head strs) :: Integer, count + 1)
>> >               Nothing -> (sum, count)
>
> Yes, part of it.  To see why, put yourself into the role of an evaluator
> for your program.  An application of check_line will not be evaluated
> until necessary, and it becomes necessary only if the result is bound to
> a pattern (and that binding is needed for some reason).  At that point,
> enough has to be evaluated to determine whether the result is actually a
> pair or bottom.
>
> So what will you do?  The body of check_line is a case expression, so
> you need to sufficiently evaluate its scrutinee.  You evaluate enough of
> matchRegex to see whether the result is Nothing or Just.  Let's say it's
> Just.  So you descent into the Just branch, and you see the result is a
> pair (and not bottom).  The elements of the pair have not been
> evaluated, there was no need to.  Also, the arguments to check_line have
> not been evaluated, except for line.
>
> You need to force the evaluation of the elements of the result pair
> whenever the pair itself is demanded, for example:
>
>> >check_line line sum count =  
>> >    let match = matchRegex regexp line
>> >        in case match of
>> >               Just strs -> ((,) $! (sum + read (head strs) :: Integer)) $! count + 1
>> >               Nothing -> ((,) $! sum) $! count)
>
> (The associativity of ($!) is inconvenient here.  I want
> left-associative ($!).  Actually, a strict pair type would be even more
> convenient here.)
>
> On recent GHC with bang-patterns, this short-cut works, too.  It's not
> quite equivalent, because it will create unevaluated thunks, though they
> won't pile up:
>
>> >check_line line !sum !count =  
>> >    let match = matchRegex regexp line
>> >        in case match of
>> >               Just strs -> (sum + read (head strs) :: Integer, count + 1)
>> >               Nothing -> (sum, count)

Ok, I followed the suggestions. Now I have the following code:

module Main where
import System
import System.IO
import System.Directory
import System.IO.Error
import Text.Regex
import Control.Monad

regexp = mkRegex ("([0-9]+) Windows ex")

main = do
       files <- show_dir "[0-9].*"
       (sum,count) <- run_on_all_files (0,0) files
       let dd = (fromIntegral (sum::Integer))/ (fromIntegral (count::Int))
           in
            putStr("Download = " ++ show sum ++ " in " ++ show count ++ " days are " ++ show dd ++ " downloads/day\n") 




run_on_all_files (a,b) [] = return (a,b)
run_on_all_files (a,b) (x:xs) = do (s,c) <- run_on(a,b) x
                                   run_on_all_files (s,c) xs


run_on (a,b) file_name = do
    handle <- openFile file_name ReadMode
    (sum,count) <- for_each_line (a,b) handle
    hClose handle
    return ((sum,count))
                              
for_each_line (sum, count) handle = do
                       l <- try (hGetLine handle)
                       case l of
                              Left err 
                                  | isEOFError err -> return(sum,count)
                                  | otherwise -> ioError err
                              Right line  -> do 
                                             let (nsum, ncount) = count_downloads line (sum, count) 
                                             for_each_line (nsum,ncount) handle 
                                                                                
                       
          
count_downloads line (!sum, !count) =  
    let match = matchRegex regexp line
        in case match of
               Just strs -> (sum + read (head strs) :: Integer, count + 1)
               Nothing -> (sum, count)
                    
                            
    
show_dir regmatch = do   
                    files <- getDirectoryContents "."
                    let reg = mkRegex regmatch in
                              return(filter (\file_name -> let fm = matchRegex reg file_name
                                      in case fm of
                                      Just strs -> True
                                      Nothing -> False) files)




But it still  sucks  memor as wild and more or less crashes the
system. So why's that  than?

Regards
Friedrich



More information about the Haskell mailing list