[Haskell-cafe] Re: hGetContents and lazyness

Max Vasin max.vasin at gmail.com
Tue Sep 23 10:03:08 EDT 2008


Micah Cowan <micah at cowan.name> writes:

> Max Vasin wrote:
>> Hello, haskellers!
>> 
>> Suppose we have function (it filters package filenames from apt Packages file):
>> 
>>> getPackageList :: FilePath -> IO [FilePath]
>>> getPackageList packageFile = withFile packageFile ReadMode $
>>>                              \h -> do c <- hGetContents h
>>>                                       return $ map (drop 10) $ filter (startsWith "Filename:") $ lines c -- (1)
>>>     where startsWith [] _ = True
>>>           startsWith _ [] = False
>>>           startsWith (x:xs) (y:ys) | x == y    = startsWith xs ys
>>>                                    | otherwise = False
>> 
>> When, I apply it to a Packages file I (surely) get an empty list. This is an expected result due to
>> lazyness of hGetContents.
>
> Combined with the fact that you're not evaluating its non-strict result
> until after the file handle has been closed, yes.
>
> Your current set of IO actions is probably similar to:
>   . open file
>   . process file
>   . close file
>   . use results from processing the file.
> where the first three steps are all handled by your getPackageList. To
> avoid either getting incomplete (or empty) results, or having to
> strictify everything with $!, it'd be better for you to use a process
> more like:
>   . open file
>   . process file
>   . use results from processing the file.
>   . close file
> probably by moving the withFile outside of getPackageList, to wrap a
> function that prints the results after they've been obtained. The
> function passed to withFile should generally include all the processing
> related to the file and its results, I believe.
Yes. Probably I should leave closing file to the GC and use readFile, this
seems to be the simplest way.

>> I tried changing line (1) to
>> 
>>> return $ map (drop 10) $ filter (startsWith "Filename:") $! lines c
>
> The $! forces strictness, but since it's deep in the result, it isn't
> evaluated until it's too late.
>
>> Chaning it to
>> 
>>> return $! map (drop 10) $ filter (startsWith "Filename:") $ lines c
>> 
>> makes getPackageList function return several (but not all) filenames.
>
> I think we'd need to see the actual input and expected output, to
> understand what's going wrong here. It worked fine for me, for small tests.
The gzipped example file is here:
ftp://ftp.debian.org/debian/dists/lenny/contrib/binary-i386/Packages.gz

>
> By the way, it's good policy to always post complete, runnable examples.
> Requiring anyone who wants to help you to write additional code just to
> get it to run decreases the chances that someone will bother to do so.
Sorry. I've just omitted module imports:

> import Control.Monad (filterM, mapM)
> import System.IO (withFile, IOMode (ReadMode), hGetContents)
> import qualified System.Posix.Files as SPF (isDirectory, getFileStatus)

Running in GHCi:

GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help
Loading package base ... linking ... done.
Prelude> :load Foo.hs 
[1 of 1] Compiling Foo              ( Foo.hs, interpreted )
Ok, modules loaded: Foo.
*Foo> getPackageList "Packages" >>= mapM_ putStrLn
Loading package old-locale-1.0.0.0 ... linking ... done.
Loading package old-time-1.0.0.0 ... linking ... done.
Loading package filepath-1.1.0.0 ... linking ... done.
Loading package directory-1.0.0.0 ... linking ... done.
Loading package unix-2.3.0.0 ... linking ... done.
pool/contrib/a/acx100/acx100-source_20070101-3_all.deb
pool/contrib/a/alien-arena/alien-arena_7.0-1_i386.deb
pool/contrib/a/alien-arena/alien-arena-browser_7.0-1_all.deb
pool/contrib/a/alien-arena/alien-arena-server_7.0-1_i386.deb
pool/contrib/a/alsa-tools/alsa-firmware-loaders_1.0.16-2_i386.deb
pool/contrib/a/amoeba/amoeba_1.1-19_i386.deb
pool/contrib/a/apple2/apple2_0.7.4-5_i386
*Foo> 

Printed list of package files is incomplete.

--
WBR,
Max Vasin.



More information about the Haskell-Cafe mailing list