[Haskell-cafe] Re: hGetContents and lazyness
Max Vasin
max.vasin at gmail.com
Tue Sep 23 10:03:08 EDT 2008
Micah Cowan <micah at cowan.name> writes:
> Max Vasin wrote:
>> Hello, haskellers!
>>
>> Suppose we have function (it filters package filenames from apt Packages file):
>>
>>> getPackageList :: FilePath -> IO [FilePath]
>>> getPackageList packageFile = withFile packageFile ReadMode $
>>> \h -> do c <- hGetContents h
>>> return $ map (drop 10) $ filter (startsWith "Filename:") $ lines c -- (1)
>>> where startsWith [] _ = True
>>> startsWith _ [] = False
>>> startsWith (x:xs) (y:ys) | x == y = startsWith xs ys
>>> | otherwise = False
>>
>> When, I apply it to a Packages file I (surely) get an empty list. This is an expected result due to
>> lazyness of hGetContents.
>
> Combined with the fact that you're not evaluating its non-strict result
> until after the file handle has been closed, yes.
>
> Your current set of IO actions is probably similar to:
> . open file
> . process file
> . close file
> . use results from processing the file.
> where the first three steps are all handled by your getPackageList. To
> avoid either getting incomplete (or empty) results, or having to
> strictify everything with $!, it'd be better for you to use a process
> more like:
> . open file
> . process file
> . use results from processing the file.
> . close file
> probably by moving the withFile outside of getPackageList, to wrap a
> function that prints the results after they've been obtained. The
> function passed to withFile should generally include all the processing
> related to the file and its results, I believe.
Yes. Probably I should leave closing file to the GC and use readFile, this
seems to be the simplest way.
>> I tried changing line (1) to
>>
>>> return $ map (drop 10) $ filter (startsWith "Filename:") $! lines c
>
> The $! forces strictness, but since it's deep in the result, it isn't
> evaluated until it's too late.
>
>> Chaning it to
>>
>>> return $! map (drop 10) $ filter (startsWith "Filename:") $ lines c
>>
>> makes getPackageList function return several (but not all) filenames.
>
> I think we'd need to see the actual input and expected output, to
> understand what's going wrong here. It worked fine for me, for small tests.
The gzipped example file is here:
ftp://ftp.debian.org/debian/dists/lenny/contrib/binary-i386/Packages.gz
>
> By the way, it's good policy to always post complete, runnable examples.
> Requiring anyone who wants to help you to write additional code just to
> get it to run decreases the chances that someone will bother to do so.
Sorry. I've just omitted module imports:
> import Control.Monad (filterM, mapM)
> import System.IO (withFile, IOMode (ReadMode), hGetContents)
> import qualified System.Posix.Files as SPF (isDirectory, getFileStatus)
Running in GHCi:
GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help
Loading package base ... linking ... done.
Prelude> :load Foo.hs
[1 of 1] Compiling Foo ( Foo.hs, interpreted )
Ok, modules loaded: Foo.
*Foo> getPackageList "Packages" >>= mapM_ putStrLn
Loading package old-locale-1.0.0.0 ... linking ... done.
Loading package old-time-1.0.0.0 ... linking ... done.
Loading package filepath-1.1.0.0 ... linking ... done.
Loading package directory-1.0.0.0 ... linking ... done.
Loading package unix-2.3.0.0 ... linking ... done.
pool/contrib/a/acx100/acx100-source_20070101-3_all.deb
pool/contrib/a/alien-arena/alien-arena_7.0-1_i386.deb
pool/contrib/a/alien-arena/alien-arena-browser_7.0-1_all.deb
pool/contrib/a/alien-arena/alien-arena-server_7.0-1_i386.deb
pool/contrib/a/alsa-tools/alsa-firmware-loaders_1.0.16-2_i386.deb
pool/contrib/a/amoeba/amoeba_1.1-19_i386.deb
pool/contrib/a/apple2/apple2_0.7.4-5_i386
*Foo>
Printed list of package files is incomplete.
--
WBR,
Max Vasin.
More information about the Haskell-Cafe
mailing list