[Haskell-beginners] lazy IO in readFile

Andrew Sackville-West andrew at swclan.homelinux.org
Fri May 14 19:51:17 EDT 2010


Sorry, I got distracted for a couple of days.


On Sat, May 08, 2010 at 02:16:27PM +0200, Daniel Fischer wrote:
> On Saturday 08 May 2010 04:47:14, Andrew Sackville-West wrote:
> >
> > Please ignore, for the moment, whatever *other* problems (idiomatic or
> > just idiotic) that may exist above and focus on the IO problem.
> >
> 
> Sorry, can't entirely. Unless the number of rss items remains low, don't 
> use lists, use a Set.

I knew people wouldn't be able to entirely, but I was trying to focus
on just one problem for the moment. THanks for the tip on Set. I don't
anticipate it really being a need, but it's a good idea. Ideally, I'd
like to come up with something more robust than just matching on
titles as well, but it works for the moment.

> 
> > This code works fine *if* the file "testfile" has in it some subset of
> > the testData list. If it has the complete set, it fails with a "resource
> > busy" exception.
> >
> > Okay, I can more or less understand what's going on here. Each letter
> > in the testData list gets compared to the contents of the file, but
> > because they are *all* found, the readFile call never has to try and
> > fail to read the last line of the file. Thus the file handle is kept
> > open lazily waiting around not having reached EOF.  Fair enough.
> >
> > But what is the best solution? One obvious one, and the one I'm using
> > now, is to move the appendFile call into a function with guards to
> > prevent trying to append an empty list to the end of the file. This
> > solves the problem not by forcing the read on EOF, but by not
> > bothering to open the file for appending:
> >
> > writeHistory [] = return ()
> > writeHistory ni = appendFile "testfile" . unlines $ ni
> >
> > And this makes some sense. It's silly to try to write nothing to a
> > file.
> 
> Yes. In any case,
> 
>     unless (null newItems) $ appendFile "testfile" $ unlines newItems
> 
> seems cleaner.

indeed. Thanks.

> 
> >
> > But it also rubs me the wrong way. It's not solving the problem
> > directly -- closing that file handle. So there's my question, how can
> > I close that thing? Is there some way to force it?
> 
> For almost all practical purposes, there is (despite the fact that what 
> Stephen said is basically right, although a little overstated in my 
> opinion).
> You have to force the entire file to be read, the standard idiom is using
> 
>   x `seq` doSomethingElse
> 
> where x is a value that requires the entire file to be read, in your case
> x = length currItems is a natural choice.

d'oh. 

> That way, you effectively have made readFile strict without sacrificing too 
> much niceness of the code (withFile and hGetLine mostly are much uglier 
> IMO).

I know I"m probably not really thinking about it in the right way, but
withFile and hGetLine don't seem to be a good fit because of the
current structure. I'm using the existing data in the file as a filter
to determine what new data to put in the file. like this:

items <- oldTitles `seq` liftM (filterItems oldTitles) $ buildItems srcs

where builtItems::[String] -> IO [Item], and filterItems::[String] ->
[Item] -> [Item] (I suppose I should post the whole file. You can see
it here: http://git.swclan.homelinux.org/rss2email.git) 

Anyway, filterItems is doing 'filter (isNew oldItems) items',
effectively, and putting that into a withFile do structure seems not
right. 

Thanks for the help

A
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
Url : http://www.haskell.org/pipermail/beginners/attachments/20100514/e14b54d6/attachment.bin


More information about the Beginners mailing list