lazy file reading in H98
Simon Marlow
simonmar@microsoft.com
Tue, 3 Apr 2001 16:29:00 +0100
I admit the existing behaviour is unsatisfactory. However, I'd like to
point out that a program using the sequence
s <- readFile f
...
writeFile f s'
is arguably wrong, even given semantics (2) for readFile, becuase it's
non-atomic. A more correct sequence is
s <- readFile f
...
writeFile f' s'
renameFile f' f
where f' is a temporary file name.
Cheers,
Simon
> -----Original Message-----
> From: Simon Peyton-Jones=20
> Sent: Tuesday, April 03, 2001 12:01 PM
> To: Libraries for Haskell List
> Subject: FW: lazy file reading in H98
>=20
>=20
> Here's a library issue.
>=20
> The conclusion of this conversation was that H98 already specifies
> option (1) below, and I will clarify that in revising the library
> report.
> Nevertheless, the absence of a simple way to read-modify-write a file
> is a pain in the neck.=20
>=20
> Question: should one of our extended-IO libraries support a version of
> openFile that guarantees option (2)?
>=20
> Simon
>=20
> -----Original Message-----
> From: Manuel M. T. Chakravarty [mailto:chak@cse.unsw.edu.au]=20
> Sent: 05 September 2000 02:10
> To: haskell@haskell.org
> Subject: lazy file reading in H98
>=20
>=20
> In an assignment, in my class, we came across a lack of=20
> specification of
> the behaviour of `Prelude.readFile' and `IO.hGetContents' and=20
> IMHO also
> a lack of functionality. As both operations read a file lazily,
> subsequent writes to the same file are potentially=20
> disastrous. In this
> assignment, the file was used to make a Haskell data structure
> persistent over multiple runs of the program - ie,=20
>=20
> readFile fname >>=3D return . read
>=20
> at the start of the program and
>=20
> writeFile fname . show
>=20
> at the end of the program. For certain inputs, where the
> data structure stored in the file was only partially used,
> the file was overwritten before it was fully read.
>=20
> H98 doesn't really specify what happens in this situation.
> I think, there are two ways to solve that:
>=20
> (1) At least, the definition should say that the behaviour
> is undefined if a program every writes to a file that it
> has read with `readFile' or `hGetContents' before.
>=20
> (2) Alternatively, it could demand more sophistication from
> the implementation and require that upon opening of a
> file for writing that is currently semi-closed, the
> implementation has to make sure that the contents of the
> semi-closed file is not corrupted before it is fully
> read.[1]
>=20
> In the case that solution (1) is chosen, I think, we should also have
> something like `strictReadFile' (and
> `hStrictGetContents') which reads the whole file before proceeding to
> the next IO action. Otherwise, in situations like in the mentioned
> assignment, you have to resort to reading the file character by
> character, which seems very awkward.
>=20
> So, overall, I think solution (2) is more elegant.
>=20
> Cheers,
> Manuel
>=20
> [1] On Unix-like (POSIX?) systems, unlinking the file and
> then opening the writable file would be sufficient. On
> certain legacy OSes, the implementation would have to
> read the rest of the file into memory before creating
> a new file under the same name.
>=20
> _______________________________________________
> Libraries mailing list
> Libraries@haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>=20