Proposal for a new I/O library design

Tomasz Zielonka t.zielonka@students.mimuw.edu.pl
Mon, 28 Jul 2003 19:29:42 +0200


On Mon, Jul 28, 2003 at 12:56:04PM -0500, Tim Sweeney wrote:
> Ben,
> 
> I live in a different universe, but over here I prefer to represent files
> purely as memory-mapped objects.  In this view, there is no difference
> between a read-only file and an immutable array of bytes (a byte being a
> natural number between 0 to 255).  A read-write file is then equivalant to a
> mutable array (or a reference to a mutable array on a heap) of the same.
> Treating these all as heap references tends to be cleaner, because you can
> compare the references for equality, which is significant even for read-only
> files, because two files which contain the same exact data are not
> necessarily the same file, whereas opening the same file in two different
> places should result in equal references.
>
> [...]
> 
> In this manner, it's possible to get rid of all remnants of Unix-like
> streams from a language's IO interface.

You certainly can't always mmap the whole file into memory at once (on a
32-bit architecture at least), because:
1) there are files that won't fit into 32-bit address space
2) ... and usually you don't have the whole address space for you. I
    mean there are mmaped libraries, stack, allocated memory, etc. so the
    address space can be somewhat fragmented.
3) after mmaping many big files (each fitting into 32-bit address space)
    you can run out of address space

I've been bitten by all this problems and now I mmap my files in parts,
mapping and unmapping them as needed. It can be done, but it is no
longer that simple.

> -Tim

Best regards,
Tom

-- 
.signature: Too many levels of symbolic links