Raw I/O library proposal, second (more pragmatic) draft
Seth Kurtzberg
seth@cql.com
Tue, 5 Aug 2003 15:42:25 -0700
On Tuesday, August 5, 2003, at 02:00 PM, John Meacham wrote:
> On Tue, Aug 05, 2003 at 12:34:03AM -0700, Seth Kurtzberg wrote:
>> On Tuesday, August 5, 2003, at 12:30 AM, Ben Rudiak-Gould wrote:
>>> On Tue, 5 Aug 2003, Seth Kurtzberg wrote:
>>>> For my purposes (transaction logging for my database server) I need
>>>> to
>>>> be able to guarantee that data is written to disk. That is, it
>>>> isn't
>>>> enough to disable buffering in the compiler libraries (all
>>>> libraries,
>>>> more accurately), I need to also force the O/S to flush the data to
>>>> disk.
>>>>
>>>> This is difficult to do in a portable manner, obviously, but if a
>>>> practical way can be found it would have many uses in systems using
>>>> transactional semantics. It would also get rid of an FFI dependency
>>>> for my code.
>>>
>>> My intended semantics for the osFlush function was always that it
>>> would do
>>> its best to ensure that the data was "pushed as far as possible"
>>> toward
>>> its final destination.
>>>
>>> If you need a guarantee, the function could be made to return a Bool,
>>> with
>>> True indicating that it was absolutely sure that the data had made it
>>> all
>>> the way. But I don't think that it could ever return True. It might
>>> be
>>> running in a VMware sandbox without realizing it, for example. So
>>> you'll
>>> probably have to run tests on your particular setup to see how well
>>> it
>>> works.
>>
>> That is certainly true, but to get even that far the semantics have to
>> exist. You've answered my question; osFlush means (assuming that the
>> O/S can provide the functionality) flush to permanent storage.
>
> There are three useful levels of flush that I can think of. flush all
> userspace buffers to the OS, flush all data to disk. flush all data and
> metadata to disk. the os interfaces would be fflush (well the internal
> haskell equivalant), fdatasync, and fsync.
There is another very important case, which is to flush, _selectively_,
some but not all data to disk. For example when doing transaction
logging you must flush log data to disk but you _don't_ want to flush
other data (because it destroys performance).
>
> I think there is a use for all of them. in particular, being able to
> flush to the os without doing an fsync is good for network traffic
> because always fsyncing can mess with the normal TCP packet
> consolodation logic. (on some OSes). fdatasync vs fsync is useful.
> fdatasync can be much faster with many types of filesystems and is
> usually what people want.
> John
>
>
> --
> -----------------------------------------------------------------------
> ----
> John Meacham - California Institute of Technology, Alum. - john@foo.net
> -----------------------------------------------------------------------
> ----
> _______________________________________________
> Libraries mailing list
> Libraries@haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>
>
-----------------------------------------------------------------
Seth Kurtzberg
CTO
ISEC Research and Network Operations Center
480-314-1540
888-879-5206
seth@isec.us
-----------------------------------------------------------------