openFile gives "file is locked" error on Linux when creating a non-existing file

Viktor Dukhovni ietf-dane at dukhovni.org
Wed Oct 9 23:02:32 UTC 2024


On Wed, Oct 09, 2024 at 12:15:32PM +0530, Harendra Kumar wrote:
> We do use low level C APIs and GHC APIs to create a Handle in the
> event watching module. But that is for the watch-root and not for the
> file that is experiencing this problem. So here is how it works. We
> have a top level directory which is watched for events using inotify.
> We first create this directory, this directory is opened using
> inotify_init which returns a C file descriptor. We then create a
> Handle from this fd, this Handle is used for watching inotify events.
> We are then creating a file inside this directory which is being
> watched while we are reading events from the parent directory. The
> resource-busy issue occurs when creating a file inside this directory.
> So we are not creating the Handle for the file in question in a
> non-standard manner, but the parent directory Handle is being created
> in that manner. I do not know if that somehow affects anything. Or if
> the fact that the directory is being watched using inotify makes any
> difference?
> 
> The code for creating the watch Handle is here:
> https://github.com/composewell/streamly/blob/bbac52d9e09fa5ad760ab6ee5572c701e198d4ee/src/Streamly/Internal/FileSystem/Event/Linux.hs#L589
> . Viktor, you may want to take a quick look at this to see if it can
> make any difference to the issue at hand.

I don't have the cycles to isolate the problem.  I still suspect that
your code is somehow directly closing file descriptors associated with a
Handle.  This then orphans the associated logical reader/writer lock,
which then gets inherited by the next incarnation of the same (dev, ino)
pair.  However, if the filesystem underlying "/tmp" were actually "tmpfs",
inode reuse would be quite unlikely, because tmpfs inodes are assigned
from a strictly incrementing counter:

    $ for i in {1..10}; do touch /tmp/foobar; ls -i /tmp/foobar; rm
    /tmp/foobar; done
    3830 /tmp/foobar
    3831 /tmp/foobar
    3832 /tmp/foobar
    3833 /tmp/foobar
    3834 /tmp/foobar
    3835 /tmp/foobar
    3836 /tmp/foobar
    3837 /tmp/foobar
    3838 /tmp/foobar
    3839 /tmp/foobar

but IIRC you mentioned that on Github "/tmp" is ext4, not "tmpfs"
(perhaps RAM-backed storage is a more scarce resource), in which
case indeed inode reuse is quite likely:

    $ for i in {1..10}; do touch /var/tmp/foobar; ls -i /var/tmp/foobar; rm
    /var/tmp/foobar; done
    25854141 /var/tmp/foobar
    25854142 /var/tmp/foobar
    25854141 /var/tmp/foobar
    25854142 /var/tmp/foobar
    25854141 /var/tmp/foobar
    25854142 /var/tmp/foobar
    25854141 /var/tmp/foobar
    25854142 /var/tmp/foobar
    25854141 /var/tmp/foobar
    25854142 /var/tmp/foobar

But since normal open/close of Handles acquires the lock after open, and
releases it before close, the evidence points to a bypass of the normal
open file lifecycle.

Your codebase contains a bunch of custom file management logic, which
could be the source the of problem.  To find the problem code path,
you'd probably need to instrument the RTS lock/unlock code to log its
activity: (mode, descriptor, dev, ino) tuples being added and removed.
And strace execution to be able to identify descriptor open and close
events.  Ideally the problem will be reproducible even with strace.

Good luck.

-- 
    Viktor.


More information about the ghc-devs mailing list