openFile gives "file is locked" error on Linux when creating a non-existing file

Harendra Kumar harendra.kumar at gmail.com
Wed Oct 9 06:45:32 UTC 2024


We do use low level C APIs and GHC APIs to create a Handle in the
event watching module. But that is for the watch-root and not for the
file that is experiencing this problem. So here is how it works. We
have a top level directory which is watched for events using inotify.
We first create this directory, this directory is opened using
inotify_init which returns a C file descriptor. We then create a
Handle from this fd, this Handle is used for watching inotify events.
We are then creating a file inside this directory which is being
watched while we are reading events from the parent directory. The
resource-busy issue occurs when creating a file inside this directory.
So we are not creating the Handle for the file in question in a
non-standard manner, but the parent directory Handle is being created
in that manner. I do not know if that somehow affects anything. Or if
the fact that the directory is being watched using inotify makes any
difference?

The code for creating the watch Handle is here:
https://github.com/composewell/streamly/blob/bbac52d9e09fa5ad760ab6ee5572c701e198d4ee/src/Streamly/Internal/FileSystem/Event/Linux.hs#L589
. Viktor, you may want to take a quick look at this to see if it can
make any difference to the issue at hand.

-harendra

On Wed, 9 Oct 2024 at 10:31, Viktor Dukhovni <ietf-dane at dukhovni.org> wrote:
>
> On Wed, Oct 09, 2024 at 10:24:30AM +0530, Harendra Kumar wrote:
>
> > I just noticed that cabal seems to be running test suites in parallel.
> > We have two test suites. Even though each test suite generates the
> > temp names randomly, they use the same prefix, if the generated names
> > have a possibility of conflict due to PRNG it may cause a problem.
> > That is perhaps the more likely cause rather than hunting this in GHC.
> > cabal running tests in parallel without explicitly saying so came as a
> > surprise to me. In fact I found an issue in cabal repo asking for a
> > "feature" to run them sequentially, the issue is still open -
> > https://github.com/haskell/cabal/issues/6751 . Hopefully this is it.
>
> Just parallel execution is not sufficient to explain the observed
> problem, you still need to have the same inode/dev already open
> in the same process, or bookkeeping of which dev/ino pairs are
> in use to be incorrect.
>
> So either the Github filesystem is reusing inodes of already deleted,
> but still open files (a deviation from expected Unix behaviour), or
> somehow GHC fails to correctly track the dev/ino pairs of open handles.
>
> My best guess is that something is manipulating file descriptors
> directly, bypassing the Handle layer, and *then* parallel execution
> could exacerbate the resulting inconsistent state.
>
> --
>     Viktor.
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


More information about the ghc-devs mailing list