openFile gives "file is locked" error on Linux when creating a non-existing file
Harendra Kumar
harendra.kumar at gmail.com
Tue Oct 8 13:17:57 UTC 2024
Some more information:
The root file system which contains the /tmp directory, where the file
is being created, is of type ext4:
df -T output:
Filesystem Type 1K-blocks Used Available Use% Mounted on
/dev/root ext4 76026616 53915404 22094828 71% /
cat /proc/mounts output:
/dev/root / ext4 rw,relatime,discard,errors=remount-ro 0 0
Also, I added the directory and file existence checks before the
creation of the file at all places, and now the problem stopped
happening. Maybe it became less likely and might surface some time
later.
-harendra
On Tue, 8 Oct 2024 at 18:08, Harendra Kumar <harendra.kumar at gmail.com> wrote:
>
> What if we closed a file and created another one and the inode of the
> previous file got reused for the new one? Is it possible that there is
> a small window after the deletion of the old one in which GHC keeps
> the lock in its hash table? If that happens then the newly created
> file will see that there is already a lock on the file. Could it be
> that the lock gets released when the handle is cleaned by GC or
> something like that?
>
> I can try adding a delay and/or performMajorGC before creating the new file.
>
> -harendra
>
> On Tue, 8 Oct 2024 at 15:57, Viktor Dukhovni <ietf-dane at dukhovni.org> wrote:
> >
> > On Tue, Oct 08, 2024 at 01:15:40PM +0530, Harendra Kumar wrote:
> > > On Tue, 8 Oct 2024 at 11:50, Viktor Dukhovni <ietf-dane at dukhovni.org> wrote:
> > >
> > > > What sort of filesystem is "/tmp/fsevent_dir-.../watch-root" located in?
> > >
> > > This happens on github Linux CI. Not sure which filesystem they are
> > > using. Earlier I was wondering if something funny is happening in case
> > > they are using NFS. But NFS usually causes issues due to caching of
> > > directory entries if we are doing cross-node operations, here we are
> > > on a single node and operations are not running in parallel (or that's
> > > what I believe). I will remove the hspec layer from the tests to make
> > > sure that the code is simpler and our understanding is correct.
> > >
> > > I will also run the tests on circle-ci to check if the problem occurs
> > > there. I have never seen this problem in testing this on a Linux
> > > machine on AWS even if I ran the tests for days in a loop.
> >
> > Looking more closely at the GHC code, we see that there's an internal
> > (RTS not OS level) exclusive lock on the (device, inode) pair as part of
> > opening a Unix file for writes, or shared lock for reads.
> >
> > rts/FileLock.c:
> > int
> > lockFile(StgWord64 id, StgWord64 dev, StgWord64 ino, int for_writing)
> > {
> > Lock key, *lock;
> >
> > ACQUIRE_LOCK(&file_lock_mutex);
> >
> > key.device = dev;
> > key.inode = ino;
> >
> > lock = lookupHashTable_(obj_hash, (StgWord)&key, hashLock, cmpLocks);
> >
> > if (lock == NULL)
> > {
> > lock = stgMallocBytes(sizeof(Lock), "lockFile");
> > lock->device = dev;
> > lock->inode = ino;
> > lock->readers = for_writing ? -1 : 1;
> > insertHashTable_(obj_hash, (StgWord)lock, (void *)lock, hashLock);
> > insertHashTable(key_hash, id, lock);
> > RELEASE_LOCK(&file_lock_mutex);
> > return 0;
> > }
> > else
> > {
> > // single-writer/multi-reader locking:
> > if (for_writing || lock->readers < 0) {
> > RELEASE_LOCK(&file_lock_mutex);
> > return -1;
> > }
> > insertHashTable(key_hash, id, lock);
> > lock->readers++;
> > RELEASE_LOCK(&file_lock_mutex);
> > return 0;
> > }
> > }
> >
> > This is obtained in "libraries/base/GHC/IO/FD.hs", via:
> >
> > mkFD fd iomode mb_stat is_socket is_nonblock = do
> > ...
> > case fd_type of
> > Directory ->
> > ioException (IOError Nothing InappropriateType "openFile"
> > "is a directory" Nothing Nothing)
> >
> > -- regular files need to be locked
> > RegularFile -> do
> > -- On Windows we need an additional call to get a unique device id
> > -- and inode, since fstat just returns 0 for both.
> > -- See also Note [RTS File locking]
> > (unique_dev, unique_ino) <- getUniqueFileInfo fd dev ino
> > r <- lockFile (fromIntegral fd) unique_dev unique_ino
> > (fromBool write)
> > when (r == -1) $
> > ioException (IOError Nothing ResourceBusy "openFile"
> > "file is locked" Nothing Nothing)
> > ...
> >
> > This suggests that when the file in question is opened there's already a
> > read lock in for the same dev/ino. Perhaps the Github filesystem fails
> > to ensure uniqueness of dev+ino of open files (perhaps when open files
> > are already unlinked)?
> >
> > --
> > Viktor.
> > _______________________________________________
> > ghc-devs mailing list
> > ghc-devs at haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
More information about the ghc-devs
mailing list