[Gluster-users] Setting gfid failed on slave geo-rep node
mlnospam at yahoo.com
Mon Feb 1 08:44:53 UTC 2016
I just set up distributed geo-replication to a slave on my 2 nodes' replicated volume and noticed quite a few error messages (around 70 of them) in the slave's brick log file:
The exact log file is: /var/log/glusterfs/bricks/data-myvolume-geo-brick.log
[2016-01-31 22:19:29.524370] E [MSGID: 113020] [posix.c:1221:posix_mknod] 0-myvolume-geo-posix: setting gfid on /data/myvolume-geo/brick/data/username/files/shared/logo-login-09.svg.ocTransferId1789604916.part failed
[2016-01-31 22:19:29.535478] W [MSGID: 113026] [posix.c:1338:posix_mkdir] 0-myvolume-geo-posix: mkdir (/data/username/files_encryption/keys/files/shared/logo-login-09.svg.ocTransferId1789604916.part): gfid (15bbcec6-a332-4c21-81e4-c52472b1e13d) isalready associated with directory (/data/myvolume-geo/brick/.glusterfs/49/5d/495d6868-4844-4632-8ff9-ad9646a878fe/logo-login-09.svg). Hence,both directories will share same gfid and thiscan lead to inconsistencies.
This doesn't look good at all because the file mentioned in the error message (
logo-login-09.svg.ocTransferId1789604916.part) is left there with 0 kbytes and does not get deleted or cleaned up by glusterfs, leaving my geo-rep slave node in an inconsistent state which does not reflect the reality from the master nodes. The master nodes don't have that file anymore (which is correct). Here below is an "ls" of the concerned file with the correct file on top.
-rw-r--r-- 2 www-data www-data 24312 Jan 6 2014 logo-login-09.svg
-rw-r--r-- 1 root root 0 Jan 31 23:19 logo-login-09.svg.ocTransferId1789604916.part
So at least I have the correct file (first file in the list) but gluster leaves this second "temporary" or "transient" file although it should delete it.
More information about the Gluster-users