[Gluster-users] Input/output error

Pranith Kumar K pranithk at gluster.com
Thu Aug 25 09:08:29 UTC 2011


hi siga hiro,
        The gfids of the directories /home/syncdata/testdata/ are 
different, most probably because of the bug 2921.
This could have happened due to the following reason: You created the 
directory testdata before the volume is mounted and accessed them parallely.
The gfid is assigned to the entry when it is first accessed. Please do 
respond back if this is not the case.
If you think both the directories contain the same file, then you can 
remove the gfid xattr by executing setfattr -x trusted.gfid 
/home/syncdata/testdata on both the machines. and access it from just 
one of the mounts.
Then this problem goes away.
     If you are using more than one mounts to access fresh data under 
the volumes then first mount one client and do a "find <mount-point>" 
and then mount the rest of the clients and use it.
The find command will assign the gfids which wont conflict.

Pranith.

On 08/24/2011 04:28 PM, siga hiro wrote:
> Thank you Pranith Kumar K.
>
> ------172.23.0.1-------------------------------
> # getfattr -d -m . /home/syncdata/
> getfattr: Removing leading '/' from absolute path names
> # file: home/syncdata
> trusted.afr.syncdata-client-0=0sAAAAAAAAAAAAAAAA
> trusted.afr.syncdata-client-1=0sAAAAAAAAAAAAAAAA
> trusted.gfid=0sAAAAAAAAAAAAAAAAAAAAAQ==
> trusted.glusterfs.quota.dirty=0sMAA=
> trusted.glusterfs.quota.size=0sAAAAAAAAAAA=
> trusted.glusterfs.test="working\000"
>
> # getfattr -d -m . /home/syncdata/testdata/
> getfattr: Removing leading '/' from absolute path names
> # file: home/syncdata/testdata
> trusted.afr.syncdata-client-0=0sAAAAAAAAAAAAAAAA
> trusted.afr.syncdata-client-1=0sAAAAAAAAAAAAAAAA
> trusted.gfid=0st0UDRLu7TEqt2W8wc30mCQ==
>
> ------172.23.0.2-------------------------------
> # getfattr -d -m . /home/syncdata/
> getfattr: Removing leading '/' from absolute path names
> # file: home/syncdata
> trusted.afr.syncdata-client-0=0sAAAAAAAAAAAAAAAA
> trusted.afr.syncdata-client-1=0sAAAAAAAAAAAAAAAA
> trusted.gfid=0sAAAAAAAAAAAAAAAAAAAAAQ==
> trusted.glusterfs.quota.dirty=0sMAA=
> trusted.glusterfs.quota.size=0sAAAAAAAAAAA=
> trusted.glusterfs.test="working\000"
>
> # getfattr -d -m . /home/syncdata/testdata/
> getfattr: Removing leading '/' from absolute path names
> # file: home/syncdata/testdata
> trusted.afr.syncdata-client-0=0sAAAAAAAAAAAAAAAA
> trusted.afr.syncdata-client-1=0sAAAAAAAAAAAAAAAA
> trusted.gfid=0shkqegy6JT0KgZjAlx3Db0w==
>
>
> thanks.
>
> 2011/8/24 Pranith Kumar K<pranithk at gluster.com>:
>> hi siga hiro,
>>      Can you provide the output of:
>> getfattr -d -m . /home/syncdata
>> getfattr -d -m . /home/syncdata/testdata
>>
>> On both the machines.
>> Pranith
>>
>> On 08/24/2011 02:11 PM, siga hiro wrote:
>>> Thank you for the quick answer.
>>>
>>>> 1) http://bugs.gluster.com/show_bug.cgi?id=2921 (most likely this)
>>> Isn't this solved in GlusterFS 3.2.3?
>>>
>>> I have installed GlusterFS 3.2.3 in 172.23.0.2.
>>> (get from
>>> http://download.gluster.com/pub/gluster/glusterfs/LATEST/CentOS/)
>>> And It confirmed that md5sum corresponded with 172.23.0.1 and 172.23.0.2.
>>> # md5sum *
>>> 8012eaf68e8ee8153d1b4f317dea385d  error_log.txt
>>> 88f70311135f82578a69866bce0564ba  error.log
>>>
>>> mount 172.23.0.2
>>>    ->    mount -t glusterfs -o tcp,soft,timeo=3 172.23.0.2:/syncdata
>>> /syncdata
>>>
>>> But...
>>> [root at 172.23.0.2 /]# ls -al /syncdata/testdata/
>>> ls: reading directory /syncdata/testdata/: Input/output error
>>>
>>> /var/log/glusterfs/nfs.log
>>> [2011-08-24 17:06:14.447688] I [rpc-clnt.c:1531:rpc_clnt_reconfig]
>>> 0-syncdata-client-0: changing port to 24009 (from 0)
>>> [2011-08-24 17:06:17.453688] I
>>>
>>> [client-handshake.c:1082:select_server_supported_programs]0-syncdata-client-1:
>>> Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
>>> [2011-08-24 17:06:17.456448] I
>>> [client-handshake.c:913:client_setvolume_cbk] 0-syncdata-client-1:
>>> Connected to 172.23.11.121:24009, attached to remote volume
>>> '/home/syncdata'.
>>> [2011-08-24 17:06:17.456517] I [afr-common.c:2611:afr_notify]
>>> 0-syncdata-replicate-0: Subvolume 'syncdata-client-1' came back up;
>>> going online.
>>> [2011-08-24 17:06:17.456957] I
>>>
>>> [client-handshake.c:1082:select_server_supported_programs]0-syncdata-client-0:
>>> Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
>>> [2011-08-24 17:06:17.457937] I
>>> [client-handshake.c:913:client_setvolume_cbk] 0-syncdata-client-0:
>>> Connected to 172.23.3.4:24009, attached to remote volume
>>> '/home/syncdata'.
>>> [2011-08-24 17:06:17.458478] I [afr-common.c:912:afr_fresh_lookup_cbk]
>>> 0-syncdata-replicate-0: added root inode
>>> [2011-08-24 17:06:52.479588] W
>>> [afr-common.c:656:afr_lookup_self_heal_check] 0-syncdata-replicate-0:
>>> /fastask: gfid different on subvolume
>>> [2011-08-24 17:06:52.480560] I
>>> [client3_1-fops.c:411:client3_1_stat_cbk] 0-syncdata-client-0: remote
>>> operation failed: No such file or directory
>>> [2011-08-24 17:06:52.481555] I
>>> [client3_1-fops.c:1099:client3_1_access_cbk] 0-syncdata-client-0:
>>> remote operation failed: No such file or directory
>>> [2011-08-24 17:06:52.482554] I
>>> [client3_1-fops.c:2132:client3_1_opendir_cbk] 0-syncdata-client-0:
>>> remote operation failed: No such file or directory
>>> [2011-08-24 17:06:52.482577] W
>>> [client3_1-fops.c:5136:client3_1_readdir] 0-syncdata-client-0:
>>> (689897478): failed to get fd ctx. EBADFD
>>> [2011-08-24 17:06:52.482592] W
>>> [client3_1-fops.c:5201:client3_1_readdir] 0-syncdata-client-0: failed
>>> to send the fop: File descriptor in bad state
>>> [2011-08-24 17:06:52.482608] I
>>> [afr-dir-read.c:120:afr_examine_dir_readdir_cbk]
>>> 0-syncdata-replicate-0: /fastask: failed to do opendir on
>>> syncdata-client-0
>>> [2011-08-24 17:06:52.482811] I
>>> [afr-dir-read.c:174:afr_examine_dir_readdir_cbk]
>>> 0-syncdata-replicate-0:  entry self-heal triggered. path: /fastask,
>>> reason: checksums of directory differ, forced merge option set
>>> [2011-08-24 17:06:52.483553] I
>>> [client3_1-fops.c:1303:client3_1_entrylk_cbk] 0-syncdata-client-0:
>>> remote operation failed: No such file or directory
>>> [2011-08-24 17:06:52.483642] E
>>> [afr-self-heal-entry.c:2292:afr_sh_post_nonblocking_entry_cbk]
>>> 0-syncdata-replicate-0: Non Blocking entrylks failed for /fastask.
>>> [2011-08-24 17:06:52.483839] W [afr-common.c:122:afr_set_split_brain]
>>>
>>> (-->/opt/glusterfs/3.2.3/lib64/glusterfs/3.2.3/xlator/cluster/replicate.so(afr_sh_post_nonblocking_entry_cbk+0xf5)
>>> [0x2aaaaad137f5]
>>>
>>> (-->/opt/glusterfs/3.2.3/lib64/glusterfs/3.2.3/xlator/cluster/replicate.so(afr_sh_entry_done+0x46)
>>> [0x2aaaaad13646]
>>>
>>> (-->/opt/glusterfs/3.2.3/lib64/glusterfs/3.2.3/xlator/cluster/replicate.so(afr_self_heal_completion_cbk+0x246)
>>> [0x2aaaaad0cac6]))) 0-syncdata-replicate-0: invalid argument: inode
>>> [2011-08-24 17:06:52.483864] E
>>> [afr-self-heal-common.c:1554:afr_self_heal_completion_cbk]
>>> 0-syncdata-replicate-0: background  entry entry self-heal failed on
>>> /fastask
>>> [2011-08-24 17:06:52.483898] W
>>> [client3_1-fops.c:5253:client3_1_readdirp] 0-syncdata-client-0:
>>> (689897478): failed to get fd ctx. EBADFD
>>> [2011-08-24 17:06:52.483913] W
>>> [client3_1-fops.c:5317:client3_1_readdirp] 0-syncdata-client-0: failed
>>> to send the fop: File descriptor in bad state
>>>
>>> thanks.
>>>
>>>> hi siga hiro,
>>>>     I see the following warning:
>>>> [2011-08-24 11:36:04.695145] W
>>>> [afr-common.c:656:afr_lookup_self_heal_check]
>>>> 0-syncdata-replicate-0: /testdata: gfid different on subvolume
>>>>
>>>> I also see that you have more than one mount on the volume. Most probably
>>>> you are running into one of the following bugs:
>>>> 1) http://bugs.gluster.com/show_bug.cgi?id=2921 (most likely this)
>>>> 2) http://bugs.gluster.com/show_bug.cgi?id=2745
>>>>
>>>> If it is not the bug 2745, you can confirm it is the bug 2921 if the
>>>> md5sums
>>>> on the files match on both the machines 172.23.0.1, 172.23.0.2
>>>>
>>>> pranith.
>>>>
>>>> On 08/24/2011 11:48 AM, siga hiro wrote:
>>>>
>>>> Hi, everyone.
>>>> Its nice meeting you.
>>>> I am poor at English....
>>>>
>>>> I am writing this because I'd like to update GlusterFS to 3.2.2-1,and I
>>>> want
>>>> to change from gluster mount to nfs mount.
>>>>
>>>> I have installed GlusterFS 3.2.1 one week ago,and replication 2 server.
>>>>
>>>> OS:CentOS5.5 64bit
>>>> RPM:glusterfs-core-3.2.1-1
>>>>      glusterfs-fuse-3.2.1-1
>>>>
>>>> command
>>>>   gluster volume create syncdata replica 2  transport tcp
>>>> 172.23.0.1:/home/syncdata 172.23.0.2:/home/syncdata
>>>>
>>>> mount command
>>>>   172.23.0.1 ->    mount -t glusterfs -o tcp,soft,timeo=3
>>>> 172.23.0.1:/syncdata
>>>> /syncdata
>>>>   172.23.0.2 ->    mount -t glusterfs -o tcp,soft,timeo=3
>>>> 172.23.0.2:/syncdata
>>>> /syncdata
>>>>
>>>> So,Yesterday I update GlusterFS to 3.2.2-1 and use nfs mount.
>>>>   172.23.0.2 ->    mount -t nfs  -o nolock,nfsvers=3,tcp,hard,intr
>>>> 172.23.0.2:/syncdata /syncdata
>>>>
>>>> [root at 172.23.0.2 /]# ls -al /syncdata/testdata/
>>>> ls: reading directory /syncdata/testdata/: Input/output error
>>>>
>>>> /var/log/glusterfs/nfs.log
>>>> [2011-08-24 11:35:16.319379] I
>>>> [client-handshake.c:1082:select_server_supported_programs]
>>>> 0-syncdata-client-1: Using Program GlusterFS-3.1.0, Num (1298437),
>>>> Version
>>>> (310)
>>>> [2011-08-24 11:35:16.322126] I
>>>> [client-handshake.c:913:client_setvolume_cbk]
>>>> 0-syncdata-client-1: Connected to 172.23.0.2:24009, attached to remote
>>>> volume '/home/syncdata'.
>>>> [2011-08-24 11:35:16.322191] I [afr-common.c:2611:afr_notify]
>>>> 0-syncdata-replicate-0: Subvolume 'syncdata-client-1' came back up; going
>>>> online.
>>>> [2011-08-24 11:35:16.323281] I
>>>> [client-handshake.c:1082:select_server_supported_programs]
>>>> 0-syncdata-client-0: Using Program GlusterFS-3.1.0, Num (1298437),
>>>> Version
>>>> (310)
>>>> [2011-08-24 11:35:16.324274] I
>>>> [client-handshake.c:913:client_setvolume_cbk]
>>>> 0-syncdata-client-0: Connected to 172.23.0.1:24009, attached to remote
>>>> volume '/home/syncdata'.
>>>> [2011-08-24 11:35:16.324801] I [afr-common.c:912:afr_fresh_lookup_cbk]
>>>> 0-syncdata-replicate-0: added root inode
>>>> [2011-08-24 11:36:04.695145] W
>>>> [afr-common.c:656:afr_lookup_self_heal_check]
>>>> 0-syncdata-replicate-0: /testdata: gfid different on subvolume
>>>> [2011-08-24 11:36:04.696121] I [client3_1-fops.c:411:client3_1_stat_cbk]
>>>> 0-syncdata-client-0: remote operation failed: No such file or directory
>>>> [2011-08-24 11:36:04.697121] I
>>>> [client3_1-fops.c:1099:client3_1_access_cbk]
>>>> 0-syncdata-client-0: remote operation failed: No such file or directory
>>>> [2011-08-24 11:36:04.698118] I
>>>> [client3_1-fops.c:2132:client3_1_opendir_cbk]
>>>> 0-syncdata-client-0: remote operation failed: No such file or directory
>>>> [2011-08-24 11:36:04.698140] W [client3_1-fops.c:5136:client3_1_readdir]
>>>> 0-syncdata-client-0: (689897478): failed to get fd ctx. EBADFD
>>>> [2011-08-24 11:36:04.698155] W [client3_1-fops.c:5201:client3_1_readdir]
>>>> 0-syncdata-client-0: failed to send the fop: File descriptor in bad state
>>>> [2011-08-24 11:36:04.698168] I
>>>> [afr-dir-read.c:120:afr_examine_dir_readdir_cbk] 0-syncdata-replicate-0:
>>>> /fastask: failed to do opendir on syncdata-client-0
>>>>
>>>> # gluster volume info all
>>>>
>>>> Volume Name: syncdata
>>>> Type: Replicate
>>>> Status: Started
>>>> Number of Bricks: 2
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: 172.23.0.1:/home/syncdata
>>>> Brick2: 172.23.0.2:/home/syncdata
>>>>
>>>>
>>>> After an 172.23.0.2 server is made to work as usual, I want to do the
>>>> work
>>>> of the 172.23.0.1 server.
>>>>
>>>> Any ideas?
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>
>>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users




More information about the Gluster-users mailing list