[Gluster-users] FW: fix-layout stalls with xattr errors
Dan Bretherton
d.a.bretherton at reading.ac.uk
Fri Dec 30 13:48:57 UTC 2011
Hello Shylesh,
Thanks for looking into this for me. I think the ext4 features are
missing because the filesystems were accidentally formatted as ext3 and
then mounted as ext4. I didn't realise that was possible until I
started investigating this fix-layout problem. I don't know how I
managed to make the same mistake on both replicated bricks but I can't
think of any other explanation. I mounted the filesystems as ext3 and
tried the rebalance again, but the result was the same. Then I tried
converting the filesystems to ext4, as described in various CentOS
forums and blogs including this one:
http://blog.secaserver.com/2011/08/linux-converting-ext3-ext4-for-centos-5.
Unfortunately the "Operation not supported" errors were still there
during the fix-layout, so it seems that the damage has already been done
by mounting the ext3 filesystems as ext4. Perhaps xattrs on new files
would be created correctly in the converted bricks, but I really need to
find a way to repair the GlusterFS xattrs on the existing files. Is
there a way of doing this?
Regards
Dan.
> Hi Dan,
>
> I created two bricks both have ext4 file system.
>
> The issue seems to be in fs features that you have disabled.
>
> Formatted the *brick1* with ext4:
>
> root at SERVER1 mnt]# dumpe2fs /dev/sda| grep 'Filesystem features'
> dumpe2fs 1.41.12 (17-May-2010)
> Filesystem features: has_journal ext_attr resize_inode dir_index
> filetype needs_recovery extent flex_bg sparse_super large_file
> huge_file uninit_bg dir_nlink extra_isize
>
> Formatted *brick 2* with ext4:
> [root at SERVER2 ~]# dumpe2fs /dev/sda| grep 'Filesystem features'
> dumpe2fs 1.41.12 (17-May-2010)
> Filesystem features: has_journal ext_attr resize_inode dir_index
> filetype extent flex_bg sparse_super large_file
>
> As you said i have disabled some of the features from *brick2*.
>
> I created a distribute volume with these two bricks. created some
> files on the mount point and tried setting xattr for these files.
>
> I got error messages
> =======================================================================================
> [2011-12-30 01:57:22.551634] I
> [client3_1-fops.c:818:client3_1_setxattr_cbk] 1-test-client-1: remote
> operation failed: Operation not supported
> [2011-12-30 01:57:22.551658] W [fuse-bridge.c:850:fuse_err_cbk]
> 0-glusterfs-fuse: 201305: SETXATTR() /92 => -1 (Operation not supported)
> [2011-12-30 01:57:22.556490] I
> [client3_1-fops.c:818:client3_1_setxattr_cbk] 1-test-client-1: remote
> operation failed: Operation not supported
> [2011-12-30 01:57:22.556520] W [fuse-bridge.c:850:fuse_err_cbk]
> 0-glusterfs-fuse: 201311: SETXATTR() /95 => -1 (Operation not supported)
> [2011-12-30 01:57:22.564089] I
> [client3_1-fops.c:818:client3_1_setxattr_cbk] 1-test-client-1: remote
> operation failed: Operation not supported
> [2011-12-30 01:57:22.564114] W [fuse-bridge.c:850:fuse_err_cbk]
> 0-glusterfs-fuse: 201321: SETXATTR() /100 => -1 (Operation not supported)
> ========================================================================================
>
> where as i created another volume with only *brick1* and everything
> went smoothly.
> so i suspect problem is not with rebalance but with ext4 features that
> are disabled on *brick2*.
>
> Please let me know if i am missing anything that can be tried.
>
>
>
>
> Thanks,
> Shylesh
>
>> ------------------------------------------------------------------------
>> *From:* gluster-users-bounces at gluster.org
>> [gluster-users-bounces at gluster.org] on behalf of Dan Bretherton
>> [d.a.bretherton at reading.ac.uk]
>> *Sent:* Thursday, December 29, 2011 6:05 AM
>> *To:* gluster-users
>> *Subject:* [Gluster-users] fix-layout stalls with xattr errors
>>
>> Hello All-
>> I am having problems with rebalance ... fix-layout in version 3.2.5.
>> I extended a volume with add-brick but the fix-layout stalls after a
>> small number of layout fixes and does not make any more progress. I
>> have tried the operation twice on different servers with the same
>> result. The following errors are found in the fuse mount log file on
>> the server carrying out the operation.
>>
>> [2011-12-28 21:38:14.840013] I
>> [afr-common.c:1038:afr_launch_self_heal] 0-nemo2-replicate-4:
>> background data self-heal triggered. path:
>> /users/hzu/DATA/ERAINT/ORCA025/2010/snow_ERAINT_2010.nc
>> [2011-12-28 21:38:15.93079] E
>> [client3_1-fops.c:1498:client3_1_fxattrop_cbk] 0-nemo2-client-8:
>> remote operation failed: Operation not supported
>> [2011-12-28 21:38:15.93141] E
>> [client3_1-fops.c:1498:client3_1_fxattrop_cbk] 0-nemo2-client-9:
>> remote operation failed: Operation not supported
>> [2011-12-28 21:38:15.93385] I
>> [client3_1-fops.c:1187:client3_1_fstat_cbk] 0-nemo2-client-8:
>> remote operation failed: Operation not supported
>> [2011-12-28 21:38:15.93521] I
>> [client3_1-fops.c:1187:client3_1_fstat_cbk] 0-nemo2-client-9:
>> remote operation failed: Operation not supported
>>
>>
>> The file in the error message is a link, and it is not broken as seen
>> from the volume mount point or the bricks.
>>
>> There are some worrying error messages in the brick log files for
>> nemo2-client-8 and nemo2-client-9. Here are some exerpts from the
>> nemo2-client-8 log, which is similar to the 0-nemo2-client-9 log.
>>
>> [2011-12-28 21:23:05.827877] W [posix.c:3928:do_xattrop]
>> 0-nemo2-posix: Extended attributes not supported by filesystem
>> [2011-12-28 21:23:05.827932] I
>> [server3_1-fops.c:1705:server_fxattrop_cbk] 0-nemo2-server: 8438:
>> FXATTROP 0 (-2111276040) ==> -1 (Operation not support
>> ed)
>> [2011-12-28 21:23:05.828848] E [posix.c:4200:posix_fstat]
>> 0-nemo2-posix: fstat failed on fd=0x2aaaac703804: Operation not
>> supported
>> [2011-12-28 21:23:05.828879] I
>> [server3_1-fops.c:1113:server_fstat_cbk] 0-nemo2-server: 8439:
>> FSTAT 0 (-2111276040) ==> -1 (Operation not supported)
>> [2011-12-28 21:29:29.871213] W
>> [socket.c:1494:__socket_proto_state_machine] 0-tcp.nemo2-server:
>> reading from socket failed. Error (Transport endpoint i
>> s not connected), peer (192.171.166.81:1003)
>> [2011-12-28 21:29:29.871305] I
>> [server-helpers.c:360:do_lock_table_cleanup] 0-nemo2-server:
>> inodelk released on /users/hzu/DATA/ERAINT/ORCA025/2010/sno
>> w_ERAINT_2010.nc
>> [2011-12-28 21:29:29.871345] I
>> [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>> on /users/hzu/DATA/ERAINT/ORCA025/2010/snow_ERAINT_2010.
>> nc
>>
>> [2011-12-28 21:34:36.190023] I
>> [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup on /
>> [2011-12-28 21:34:36.190055] I
>> [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>> on /users
>> [2011-12-28 21:34:36.190086] I
>> [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>> on /users/hzu
>> [2011-12-28 21:34:36.190102] I
>> [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>> on /users/hzu/DATA
>> [2011-12-28 21:34:36.190135] I
>> [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>> on /users/hzu/DATA/ERAINT
>> [2011-12-28 21:34:36.190154] I
>> [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>> on /users/hzu/DATA/ERAINT/ORCA025
>> [2011-12-28 21:34:36.190171] I
>> [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>> on /users/hzu/DATA/ERAINT/ORCA025/2009
>>
>> [2011-12-28 21:38:15.92433] I
>> [server3_1-fops.c:1705:server_fxattrop_cbk] 0-nemo2-server:
>> 12228: FXATTROP 7 (-2111276040) ==> -1 (Operation not supported)
>> [2011-12-28 21:38:15.92743] E [posix.c:4200:posix_fstat]
>> 0-nemo2-posix: fstat failed on fd=0x2aaaac703804: Operation not
>> supported
>> [2011-12-28 21:38:15.92775] I
>> [server3_1-fops.c:1113:server_fstat_cbk] 0-nemo2-server: 12229:
>> FSTAT 7 (-2111276040) ==> -1 (Operation not supported)
>>
>>
>> The backend filesystems are ext4 and the are mounted with options
>> "acl,user_xattr". I tested extended attribute support (as suggested
>> here:
>> http://gluster.org/pipermail/gluster-users/2010-December/006257.html)
>> and could not find any problems, so I don't understand the "Extended
>> attributes not supported by filesystem" error. The only unusual
>> thing about the filesystems is the reduced number of filesystem
>> features enabled compared to other bricks. These are the ext4
>> features enabled.
>>
>> has_journal ext_attr resize_inode dir_index filetype needs_recovery
>> sparse_super large_file
>>
>> All the other bricks in the volume have these features plus extent,
>> flex_bg, huge_file, uninit_bg, dir_nlink and extra_isize. I don't
>> know if any of these missing ext4 features are part of the problem.
>> Does anybody know what's going on here?
>>
>> Regards
>> Dan.
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20111230/d69300aa/attachment.html>
More information about the Gluster-users
mailing list