[Gluster-devel] Rebalance failure wrt trashcan
Nithya Balachandran
nbalacha at redhat.com
Thu May 14 15:44:03 UTC 2015
Hi Anoop,
It is a specific use case. Please see http://review.gluster.org/#/c/10786/ for more details.
The issue is not related to the trash translator.
To hit the issue you would need to create a distrep vol such that the first brick of each replica set exists on one node and the second brick on the second node. ie.,
gluster v create vol1 replica 2 <node1>:/path_to_brick1 <node2>:/path_to_brick1 <node1>:/path_to_brick2 <node2>:/path_to_brick1
Regards,
Nithya
----- Anoop C S <achiraya at redhat.com> wrote:
> Hi,
>
> I tried to reproduce the situation using master by adding some bricks
> and initiating the rebalance operation(I created some empty files
> through mount before adding the bricks). And I couldn't find any error
> in volume status output or rebalance/brick logs.
>
> [root at dhcp43-4 master]# gluster v create vol 10.70.43.4:/home/brick1
> 10.70.43.66:/home/brick2 force
> volume create: vol: success: please start the volume to access data
> [root at dhcp43-4 master]# gluster v start vol
> volume start: vol: success
> [root at dhcp43-4 master]# gluster v add-brick vol 10.70.43.66:/home/brick3
> 10.70.43.66:/home/brick4 force
> volume add-brick: success
> [root at dhcp43-4 master]# gluster v rebalance vol start
> volume rebalance: vol: success: Rebalance on vol has been started
> successfully. Use rebalance status command to check status of the
> rebalance process.
> ID: f4f86e5e-e042-424b-a155-687b88cd6d26
>
> [root at dhcp43-4 master]# gluster v rebalance vol status
> Node Rebalanced-files size
> scanned failures skipped status run
> time in secs
> --------- ----------- -----------
> ----------- ----------- ----------- ------------
> --------------
> localhost 0 0Bytes
> 5 0 1 completed 0.00
> 10.70.43.66 0 0Bytes
> 6 0 2 completed 0.00
> volume rebalance: vol: success:
> [root at dhcp43-4 master]# gluster v status vol
> Status of volume: vol
> Gluster process TCP Port RDMA Port Online Pid
> ------------------------------------------------------------------------------
> Brick 10.70.43.4:/home/brick1 49152 0 Y 6983
> Brick 10.70.43.66:/home/brick2 49152 0 Y 12853
> Brick 10.70.43.66:/home/brick3 49153 0 Y 12888
> Brick 10.70.43.66:/home/brick4 49154 0 Y 12905
> NFS Server on localhost 2049 0 Y 7027
> NFS Server on 10.70.43.66 2049 0 Y 12923
>
> Task Status of Volume vol
> ------------------------------------------------------------------------------
> Task : Rebalance
> ID : f4f86e5e-e042-424b-a155-687b88cd6d26
> Status : completed
>
> However I could see the following in rebalance logs.
>
> [2015-05-14 11:40:14.474644] I [dht-layout.c:697:dht_layout_normalize]
> 0-vol-dht: Found anomalies in /.trashcan (gfid = 00000000-0000-00
> 00-0000-000000000005). Holes=1 overlaps=0
>
> [2015-05-14 11:40:14.485028] I [MSGID: 109036]
> [dht-common.c:6690:dht_log_new_layout_for_dir_selfheal] 0-vol-dht:
> Setting layout of /.trashcan with [Subvol_name: vol-client-0, Err: -1 ,
> Start: 0 , Stop: 1073737911 , Hash: 1 ], [Subvol_name: vol-client-1,
> Err: -1 , Start: 1073737912 , Stop: 2147475823 , Hash: 1 ],
> [Subvol_name: vol-client-2, Err: -1 , Start: 2147475824 , Stop:
> 3221213735 , Hash: 1 ], [Subvol_name: vol-client-3, Err: -1 , Start:
> 3221213736 , Stop: 4294967295 , Hash: 1 ],
>
> [2015-05-14 11:40:14.485958] I [dht-common.c:3539:dht_setxattr]
> 0-vol-dht: fixing the layout of /.trashcan
>
> . . .
>
> [2015-05-14 11:40:14.488222] I
> [dht-rebalance.c:2113:gf_defrag_process_dir] 0-vol-dht: migrate data
> called on /.trashcan
>
> [2015-05-14 11:40:14.488966] I
> [dht-rebalance.c:2322:gf_defrag_process_dir] 0-vol-dht: Migration
> operation on dir /.trashcan took 0.00 secs
>
> [2015-05-14 11:40:14.494033] I [dht-layout.c:697:dht_layout_normalize]
> 0-vol-dht: Found anomalies in /.trashcan/internal_op (gfid =
> 00000000-0000-0000-0000-000000000006). Holes=1 overlaps=0
>
> [2015-05-14 11:40:14.495608] I [MSGID: 109036]
> [dht-common.c:6690:dht_log_new_layout_for_dir_selfheal] 0-vol-dht:
> Setting layout of /.trashcan/internal_op with [Subvol_name:
> vol-client-0, Err: -1 , Start: 2147475824 , Stop: 3221213735 , Hash: 1
> ], [Subvol_name: vol-client-1, Err: -1 , Start: 3221213736 , Stop:
> 4294967295 , Hash: 1 ], [Subvol_name: vol-client-2, Err: -1 , Start: 0 ,
> Stop: 1073737911 , Hash: 1 ], [Subvol_name: vol-client-3, Err: -1 ,
> Start: 1073737912 , Stop: 2147475823 , Hash: 1 ],
>
> [2015-05-14 11:40:14.501198] I [dht-common.c:3539:dht_setxattr]
> 0-vol-dht: fixing the layout of /.trashcan/internal_op
>
> . . .
>
> [2015-05-14 11:40:14.508264] I
> [dht-rebalance.c:2113:gf_defrag_process_dir] 0-vol-dht: migrate data
> called on /.trashcan/internal_op
>
> [2015-05-14 11:40:14.509493] I
> [dht-rebalance.c:2322:gf_defrag_process_dir] 0-vol-dht: Migration
> operation on dir /.trashcan/internal_op took 0.00 secs
>
> [2015-05-14 11:40:14.513020] I [dht-common.c:3539:dht_setxattr]
> 0-vol-dht: fixing the layout of /.trashcan/internal_op
>
> [2015-05-14 11:40:14.525227] I [dht-common.c:3539:dht_setxattr]
> 0-vol-dht: fixing the layout of /.trashcan
>
> . . .
>
> [2015-05-14 11:40:14.529157] I
> [dht-rebalance.c:2793:gf_defrag_start_crawl] 0-DHT: crawling file-system
> completed
>
>
> On 05/14/2015 04:20 PM, SATHEESARAN wrote:
> > On 05/14/2015 12:55 PM, Vijay Bellur wrote:
> >> On 05/14/2015 09:00 AM, SATHEESARAN wrote:
> >>> Hi All,
> >>>
> >>> I was using glusterfs-3.7 beta2 build (
> >>> glusterfs-3.7.0beta2-0.0.el6.x86_64 )
> >>> I have seen rebalance failure in one of the node.
> >>>
> >>> [2015-05-14 12:17:03.695156] E
> >>> [dht-rebalance.c:2368:gf_defrag_settle_hash] 0-vmstore-dht: fix layout
> >>> on /.trashcan/internal_op failed
> >>> [2015-05-14 12:17:03.695636] E [MSGID: 109016]
> >>> [dht-rebalance.c:2528:gf_defrag_fix_layout] 0-vmstore-dht: Fix layout
> >>> failed for /.trashcan
> >>>
> >>> Does it have any impact ?
> >>>
> >>
> >> I don't think there should be any impact due to this. rebalance should
> >> continue fine without any problems. Do let us know if you observe the
> >> behavior to be otherwise.
> >>
> >> -Vijay
> > I tested the same functionally and I don't find any impact as such, but
> > the 'gluster volume status <vol-name>' reports the rebalance as a FAILURE.
> > Any tool ( for example oVirt ), consuming the output from 'gluster
> > volume status <vol> --xml' would report the rebalance operation as FAILURE
> >
> > [root@ ~]# gluster volume rebalance vmstore start
> > volume rebalance: vmstore: success: Rebalance on vmstore has been
> > started successfully. Use rebalance status command to check status of
> > the rebalance process.
> > ID: 68a12fc9-acd5-4f24-ba2d-bfc070ad5668
> >
> > [root@~]# gluster volume rebalance vmstore status
> > Node Rebalanced-files size
> > scanned failures skipped status run time in secs
> > --------- ----------- -----------
> > ----------- ----------- ----------- ------------ --------------
> > localhost 0
> > 0Bytes 2 0 0 completed 0.00
> > 10.70.37.58 0
> > 0Bytes 0 3 0 failed 0.00
> > volume rebalance: vmstore: success:
> >
> > [root@~]# gluster volume status vmstore
> > Status of volume: vmstore
> > Gluster process TCP Port RDMA Port Online Pid
> > ------------------------------------------------------------------------------
> >
> > ......
> >
> > Task Status of Volume vmstore
> > ------------------------------------------------------------------------------
> >
> > Task : Rebalance
> > ID : 68a12fc9-acd5-4f24-ba2d-bfc070ad5668
> > Status : failed
> >
> > Snip from --xml tasks :
> > <tasks>
> > <task>
> > <type>Rebalance</type>
> > <id>68a12fc9-acd5-4f24-ba2d-bfc070ad5668</id>
> > <status>4</status>
> > <statusStr>failed</statusStr>
> > </task>
> > </tasks>
> >
> > Even this is the case with remove-brick with data migration too
> >
> > -- sas
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
More information about the Gluster-devel
mailing list