[Gluster-users] Failures during rebalance on gluster distributed disperse volume
Mauro Tridici
mauro.tridici at cmcc.it
Thu Sep 13 13:04:55 UTC 2018
Hi Nithya,
thank you for involving EC group.
I will wait for your suggestions.
Regards,
Mauro
> Il giorno 13 set 2018, alle ore 13:38, Nithya Balachandran <nbalacha at redhat.com> ha scritto:
>
> This looks like an issue because rebalance switched to using fallocate which EC did not have implemented at that point.
>
> @Pranith, @Ashish, which version of gluster had support for fallocate in EC?
>
>
> Regards,
> Nithya
>
> On 12 September 2018 at 19:24, Mauro Tridici <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> wrote:
> Dear All,
>
> I recently added 3 servers (each one with 12 bricks) to an existing Gluster Distributed Disperse Volume.
> Volume extension has been completed without error and I already executed the rebalance procedure with fix-layout option with no problem.
> I just launched the rebalance procedure without fix-layout option, but, as you can see in the output below, I noticed that some failures have been detected.
>
> [root at s01 glusterfs]# gluster v rebalance tier2 status
> Node Rebalanced-files size scanned failures skipped status run time in h:m:s
> --------- ----------- ----------- ----------- ----------- ----------- ------------ --------------
> localhost 71176 3.2MB 2137557 1530391 8128 in progress 13:59:05
> s02-stg 0 0Bytes 0 0 0 completed 11:53:28
> s03-stg 0 0Bytes 0 0 0 completed 11:53:32
> s04-stg 0 0Bytes 0 0 0 completed 0:00:06
> s05-stg 15 0Bytes 17055 0 18 completed 10:48:01
> s06-stg 0 0Bytes 0 0 0 completed 0:00:06
> Estimated time left for rebalance to complete : 0:46:53
> volume rebalance: tier2: success
>
> In the volume rebalance log file, I detected a lot of error messages similar to the following ones:
>
> [2018-09-12 13:15:50.756703] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] 0-tier2-dht: Create dst failed on - tier2-disperse-6 for file - /CSP/sp1/CESM/archive/sps_200508_003/atm/hist/postproc/sps_200508_003.cam.h0.2005-12_grid.nc <http://sps_200508_003.cam.h0.2005-12_grid.nc/>
> [2018-09-12 13:15:50.757025] E [MSGID: 109023] [dht-rebalance.c:2733:gf_defrag_migrate_single_file] 0-tier2-dht: migrate-data failed for /CSP/sp1/CESM/archive/sps_200508_003/atm/hist/postproc/sps_200508_003.cam.h0.2005-12_grid.nc <http://sps_200508_003.cam.h0.2005-12_grid.nc/>
> [2018-09-12 13:15:50.759183] E [MSGID: 109023] [dht-rebalance.c:844:__dht_rebalance_create_dst_file] 0-tier2-dht: fallocate failed for /CSP/sp1/CESM/archive/sps_200508_003/atm/hist/postproc/sps_200508_003.cam.h0.2005-09_grid.nc <http://sps_200508_003.cam.h0.2005-09_grid.nc/> on tier2-disperse-9 (Operation not supported)
> [2018-09-12 13:15:50.759206] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] 0-tier2-dht: Create dst failed on - tier2-disperse-9 for file - /CSP/sp1/CESM/archive/sps_200508_003/atm/hist/postproc/sps_200508_003.cam.h0.2005-09_grid.nc <http://sps_200508_003.cam.h0.2005-09_grid.nc/>
> [2018-09-12 13:15:50.759536] E [MSGID: 109023] [dht-rebalance.c:2733:gf_defrag_migrate_single_file] 0-tier2-dht: migrate-data failed for /CSP/sp1/CESM/archive/sps_200508_003/atm/hist/postproc/sps_200508_003.cam.h0.2005-09_grid.nc <http://sps_200508_003.cam.h0.2005-09_grid.nc/>
> [2018-09-12 13:15:50.777219] E [MSGID: 109023] [dht-rebalance.c:844:__dht_rebalance_create_dst_file] 0-tier2-dht: fallocate failed for /CSP/sp1/CESM/archive/sps_200508_003/atm/hist/postproc/sps_200508_003.cam.h0.2006-01_grid.nc <http://sps_200508_003.cam.h0.2006-01_grid.nc/> on tier2-disperse-10 (Operation not supported)
> [2018-09-12 13:15:50.777241] E [MSGID: 0] [dht-rebalance.c:1696:dht_migrate_file] 0-tier2-dht: Create dst failed on - tier2-disperse-10 for file - /CSP/sp1/CESM/archive/sps_200508_003/atm/hist/postproc/sps_200508_003.cam.h0.2006-01_grid.nc <http://sps_200508_003.cam.h0.2006-01_grid.nc/>
> [2018-09-12 13:15:50.777676] E [MSGID: 109023] [dht-rebalance.c:2733:gf_defrag_migrate_single_file] 0-tier2-dht: migrate-data failed for /CSP/sp1/CESM/archive/sps_200508_003/atm/hist/postproc/sps_200508_003.cam.h0.2006-01_grid.nc <http://sps_200508_003.cam.h0.2006-01_grid.nc/>
>
> Could you please help me to understand what is happening and how to solve it?
>
> Our Gluster implementation is based on Gluster v.3.10.5
>
> Thank you in advance,
> Mauro
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> https://lists.gluster.org/mailman/listinfo/gluster-users <https://lists.gluster.org/mailman/listinfo/gluster-users>
>
-------------------------
Mauro Tridici
Fondazione CMCC
CMCC Supercomputing Center
presso Complesso Ecotekne - Università del Salento -
Strada Prov.le Lecce - Monteroni sn
73100 Lecce IT
http://www.cmcc.it
mobile: (+39) 327 5630841
email: mauro.tridici at cmcc.it
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180913/e7d2e970/attachment.html>
More information about the Gluster-users
mailing list