[Gluster-users] heaps split-brains during back-transfert

Krutika Dhananjay kdhananj at redhat.com
Fri Jul 31 09:37:55 UTC 2015


Niels, 

I heard from Geoffrey on IRC last evening about this issue where he told me he was able to resolve all split brains manually. 

-Krutika 
----- Original Message -----

> From: "Niels de Vos" <ndevos at redhat.com>
> To: "Geoffrey Letessier" <geoffrey.letessier at cnrs.fr>
> Cc: gluster-users at gluster.org
> Sent: Friday, July 31, 2015 2:56:26 PM
> Subject: Re: [Gluster-users] heaps split-brains during back-transfert

> On Wed, Jul 29, 2015 at 12:44:38AM +0200, Geoffrey Letessier wrote:
> > OK, thank you Niels for this explanation. Now, this makes sense.
> >
> > And concerning all split-brains appeared during the back-transfert, do you
> > have an idea where is this coming from?

> Sorry, no, I dont know how that is happening in your environment. I'll
> try to find someone that understands more about it and can help you with
> that.

> Niels

> >
> > Best,
> > Geoffrey
> > ------------------------------------------------------
> > Geoffrey Letessier
> > Responsable informatique & ingénieur système
> > UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
> > Institut de Biologie Physico-Chimique
> > 13, rue Pierre et Marie Curie - 75005 Paris
> > Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
> >
> > Le 29 juil. 2015 à 00:02, Niels de Vos <ndevos at redhat.com> a écrit :
> >
> > > On Tue, Jul 28, 2015 at 03:46:37PM +0200, Geoffrey Letessier wrote:
> > >> Hi,
> > >>
> > >> In addition of all split brains reported, is it normal to notice
> > >> thousands and thousands (several tens nay hundreds of thousands)
> > >> broken symlinks browsing the .glusterfs directory on each brick?
> > >
> > > Yes, I think it is normal. A symlink points to a particular filename,
> > > possibly in a different directory. If the target file is located on a
> > > different brick, the symlink points to a non-local file.
> > >
> > > Consider this example with two bricks in a distributed volume:
> > > - file: README
> > > - symlink: IMPORTANT -> README
> > >
> > > When the distribution algorithm is done, README 'hashes' to brick-A. The
> > > symlink 'hashes' to brick-B. This means that README will be localed on
> > > brick-A, and the symlink with name IMPORTANT would be located on
> > > brick-B. Because README is not on the same brick as IMPORTANT, the
> > > symlink points to the non-existing file README on brick-B.
> > >
> > > However, when a Gluster client reads the target of symlink IMPORTANT,
> > > the Gluster client calculate the location of README and will know that
> > > README can be found on brick-A.
> > >
> > > I hope that makes sense?
> > >
> > > Niels
> > >
> > >
> > >> For the moment, i just synchronized one remote directory (around 30TB
> > >> and a few million files) into my new volume. No other operations on
> > >> files on this volume has yet been done.
> > >> How can I fix it? Can I delete these dead-symlinks? How can I fix all
> > >> my split-brains?
> > >>
> > >> Here is an example of a ls:
> > >> [root at cl-storage3 ~]# cd
> > >> /export/brick_home/brick1/data/.glusterfs/7b/d2/
> > >> [root at cl-storage3 d2]# ll
> > >> total 8,7M
> > >> 13706 drwx------ 2 root root 8,0K 26 juil. 17:22 .
> > >> 2147483784 drwx------ 258 root root 8,0K 20 juil. 23:07 ..
> > >> 2148444137 -rwxrwxrwx 2 baaden baaden_team 173K 22 mai 2008
> > >> 7bd200dd-1774-4395-9065-605ae30ec18b
> > >> 1559384 -rw-rw-r-- 2 tarus amyloid_team 4,3K 19 juin 2013
> > >> 7bd2155c-7a05-4edc-ae77-35ed7e16afbc
> > >> 287295 lrwxrwxrwx 1 root root 58 20 juil. 23:38
> > >> 7bd2370a-100b-411e-89a4-d184da9f0f88 ->
> > >> ../../a7/59/a759de6f-cdf5-43dd-809a-baf81d103bf7/prop-base
> > >> 2149090201 -rw-rw-r-- 2 tarus amyloid_team 76K 8 mars 2014
> > >> 7bd2497f-d24b-4b19-a1c5-80a4956e56a1
> > >> 2148561174 -rw-r--r-- 2 tran derreumaux_team 575 14 févr. 07:54
> > >> 7bd25db0-67f5-43e5-a56a-52cf8c4c60dd
> > >> 1303943 -rw-r--r-- 2 tran derreumaux_team 576 10 févr. 06:06
> > >> 7bd25e97-18be-4faf-b122-5868582b4fd8
> > >> 1308607 -rw-r--r-- 2 tran derreumaux_team 414K 16 juin 11:05
> > >> 7bd2618f-950a-4365-a753-723597ef29f5
> > >> 45745 -rw-r--r-- 2 letessier admin_team 585 5 janv. 2012
> > >> 7bd265c7-e204-4ee8-8717-e4a0c393fb0f
> > >> 2148144918 -rw-rw-r-- 2 tarus amyloid_team 107K 28 févr. 2014
> > >> 7bd26c5b-d48a-481a-9ca6-2dc27768b5ad
> > >> 13705 -rw-rw-r-- 2 tarus amyloid_team 25K 4 juin 2014
> > >> 7bd27e4c-46ba-4f21-a766-389bfa52fd78
> > >> 1633627 -rw-rw-r-- 2 tarus amyloid_team 75K 12 mars 2014
> > >> 7bd28631-90af-4c16-8ff0-c3d46d5026c6
> > >> 1329165 -rw-r--r-- 2 tran derreumaux_team 175 15 juin 23:40
> > >> 7bd2957e-a239-4110-b3d8-b4926c7f060b
> > >> 797803 lrwxrwxrwx 2 baaden baaden_team 26 2 avril 2007
> > >> 7bd29933-1c80-4c6b-ae48-e64e4da874cb -> ../divided/a7/2a7o.pdb1.gz
> > >> 1532463 -rw-rw-rw- 2 baaden baaden_team 1,8M 2 nov. 2009
> > >> 7bd29d70-aeb4-4eca-ac55-fae2d46ba911
> > >> 1411112 -rw-r--r-- 2 sterpone sterpone_team 3,1K 2 mai 2012
> > >> 7bd2a5eb-62a4-47fc-b149-31e10bd3c33d
> > >> 2148865896 -rw-r--r-- 2 tran derreumaux_team 2,1M 15 juin 23:46
> > >> 7bd2ae9c-18ca-471f-a54a-6e4aec5aea89
> > >> 2148762578 -rw-rw-r-- 2 tarus amyloid_team 154K 11 mars 2014
> > >> 7bd2b7d7-7745-4842-b7b4-400791c1d149
> > >> 149216 -rw-r--r-- 2 vamparys sacquin_team 241K 17 mai 2013
> > >> 7bd2ba98-6a42-40ea-87ea-acb607d73cb5
> > >> 2148977923 -rwxr-xr-x 2 murail baaden_team 23K 18 juin 2012
> > >> 7bd2cf57-19e7-451c-885d-fd02fd988d43
> > >> 1176623 -rw-rw-r-- 2 tarus amyloid_team 227K 8 mars 2014
> > >> 7bd2d92c-7ec8-4af8-9043-49d1908a99dc
> > >> 1172122 lrwxrwxrwx 2 sterpone sterpone_team 61 17 avril 12:49
> > >> 7bd2d96e-e925-45f0-a26a-56b95c084122 ->
> > >> ../../../../../src/libs/ck-libs/ParFUM-Tops-Dev/ParFUM_TOPS.h
> > >> 1385933 -rw-r--r-- 2 tran derreumaux_team 2,9M 16 juin 05:29
> > >> 7bd2df54-17d2-4644-96b7-f8925a67ec1e
> > >> 745899 lrwxrwxrwx 1 root root 58 22 juil. 09:50
> > >> 7bd2df83-ce58-4a17-aca8-a32b71e953d4 ->
> > >> ../../5c/39/5c39010f-fa77-49df-8df6-8d72cf74fd64/model_009
> > >> 2149100186 -rw-rw-r-- 2 tarus amyloid_team 494K 17 mars 2014
> > >> 7bd2e865-a2f4-4d90-ab29-dccebe2e3440
> > >>
> > >>
> > >>
> > >> Best.
> > >> Geoffrey
> > >> ------------------------------------------------------
> > >> Geoffrey Letessier
> > >> Responsable informatique & ingénieur système
> > >> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
> > >> Institut de Biologie Physico-Chimique
> > >> 13, rue Pierre et Marie Curie - 75005 Paris
> > >> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
> > >>
> > >> Le 27 juil. 2015 à 22:57, Geoffrey Letessier
> > >> <geoffrey.letessier at cnrs.fr> a écrit :
> > >>
> > >>> Dears,
> > >>>
> > >>> For a couple of weeks (more than one month), our computing production
> > >>> is stopped due to several -but amazing- troubles with GlusterFS.
> > >>>
> > >>> After having noticed a big problem with incorrect quota size accounted
> > >>> for many many files, i decided under the guidance of Gluster team
> > >>> support to upgrade my storage cluster from version 3.5.3 to the latest
> > >>> (3.7.2-3) because these bugs are theoretically fixed in this branch.
> > >>> Now, since i’ve done this upgrade, it’s the amazing mess and i cannot
> > >>> restart the production.
> > >>> Indeed :
> > >>> 1 - RDMA protocol is not working and hang my system / shell commands;
> > >>> only TCP protocol (over Infiniband) is more or less operational - it’s
> > >>> not a blocking point but…
> > >>> 2 - read/write performance relatively low
> > >>> 3 - thousands split-brains are appeared.
> > >>>
> > >>> So, for the moment, i believe GlusterFS 3.7 is not actually production
> > >>> ready.
> > >>>
> > >>> Concerning the third point: after having destroy all my volumes (RAID
> > >>> re-init, new partition, GlusterFS volumes, etc.), recreate the main
> > >>> one, I tried to back-transfert my data from archive/backup server info
> > >>> this new volume and I note a lot of errors in my mount log file, as
> > >>> your can read in this extract:
> > >>> [2015-07-26 22:35:16.962815] I
> > >>> [afr-self-heal-entry.c:565:afr_selfheal_entry_do]
> > >>> 0-vol_home-replicate-0: performing entry selfheal on
> > >>> 865083fa-984e-44bd-aacf-b8195789d9e0
> > >>> [2015-07-26 22:35:16.965896] E
> > >>> [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch]
> > >>> 0-vol_home-replicate-0: Gfid mismatch detected for
> > >>> <865083fa-984e-44bd-aacf-b8195789d9e0/job.pbs>,
> > >>> e944d444-66c5-40a4-9603-7c190ad86013 on vol_home-client-1 and
> > >>> 820f9bcc-a0f6-40e0-bcec-28a76b4195ea on vol_home-client-0. Skipping
> > >>> conservative merge on the file.
> > >>> [2015-07-26 22:35:16.975206] I
> > >>> [afr-self-heal-entry.c:565:afr_selfheal_entry_do]
> > >>> 0-vol_home-replicate-0: performing entry selfheal on
> > >>> 29382d8d-c507-4d2e-b74d-dbdcb791ca65
> > >>> [2015-07-26 22:35:28.719935] E
> > >>> [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch]
> > >>> 0-vol_home-replicate-0: Gfid mismatch detected for
> > >>> <29382d8d-c507-4d2e-b74d-dbdcb791ca65/res_1BVK_r_u_1IBR_l_u_Cond.1IBR_l_u.1BVK_r_u.UB.global.dat.txt>,
> > >>> 951c5ffb-ca38-4630-93f3-8e4119ab0bd8 on vol_home-client-1 and
> > >>> 5ae663ca-e896-4b92-8ec5-5b15422ab861 on vol_home-client-0. Skipping
> > >>> conservative merge on the file.
> > >>> [2015-07-26 22:35:29.764891] I
> > >>> [afr-self-heal-entry.c:565:afr_selfheal_entry_do]
> > >>> 0-vol_home-replicate-0: performing entry selfheal on
> > >>> 865083fa-984e-44bd-aacf-b8195789d9e0
> > >>> [2015-07-26 22:35:29.768339] E
> > >>> [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch]
> > >>> 0-vol_home-replicate-0: Gfid mismatch detected for
> > >>> <865083fa-984e-44bd-aacf-b8195789d9e0/job.pbs>,
> > >>> e944d444-66c5-40a4-9603-7c190ad86013 on vol_home-client-1 and
> > >>> 820f9bcc-a0f6-40e0-bcec-28a76b4195ea on vol_home-client-0. Skipping
> > >>> conservative merge on the file.
> > >>> [2015-07-26 22:35:29.775037] I
> > >>> [afr-self-heal-entry.c:565:afr_selfheal_entry_do]
> > >>> 0-vol_home-replicate-0: performing entry selfheal on
> > >>> 29382d8d-c507-4d2e-b74d-dbdcb791ca65
> > >>> [2015-07-26 22:35:29.776857] E
> > >>> [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch]
> > >>> 0-vol_home-replicate-0: Gfid mismatch detected for
> > >>> <29382d8d-c507-4d2e-b74d-dbdcb791ca65/res_1BVK_r_u_1IBR_l_u_Cond.1IBR_l_u.1BVK_r_u.UB.global.dat.txt>,
> > >>> 951c5ffb-ca38-4630-93f3-8e4119ab0bd8 on vol_home-client-1 and
> > >>> 5ae663ca-e896-4b92-8ec5-5b15422ab861 on vol_home-client-0. Skipping
> > >>> conservative merge on the file.
> > >>> [2015-07-26 22:35:29.800535] W [MSGID: 108008]
> > >>> [afr-self-heal-name.c:353:afr_selfheal_name_gfid_mismatch_check]
> > >>> 0-vol_home-replicate-0: GFID mismatch for
> > >>> <gfid:29382d8d-c507-4d2e-b74d-dbdcb791ca65>/res_1BVK_r_u_1IBR_l_u_Cond.1IBR_l_u.1BVK_r_u.UB.global.dat.txt
> > >>> 951c5ffb-ca38-4630-93f3-8e4119ab0bd8 on vol_home-client-1 and
> > >>> 5ae663ca-e896-4b92-8ec5-5b15422ab861 on vol_home-client-0
> > >>>
> > >>> And when I try to browse some folders (still in mount log file):
> > >>> [2015-07-27 09:00:19.005763] I
> > >>> [afr-self-heal-entry.c:565:afr_selfheal_entry_do]
> > >>> 0-vol_home-replicate-0: performing entry selfheal on
> > >>> 2ac27442-8be0-4985-b48f-3328a86a6686
> > >>> [2015-07-27 09:00:22.322316] E
> > >>> [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch]
> > >>> 0-vol_home-replicate-0: Gfid mismatch detected for
> > >>> <2ac27442-8be0-4985-b48f-3328a86a6686/md0012588.gro>,
> > >>> 9c635868-054b-4a13-b974-0ba562991586 on vol_home-client-1 and
> > >>> 1943175c-b336-4b33-aa1c-74a1c51f17b9 on vol_home-client-0. Skipping
> > >>> conservative merge on the file.
> > >>> [2015-07-27 09:00:23.008771] I
> > >>> [afr-self-heal-entry.c:565:afr_selfheal_entry_do]
> > >>> 0-vol_home-replicate-0: performing entry selfheal on
> > >>> 2ac27442-8be0-4985-b48f-3328a86a6686
> > >>> [2015-07-27 08:59:50.359187] W [MSGID: 108008]
> > >>> [afr-self-heal-name.c:353:afr_selfheal_name_gfid_mismatch_check]
> > >>> 0-vol_home-replicate-0: GFID mismatch for
> > >>> <gfid:2ac27442-8be0-4985-b48f-3328a86a6686>/md0012588.gro
> > >>> 9c635868-054b-4a13-b974-0ba562991586 on vol_home-client-1 and
> > >>> 1943175c-b336-4b33-aa1c-74a1c51f17b9 on vol_home-client-0
> > >>> [2015-07-27 09:00:02.500419] W [MSGID: 108008]
> > >>> [afr-self-heal-name.c:353:afr_selfheal_name_gfid_mismatch_check]
> > >>> 0-vol_home-replicate-0: GFID mismatch for
> > >>> <gfid:2ac27442-8be0-4985-b48f-3328a86a6686>/md0012590.gro
> > >>> b22aec09-2be3-41ea-a976-7b8d0e6f61f0 on vol_home-client-1 and
> > >>> ec100f9e-ec48-4b29-b75e-a50ec6245de6 on vol_home-client-0
> > >>> [2015-07-27 09:00:02.506925] W [MSGID: 108008]
> > >>> [afr-self-heal-name.c:353:afr_selfheal_name_gfid_mismatch_check]
> > >>> 0-vol_home-replicate-0: GFID mismatch for
> > >>> <gfid:2ac27442-8be0-4985-b48f-3328a86a6686>/md0009059.gro
> > >>> 0485c093-11ca-4829-b705-e259668ebd8c on vol_home-client-1 and
> > >>> e83a492b-7f8c-4b32-a76e-343f984142fe on vol_home-client-0
> > >>> [2015-07-27 09:00:23.001121] W [MSGID: 108008]
> > >>> [afr-read-txn.c:241:afr_read_txn] 0-vol_home-replicate-0: Unreadable
> > >>> subvolume -1 found with event generation 2. (Possible split-brain)
> > >>> [2015-07-27 09:00:26.231262] E
> > >>> [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch]
> > >>> 0-vol_home-replicate-0: Gfid mismatch detected for
> > >>> <2ac27442-8be0-4985-b48f-3328a86a6686/md0012588.gro>,
> > >>> 9c635868-054b-4a13-b974-0ba562991586 on vol_home-client-1 and
> > >>> 1943175c-b336-4b33-aa1c-74a1c51f17b9 on vol_home-client-0. Skipping
> > >>> conservative merge on the file.
> > >>>
> > >>> And, above all, browsing folder I get a lot of input/ouput errors.
> > >>>
> > >>> Currently I have 6.2M inodes and roughly 30TB in my "new" volume.
> > >>>
> > >>> For the moment, Quota is disable to increase the IO performance during
> > >>> the back-transfert…
> > >>>
> > >>> Your can also find in attachments:
> > >>> - an "ls" result
> > >>> - a split-brain research result
> > >>> - the volume information and status
> > >>> - a complete volume heal info
> > >>>
> > >>> Hoping this can help your to help me to fix all my problems and reopen
> > >>> the computing production.
> > >>>
> > >>> Thanks in advance,
> > >>> Geoffrey
> > >>>
> > >>> PS: « Erreur d’Entrée/Sortie » = « Input / Output Error »
> > >>> ------------------------------------------------------
> > >>> Geoffrey Letessier
> > >>> Responsable informatique & ingénieur système
> > >>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
> > >>> Institut de Biologie Physico-Chimique
> > >>> 13, rue Pierre et Marie Curie - 75005 Paris
> > >>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
> > >>>
> > >>> <ls_example.txt>
> > >>> <split_brain__20150725.txt>
> > >>> <vol_home_healinfo.txt>
> > >>> <vol_home_info.txt>
> > >>> <vol_home_status.txt>
> > >>
> >
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150731/8696ac63/attachment.html>


More information about the Gluster-users mailing list