[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting

Raghavendra Gowdappa rgowdapp at redhat.com
Mon Jun 27 09:05:06 UTC 2016



----- Original Message -----
> From: "Avra Sengupta" <asengupt at redhat.com>
> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> Cc: "Vijay Bellur" <vbellur at redhat.com>, "Alastair Neil" <ajneil.tech at gmail.com>, "gluster-users"
> <gluster-users at gluster.org>, "Niels de Vos" <ndevos at redhat.com>
> Sent: Monday, June 27, 2016 2:31:20 PM
> Subject: Re: [Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting
> 
> On 06/27/2016 01:08 PM, Raghavendra Gowdappa wrote:
> >
> > ----- Original Message -----
> >> From: "Avra Sengupta" <asengupt at redhat.com>
> >> To: "Vijay Bellur" <vbellur at redhat.com>, "Alastair Neil"
> >> <ajneil.tech at gmail.com>, "gluster-users"
> >> <gluster-users at gluster.org>, "Niels de Vos" <ndevos at redhat.com>,
> >> "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> >> Sent: Monday, June 27, 2016 12:53:41 PM
> >> Subject: Re: [Gluster-users] Fedora upgrade to f24 installed 3.8.0 client
> >> and broke mounting
> >>
> >> On 06/27/2016 12:04 PM, Avra Sengupta wrote:
> >>> On 06/25/2016 01:19 AM, Vijay Bellur wrote:
> >>>> On 06/24/2016 02:12 PM, Alastair Neil wrote:
> >>>>> I upgraded my fedora 23 system to f24 a couple of days ago, now I am
> >>>>> unable to mount my gluster cluster.
> >>>>>
> >>>>> The update installed:
> >>>>>
> >>>>> glusterfs-3.8.0-1.fc24.x86_64
> >>>>> glusterfs-libs-3.8.0-1.fc24.x86_64
> >>>>> glusterfs-fuse-3.8.0-1.fc24.x86_64
> >>>>> glusterfs-client-xlators-3.8.0-1.fc24.x86_64
> >>>>>
> >>>>> the gluster is running 3.7.11
> >>>>>
> >>>>> The volume is replica 3
> >>>>>
> >>>>> I see these errors in the mount log:
> >>>>>
> >>>>>      [2016-06-24 17:55:34.016462] I [MSGID: 100030]
> >>>>>      [glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running
> >>>>>      /usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs
> >>>>>      --volfile-server=gluster1 --volfile-id=homes /mnt/homes)
> >>>>>      [2016-06-24 17:55:34.094345] I [MSGID: 101190]
> >>>>>      [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
> >>>>>      thread with index 1
> >>>>>      [2016-06-24 17:55:34.240135] I [MSGID: 101190]
> >>>>>      [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
> >>>>>      thread with index 2
> >>>>>      [2016-06-24 17:55:34.240130] I [MSGID: 101190]
> >>>>>      [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
> >>>>>      thread with index 4
> >>>>>      [2016-06-24 17:55:34.240130] I [MSGID: 101190]
> >>>>>      [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
> >>>>>      thread with index 3
> >>>>>      [2016-06-24 17:55:34.241499] I [MSGID: 114020]
> >>>>>      [client.c:2356:notify] 0-homes-client-2: parent translators are
> >>>>>      ready, attempting connect on transport
> >>>>>      [2016-06-24 17:55:34.249172] I [MSGID: 114020]
> >>>>>      [client.c:2356:notify] 0-homes-client-5: parent translators are
> >>>>>      ready, attempting connect on transport
> >>>>>      [2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
> >>>>>      0-homes-client-2: changing port to 49171 (from 0)
> >>>>>      [2016-06-24 17:55:34.253347] I [MSGID: 114020]
> >>>>>      [client.c:2356:notify] 0-homes-client-6: parent translators are
> >>>>>      ready, attempting connect on transport
> >>>>>      [2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
> >>>>>      0-homes-client-5: changing port to 49154 (from 0)
> >>>>>      [2016-06-24 17:55:34.255115] I [MSGID: 114057]
> >>>>>      [client-handshake.c:1441:select_server_supported_programs]
> >>>>>      0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437),
> >>>>>      Version (330)
> >>>>>      [2016-06-24 17:55:34.255861] W [MSGID: 114007]
> >>>>>      [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2:
> >>>>>      failed to find key 'child_up' in the options
> >>>>>      [2016-06-24 17:55:34.259097] I [MSGID: 114057]
> >>>>>      [client-handshake.c:1441:select_server_supported_programs]
> >>>>>      0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437),
> >>>>>      Version (330)
> >>>>>      Final graph:
> >>>>> +------------------------------------------------------------------------------+
> >>>>>
> >>>>>        1: volume homes-client-2
> >>>>>        2:     type protocol/client
> >>>>>        3:     option clnt-lk-version 1
> >>>>>        4:     option volfile-checksum 0
> >>>>>        5:     option volfile-key homes
> >>>>>        6:     option client-version 3.8.0
> >>>>>        7:     option process-uuid
> >>>>>      Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0
> >>>>>        8:     option fops-version 1298437
> >>>>>        9:     option ping-timeout 20
> >>>>>       10:     option remote-host gluster-2
> >>>>>       11:     option remote-subvolume /export/brick2/home
> >>>>>       12:     option transport-type socket
> >>>>>       13:     option event-threads 4
> >>>>>       14:     option send-gids true
> >>>>>       15: end-volume
> >>>>>       16:
> >>>>>       17: volume homes-client-5
> >>>>>       18:     type protocol/client
> >>>>>       19:     option clnt-lk-version 1
> >>>>>       20:     option volfile-checksum 0
> >>>>>       21:     option volfile-key homes
> >>>>>       22:     option client-version 3.8.0
> >>>>>       23:     option process-uuid
> >>>>>      Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0
> >>>>>       24:     option fops-version 1298437
> >>>>>       25:     option ping-timeout 20
> >>>>>       26:     option remote-host gluster1.vsnet.gmu.edu
> >>>>>      <http://gluster1.vsnet.gmu.edu>
> >>>>>       27:     option remote-subvolume /export/brick2/home
> >>>>>       28:     option transport-type socket
> >>>>>       29:     option event-threads 4
> >>>>>       30:     option send-gids true
> >>>>>       31: end-volume
> >>>>>       32:
> >>>>>       33: volume homes-client-6
> >>>>>       34:     type protocol/client
> >>>>>       35:     option ping-timeout 20
> >>>>>       36:     option remote-host gluster0
> >>>>>       37:     option remote-subvolume /export/brick2/home
> >>>>>       38:     option transport-type socket
> >>>>>       39:     option event-threads 4
> >>>>>       40:     option send-gids true
> >>>>>       41: end-volume
> >>>>>       42:
> >>>>>       43: volume homes-replicate-0
> >>>>>       44:     type cluster/replicate
> >>>>>       45:     option background-self-heal-count 20
> >>>>>       46:     option metadata-self-heal on
> >>>>>       47:     option data-self-heal off
> >>>>>       48:     option entry-self-heal on
> >>>>>       49:     option data-self-heal-window-size 8
> >>>>>       50:     option data-self-heal-algorithm diff
> >>>>>       51:     option eager-lock on
> >>>>>       52:     option quorum-type auto
> >>>>>       53:     option self-heal-readdir-size 64KB
> >>>>>       54:     subvolumes homes-client-2 homes-client-5 homes-client-6
> >>>>>       55: end-volume
> >>>>>       56:
> >>>>>       57: volume homes-dht
> >>>>>       58:     type cluster/distribute
> >>>>>       59:     option min-free-disk 5%
> >>>>>       60:     option rebalance-stats on
> >>>>>       61:     option readdir-optimize on
> >>>>>       62:     subvolumes homes-replicate-0
> >>>>>       63: end-volume
> >>>>>       64:
> >>>>>       65: volume homes-read-ahead
> >>>>>       66:     type performance/read-ahead
> >>>>>       67:     subvolumes homes-dht
> >>>>>       68: end-volume
> >>>>>       69:
> >>>>>       70: volume homes-io-cache
> >>>>>       71:     type performance/io-cache
> >>>>>       72:     subvolumes homes-read-ahead
> >>>>>       73: end-volume
> >>>>>       74:
> >>>>>       75: volume homes-quick-read
> >>>>>       76:     type performance/quick-read
> >>>>>       77:     subvolumes homes-io-cache
> >>>>>       78: end-volume
> >>>>>       79:
> >>>>>       80: volume homes-open-behind
> >>>>>       81:     type performance/open-behind
> >>>>>       82:     subvolumes homes-quick-read
> >>>>>       83: end-volume
> >>>>>       84:
> >>>>>       85: volume homes-md-cache
> >>>>>       86:     type performance/md-cache
> >>>>>       87:     subvolumes homes-open-behind
> >>>>>       88: end-volume
> >>>>>       89:
> >>>>>       90: volume homes
> >>>>>       91:     type debug/io-stats
> >>>>>       92:     option log-level INFO
> >>>>>       93:     option latency-measurement off
> >>>>>       94:     option count-fop-hits on
> >>>>>       95:     subvolumes homes-md-cache
> >>>>>       96: end-volume
> >>>>>       97:
> >>>>>       98: volume meta-autoload
> >>>>>       99:     type meta
> >>>>>      100:     subvolumes homes
> >>>>>      101: end-volume
> >>>>>      102:
> >>>>> +------------------------------------------------------------------------------+
> >>>>>
> >>>>>      [2016-06-24 17:55:34.261219] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
> >>>>>      0-homes-client-6: changing port to 49153 (from 0)
> >>>>>      [2016-06-24 17:55:34.266096] I [MSGID: 114057]
> >>>>>      [client-handshake.c:1441:select_server_supported_programs]
> >>>>>      0-homes-client-6: Using Program GlusterFS 3.3, Num (1298437),
> >>>>>      Version (330)
> >>>>>      [2016-06-24 17:55:34.266905] W [MSGID: 114007]
> >>>>>      [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-6:
> >>>>>      failed to find key 'child_up' in the options
> >>>>>      [2016-06-24 17:55:34.273618] W [MSGID: 114007]
> >>>>>      [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-5:
> >>>>>      failed to find key 'child_up' in the options
> >>>>>
> >>>>>
> >>>>> I checked the release notes for 3.8.0 but I did not see any caveats or
> >>>>> compatibility warnings.
> >>>>>
> >>>>> Anyone else seeing issues with 3.8 clients mounting 3.7 volumes?
> >>>>>
> >>>> Seems like it is due to this commit:
> >>>>
> >>>> commit 2bfdc30e0e7fba6f97d8829b2618a1c5907dc404
> >>>> Author: Avra Sengupta
> >>>> Date:   Mon Feb 29 14:43:58 2016 +0530
> >>>>
> >>>>      protocol client/server: Fix client-server handshake
> >>>>
> >>>> This commit introduced a new check to determine the existence of a
> >>>> key in the dictionary that gets exchanged between clients and servers
> >>>> during a handshake. Upon not finding the key, the clients bail out.
> >>>>
> >>>> Avra - would it be possible to avoid a hard check of 'child_up'
> >>>> during a handshake?
> >>> Yes Vijay, This particular failure is because the client is expecting
> >>> a 'child_up' from the server during a handshake, to determine if all
> >>> children in the server are up and it's not just a handshake. Although
> >>> this is the ideal behaviour in which the handshake should work, it is
> >>> currently breaking backward compatibility with 3.7 volumes, as those
> >>> servers are not sending the appropriate key which the newer client is
> >>> expecting.
> >>>
> >>> I would prefer not to bypass this check in the client, but rather
> >>> enforce this check only for connections comming from servers running 3.8.
> >>>
> >>> + Adding Raghavendra Gowdappa
> >>>
> >>> Raghavendra,
> >>>
> >>> Would it be possible to keep this check in the client specific to
> >>> servers running on 3.8 and beyond.
> >> I have raised a bug for this :
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1350326 (3.8)
> >>
> >> and I have sent a patch for this in master :
> >> http://review.gluster.org/#/c/14811/1
> > This approach fixes the current issue. Is there any reason for propagating
> > CHILD_UP from server to client? Couldn't this be abstracted in server
> > itself, i.e., fail all setvolume requests on brick till protocol/server on
> > brick has received a CHILD_UP (with an optional error being sent for cause
> > of failure). That way we could've fixed the original issue of clients
> > connecting when the xlator stack on brick is not up yet even for older
> > clients and newer server too.
> The reason for doing so is, that when the client has tried and connected
> to the brick, and failed coz the brick translators are not ready to
> serve it, then the client polls back after some time to check if the
> status has sent. So even though the brick might be ready to serve right
> after the it failed the setvolume request, it has to wait till the
> client send the request again and this causes a delay and is very easily
> reproducible.
> 
> To prevent this we don't fail the rpc connection and handshake even
> though the brick is not ready to serve. We know at this point that the
> connection is made but the children are up. Once the protocol server
> receives a child up it immediately propagates it to the client thus
> causing no delay. Also this segregates the two different aspects of
> client-server connection and the translators being ready. Just because
> the translators aren't ready, we should not be ideally purging an
> otherwise successful client-server connection/handshake.

Sounds fine. I've acked the patch over gerrit.

> >
> >> I will backport it to 3.8 branch as soon as it is merged in master. With
> >> this patch we are treating the absence of the said key as an indication
> >> that the server trying to connect to this client is running an older
> >> version and hence in such a case we are setting conf->child_up as
> >> _gf_true explicitly. This should suffice in emulating the older behavior.
> >>>> Note that if servers are upgraded ahead of the clients, this problem
> >>>> should not be seen.
> >>>>
> >>>> Thanks,
> >>>> Vijay
> >>>>
> >>>>
> >>
> 
> 


More information about the Gluster-users mailing list