[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting

Avra Sengupta asengupt at redhat.com
Mon Jun 27 09:01:20 UTC 2016


On 06/27/2016 01:08 PM, Raghavendra Gowdappa wrote:
>
> ----- Original Message -----
>> From: "Avra Sengupta" <asengupt at redhat.com>
>> To: "Vijay Bellur" <vbellur at redhat.com>, "Alastair Neil" <ajneil.tech at gmail.com>, "gluster-users"
>> <gluster-users at gluster.org>, "Niels de Vos" <ndevos at redhat.com>, "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>> Sent: Monday, June 27, 2016 12:53:41 PM
>> Subject: Re: [Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting
>>
>> On 06/27/2016 12:04 PM, Avra Sengupta wrote:
>>> On 06/25/2016 01:19 AM, Vijay Bellur wrote:
>>>> On 06/24/2016 02:12 PM, Alastair Neil wrote:
>>>>> I upgraded my fedora 23 system to f24 a couple of days ago, now I am
>>>>> unable to mount my gluster cluster.
>>>>>
>>>>> The update installed:
>>>>>
>>>>> glusterfs-3.8.0-1.fc24.x86_64
>>>>> glusterfs-libs-3.8.0-1.fc24.x86_64
>>>>> glusterfs-fuse-3.8.0-1.fc24.x86_64
>>>>> glusterfs-client-xlators-3.8.0-1.fc24.x86_64
>>>>>
>>>>> the gluster is running 3.7.11
>>>>>
>>>>> The volume is replica 3
>>>>>
>>>>> I see these errors in the mount log:
>>>>>
>>>>>      [2016-06-24 17:55:34.016462] I [MSGID: 100030]
>>>>>      [glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running
>>>>>      /usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs
>>>>>      --volfile-server=gluster1 --volfile-id=homes /mnt/homes)
>>>>>      [2016-06-24 17:55:34.094345] I [MSGID: 101190]
>>>>>      [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>>>      thread with index 1
>>>>>      [2016-06-24 17:55:34.240135] I [MSGID: 101190]
>>>>>      [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>>>      thread with index 2
>>>>>      [2016-06-24 17:55:34.240130] I [MSGID: 101190]
>>>>>      [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>>>      thread with index 4
>>>>>      [2016-06-24 17:55:34.240130] I [MSGID: 101190]
>>>>>      [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>>>      thread with index 3
>>>>>      [2016-06-24 17:55:34.241499] I [MSGID: 114020]
>>>>>      [client.c:2356:notify] 0-homes-client-2: parent translators are
>>>>>      ready, attempting connect on transport
>>>>>      [2016-06-24 17:55:34.249172] I [MSGID: 114020]
>>>>>      [client.c:2356:notify] 0-homes-client-5: parent translators are
>>>>>      ready, attempting connect on transport
>>>>>      [2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>>>      0-homes-client-2: changing port to 49171 (from 0)
>>>>>      [2016-06-24 17:55:34.253347] I [MSGID: 114020]
>>>>>      [client.c:2356:notify] 0-homes-client-6: parent translators are
>>>>>      ready, attempting connect on transport
>>>>>      [2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>>>      0-homes-client-5: changing port to 49154 (from 0)
>>>>>      [2016-06-24 17:55:34.255115] I [MSGID: 114057]
>>>>>      [client-handshake.c:1441:select_server_supported_programs]
>>>>>      0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437),
>>>>>      Version (330)
>>>>>      [2016-06-24 17:55:34.255861] W [MSGID: 114007]
>>>>>      [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2:
>>>>>      failed to find key 'child_up' in the options
>>>>>      [2016-06-24 17:55:34.259097] I [MSGID: 114057]
>>>>>      [client-handshake.c:1441:select_server_supported_programs]
>>>>>      0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437),
>>>>>      Version (330)
>>>>>      Final graph:
>>>>> +------------------------------------------------------------------------------+
>>>>>
>>>>>        1: volume homes-client-2
>>>>>        2:     type protocol/client
>>>>>        3:     option clnt-lk-version 1
>>>>>        4:     option volfile-checksum 0
>>>>>        5:     option volfile-key homes
>>>>>        6:     option client-version 3.8.0
>>>>>        7:     option process-uuid
>>>>>      Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0
>>>>>        8:     option fops-version 1298437
>>>>>        9:     option ping-timeout 20
>>>>>       10:     option remote-host gluster-2
>>>>>       11:     option remote-subvolume /export/brick2/home
>>>>>       12:     option transport-type socket
>>>>>       13:     option event-threads 4
>>>>>       14:     option send-gids true
>>>>>       15: end-volume
>>>>>       16:
>>>>>       17: volume homes-client-5
>>>>>       18:     type protocol/client
>>>>>       19:     option clnt-lk-version 1
>>>>>       20:     option volfile-checksum 0
>>>>>       21:     option volfile-key homes
>>>>>       22:     option client-version 3.8.0
>>>>>       23:     option process-uuid
>>>>>      Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0
>>>>>       24:     option fops-version 1298437
>>>>>       25:     option ping-timeout 20
>>>>>       26:     option remote-host gluster1.vsnet.gmu.edu
>>>>>      <http://gluster1.vsnet.gmu.edu>
>>>>>       27:     option remote-subvolume /export/brick2/home
>>>>>       28:     option transport-type socket
>>>>>       29:     option event-threads 4
>>>>>       30:     option send-gids true
>>>>>       31: end-volume
>>>>>       32:
>>>>>       33: volume homes-client-6
>>>>>       34:     type protocol/client
>>>>>       35:     option ping-timeout 20
>>>>>       36:     option remote-host gluster0
>>>>>       37:     option remote-subvolume /export/brick2/home
>>>>>       38:     option transport-type socket
>>>>>       39:     option event-threads 4
>>>>>       40:     option send-gids true
>>>>>       41: end-volume
>>>>>       42:
>>>>>       43: volume homes-replicate-0
>>>>>       44:     type cluster/replicate
>>>>>       45:     option background-self-heal-count 20
>>>>>       46:     option metadata-self-heal on
>>>>>       47:     option data-self-heal off
>>>>>       48:     option entry-self-heal on
>>>>>       49:     option data-self-heal-window-size 8
>>>>>       50:     option data-self-heal-algorithm diff
>>>>>       51:     option eager-lock on
>>>>>       52:     option quorum-type auto
>>>>>       53:     option self-heal-readdir-size 64KB
>>>>>       54:     subvolumes homes-client-2 homes-client-5 homes-client-6
>>>>>       55: end-volume
>>>>>       56:
>>>>>       57: volume homes-dht
>>>>>       58:     type cluster/distribute
>>>>>       59:     option min-free-disk 5%
>>>>>       60:     option rebalance-stats on
>>>>>       61:     option readdir-optimize on
>>>>>       62:     subvolumes homes-replicate-0
>>>>>       63: end-volume
>>>>>       64:
>>>>>       65: volume homes-read-ahead
>>>>>       66:     type performance/read-ahead
>>>>>       67:     subvolumes homes-dht
>>>>>       68: end-volume
>>>>>       69:
>>>>>       70: volume homes-io-cache
>>>>>       71:     type performance/io-cache
>>>>>       72:     subvolumes homes-read-ahead
>>>>>       73: end-volume
>>>>>       74:
>>>>>       75: volume homes-quick-read
>>>>>       76:     type performance/quick-read
>>>>>       77:     subvolumes homes-io-cache
>>>>>       78: end-volume
>>>>>       79:
>>>>>       80: volume homes-open-behind
>>>>>       81:     type performance/open-behind
>>>>>       82:     subvolumes homes-quick-read
>>>>>       83: end-volume
>>>>>       84:
>>>>>       85: volume homes-md-cache
>>>>>       86:     type performance/md-cache
>>>>>       87:     subvolumes homes-open-behind
>>>>>       88: end-volume
>>>>>       89:
>>>>>       90: volume homes
>>>>>       91:     type debug/io-stats
>>>>>       92:     option log-level INFO
>>>>>       93:     option latency-measurement off
>>>>>       94:     option count-fop-hits on
>>>>>       95:     subvolumes homes-md-cache
>>>>>       96: end-volume
>>>>>       97:
>>>>>       98: volume meta-autoload
>>>>>       99:     type meta
>>>>>      100:     subvolumes homes
>>>>>      101: end-volume
>>>>>      102:
>>>>> +------------------------------------------------------------------------------+
>>>>>
>>>>>      [2016-06-24 17:55:34.261219] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>>>      0-homes-client-6: changing port to 49153 (from 0)
>>>>>      [2016-06-24 17:55:34.266096] I [MSGID: 114057]
>>>>>      [client-handshake.c:1441:select_server_supported_programs]
>>>>>      0-homes-client-6: Using Program GlusterFS 3.3, Num (1298437),
>>>>>      Version (330)
>>>>>      [2016-06-24 17:55:34.266905] W [MSGID: 114007]
>>>>>      [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-6:
>>>>>      failed to find key 'child_up' in the options
>>>>>      [2016-06-24 17:55:34.273618] W [MSGID: 114007]
>>>>>      [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-5:
>>>>>      failed to find key 'child_up' in the options
>>>>>
>>>>>
>>>>> I checked the release notes for 3.8.0 but I did not see any caveats or
>>>>> compatibility warnings.
>>>>>
>>>>> Anyone else seeing issues with 3.8 clients mounting 3.7 volumes?
>>>>>
>>>> Seems like it is due to this commit:
>>>>
>>>> commit 2bfdc30e0e7fba6f97d8829b2618a1c5907dc404
>>>> Author: Avra Sengupta
>>>> Date:   Mon Feb 29 14:43:58 2016 +0530
>>>>
>>>>      protocol client/server: Fix client-server handshake
>>>>
>>>> This commit introduced a new check to determine the existence of a
>>>> key in the dictionary that gets exchanged between clients and servers
>>>> during a handshake. Upon not finding the key, the clients bail out.
>>>>
>>>> Avra - would it be possible to avoid a hard check of 'child_up'
>>>> during a handshake?
>>> Yes Vijay, This particular failure is because the client is expecting
>>> a 'child_up' from the server during a handshake, to determine if all
>>> children in the server are up and it's not just a handshake. Although
>>> this is the ideal behaviour in which the handshake should work, it is
>>> currently breaking backward compatibility with 3.7 volumes, as those
>>> servers are not sending the appropriate key which the newer client is
>>> expecting.
>>>
>>> I would prefer not to bypass this check in the client, but rather
>>> enforce this check only for connections comming from servers running 3.8.
>>>
>>> + Adding Raghavendra Gowdappa
>>>
>>> Raghavendra,
>>>
>>> Would it be possible to keep this check in the client specific to
>>> servers running on 3.8 and beyond.
>> I have raised a bug for this :
>> https://bugzilla.redhat.com/show_bug.cgi?id=1350326 (3.8)
>>
>> and I have sent a patch for this in master :
>> http://review.gluster.org/#/c/14811/1
> This approach fixes the current issue. Is there any reason for propagating CHILD_UP from server to client? Couldn't this be abstracted in server itself, i.e., fail all setvolume requests on brick till protocol/server on brick has received a CHILD_UP (with an optional error being sent for cause of failure). That way we could've fixed the original issue of clients connecting when the xlator stack on brick is not up yet even for older clients and newer server too.
The reason for doing so is, that when the client has tried and connected 
to the brick, and failed coz the brick translators are not ready to 
serve it, then the client polls back after some time to check if the 
status has sent. So even though the brick might be ready to serve right 
after the it failed the setvolume request, it has to wait till the 
client send the request again and this causes a delay and is very easily 
reproducible.

To prevent this we don't fail the rpc connection and handshake even 
though the brick is not ready to serve. We know at this point that the 
connection is made but the children are up. Once the protocol server 
receives a child up it immediately propagates it to the client thus 
causing no delay. Also this segregates the two different aspects of 
client-server connection and the translators being ready. Just because 
the translators aren't ready, we should not be ideally purging an 
otherwise successful client-server connection/handshake.
>
>> I will backport it to 3.8 branch as soon as it is merged in master. With
>> this patch we are treating the absence of the said key as an indication
>> that the server trying to connect to this client is running an older
>> version and hence in such a case we are setting conf->child_up as
>> _gf_true explicitly. This should suffice in emulating the older behavior.
>>>> Note that if servers are upgraded ahead of the clients, this problem
>>>> should not be seen.
>>>>
>>>> Thanks,
>>>> Vijay
>>>>
>>>>
>>



More information about the Gluster-users mailing list