[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting

Avra Sengupta asengupt at redhat.com
Mon Jun 27 07:23:41 UTC 2016


On 06/27/2016 12:04 PM, Avra Sengupta wrote:
> On 06/25/2016 01:19 AM, Vijay Bellur wrote:
>> On 06/24/2016 02:12 PM, Alastair Neil wrote:
>>> I upgraded my fedora 23 system to f24 a couple of days ago, now I am
>>> unable to mount my gluster cluster.
>>>
>>> The update installed:
>>>
>>> glusterfs-3.8.0-1.fc24.x86_64
>>> glusterfs-libs-3.8.0-1.fc24.x86_64
>>> glusterfs-fuse-3.8.0-1.fc24.x86_64
>>> glusterfs-client-xlators-3.8.0-1.fc24.x86_64
>>>
>>> the gluster is running 3.7.11
>>>
>>> The volume is replica 3
>>>
>>> I see these errors in the mount log:
>>>
>>>     [2016-06-24 17:55:34.016462] I [MSGID: 100030]
>>>     [glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running
>>>     /usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs
>>>     --volfile-server=gluster1 --volfile-id=homes /mnt/homes)
>>>     [2016-06-24 17:55:34.094345] I [MSGID: 101190]
>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>     thread with index 1
>>>     [2016-06-24 17:55:34.240135] I [MSGID: 101190]
>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>     thread with index 2
>>>     [2016-06-24 17:55:34.240130] I [MSGID: 101190]
>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>     thread with index 4
>>>     [2016-06-24 17:55:34.240130] I [MSGID: 101190]
>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>     thread with index 3
>>>     [2016-06-24 17:55:34.241499] I [MSGID: 114020]
>>>     [client.c:2356:notify] 0-homes-client-2: parent translators are
>>>     ready, attempting connect on transport
>>>     [2016-06-24 17:55:34.249172] I [MSGID: 114020]
>>>     [client.c:2356:notify] 0-homes-client-5: parent translators are
>>>     ready, attempting connect on transport
>>>     [2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>     0-homes-client-2: changing port to 49171 (from 0)
>>>     [2016-06-24 17:55:34.253347] I [MSGID: 114020]
>>>     [client.c:2356:notify] 0-homes-client-6: parent translators are
>>>     ready, attempting connect on transport
>>>     [2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>     0-homes-client-5: changing port to 49154 (from 0)
>>>     [2016-06-24 17:55:34.255115] I [MSGID: 114057]
>>>     [client-handshake.c:1441:select_server_supported_programs]
>>>     0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437),
>>>     Version (330)
>>>     [2016-06-24 17:55:34.255861] W [MSGID: 114007]
>>>     [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2:
>>>     failed to find key 'child_up' in the options
>>>     [2016-06-24 17:55:34.259097] I [MSGID: 114057]
>>>     [client-handshake.c:1441:select_server_supported_programs]
>>>     0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437),
>>>     Version (330)
>>>     Final graph:
>>> +------------------------------------------------------------------------------+ 
>>>
>>>       1: volume homes-client-2
>>>       2:     type protocol/client
>>>       3:     option clnt-lk-version 1
>>>       4:     option volfile-checksum 0
>>>       5:     option volfile-key homes
>>>       6:     option client-version 3.8.0
>>>       7:     option process-uuid
>>>     Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0
>>>       8:     option fops-version 1298437
>>>       9:     option ping-timeout 20
>>>      10:     option remote-host gluster-2
>>>      11:     option remote-subvolume /export/brick2/home
>>>      12:     option transport-type socket
>>>      13:     option event-threads 4
>>>      14:     option send-gids true
>>>      15: end-volume
>>>      16:
>>>      17: volume homes-client-5
>>>      18:     type protocol/client
>>>      19:     option clnt-lk-version 1
>>>      20:     option volfile-checksum 0
>>>      21:     option volfile-key homes
>>>      22:     option client-version 3.8.0
>>>      23:     option process-uuid
>>>     Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0
>>>      24:     option fops-version 1298437
>>>      25:     option ping-timeout 20
>>>      26:     option remote-host gluster1.vsnet.gmu.edu
>>>     <http://gluster1.vsnet.gmu.edu>
>>>      27:     option remote-subvolume /export/brick2/home
>>>      28:     option transport-type socket
>>>      29:     option event-threads 4
>>>      30:     option send-gids true
>>>      31: end-volume
>>>      32:
>>>      33: volume homes-client-6
>>>      34:     type protocol/client
>>>      35:     option ping-timeout 20
>>>      36:     option remote-host gluster0
>>>      37:     option remote-subvolume /export/brick2/home
>>>      38:     option transport-type socket
>>>      39:     option event-threads 4
>>>      40:     option send-gids true
>>>      41: end-volume
>>>      42:
>>>      43: volume homes-replicate-0
>>>      44:     type cluster/replicate
>>>      45:     option background-self-heal-count 20
>>>      46:     option metadata-self-heal on
>>>      47:     option data-self-heal off
>>>      48:     option entry-self-heal on
>>>      49:     option data-self-heal-window-size 8
>>>      50:     option data-self-heal-algorithm diff
>>>      51:     option eager-lock on
>>>      52:     option quorum-type auto
>>>      53:     option self-heal-readdir-size 64KB
>>>      54:     subvolumes homes-client-2 homes-client-5 homes-client-6
>>>      55: end-volume
>>>      56:
>>>      57: volume homes-dht
>>>      58:     type cluster/distribute
>>>      59:     option min-free-disk 5%
>>>      60:     option rebalance-stats on
>>>      61:     option readdir-optimize on
>>>      62:     subvolumes homes-replicate-0
>>>      63: end-volume
>>>      64:
>>>      65: volume homes-read-ahead
>>>      66:     type performance/read-ahead
>>>      67:     subvolumes homes-dht
>>>      68: end-volume
>>>      69:
>>>      70: volume homes-io-cache
>>>      71:     type performance/io-cache
>>>      72:     subvolumes homes-read-ahead
>>>      73: end-volume
>>>      74:
>>>      75: volume homes-quick-read
>>>      76:     type performance/quick-read
>>>      77:     subvolumes homes-io-cache
>>>      78: end-volume
>>>      79:
>>>      80: volume homes-open-behind
>>>      81:     type performance/open-behind
>>>      82:     subvolumes homes-quick-read
>>>      83: end-volume
>>>      84:
>>>      85: volume homes-md-cache
>>>      86:     type performance/md-cache
>>>      87:     subvolumes homes-open-behind
>>>      88: end-volume
>>>      89:
>>>      90: volume homes
>>>      91:     type debug/io-stats
>>>      92:     option log-level INFO
>>>      93:     option latency-measurement off
>>>      94:     option count-fop-hits on
>>>      95:     subvolumes homes-md-cache
>>>      96: end-volume
>>>      97:
>>>      98: volume meta-autoload
>>>      99:     type meta
>>>     100:     subvolumes homes
>>>     101: end-volume
>>>     102:
>>> +------------------------------------------------------------------------------+ 
>>>
>>>     [2016-06-24 17:55:34.261219] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>     0-homes-client-6: changing port to 49153 (from 0)
>>>     [2016-06-24 17:55:34.266096] I [MSGID: 114057]
>>>     [client-handshake.c:1441:select_server_supported_programs]
>>>     0-homes-client-6: Using Program GlusterFS 3.3, Num (1298437),
>>>     Version (330)
>>>     [2016-06-24 17:55:34.266905] W [MSGID: 114007]
>>>     [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-6:
>>>     failed to find key 'child_up' in the options
>>>     [2016-06-24 17:55:34.273618] W [MSGID: 114007]
>>>     [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-5:
>>>     failed to find key 'child_up' in the options
>>
>>>
>>>
>>>
>>> I checked the release notes for 3.8.0 but I did not see any caveats or
>>> compatibility warnings.
>>>
>>> Anyone else seeing issues with 3.8 clients mounting 3.7 volumes?
>>>
>>
>> Seems like it is due to this commit:
>>
>> commit 2bfdc30e0e7fba6f97d8829b2618a1c5907dc404
>> Author: Avra Sengupta
>> Date:   Mon Feb 29 14:43:58 2016 +0530
>>
>>     protocol client/server: Fix client-server handshake
>>
>> This commit introduced a new check to determine the existence of a 
>> key in the dictionary that gets exchanged between clients and servers 
>> during a handshake. Upon not finding the key, the clients bail out.
>>
>> Avra - would it be possible to avoid a hard check of 'child_up' 
>> during a handshake?
> Yes Vijay, This particular failure is because the client is expecting 
> a 'child_up' from the server during a handshake, to determine if all 
> children in the server are up and it's not just a handshake. Although 
> this is the ideal behaviour in which the handshake should work, it is 
> currently breaking backward compatibility with 3.7 volumes, as those 
> servers are not sending the appropriate key which the newer client is 
> expecting.
>
> I would prefer not to bypass this check in the client, but rather 
> enforce this check only for connections comming from servers running 3.8.
>
> + Adding Raghavendra Gowdappa
>
> Raghavendra,
>
> Would it be possible to keep this check in the client specific to 
> servers running on 3.8 and beyond.
I have raised a bug for this : 
https://bugzilla.redhat.com/show_bug.cgi?id=1350326 (3.8)

and I have sent a patch for this in master : 
http://review.gluster.org/#/c/14811/1

I will backport it to 3.8 branch as soon as it is merged in master. With 
this patch we are treating the absence of the said key as an indication 
that the server trying to connect to this client is running an older 
version and hence in such a case we are setting conf->child_up as 
_gf_true explicitly. This should suffice in emulating the older behavior.
>>
>> Note that if servers are upgraded ahead of the clients, this problem 
>> should not be seen.
>>
>> Thanks,
>> Vijay
>>
>>
>



More information about the Gluster-users mailing list