[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting

Avra Sengupta asengupt at redhat.com
Tue Jun 28 10:17:56 UTC 2016


Hi,

The patch (http://review.gluster.org/#/c/14811/) passed all regressions. 
If any of you could merge it, I would backport it to 3.8

Regards,
Avra

On 06/27/2016 12:04 PM, Avra Sengupta wrote:
> On 06/25/2016 01:19 AM, Vijay Bellur wrote:
>> On 06/24/2016 02:12 PM, Alastair Neil wrote:
>>> I upgraded my fedora 23 system to f24 a couple of days ago, now I am
>>> unable to mount my gluster cluster.
>>>
>>> The update installed:
>>>
>>> glusterfs-3.8.0-1.fc24.x86_64
>>> glusterfs-libs-3.8.0-1.fc24.x86_64
>>> glusterfs-fuse-3.8.0-1.fc24.x86_64
>>> glusterfs-client-xlators-3.8.0-1.fc24.x86_64
>>>
>>> the gluster is running 3.7.11
>>>
>>> The volume is replica 3
>>>
>>> I see these errors in the mount log:
>>>
>>>     [2016-06-24 17:55:34.016462] I [MSGID: 100030]
>>>     [glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running
>>>     /usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs
>>>     --volfile-server=gluster1 --volfile-id=homes /mnt/homes)
>>>     [2016-06-24 17:55:34.094345] I [MSGID: 101190]
>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>     thread with index 1
>>>     [2016-06-24 17:55:34.240135] I [MSGID: 101190]
>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>     thread with index 2
>>>     [2016-06-24 17:55:34.240130] I [MSGID: 101190]
>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>     thread with index 4
>>>     [2016-06-24 17:55:34.240130] I [MSGID: 101190]
>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>     thread with index 3
>>>     [2016-06-24 17:55:34.241499] I [MSGID: 114020]
>>>     [client.c:2356:notify] 0-homes-client-2: parent translators are
>>>     ready, attempting connect on transport
>>>     [2016-06-24 17:55:34.249172] I [MSGID: 114020]
>>>     [client.c:2356:notify] 0-homes-client-5: parent translators are
>>>     ready, attempting connect on transport
>>>     [2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>     0-homes-client-2: changing port to 49171 (from 0)
>>>     [2016-06-24 17:55:34.253347] I [MSGID: 114020]
>>>     [client.c:2356:notify] 0-homes-client-6: parent translators are
>>>     ready, attempting connect on transport
>>>     [2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>     0-homes-client-5: changing port to 49154 (from 0)
>>>     [2016-06-24 17:55:34.255115] I [MSGID: 114057]
>>>     [client-handshake.c:1441:select_server_supported_programs]
>>>     0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437),
>>>     Version (330)
>>>     [2016-06-24 17:55:34.255861] W [MSGID: 114007]
>>>     [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2:
>>>     failed to find key 'child_up' in the options
>>>     [2016-06-24 17:55:34.259097] I [MSGID: 114057]
>>>     [client-handshake.c:1441:select_server_supported_programs]
>>>     0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437),
>>>     Version (330)
>>>     Final graph:
>>> +------------------------------------------------------------------------------+ 
>>>
>>>       1: volume homes-client-2
>>>       2:     type protocol/client
>>>       3:     option clnt-lk-version 1
>>>       4:     option volfile-checksum 0
>>>       5:     option volfile-key homes
>>>       6:     option client-version 3.8.0
>>>       7:     option process-uuid
>>>     Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0
>>>       8:     option fops-version 1298437
>>>       9:     option ping-timeout 20
>>>      10:     option remote-host gluster-2
>>>      11:     option remote-subvolume /export/brick2/home
>>>      12:     option transport-type socket
>>>      13:     option event-threads 4
>>>      14:     option send-gids true
>>>      15: end-volume
>>>      16:
>>>      17: volume homes-client-5
>>>      18:     type protocol/client
>>>      19:     option clnt-lk-version 1
>>>      20:     option volfile-checksum 0
>>>      21:     option volfile-key homes
>>>      22:     option client-version 3.8.0
>>>      23:     option process-uuid
>>>     Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0
>>>      24:     option fops-version 1298437
>>>      25:     option ping-timeout 20
>>>      26:     option remote-host gluster1.vsnet.gmu.edu
>>>     <http://gluster1.vsnet.gmu.edu>
>>>      27:     option remote-subvolume /export/brick2/home
>>>      28:     option transport-type socket
>>>      29:     option event-threads 4
>>>      30:     option send-gids true
>>>      31: end-volume
>>>      32:
>>>      33: volume homes-client-6
>>>      34:     type protocol/client
>>>      35:     option ping-timeout 20
>>>      36:     option remote-host gluster0
>>>      37:     option remote-subvolume /export/brick2/home
>>>      38:     option transport-type socket
>>>      39:     option event-threads 4
>>>      40:     option send-gids true
>>>      41: end-volume
>>>      42:
>>>      43: volume homes-replicate-0
>>>      44:     type cluster/replicate
>>>      45:     option background-self-heal-count 20
>>>      46:     option metadata-self-heal on
>>>      47:     option data-self-heal off
>>>      48:     option entry-self-heal on
>>>      49:     option data-self-heal-window-size 8
>>>      50:     option data-self-heal-algorithm diff
>>>      51:     option eager-lock on
>>>      52:     option quorum-type auto
>>>      53:     option self-heal-readdir-size 64KB
>>>      54:     subvolumes homes-client-2 homes-client-5 homes-client-6
>>>      55: end-volume
>>>      56:
>>>      57: volume homes-dht
>>>      58:     type cluster/distribute
>>>      59:     option min-free-disk 5%
>>>      60:     option rebalance-stats on
>>>      61:     option readdir-optimize on
>>>      62:     subvolumes homes-replicate-0
>>>      63: end-volume
>>>      64:
>>>      65: volume homes-read-ahead
>>>      66:     type performance/read-ahead
>>>      67:     subvolumes homes-dht
>>>      68: end-volume
>>>      69:
>>>      70: volume homes-io-cache
>>>      71:     type performance/io-cache
>>>      72:     subvolumes homes-read-ahead
>>>      73: end-volume
>>>      74:
>>>      75: volume homes-quick-read
>>>      76:     type performance/quick-read
>>>      77:     subvolumes homes-io-cache
>>>      78: end-volume
>>>      79:
>>>      80: volume homes-open-behind
>>>      81:     type performance/open-behind
>>>      82:     subvolumes homes-quick-read
>>>      83: end-volume
>>>      84:
>>>      85: volume homes-md-cache
>>>      86:     type performance/md-cache
>>>      87:     subvolumes homes-open-behind
>>>      88: end-volume
>>>      89:
>>>      90: volume homes
>>>      91:     type debug/io-stats
>>>      92:     option log-level INFO
>>>      93:     option latency-measurement off
>>>      94:     option count-fop-hits on
>>>      95:     subvolumes homes-md-cache
>>>      96: end-volume
>>>      97:
>>>      98: volume meta-autoload
>>>      99:     type meta
>>>     100:     subvolumes homes
>>>     101: end-volume
>>>     102:
>>> +------------------------------------------------------------------------------+ 
>>>
>>>     [2016-06-24 17:55:34.261219] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>     0-homes-client-6: changing port to 49153 (from 0)
>>>     [2016-06-24 17:55:34.266096] I [MSGID: 114057]
>>>     [client-handshake.c:1441:select_server_supported_programs]
>>>     0-homes-client-6: Using Program GlusterFS 3.3, Num (1298437),
>>>     Version (330)
>>>     [2016-06-24 17:55:34.266905] W [MSGID: 114007]
>>>     [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-6:
>>>     failed to find key 'child_up' in the options
>>>     [2016-06-24 17:55:34.273618] W [MSGID: 114007]
>>>     [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-5:
>>>     failed to find key 'child_up' in the options
>>
>>>
>>>
>>>
>>> I checked the release notes for 3.8.0 but I did not see any caveats or
>>> compatibility warnings.
>>>
>>> Anyone else seeing issues with 3.8 clients mounting 3.7 volumes?
>>>
>>
>> Seems like it is due to this commit:
>>
>> commit 2bfdc30e0e7fba6f97d8829b2618a1c5907dc404
>> Author: Avra Sengupta
>> Date:   Mon Feb 29 14:43:58 2016 +0530
>>
>>     protocol client/server: Fix client-server handshake
>>
>> This commit introduced a new check to determine the existence of a 
>> key in the dictionary that gets exchanged between clients and servers 
>> during a handshake. Upon not finding the key, the clients bail out.
>>
>> Avra - would it be possible to avoid a hard check of 'child_up' 
>> during a handshake?
> Yes Vijay, This particular failure is because the client is expecting 
> a 'child_up' from the server during a handshake, to determine if all 
> children in the server are up and it's not just a handshake. Although 
> this is the ideal behaviour in which the handshake should work, it is 
> currently breaking backward compatibility with 3.7 volumes, as those 
> servers are not sending the appropriate key which the newer client is 
> expecting.
>
> I would prefer not to bypass this check in the client, but rather 
> enforce this check only for connections comming from servers running 3.8.
>
> + Adding Raghavendra Gowdappa
>
> Raghavendra,
>
> Would it be possible to keep this check in the client specific to 
> servers running on 3.8 and beyond.
>>
>> Note that if servers are upgraded ahead of the clients, this problem 
>> should not be seen.
>>
>> Thanks,
>> Vijay
>>
>>
>



More information about the Gluster-users mailing list