[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting

Avra Sengupta asengupt at redhat.com
Tue Jul 5 11:48:26 UTC 2016


The 3.8 patch(http://review.gluster.org/#/c/14810/) has passed all 
regressions. Can someone please merge it.

Regards,
Avra

On 06/29/2016 12:45 PM, Avra Sengupta wrote:
> Thanks Jeff for merging the patch.
>
> I have backported it to 3.8 (http://review.gluster.org/#/c/14810). I 
> will notify once the regressions have passed.
>
> Regards,
> Avra
>
> On 06/28/2016 03:47 PM, Avra Sengupta wrote:
>> Hi,
>>
>> The patch (http://review.gluster.org/#/c/14811/) passed all 
>> regressions. If any of you could merge it, I would backport it to 3.8
>>
>> Regards,
>> Avra
>>
>> On 06/27/2016 12:04 PM, Avra Sengupta wrote:
>>> On 06/25/2016 01:19 AM, Vijay Bellur wrote:
>>>> On 06/24/2016 02:12 PM, Alastair Neil wrote:
>>>>> I upgraded my fedora 23 system to f24 a couple of days ago, now I am
>>>>> unable to mount my gluster cluster.
>>>>>
>>>>> The update installed:
>>>>>
>>>>> glusterfs-3.8.0-1.fc24.x86_64
>>>>> glusterfs-libs-3.8.0-1.fc24.x86_64
>>>>> glusterfs-fuse-3.8.0-1.fc24.x86_64
>>>>> glusterfs-client-xlators-3.8.0-1.fc24.x86_64
>>>>>
>>>>> the gluster is running 3.7.11
>>>>>
>>>>> The volume is replica 3
>>>>>
>>>>> I see these errors in the mount log:
>>>>>
>>>>>     [2016-06-24 17:55:34.016462] I [MSGID: 100030]
>>>>>     [glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running
>>>>>     /usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs
>>>>>     --volfile-server=gluster1 --volfile-id=homes /mnt/homes)
>>>>>     [2016-06-24 17:55:34.094345] I [MSGID: 101190]
>>>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>>>     thread with index 1
>>>>>     [2016-06-24 17:55:34.240135] I [MSGID: 101190]
>>>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>>>     thread with index 2
>>>>>     [2016-06-24 17:55:34.240130] I [MSGID: 101190]
>>>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>>>     thread with index 4
>>>>>     [2016-06-24 17:55:34.240130] I [MSGID: 101190]
>>>>>     [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
>>>>>     thread with index 3
>>>>>     [2016-06-24 17:55:34.241499] I [MSGID: 114020]
>>>>>     [client.c:2356:notify] 0-homes-client-2: parent translators are
>>>>>     ready, attempting connect on transport
>>>>>     [2016-06-24 17:55:34.249172] I [MSGID: 114020]
>>>>>     [client.c:2356:notify] 0-homes-client-5: parent translators are
>>>>>     ready, attempting connect on transport
>>>>>     [2016-06-24 17:55:34.250186] I 
>>>>> [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>>>     0-homes-client-2: changing port to 49171 (from 0)
>>>>>     [2016-06-24 17:55:34.253347] I [MSGID: 114020]
>>>>>     [client.c:2356:notify] 0-homes-client-6: parent translators are
>>>>>     ready, attempting connect on transport
>>>>>     [2016-06-24 17:55:34.254213] I 
>>>>> [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>>>     0-homes-client-5: changing port to 49154 (from 0)
>>>>>     [2016-06-24 17:55:34.255115] I [MSGID: 114057]
>>>>> [client-handshake.c:1441:select_server_supported_programs]
>>>>>     0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437),
>>>>>     Version (330)
>>>>>     [2016-06-24 17:55:34.255861] W [MSGID: 114007]
>>>>>     [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2:
>>>>>     failed to find key 'child_up' in the options
>>>>>     [2016-06-24 17:55:34.259097] I [MSGID: 114057]
>>>>> [client-handshake.c:1441:select_server_supported_programs]
>>>>>     0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437),
>>>>>     Version (330)
>>>>>     Final graph:
>>>>> +------------------------------------------------------------------------------+ 
>>>>>
>>>>>       1: volume homes-client-2
>>>>>       2:     type protocol/client
>>>>>       3:     option clnt-lk-version 1
>>>>>       4:     option volfile-checksum 0
>>>>>       5:     option volfile-key homes
>>>>>       6:     option client-version 3.8.0
>>>>>       7:     option process-uuid
>>>>> Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0
>>>>>       8:     option fops-version 1298437
>>>>>       9:     option ping-timeout 20
>>>>>      10:     option remote-host gluster-2
>>>>>      11:     option remote-subvolume /export/brick2/home
>>>>>      12:     option transport-type socket
>>>>>      13:     option event-threads 4
>>>>>      14:     option send-gids true
>>>>>      15: end-volume
>>>>>      16:
>>>>>      17: volume homes-client-5
>>>>>      18:     type protocol/client
>>>>>      19:     option clnt-lk-version 1
>>>>>      20:     option volfile-checksum 0
>>>>>      21:     option volfile-key homes
>>>>>      22:     option client-version 3.8.0
>>>>>      23:     option process-uuid
>>>>> Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0
>>>>>      24:     option fops-version 1298437
>>>>>      25:     option ping-timeout 20
>>>>>      26:     option remote-host gluster1.vsnet.gmu.edu
>>>>>     <http://gluster1.vsnet.gmu.edu>
>>>>>      27:     option remote-subvolume /export/brick2/home
>>>>>      28:     option transport-type socket
>>>>>      29:     option event-threads 4
>>>>>      30:     option send-gids true
>>>>>      31: end-volume
>>>>>      32:
>>>>>      33: volume homes-client-6
>>>>>      34:     type protocol/client
>>>>>      35:     option ping-timeout 20
>>>>>      36:     option remote-host gluster0
>>>>>      37:     option remote-subvolume /export/brick2/home
>>>>>      38:     option transport-type socket
>>>>>      39:     option event-threads 4
>>>>>      40:     option send-gids true
>>>>>      41: end-volume
>>>>>      42:
>>>>>      43: volume homes-replicate-0
>>>>>      44:     type cluster/replicate
>>>>>      45:     option background-self-heal-count 20
>>>>>      46:     option metadata-self-heal on
>>>>>      47:     option data-self-heal off
>>>>>      48:     option entry-self-heal on
>>>>>      49:     option data-self-heal-window-size 8
>>>>>      50:     option data-self-heal-algorithm diff
>>>>>      51:     option eager-lock on
>>>>>      52:     option quorum-type auto
>>>>>      53:     option self-heal-readdir-size 64KB
>>>>>      54:     subvolumes homes-client-2 homes-client-5 homes-client-6
>>>>>      55: end-volume
>>>>>      56:
>>>>>      57: volume homes-dht
>>>>>      58:     type cluster/distribute
>>>>>      59:     option min-free-disk 5%
>>>>>      60:     option rebalance-stats on
>>>>>      61:     option readdir-optimize on
>>>>>      62:     subvolumes homes-replicate-0
>>>>>      63: end-volume
>>>>>      64:
>>>>>      65: volume homes-read-ahead
>>>>>      66:     type performance/read-ahead
>>>>>      67:     subvolumes homes-dht
>>>>>      68: end-volume
>>>>>      69:
>>>>>      70: volume homes-io-cache
>>>>>      71:     type performance/io-cache
>>>>>      72:     subvolumes homes-read-ahead
>>>>>      73: end-volume
>>>>>      74:
>>>>>      75: volume homes-quick-read
>>>>>      76:     type performance/quick-read
>>>>>      77:     subvolumes homes-io-cache
>>>>>      78: end-volume
>>>>>      79:
>>>>>      80: volume homes-open-behind
>>>>>      81:     type performance/open-behind
>>>>>      82:     subvolumes homes-quick-read
>>>>>      83: end-volume
>>>>>      84:
>>>>>      85: volume homes-md-cache
>>>>>      86:     type performance/md-cache
>>>>>      87:     subvolumes homes-open-behind
>>>>>      88: end-volume
>>>>>      89:
>>>>>      90: volume homes
>>>>>      91:     type debug/io-stats
>>>>>      92:     option log-level INFO
>>>>>      93:     option latency-measurement off
>>>>>      94:     option count-fop-hits on
>>>>>      95:     subvolumes homes-md-cache
>>>>>      96: end-volume
>>>>>      97:
>>>>>      98: volume meta-autoload
>>>>>      99:     type meta
>>>>>     100:     subvolumes homes
>>>>>     101: end-volume
>>>>>     102:
>>>>> +------------------------------------------------------------------------------+ 
>>>>>
>>>>>     [2016-06-24 17:55:34.261219] I 
>>>>> [rpc-clnt.c:1855:rpc_clnt_reconfig]
>>>>>     0-homes-client-6: changing port to 49153 (from 0)
>>>>>     [2016-06-24 17:55:34.266096] I [MSGID: 114057]
>>>>> [client-handshake.c:1441:select_server_supported_programs]
>>>>>     0-homes-client-6: Using Program GlusterFS 3.3, Num (1298437),
>>>>>     Version (330)
>>>>>     [2016-06-24 17:55:34.266905] W [MSGID: 114007]
>>>>>     [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-6:
>>>>>     failed to find key 'child_up' in the options
>>>>>     [2016-06-24 17:55:34.273618] W [MSGID: 114007]
>>>>>     [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-5:
>>>>>     failed to find key 'child_up' in the options
>>>>
>>>>>
>>>>>
>>>>>
>>>>> I checked the release notes for 3.8.0 but I did not see any 
>>>>> caveats or
>>>>> compatibility warnings.
>>>>>
>>>>> Anyone else seeing issues with 3.8 clients mounting 3.7 volumes?
>>>>>
>>>>
>>>> Seems like it is due to this commit:
>>>>
>>>> commit 2bfdc30e0e7fba6f97d8829b2618a1c5907dc404
>>>> Author: Avra Sengupta
>>>> Date:   Mon Feb 29 14:43:58 2016 +0530
>>>>
>>>>     protocol client/server: Fix client-server handshake
>>>>
>>>> This commit introduced a new check to determine the existence of a 
>>>> key in the dictionary that gets exchanged between clients and 
>>>> servers during a handshake. Upon not finding the key, the clients 
>>>> bail out.
>>>>
>>>> Avra - would it be possible to avoid a hard check of 'child_up' 
>>>> during a handshake?
>>> Yes Vijay, This particular failure is because the client is 
>>> expecting a 'child_up' from the server during a handshake, to 
>>> determine if all children in the server are up and it's not just a 
>>> handshake. Although this is the ideal behaviour in which the 
>>> handshake should work, it is currently breaking backward 
>>> compatibility with 3.7 volumes, as those servers are not sending the 
>>> appropriate key which the newer client is expecting.
>>>
>>> I would prefer not to bypass this check in the client, but rather 
>>> enforce this check only for connections comming from servers running 
>>> 3.8.
>>>
>>> + Adding Raghavendra Gowdappa
>>>
>>> Raghavendra,
>>>
>>> Would it be possible to keep this check in the client specific to 
>>> servers running on 3.8 and beyond.
>>>>
>>>> Note that if servers are upgraded ahead of the clients, this 
>>>> problem should not be seen.
>>>>
>>>> Thanks,
>>>> Vijay
>>>>
>>>>
>>>
>>
>



More information about the Gluster-users mailing list