[Gluster-devel] Release 4.0: Unable to complete rolling upgrade tests

Fri Mar 2 04:41:05 UTC 2018

+ Anoop.

It looks like clients on the old (3.12) nodes are not able to talk to 
the upgraded (4.0) node. I see messages like these on the old clients:

  [2018-03-02 03:49:13.483458] W [MSGID: 114007] 
[client-handshake.c:1197:client_setvolume_cbk] 0-testvol-client-2: 
failed to find key 'clnt-lk-version' in the options

Is there something more to be done on BZ 1544366?

-Ravi
On 03/02/2018 08:44 AM, Ravishankar N wrote:
>
> On 03/02/2018 07:26 AM, Shyam Ranganathan wrote:
>> Hi Pranith/Ravi,
>>
>> So, to keep a long story short, post upgrading 1 node in a 3 node 3.13
>> cluster, self-heal is not able to catch the heal backlog and this is a
>> very simple synthetic test anyway, but the end result is that upgrade
>> testing is failing.
>
> Let me try this now and get back. I had done some thing similar when 
> testing the FIPS patch and the rolling upgrade had worked.
> Thanks,
> Ravi
>>
>> Here are the details,
>>
>> - Using
>> https://hackmd.io/GYIwTADCDsDMCGBaArAUxAY0QFhBAbIgJwCMySIwJmAJvGMBvNEA#
>> I setup 3 server containers to install 3.13 first as follows (within the
>> containers)
>>
>> (inside the 3 server containers)
>> yum -y update; yum -y install centos-release-gluster313; yum install
>> glusterfs-server; glusterd
>>
>> (inside centos-glfs-server1)
>> gluster peer probe centos-glfs-server2
>> gluster peer probe centos-glfs-server3
>> gluster peer status
>> gluster v create patchy replica 3 centos-glfs-server1:/d/brick1
>> centos-glfs-server2:/d/brick2 centos-glfs-server3:/d/brick3
>> centos-glfs-server1:/d/brick4 centos-glfs-server2:/d/brick5
>> centos-glfs-server3:/d/brick6 force
>> gluster v start patchy
>> gluster v status
>>
>> Create a client container as per the document above, and mount the above
>> volume and create 1 file, 1 directory and a file within that directory.
>>
>> Now we start the upgrade process (as laid out for 3.13 here
>> http://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_3.13/ ):
>> - killall glusterfs glusterfsd glusterd
>> - yum install
>> http://cbs.centos.org/kojifiles/work/tasks/1548/311548/centos-release-gluster40-0.9-1.el7.centos.x86_64.rpm 
>>
>> - yum upgrade --enablerepo=centos-gluster40-test glusterfs-server
>>
>> < Go back to the client and edit the contents of one of the files and
>> change the permissions of a directory, so that there are things to heal
>> when we bring up the newly upgraded server>
>>
>> - gluster --version
>> - glusterd
>> - gluster v status
>> - gluster v heal patchy
>>
>> The above starts failing as follows,
>> [root at centos-glfs-server1 /]# gluster v heal patchy
>> Launching heal operation to perform index self heal on volume patchy has
>> been unsuccessful:
>> Commit failed on centos-glfs-server2.glfstest20. Please check log file
>> for details.
>> Commit failed on centos-glfs-server3. Please check log file for details.
>>
>>  From here, if further files or directories are created from the client,
>> they just get added to the heal backlog, and heal does not catchup.
>>
>> As is obvious, I cannot proceed, as the upgrade procedure is broken. The
>> issue itself may not be selfheal deamon, but something around
>> connections, but as the process fails here, looking to you guys to
>> unblock this as soon as possible, as we are already running a day's slip
>> in the release.
>>
>> Thanks,
>> Shyam
>