[Gluster-Maintainers] Release 4.0: Unable to complete rolling upgrade tests

Fri Mar 2 03:14:09 UTC 2018

On 03/02/2018 07:26 AM, Shyam Ranganathan wrote:
> Hi Pranith/Ravi,
>
> So, to keep a long story short, post upgrading 1 node in a 3 node 3.13
> cluster, self-heal is not able to catch the heal backlog and this is a
> very simple synthetic test anyway, but the end result is that upgrade
> testing is failing.

Let me try this now and get back. I had done some thing similar when 
testing the FIPS patch and the rolling upgrade had worked.
Thanks,
Ravi
>
> Here are the details,
>
> - Using
> https://hackmd.io/GYIwTADCDsDMCGBaArAUxAY0QFhBAbIgJwCMySIwJmAJvGMBvNEA#
> I setup 3 server containers to install 3.13 first as follows (within the
> containers)
>
> (inside the 3 server containers)
> yum -y update; yum -y install centos-release-gluster313; yum install
> glusterfs-server; glusterd
>
> (inside centos-glfs-server1)
> gluster peer probe centos-glfs-server2
> gluster peer probe centos-glfs-server3
> gluster peer status
> gluster v create patchy replica 3 centos-glfs-server1:/d/brick1
> centos-glfs-server2:/d/brick2 centos-glfs-server3:/d/brick3
> centos-glfs-server1:/d/brick4 centos-glfs-server2:/d/brick5
> centos-glfs-server3:/d/brick6 force
> gluster v start patchy
> gluster v status
>
> Create a client container as per the document above, and mount the above
> volume and create 1 file, 1 directory and a file within that directory.
>
> Now we start the upgrade process (as laid out for 3.13 here
> http://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_3.13/ ):
> - killall glusterfs glusterfsd glusterd
> - yum install
> http://cbs.centos.org/kojifiles/work/tasks/1548/311548/centos-release-gluster40-0.9-1.el7.centos.x86_64.rpm
> - yum upgrade --enablerepo=centos-gluster40-test glusterfs-server
>
> < Go back to the client and edit the contents of one of the files and
> change the permissions of a directory, so that there are things to heal
> when we bring up the newly upgraded server>
>
> - gluster --version
> - glusterd
> - gluster v status
> - gluster v heal patchy
>
> The above starts failing as follows,
> [root at centos-glfs-server1 /]# gluster v heal patchy
> Launching heal operation to perform index self heal on volume patchy has
> been unsuccessful:
> Commit failed on centos-glfs-server2.glfstest20. Please check log file
> for details.
> Commit failed on centos-glfs-server3. Please check log file for details.
>
>  From here, if further files or directories are created from the client,
> they just get added to the heal backlog, and heal does not catchup.
>
> As is obvious, I cannot proceed, as the upgrade procedure is broken. The
> issue itself may not be selfheal deamon, but something around
> connections, but as the process fails here, looking to you guys to
> unblock this as soon as possible, as we are already running a day's slip
> in the release.
>
> Thanks,
> Shyam