[Gluster-Maintainers] Upgrade issue when new mem type is added in libglusterfs

Mon Jul 11 07:49:51 UTC 2016

On Mon, Jul 11, 2016 at 12:56:24PM +0530, Kaushal M wrote:
> On Sat, Jul 9, 2016 at 10:02 PM, Atin Mukherjee <amukherj at redhat.com> wrote:
> > We have hit a bug 1347250 in downstream (applicable upstream too) where it
> > was seen that glusterd didnt regenerate the volfiles when it was interimly
> > brought up with upgrade mode by yum. Log file captured that gsyncd --version
> > failed to execute and hence glusterd init couldnt proceed till the volfile
> > regeneration. Since the ret code is not handled here in spec file users
> > wouldnt come to know about this and going forward this is going to cause
> > major issues in healing and all and finally it exploits the possibility of
> > split brains at its best.
> >
> > Further analysis by Kotresh & Raghavendra Talur reveals that gsyncd failed
> > here because of the compatibility issue where gsyncd was still not upgraded
> > where as glusterfs-server was and this failure was mainly because of change
> > in the mem type enum. We have seen a similar issue for RDMA as well
> > (probably a year back). So to be very generic this can happen in any upgrade
> > path from one version to another where new mem type is introduced. We have
> > seen this from 3.7.8 to 3.7.12 and 3.8. People upgrading from 3.6 to 3.7/3.8
> > will also experience this issue.
> >
> > Till we work on this fix, I suggest all the release managers to highlight
> > this in the release note of the latest releases with the following work
> > around after yum update:
> >
> > 1. grep -irns "geo-replication module not working as desired"
> > /var/log/glusterfs/etc-glusterfs-glusterd.vol.log | wc -l
> >
> >  If the output is non-zero, then go to step 2 else follow the rest of the
> > steps as per the guide.
> >
> > 2.Check if glusterd instance is running or not by 'ps aux | grep glusterd',
> > if it is, then stop the glusterd service.
> >
> >  3. glusterd --xlator-option *.upgrade=on -N
> >
> > and then proceed ahead with rest of the steps as per the guide.
> >
> > Thoughts?
> 
> Proper .so versioning of libglusterfs should help with problems like
> this. I don't know how to do this though.

We could provde the 'current' version of libglusterfs with the same
number as the op-version. For 3.7.13 it would be 030713, dropping the
prefixed 0 makes that 30713, so libglusterfs.so.30713. The same should
probably be done for all other internal libraries.

Some more details about library versioning can be found here:
  https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/versioning.md

Note that libgfapi uses symbol versioning, that is a more fine-grained
solution. It prevents the need for applications using the library to get
re-compiled. Details about that, and the more involved changes to get
that to work correctly are in this document:
  https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/gfapi-symbol-versions.md

Is there already a bug filed to get this fixed?

Thanks,
Niels
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/maintainers/attachments/20160711/1ed960f3/attachment.sig>