[Gluster-users] Issue when upgrading from 3.6 to 3.7

Wed Jul 27 11:35:47 UTC 2016

Thanks a lot! Yes, I did upgrade to 3.7.13 but was unaware of the new
cluster op-version. Could this incorrect op-version have been the cause for
some of the peers being in the rejected state after an upgrade?

On Wed, Jul 27, 2016 at 3:49 PM, Manikandan Selvaganesh <mselvaga at redhat.com
> wrote:

> Hi,
>
> Sorry for the delay. Apparently, from your config files in the
> /var/lib/glusterd/glusterd.info the operating-version
> is still 30700.  We have implemented quota-versioning in 3.7.6 and we have
> another feature(enhancing quota
> enable/disable performance improvements) implemented in 3.7.12.
>
> To use these features, you need to bump up the op version after the
> upgrade by doing
> 'gluster v set all cluster.op-version 30712(In case of 3.7.12). I guess
> this would fix the problem you reported.
> Let us know otherwise. If this does not fix the issue, please revert us
> back with the logs.
>
> --
> Regards,
> Manikandan Selvaganesh.
>
>
> On Wed, Jul 27, 2016 at 10:51 AM, Manikandan Selvaganesh <
> mselvaga at redhat.com> wrote:
>
>> Hi Ram,
>>
>> Apologies. I was stuck on something else. I will update you within the
>> EOD.
>>
>> On Wed, Jul 27, 2016 at 10:11 AM, B.K.Raghuram <bkrram at gmail.com> wrote:
>>
>>> Hi Manikandan,
>>>
>>> Did you have a chance to look at the glusterd config files? We've tried
>>> a couple of times to upgrade from 3.6.1 and the vol info files never seems
>>> to get a quota-version flag in it.. One of our installations is stuck at
>>> the old version because of potential upgrade issues to 3.7.13.
>>>
>>> Thanks,
>>> -Ram
>>>
>>> On Mon, Jul 25, 2016 at 6:40 PM, Manikandan Selvaganesh <
>>> mselvaga at redhat.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> It would work fine with the upgraded setup on a fresh install. And yes,
>>>> if quota-version is not present it would cause malfunctioning such as
>>>> checksum issue, peer rejection and quota would not work properly. This
>>>> quota-version is introduced recently which adds suffix to the quota related
>>>> extended attributes.
>>>>
>>>> On Jul 25, 2016 6:36 PM, "B.K.Raghuram" <bkrram at gmail.com> wrote:
>>>>
>>>>> Manikandan,
>>>>>
>>>>> We just overwrote the setup with a fresh install and there I see the
>>>>> quota-version in the volume info file. For the upgraded setup, I only have
>>>>> the /var/lib/glusterd, which I'm attaching. Once we recreate this, I'll
>>>>> send you the rest of the info.
>>>>>
>>>>> However, is there an issue if the quota-version is not being in the
>>>>> info file? Will it cause the quota functionality to malfunction?
>>>>>
>>>>> On Mon, Jul 25, 2016 at 5:41 PM, Manikandan Selvaganesh <
>>>>> mselvaga at redhat.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Could you please attach the vol files, log files and the output of
>>>>>> gluster v info?
>>>>>>
>>>>>> On Mon, Jul 25, 2016 at 5:35 PM, Atin Mukherjee <amukherj at redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 25, 2016 at 4:37 PM, B.K.Raghuram <bkrram at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Atin,
>>>>>>>>
>>>>>>>> Couple of quick questions about the upgrade and in general about
>>>>>>>> the meaning of some of the parameters in the glusterd dir..
>>>>>>>>
>>>>>>>> - I dont see the quota-version in the volume info file post
>>>>>>>> upgrade, so did the upgrade not go through properly?
>>>>>>>>
>>>>>>>
>>>>>>> If you are seeing a check sum issue you'd need to copy the same
>>>>>>> volume info file to that node where the checksum went wrong and then
>>>>>>> restart glusterd service.
>>>>>>> And yes, this looks like a bug in quota. @Mani - time to chip in :)
>>>>>>>
>>>>>>> - What does the op-version in the volume info file mean? Does this
>>>>>>>> have any corelation with the cluster op-version? Does it change with an
>>>>>>>> upgrade?
>>>>>>>>
>>>>>>>
>>>>>>> volume's op-version is different. This is basically used in checking
>>>>>>> client's compatibility and it shouldn't change with an upgrade AFAIK and
>>>>>>> remember from the code.
>>>>>>>
>>>>>>>
>>>>>>>> - A more basic question - should all peer probes always be done
>>>>>>>> from the same node or can they be done from any node that is already in the
>>>>>>>> cluster? The reason I ask is when I tried to do what was said in
>>>>>>>> http://gluster-documentations.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/
>>>>>>>> the initial cluster was initiated from node A with 5 other peers. Then post
>>>>>>>> upgrade, node B which was in the cluster got a peer rejected. So I deleted
>>>>>>>> all the files except glusterd.info and then did a peer probe of A
>>>>>>>> from B. Then when I ran a peer status on A, it only showed one node, B.
>>>>>>>> Should I have probed B from A instead?
>>>>>>>>
>>>>>>>
>>>>>>>  peer probe can be done from any node in the trusted storage pool.
>>>>>>> So that's really not the issue. Ensure you keep all your peer file contents
>>>>>>> through out the same (/var/lib/glusterd/peers) where as only self uuid
>>>>>>> differs and then restarting glusterd service should solve the problem.
>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Jul 23, 2016 at 10:48 AM, Atin Mukherjee <
>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>
>>>>>>>>> I am suspecting it to be new quota-version introduced in the
>>>>>>>>> volume info file which may have resulted in a checksum mismatch resulting
>>>>>>>>> into peer rejection. But we can confirm it from log files and respective
>>>>>>>>> info file content.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Saturday 23 July 2016, B.K.Raghuram <bkrram at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Unfortunately, the setup is at a customer's place which is not
>>>>>>>>>> remotely accessible. Will try and get it by early next week. But could it
>>>>>>>>>> just be a mismatch of the /var/lib/glusterd files?
>>>>>>>>>>
>>>>>>>>>> On Fri, Jul 22, 2016 at 8:07 PM, Atin Mukherjee <
>>>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Glusterd logs from all the nodes please?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Friday 22 July 2016, B.K.Raghuram <bkrram at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> When we upgrade some nodes from 3.6.1 to 3.7.13, some of the
>>>>>>>>>>>> nodes give a peer status of "peer rejected" while some dont. Is there a
>>>>>>>>>>>> reason for this discrepency and will the steps mentioned in
>>>>>>>>>>>> http://gluster-documentations.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/
>>>>>>>>>>>> work for this as well?
>>>>>>>>>>>>
>>>>>>>>>>>> Just out of curiosity, why the line "Try the whole procedure a
>>>>>>>>>>>> couple more times if it doesn't work right away." in the link above?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Atin
>>>>>>>>>>> Sent from iPhone
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Atin
>>>>>>>>> Sent from iPhone
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> --Atin
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> Gluster-users at gluster.org
>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Manikandan Selvaganesh.
>>>>>>
>>>>>
>>>>>
>>>
>>
>>
>> --
>> Regards,
>> Manikandan Selvaganesh.
>>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160727/24e3e249/attachment.html>