[Gluster-users] NFS crashes - bug 1010241

Thu Nov 20 03:45:11 UTC 2014

Also if OP is on non-supported gluster 3.4.x rather than RHSS or at 
least 3.5.x, and given sufficient space, how about taking enough hosts 
out of the cluster to bring fully up to date and store the data, syncing 
the data across, updating the originals, syncing back and then adding 
back the hosts you took out to to the first backup?

On 20/11/14 01:53, Ravishankar N wrote:
> On 11/19/2014 10:11 PM, Shawn Heisey wrote:
>> We are running into this crash stacktrace on 3.4.2.
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1010241
>>
>> The NFS process dies with no predictability.  I've written a shell
>> script that detects the crash and runs a process to completely kill all
>> gluster processes and restart glusterd, which has eliminated
>> customer-facing fallout from these problems.
>
> No kill required. `gluster volume start <volname> force` should re-spawn the dead processes.
>
>
>> Because of continual stability problems from day one, the gluster
>> storage is being phased out, but there are many terabytes of data still
>> used there.  It would be nice to have it remain stable while we still
>> use it.  As soon as we can fully migrate all data to another storage
>> solution, the gluster machines will be decommissioned.
>>
>> That BZ id is specific to version 3.6, and it's always difficult for
>> mere mortals to determine which fixes have been backported to earlier
>> releases.
>>
> A (not so?)  easy way is to clone the source, checkout into the desired branch and grep the git-log for the commit message you're interested in.
>
>> Has the fix for bug 1010241 been backported to any 3.4 release?
> I just did the grep and no it's not. I don't know if a backport is possible.(CC'ed the respective devs). The (two) fixes are present in 3.5 though.
>
>
>    If so,
>> is it possible for me to upgrade my servers without being concerned
>> about the distributed+replicated volume going offline?  When we upgraded
>> from 3.3 to 3.4, the volume was not fully functional as soon as we
>> upgraded one server, and did not become fully functional until all
>> servers were upgraded and rebooted.
>>
>> Assuming again that there is a 3.4 version with the fix ... the gluster
>> peers that I use for NFS do not have any bricks.  Would I need to
>> upgrade ALL the servers, or could I get away with just upgrading the
>> servers that are being used for NFS?
>
> Heterogeneous op-version cluster is not supported. You would need to upgrade all servers.
>
> http://www.gluster.org/community/documentation/index.php/Upgrade_to_3.5
>
> Thanks,
> Ravi
>
>> Thanks,
>> Shawn
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>

-- 
This message is intended only for the addressee and may contain
confidential information. Unless you are that person, you may not
disclose its contents or use it in any way and are requested to delete
the message along with any attachments and notify us immediately.
"Transact" is operated by Integrated Financial Arrangements plc. 29
Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) 7608
5300. (Registered office: as above; Registered in England and Wales
under number: 3727592). Authorised and regulated by the Financial
Conduct Authority (entered on the Financial Services Register; no. 190856).