[Gluster-users] VM going down

Thu May 11 14:15:09 UTC 2017

On Thu, May 11, 2017 at 7:19 PM, Alessandro Briosi <ab1 at metalit.com> wrote:

> Il 11/05/2017 14:09, Niels de Vos ha scritto:
>
> On Thu, May 11, 2017 at 12:35:42PM +0530, Krutika Dhananjay wrote:
>
> Niels,
>
> Allesandro's configuration does not have shard enabled. So it has
> definitely not got anything to do with shard not supporting seek fop.
>
> Yes, but in case sharding would have been enabled, the seek FOP would be
> handled correctly (detected as not supported at all).
>
> I'm still not sure how arbiter prevents doing shards though. We normally
> advise to use sharding **and** (optional) arbiter for VM workloads,
> arbiter without sharding has not been tested much. In addition, the seek
> functionality is only available in recent kernels, so there has been
> little testing on CentOS or similar enterprise Linux distributions.
>
>
> Where is stated that arbiter should be used with sharding?
>

This information is inaccurate. arbiter can be used independent of sharding.

> Or that arbiter functionality without sharding is still in "testing" phase?
> I thought that having a 3 replica on a 3 nodes cluster would have been a
> waste of space. (I can only support loosing 1 host at a time, and that's
> fine.)
>
> Anyway I had this happen also before with the same VM when there was no
> arbiter, and I thought it was for some strange reason a "quorum" thing
> which would trigger the file not beeing available in gluster, thogh there
> were no clues in the logs.
> So I added the arbiter brick, but it happened again last week.
>
> The first VM I reported about going down was created on a volume with
> arbiter enabled from the start, so I dubt it's something to do with arbiter.
>
> I think it might have something to do with a load problem ? Though the
> hosts are really not beeing used that much.
>
> Anyway this is a brief description of my setup.
>
> 3 dell servers with RAID 10 SAS Disks
> each server has 2 bonded 1Gbps ethernets dedicated to gluster (2 dedicated
> to the proxmox cluster and 2 for comunication with the hosts on the LAN)
> (each on it's VLAN in the switch)
> Also jumbo frames are enabled on ethernets and switches.
>
> each server is a proxmox host which has gluster installed and configured
> as server and client.
>
> The RAID has a LVM thin provisioned which is divided into 3 bricks (2 big
> for the data and 1 small for the arbiter).
> each Thin LVM is XFS formatted and mounted as brick.
> There are 3 volumes configured which replicate 3 with arbiter (so 2 really
> holding the data).
> Volumes are:
> datastore1: data on srv1 and srv2, arbiter srv3
> datastore2: data on srv2 and srv3, arbiter srv1
> datastore3: data on srv1 and srv3, arbiter srv2
>
> On each datastore basically there is a main VM (plus some others which
> though are not so important). (3 VM are mainly important)
>
> datastore1 was converted from 2 replica to 3 replica with arbiter, the
> other 2 were created as described.
>
> The VM on the first datastore crashed more times (even where there was no
> arbiter, which I thought for some reason there was a split brain which
> gluster could not handle).
>
> Last week also the 2nd VM (on datastore2) crashed, and that's when I
> started the thread (before as there were no special errors logged I thought
> it could have been caused by something in the VM)
>
> Till now the 3rd VM never crashed.
>
> Still any help on this would be really appreciated.
>
> I know it could also be a problem somewhere else, but I have other setups
> without gluster which simply work.
> That's why I want to start the VM with gdb, to check next time why the kvm
> process shuts down.
>
> Alessandro
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>

-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170511/5513aac1/attachment.html>