[Gluster-users] VM going down

Alessandro Briosi ab1 at metalit.com
Thu May 11 13:49:27 UTC 2017


Il 11/05/2017 14:09, Niels de Vos ha scritto:
> On Thu, May 11, 2017 at 12:35:42PM +0530, Krutika Dhananjay wrote:
>> Niels,
>>
>> Allesandro's configuration does not have shard enabled. So it has
>> definitely not got anything to do with shard not supporting seek fop.
> Yes, but in case sharding would have been enabled, the seek FOP would be
> handled correctly (detected as not supported at all).
>
> I'm still not sure how arbiter prevents doing shards though. We normally
> advise to use sharding *and* (optional) arbiter for VM workloads,
> arbiter without sharding has not been tested much. In addition, the seek
> functionality is only available in recent kernels, so there has been
> little testing on CentOS or similar enterprise Linux distributions.

Where is stated that arbiter should be used with sharding?
Or that arbiter functionality without sharding is still in "testing" phase?
I thought that having a 3 replica on a 3 nodes cluster would have been a
waste of space. (I can only support loosing 1 host at a time, and that's
fine.)

Anyway I had this happen also before with the same VM when there was no
arbiter, and I thought it was for some strange reason a "quorum" thing
which would trigger the file not beeing available in gluster, thogh
there were no clues in the logs.
So I added the arbiter brick, but it happened again last week.

The first VM I reported about going down was created on a volume with
arbiter enabled from the start, so I dubt it's something to do with arbiter.

I think it might have something to do with a load problem ? Though the
hosts are really not beeing used that much.

Anyway this is a brief description of my setup.

3 dell servers with RAID 10 SAS Disks
each server has 2 bonded 1Gbps ethernets dedicated to gluster (2
dedicated to the proxmox cluster and 2 for comunication with the hosts
on the LAN) (each on it's VLAN in the switch)
Also jumbo frames are enabled on ethernets and switches.

each server is a proxmox host which has gluster installed and configured
as server and client.

The RAID has a LVM thin provisioned which is divided into 3 bricks (2
big for the data and 1 small for the arbiter).
each Thin LVM is XFS formatted and mounted as brick.
There are 3 volumes configured which replicate 3 with arbiter (so 2
really holding the data).
Volumes are:
datastore1: data on srv1 and srv2, arbiter srv3
datastore2: data on srv2 and srv3, arbiter srv1
datastore3: data on srv1 and srv3, arbiter srv2

On each datastore basically there is a main VM (plus some others which
though are not so important). (3 VM are mainly important)

datastore1 was converted from 2 replica to 3 replica with arbiter, the
other 2 were created as described.

The VM on the first datastore crashed more times (even where there was
no arbiter, which I thought for some reason there was a split brain
which gluster could not handle).

Last week also the 2nd VM (on datastore2) crashed, and that's when I
started the thread (before as there were no special errors logged I
thought it could have been caused by something in the VM)

Till now the 3rd VM never crashed.

Still any help on this would be really appreciated.

I know it could also be a problem somewhere else, but I have other
setups without gluster which simply work.
That's why I want to start the VM with gdb, to check next time why the
kvm process shuts down.

Alessandro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170511/e2abf5f2/attachment.html>


More information about the Gluster-users mailing list