[Gluster-users] corruption using gluster and iSCSI with LIO

Olivier Lambert lambert.olivier at gmail.com
Fri Nov 18 01:43:12 UTC 2016


Attached, bricks log. Where could I find the fuse client log?

On Fri, Nov 18, 2016 at 2:22 AM, Krutika Dhananjay <kdhananj at redhat.com> wrote:
> Could you attach the fuse client and brick logs?
>
> -Krutika
>
> On Fri, Nov 18, 2016 at 6:12 AM, Olivier Lambert <lambert.olivier at gmail.com>
> wrote:
>>
>> Okay, used the exact same config you provided, and adding an arbiter
>> node (node3)
>>
>> After halting node2, VM continues to work after a small "lag"/freeze.
>> I restarted node2 and it was back online: OK
>>
>> Then, after waiting few minutes, halting node1. And **just** at this
>> moment, the VM is corrupted (segmentation fault, /var/log folder empty
>> etc.)
>>
>> dmesg of the VM:
>>
>> [ 1645.852905] EXT4-fs error (device xvda1):
>> htree_dirblock_to_tree:988: inode #19: block 8286: comm bash: bad
>> entry in directory: rec_len is smaller than minimal - offset=0(0),
>> inode=0, rec_len=0, name_len=0
>> [ 1645.854509] Aborting journal on device xvda1-8.
>> [ 1645.855524] EXT4-fs (xvda1): Remounting filesystem read-only
>>
>> And got a lot of " comm bash: bad entry in directory" messages then...
>>
>> Here is the current config with all Node back online:
>>
>> # gluster volume info
>>
>> Volume Name: gv0
>> Type: Replicate
>> Volume ID: 5f15c919-57e3-4648-b20a-395d9fe3d7d6
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: 10.0.0.1:/bricks/brick1/gv0
>> Brick2: 10.0.0.2:/bricks/brick1/gv0
>> Brick3: 10.0.0.3:/bricks/brick1/gv0 (arbiter)
>> Options Reconfigured:
>> nfs.disable: on
>> performance.readdir-ahead: on
>> transport.address-family: inet
>> features.shard: on
>> features.shard-block-size: 16MB
>> network.remote-dio: enable
>> cluster.eager-lock: enable
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> performance.stat-prefetch: on
>> performance.strict-write-ordering: off
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> cluster.data-self-heal: on
>>
>>
>> # gluster volume status
>> Status of volume: gv0
>> Gluster process                             TCP Port  RDMA Port  Online
>> Pid
>>
>> ------------------------------------------------------------------------------
>> Brick 10.0.0.1:/bricks/brick1/gv0           49152     0          Y
>> 1331
>> Brick 10.0.0.2:/bricks/brick1/gv0           49152     0          Y
>> 2274
>> Brick 10.0.0.3:/bricks/brick1/gv0           49152     0          Y
>> 2355
>> Self-heal Daemon on localhost               N/A       N/A        Y
>> 2300
>> Self-heal Daemon on 10.0.0.3                N/A       N/A        Y
>> 10530
>> Self-heal Daemon on 10.0.0.2                N/A       N/A        Y
>> 2425
>>
>> Task Status of Volume gv0
>>
>> ------------------------------------------------------------------------------
>> There are no active volume tasks
>>
>>
>>
>> On Thu, Nov 17, 2016 at 11:35 PM, Olivier Lambert
>> <lambert.olivier at gmail.com> wrote:
>> > It's planned to have an arbiter soon :) It was just preliminary tests.
>> >
>> > Thanks for the settings, I'll test this soon and I'll come back to you!
>> >
>> > On Thu, Nov 17, 2016 at 11:29 PM, Lindsay Mathieson
>> > <lindsay.mathieson at gmail.com> wrote:
>> >> On 18/11/2016 8:17 AM, Olivier Lambert wrote:
>> >>>
>> >>> gluster volume info gv0
>> >>>
>> >>> Volume Name: gv0
>> >>> Type: Replicate
>> >>> Volume ID: 2f8658ed-0d9d-4a6f-a00b-96e9d3470b53
>> >>> Status: Started
>> >>> Snapshot Count: 0
>> >>> Number of Bricks: 1 x 2 = 2
>> >>> Transport-type: tcp
>> >>> Bricks:
>> >>> Brick1: 10.0.0.1:/bricks/brick1/gv0
>> >>> Brick2: 10.0.0.2:/bricks/brick1/gv0
>> >>> Options Reconfigured:
>> >>> nfs.disable: on
>> >>> performance.readdir-ahead: on
>> >>> transport.address-family: inet
>> >>> features.shard: on
>> >>> features.shard-block-size: 16MB
>> >>
>> >>
>> >>
>> >> When hosting VM's its essential to set these options:
>> >>
>> >> network.remote-dio: enable
>> >> cluster.eager-lock: enable
>> >> performance.io-cache: off
>> >> performance.read-ahead: off
>> >> performance.quick-read: off
>> >> performance.stat-prefetch: on
>> >> performance.strict-write-ordering: off
>> >> cluster.server-quorum-type: server
>> >> cluster.quorum-type: auto
>> >> cluster.data-self-heal: on
>> >>
>> >> Also with replica two and quorum on (required) your volume will become
>> >> read-only when one node goes down to prevent the possibility of
>> >> split-brain
>> >> - you *really* want to avoid that :)
>> >>
>> >> I'd recommend a replica 3 volume, that way 1 node can go down, but the
>> >> other
>> >> two still form a quorum and will remain r/w.
>> >>
>> >> If the extra disks are not possible, then a Arbiter volume can be setup
>> >> -
>> >> basically dummy files on the third node.
>> >>
>> >>
>> >>
>> >> --
>> >> Lindsay Mathieson
>> >>
>> >> _______________________________________________
>> >> Gluster-users mailing list
>> >> Gluster-users at gluster.org
>> >> http://www.gluster.org/mailman/listinfo/gluster-users
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bricks-brick1-gv0.log
Type: text/x-log
Size: 60687 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161118/8e4484dd/attachment.bin>


More information about the Gluster-users mailing list