[Gluster-users] [URGENT] Add-bricks to a volume corrupted the files
Krutika Dhananjay
kdhananj at redhat.com
Mon Oct 17 05:02:39 UTC 2016
Hi,
No. I did run add-brick on a volume with the same configuration as that of
Kevin, while IO was running, except
that I wasn't running VM workload. I compared the file checksums wrt the
original src files from which they were copied
and they matched.
@Kevin,
I see that network.ping-timeout on your setup is 15 seconds and that's too
low. Could you reconfigure that to 30 seconds?
-Krutika
On Fri, Oct 14, 2016 at 9:07 PM, David Gossage <dgossage at carouselchecks.com>
wrote:
> Sorry to resurrect an old email but did any resolution occur for this or a
> cause found? I just see this as a potential task I may need to also run
> through some day and if their are pitfalls to watch for would be good to
> know.
>
> *David Gossage*
> *Carousel Checks Inc. | System Administrator*
> *Office* 708.613.2284
>
> On Tue, Sep 6, 2016 at 5:38 AM, Kevin Lemonnier <lemonnierk at ulrar.net>
> wrote:
>
>> Hi,
>>
>> Here is the info :
>>
>> Volume Name: VMs
>> Type: Replicate
>> Volume ID: c5272382-d0c8-4aa4-aced-dd25a064e45c
>> Status: Started
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: ips4adm.name:/mnt/storage/VMs
>> Brick2: ips5adm.name:/mnt/storage/VMs
>> Brick3: ips6adm.name:/mnt/storage/VMs
>> Options Reconfigured:
>> performance.readdir-ahead: on
>> cluster.quorum-type: auto
>> cluster.server-quorum-type: server
>> network.remote-dio: enable
>> cluster.eager-lock: enable
>> performance.quick-read: off
>> performance.read-ahead: off
>> performance.io-cache: off
>> performance.stat-prefetch: off
>> features.shard: on
>> features.shard-block-size: 64MB
>> cluster.data-self-heal-algorithm: full
>> network.ping-timeout: 15
>>
>>
>> For the logs I'm sending that over to you in private.
>>
>>
>> On Tue, Sep 06, 2016 at 09:48:07AM +0530, Krutika Dhananjay wrote:
>> > Could you please attach the glusterfs client and brick logs?
>> > Also provide output of `gluster volume info`.
>> > -Krutika
>> > On Tue, Sep 6, 2016 at 4:29 AM, Kevin Lemonnier <
>> lemonnierk at ulrar.net>
>> > wrote:
>> >
>> > >A A - What was the original (and current) geometry? (status and
>> info)
>> >
>> > It was a 1x3 that I was trying to bump to 2x3.
>> > >A A - what parameters did you use when adding the bricks?
>> > >
>> >
>> > Just a simple add-brick node1:/path node2:/path node3:/path
>> > Then a fix-layout when everything started going wrong.
>> >
>> > I was able to salvage some VMs by stopping them then starting them
>> > again,
>> > but most won't start for various reasons (disk corrupted, grub not
>> found
>> > ...).
>> > For those we are deleting the disks then importing them from
>> backups,
>> > that's
>> > a huge loss but everything has been down for so long, no choice ..
>> > >A A On 6/09/2016 8:00 AM, Kevin Lemonnier wrote:
>> > >
>> > >A I tried a fix-layout, and since that didn't work I removed the
>> brick
>> > (start then commit when it showed
>> > >A completed). Not better, the volume is now running on the 3
>> original
>> > bricks (replica 3) but the VMs
>> > >A are still corrupted. I have 880 Mb of shards on the bricks I
>> removed
>> > for some reason, thos shards do exist
>> > >A (and are bigger) on the "live" volume. I don't understand why
>> now
>> > that I have removed the new bricks
>> > >A everything isn't working like before ..
>> > >
>> > >A On Mon, Sep 05, 2016 at 11:06:16PM +0200, Kevin Lemonnier
>> wrote:
>> > >
>> > >A Hi,
>> > >
>> > >A I just added 3 bricks to a volume and all the VMs are doing I/O
>> > errors now.
>> > >A I rebooted a VM to see and it can't start again, am I missing
>> > something ? Is the reblance required
>> > >A to make everything run ?
>> > >
>> > >A That's urgent, thanks.
>> > >
>> > >A --
>> > >A Kevin Lemonnier
>> > >A PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
>> > >
>> > >
>> > >
>> > >
>> > >A _______________________________________________
>> > >A Gluster-users mailing list
>> > >A Gluster-users at gluster.org
>> > >A http://www.gluster.org/mailman/listinfo/gluster-users
>> > >
>> > >
>> > >
>> > >A _______________________________________________
>> > >A Gluster-users mailing list
>> > >A Gluster-users at gluster.org
>> > >A http://www.gluster.org/mailman/listinfo/gluster-users
>> > >
>> > >A --
>> > >A Lindsay Mathieson
>> >
>> > > _______________________________________________
>> > > Gluster-users mailing list
>> > > Gluster-users at gluster.org
>> > > http://www.gluster.org/mailman/listinfo/gluster-users
>> >
>> > --
>> > Kevin Lemonnier
>> > PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
>> > _______________________________________________
>> > Gluster-users mailing list
>> > Gluster-users at gluster.org
>> > http://www.gluster.org/mailman/listinfo/gluster-users
>>
>> --
>> Kevin Lemonnier
>> PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161017/4b7e55c6/attachment.html>
More information about the Gluster-users
mailing list