[Gluster-users] Rebuild Distributed/Replicated Setup

Remi Broemeling remi at goclio.com
Mon May 16 17:17:33 UTC 2011


I've got a distributed/replicated GlusterFS v3.1.2 (installed via RPM) setup
across two servers (web01 and web02) with the following vol config:

volume shared-application-data-client-0
    type protocol/client
    option remote-host web01
    option remote-subvolume /var/glusterfs/bricks/shared
    option transport-type tcp
    option ping-timeout 5

volume shared-application-data-client-1
    type protocol/client
    option remote-host web02
    option remote-subvolume /var/glusterfs/bricks/shared
    option transport-type tcp
    option ping-timeout 5

volume shared-application-data-replicate-0
    type cluster/replicate
    subvolumes shared-application-data-client-0

volume shared-application-data-write-behind
    type performance/write-behind
    subvolumes shared-application-data-replicate-0

volume shared-application-data-read-ahead
    type performance/read-ahead
    subvolumes shared-application-data-write-behind

volume shared-application-data-io-cache
    type performance/io-cache
    subvolumes shared-application-data-read-ahead

volume shared-application-data-quick-read
    type performance/quick-read
    subvolumes shared-application-data-io-cache

volume shared-application-data-stat-prefetch
    type performance/stat-prefetch
    subvolumes shared-application-data-quick-read

volume shared-application-data
    type debug/io-stats
    subvolumes shared-application-data-stat-prefetch

In total, four servers mount this via GlusterFS FUSE.  For whatever reason
(I'm really not sure why), the GlusterFS filesystem has run into a bit of
split-brain nightmare (although to my knowledge an actual split brain
situation has never occurred in this environment), and I have been getting
solidly corrupted issues across the filesystem as well as complaints that
the filesystem cannot be self-healed.

What I would like to do is completely empty one of the two servers (here I
am trying to empty server web01), making the other one (in this case web02)
the authoritative source for the data; and then have web01 completely
rebuild it's mirror directly from web02.

What's the easiest/safest way to do this?  Is there a command that I can run
that will force web01 to re-initialize it's mirror directly from web02 (and
thus completely eradicate all of the split-brain errors and data

