[Gluster-users] Geo-Replication - Changelog socket is not present - Falling back to xsync
PEPONNET, Cyril N (Cyril)
cyril.peponnet at alcatel-lucent.com
Thu May 28 15:54:54 UTC 2015
Hi Kotresh,
Inline.
Again, thank for you time.
--
Cyril Peponnet
> On May 27, 2015, at 10:47 PM, Kotresh Hiremath Ravishankar <khiremat at redhat.com> wrote:
>
> Hi Cyril,
>
> Replies inline.
>
> Thanks and Regards,
> Kotresh H R
>
> ----- Original Message -----
>> From: "Cyril N PEPONNET (Cyril)" <cyril.peponnet at alcatel-lucent.com>
>> To: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
>> Cc: "gluster-users" <gluster-users at gluster.org>
>> Sent: Wednesday, May 27, 2015 9:28:00 PM
>> Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not present - Falling back to xsync
>>
>> Hi and thanks again for those explanation.
>>
>> Due to lot of missing files and not up to date (with gfid mismatch some
>> time), I reset the index (or I think I do) by:
>>
>> deleting the geo-reop, reset geo-replication.indexing (set it to off does not
>> work for me), and recreate it again.
>>
> Resetting index does not initiate geo-replication from the version changelog is
> introduced. It works only for the versions prior to it.
>
> NOTE 1: Recreation of geo-rep session will work only if slave doesn't contain
> file with mismatch gfids. If there are, slave should be cleaned up
> before recreating.
I started it again to transfert missing files Ill take of gfid missmatch afterward. Our vol is almost 5TB and it took almost 2 month to crawl to the slave I did’nt want to start over :/
>
> NOTE 2: Another method exists now to initiate a full sync. It also expects slave
> files should not be in gfid mismatch state (meaning, slave volume should not
> written by any other means other than geo-replication). The method is to
> reset stime on all the bricks of master.
>
>
> Following are the steps to trigger full sync!!!. Let me know if any comments/doubts.
> ================================================
> 1. Stop geo-replication
> 2. Remove stime extended attribute all the master brick root using following command.
> setfattr -x trusted.glusterfs.<MASTER_VOL_UUID>.<SLAVE_VOL_UUID>.stime <brick-root>
> NOTE: 1. If AFR is setup, do this for all replicated set
>
> 2. Above mentioned stime key can be got as follows:
> Using 'gluster volume info <mastervol>', get all brick paths and dump all the
> extended attributes, using 'getfattr -d -m . -e hex <brick-path>', which will
> dump stime key which should be removed.
>
> 3. The technique, re-triggers complete sync. It involves complete xsync crawl.
> If there are rename issues, it might hit the rsync error on complete re-sync as well.
> So it is recommended, if the problematic files on slaves are known, remove them and initiate
> complete sync.
Is complete sync will send again the data if present of not ? How to track down rename issue ? master is a living volume with lot of creation / rename / deletion.
>
> 3. Start geo-replicatoin.
>
> The above technique can also be used to trigger data sync only on one particular brick.
> Just removing stime extended attribute only on brick root of master to be synced will
> do. If AFR is setup, remove stime on all replicated set of bricks.
>
> ================================
>
>
>> So for now it’s still in hybrid crawl process.
>>
>> I end up with that because some entire folder where not synced up by the
>> first hybrid crawl (and touch does nothing afterward in changelog). In fact
>> touch anyfile doesnt trigger any resync, only delete/rename/change do.
>>
>
> In newer geo-replication, from the version history crawl is introduced, xsync
> crawl is minimized. Once it reaches the timestamp where it gets the historical changelogs,
> it starts using history changelogs. Touch will be recorded as SETATTR in Changelog so
> Geo-rep will not sync the data. So the new virtual setattr interface is introduced
> which is mentioned in previous mail.
>
>> 1/
>>> 1. Directories:
>>> #setfattr -n glusterfs.geo-rep.trigger-sync -v "1" <DIR>
>>> 2. Files:
>>> #setfattr -n glusterfs.geo-rep.trigger-sync -v “1" <file-path>
>>
>> Is is recursive ? (for directories) or I have to do that on each mismatching
>> files ? Should I do that on master or slave ?
>>
>
> No, it is not recursive, it should be done for every missing files and directories.
> And directories should be done before the files inside it.
> It should be done on master.
I don’t understand the difference between setfattr -n glusterfs.geo-rep.trigger-sync -v “1” <DIR> (vol level) and setfattr -x trusted.glusterfs.<MASTER_VOL_UUID>.<SLAVE_VOL_UUID>.stime <brick-root> (brick level)
>
>> 2/ For the RO I can pass the Option: nfs.volume-access to read-only, this
>> will pass the vol in RO for nfs mount and glusterfs mount. Correct ?
>>
> Yes, that should do.
Cool ! Thanks!
>
>> Thank you so much for your help.
>> --
>> Cyril Peponnet
>>
>>> On May 26, 2015, at 11:29 PM, Kotresh Hiremath Ravishankar
>>> <khiremat at redhat.com> wrote:
>>>
>>> Hi Cyril,
>>>
>>> Need some clarifications. Comments inline.
>>>
>>> Thanks and Regards,
>>> Kotresh H R
>>>
>>> ----- Original Message -----
>>>> From: "Cyril N PEPONNET (Cyril)" <cyril.peponnet at alcatel-lucent.com>
>>>> To: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
>>>> Cc: "gluster-users" <gluster-users at gluster.org>
>>>> Sent: Tuesday, May 26, 2015 11:43:44 PM
>>>> Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not
>>>> present - Falling back to xsync
>>>>
>>>> So, changelog is still active but I notice that some file were missing.
>>>>
>>>> So I ‘m running a rsync -avn between the two vol (master and slave) to
>>>> sync
>>>> then again by touching the missing files (hopping geo-rep will do the
>>>> rest).
>>>>
>>> Are you running rsync -avn for missed files between master and slave
>>> volumes ?
>>> If yes, that is dangerous and it should not be done. Geo-replication
>>> demands gfid
>>> of files between master and slave to be intact (meaning the gfid of
>>> 'file1' in
>>> master vol should be same as 'file1' in slave). It is required because,
>>> the data sync
>>> happens using 'gfid' not the 'pathname' of the file. So if manual rsync is
>>> used
>>> to sync files between master and slave using pathname, gfids will change
>>> and
>>> further syncing on those files fails through geo-rep.
>>>
>>> A virtual setxattr interface is provided to sync missing files through
>>> geo-replication.
>>> It makes sure gfids are intact.
>>>
>>> NOTE: Directories have to be synced to slave before trying setxattr for
>>> files inside it.
>>>
>>> 1. Directories:
>>> #setfattr -n glusterfs.geo-rep.trigger-sync -v "1" <DIR>
>>> 2. Files:
>>> #setfattr -n glusterfs.geo-rep.trigger-sync -v "1" <file-path>
>>>
>>>> One question, can I pass the slave vol a RO ? Because if somebody change a
>>>> file in the slave it’s no longer synced (changes and delete but rename
>>>> keep
>>>> synced between master and slave).
>>>>
>>>> Will it have an impact on geo-replication process if I pass the slave vol
>>>> a
>>>> RO ?
>>>
>>> Again if slave volume is modified by something else other than geo-rep, we
>>> might
>>> end up in mismatch of gfids. So exposing the slave volume to consumers as
>>> RO is always
>>> a good idea. It doesn't affect geo-rep as it internally mounts in RW.
>>>
>>> Hope this helps. Let us know if anything else. We are happy to help you.
>>>>
>>>> Thanks again.
>>>>
>>>>
>>>> --
>>>> Cyril Peponnet
>>>>
>>>> On May 25, 2015, at 12:43 AM, Kotresh Hiremath Ravishankar
>>>> <khiremat at redhat.com<mailto:khiremat at redhat.com>> wrote:
>>>>
>>>> Hi Cyril,
>>>>
>>>> Answers inline
>>>>
>>>> Thanks and Regards,
>>>> Kotresh H R
>>>>
>>>> ----- Original Message -----
>>>> From: "Cyril N PEPONNET (Cyril)"
>>>> <cyril.peponnet at alcatel-lucent.com<mailto:cyril.peponnet at alcatel-lucent.com>>
>>>> To: "Kotresh Hiremath Ravishankar"
>>>> <khiremat at redhat.com<mailto:khiremat at redhat.com>>
>>>> Cc: "gluster-users"
>>>> <gluster-users at gluster.org<mailto:gluster-users at gluster.org>>
>>>> Sent: Friday, May 22, 2015 9:34:47 PM
>>>> Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not
>>>> present - Falling back to xsync
>>>>
>>>> One last question, correct me if I’m wrong.
>>>>
>>>> When you start a geo-rep process it starts with xsync aka hybrid crawling
>>>> (sending files every 60s, with files windows set as 8192 files per sent).
>>>>
>>>> When the crawl is done it should use changelog detector and dynamically
>>>> change things to slaves.
>>>>
>>>> 1/ During the hybride crawl, if we delete files from master (and they were
>>>> already transfered to the slave), xsync process will not delete them from
>>>> the slave (and we can’t change as the option as is hardcoded).
>>>> When it will pass to changelog, will it remove the non existent folders
>>>> and
>>>> files on the slave that are no longer on the master ?
>>>>
>>>>
>>>> You are right, xsync does not sync delete files, once it is already
>>>> synced.
>>>> After xsync, when it switches to changelog, it doesn't delete all the non
>>>> existing
>>>> entries on slave that are no longer on the master. Changelog is capable of
>>>> deleting
>>>> files from the time it got switched to changelog.
>>>>
>>>> 2/ With changelog, if I add a file of 10GB and after a file of 1KB, will
>>>> the
>>>> changelog process with queue (waiting for the 10GB file to be sent) or are
>>>> the sent done in thread ?
>>>> (ex I add a 10GB file and I delete it after 1min, what will happen ?)
>>>>
>>>> Changelog records the operations happened in master and is replayed by
>>>> geo-replication
>>>> on to slave volume. Geo-replication syncs files in two phases.
>>>>
>>>> 1. Phase-1: Create entries through RPC( 0 byte files on slave keeping
>>>> gfid
>>>> intact as in master)
>>>> 2. Phase-2: Sync data, through rsync/tar_over_ssh (Multi threaded)
>>>>
>>>> Ok, now keeping that in mind, Phase-1 happens serially, and the phase two
>>>> happens parallely.
>>>> Zero byte files of 10GB and 1KB gets created on slave serially and data
>>>> for
>>>> the same syncs
>>>> parallely. Another thing to remember, geo-rep makes sure that, syncing
>>>> data
>>>> to file is tried
>>>> only after zero byte file for the same is created already.
>>>>
>>>>
>>>> In latest release 3.7, xsync crawl is minimized by the feature called
>>>> history
>>>> crawl introduced in 3.6.
>>>> So the chances of missing deletes/renames are less.
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> Cyril Peponnet
>>>>
>>>> On May 21, 2015, at 10:22 PM, Kotresh Hiremath Ravishankar
>>>> <khiremat at redhat.com<mailto:khiremat at redhat.com>> wrote:
>>>>
>>>> Great, hope that should work. Let's see
>>>>
>>>> Thanks and Regards,
>>>> Kotresh H R
>>>>
>>>> ----- Original Message -----
>>>> From: "Cyril N PEPONNET (Cyril)"
>>>> <cyril.peponnet at alcatel-lucent.com<mailto:cyril.peponnet at alcatel-lucent.com>>
>>>> To: "Kotresh Hiremath Ravishankar"
>>>> <khiremat at redhat.com<mailto:khiremat at redhat.com>>
>>>> Cc: "gluster-users"
>>>> <gluster-users at gluster.org<mailto:gluster-users at gluster.org>>
>>>> Sent: Friday, May 22, 2015 5:31:13 AM
>>>> Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not
>>>> present - Falling back to xsync
>>>>
>>>> Thanks to JoeJulian / Kaushal I managed to re-enable the changelog option
>>>> and
>>>> the socket is now present.
>>>>
>>>> For the record I had some clients running rhs gluster-fuse and our nodes
>>>> are
>>>> running glusterfs release and op-version are not “compatible”.
>>>>
>>>> Now I have to wait for the init crawl see if it switches to changelog
>>>> detector mode.
>>>>
>>>> Thanks Kotresh
>>>> --
>>>> Cyril Peponnet
>>>>
>>>> On May 21, 2015, at 8:39 AM, Cyril Peponnet
>>>> <cyril.peponnet at alcatel-lucent.com<mailto:cyril.peponnet at alcatel-lucent.com>>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Unfortunately,
>>>>
>>>> # gluster vol set usr_global changelog.changelog off
>>>> volume set: failed: Staging failed on
>>>> mvdcgluster01.us.alcatel-lucent.com<http://mvdcgluster01.us.alcatel-lucent.com>.
>>>> Error: One or more connected clients cannot support the feature being
>>>> set.
>>>> These clients need to be upgraded or disconnected before running this
>>>> command again
>>>>
>>>>
>>>> I don’t know really why, I have some clients using 3.6 as fuse client
>>>> others are running on 3.5.2.
>>>>
>>>> Any advice ?
>>>>
>>>> --
>>>> Cyril Peponnet
>>>>
>>>> On May 20, 2015, at 5:17 AM, Kotresh Hiremath Ravishankar
>>>> <khiremat at redhat.com<mailto:khiremat at redhat.com>> wrote:
>>>>
>>>> Hi Cyril,
>>>>
>>>> From the brick logs, it seems the changelog-notifier thread has got
>>>> killed
>>>> for some reason,
>>>> as notify is failing with EPIPE.
>>>>
>>>> Try the following. It should probably help:
>>>> 1. Stop geo-replication.
>>>> 2. Disable changelog: gluster vol set <master-vol-name>
>>>> changelog.changelog off
>>>> 3. Enable changelog: glluster vol set <master-vol-name>
>>>> changelog.changelog on
>>>> 4. Start geo-replication.
>>>>
>>>> Let me know if it works.
>>>>
>>>> Thanks and Regards,
>>>> Kotresh H R
>>>>
>>>> ----- Original Message -----
>>>> From: "Cyril N PEPONNET (Cyril)"
>>>> <cyril.peponnet at alcatel-lucent.com<mailto:cyril.peponnet at alcatel-lucent.com>>
>>>> To: "gluster-users"
>>>> <gluster-users at gluster.org<mailto:gluster-users at gluster.org>>
>>>> Sent: Tuesday, May 19, 2015 3:16:22 AM
>>>> Subject: [Gluster-users] Geo-Replication - Changelog socket is not
>>>> present - Falling back to xsync
>>>>
>>>> Hi Gluster Community,
>>>>
>>>> I have a 3 nodes setup at location A and a two node setup at location
>>>> B.
>>>>
>>>> All running 3.5.2 under Centos-7.
>>>>
>>>> I have one volume I sync through georeplication process.
>>>>
>>>> So far so good, the first step of geo-replication is done
>>>> (hybrid-crawl).
>>>>
>>>> Now I’d like to use the change log detector in order to delete files on
>>>> the
>>>> slave when they are gone on master.
>>>>
>>>> But it always fallback to xsync mecanism (even when I force it using
>>>> config
>>>> changelog_detector changelog):
>>>>
>>>> [2015-05-18 12:29:49.543922] I [monitor(monitor):129:monitor] Monitor:
>>>> ------------------------------------------------------------
>>>> [2015-05-18 12:29:49.544018] I [monitor(monitor):130:monitor] Monitor:
>>>> starting gsyncd worker
>>>> [2015-05-18 12:29:49.614002] I [gsyncd(/export/raid/vol):532:main_i]
>>>> <top>:
>>>> syncing: gluster://localhost:vol ->
>>>> ssh://root@x.x.x.x:gluster://localhost:vol
>>>> [2015-05-18 12:29:54.696532] I
>>>> [master(/export/raid/vol):58:gmaster_builder]
>>>> <top>: setting up xsync change detection mode
>>>> [2015-05-18 12:29:54.696888] I [master(/export/raid/vol):357:__init__]
>>>> _GMaster: using 'rsync' as the sync engine
>>>> [2015-05-18 12:29:54.697930] I
>>>> [master(/export/raid/vol):58:gmaster_builder]
>>>> <top>: setting up changelog change detection mode
>>>> [2015-05-18 12:29:54.698160] I [master(/export/raid/vol):357:__init__]
>>>> _GMaster: using 'rsync' as the sync engine
>>>> [2015-05-18 12:29:54.699239] I [master(/export/raid/vol):1104:register]
>>>> _GMaster: xsync temp directory:
>>>> /var/run/gluster/vol/ssh%3A%2F%2Froot%40x.x.x.x%3Agluster%3A%2F%2F127.0.0.1%3Avol/ce749a38ba30d4171cd674ec00ab24f9/xsync
>>>> [2015-05-18 12:30:04.707216] I
>>>> [master(/export/raid/vol):682:fallback_xsync]
>>>> _GMaster: falling back to xsync mode
>>>> [2015-05-18 12:30:04.742422] I
>>>> [syncdutils(/export/raid/vol):192:finalize]
>>>> <top>: exiting.
>>>> [2015-05-18 12:30:05.708123] I [monitor(monitor):157:monitor] Monitor:
>>>> worker(/export/raid/vol) died in startup phase
>>>> [2015-05-18 12:30:05.708369] I [monitor(monitor):81:set_state] Monitor:
>>>> new
>>>> state: faulty
>>>> [201
>>>>
>>>> After some python debugging and stack strace printing I figure out
>>>> that:
>>>>
>>>> /var/run/gluster/vol/ssh%3A%2F%2Froot%40x.x.x.x%3Agluster%3A%2F%2F127.0.0.1%3Avol/ce749a38ba30d4171cd674ec00ab24f9/changes.log
>>>>
>>>> [2015-05-18 19:41:24.511423] I
>>>> [gf-changelog.c:179:gf_changelog_notification_init] 0-glusterfs:
>>>> connecting
>>>> to changelog socket:
>>>> /var/run/gluster/changelog-ce749a38ba30d4171cd674ec00ab24f9.sock
>>>> (brick:
>>>> /export/raid/vol)
>>>> [2015-05-18 19:41:24.511445] W
>>>> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
>>>> connection
>>>> attempt 1/5...
>>>> [2015-05-18 19:41:26.511556] W
>>>> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
>>>> connection
>>>> attempt 2/5...
>>>> [2015-05-18 19:41:28.511670] W
>>>> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
>>>> connection
>>>> attempt 3/5...
>>>> [2015-05-18 19:41:30.511790] W
>>>> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
>>>> connection
>>>> attempt 4/5...
>>>> [2015-05-18 19:41:32.511890] W
>>>> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
>>>> connection
>>>> attempt 5/5...
>>>> [2015-05-18 19:41:34.512016] E
>>>> [gf-changelog.c:204:gf_changelog_notification_init] 0-glusterfs: could
>>>> not
>>>> connect to changelog socket! bailing out...
>>>>
>>>>
>>>> /var/run/gluster/changelog-ce749a38ba30d4171cd674ec00ab24f9.sock
>>>> doesn’t
>>>> exist. So the
>>>> https://github.com/gluster/glusterfs/blob/release-3.5/xlators/features/changelog/lib/src/gf-changelog.c#L431
>>>> is failing because
>>>> https://github.com/gluster/glusterfs/blob/release-3.5/xlators/features/changelog/lib/src/gf-changelog.c#L153
>>>> cannot open the socket file.
>>>>
>>>> And I don’t find any error related to changelog in log files, except on
>>>> brick
>>>> logs node 2 (site A)
>>>>
>>>> bricks/export-raid-vol.log-20150517:[2015-05-14 17:06:52.636908] E
>>>> [changelog-helpers.c:168:changelog_rollover_changelog] 0-vol-changelog:
>>>> Failed to send file name to notify thread (reason: Broken pipe)
>>>> bricks/export-raid-vol.log-20150517:[2015-05-14 17:06:52.636949] E
>>>> [changelog-helpers.c:280:changelog_handle_change] 0-vol-changelog:
>>>> Problem
>>>> rolling over changelog(s)
>>>>
>>>> gluster vol status is all fine, and change-log options are enabled in
>>>> vol
>>>> file
>>>>
>>>> volume vol-changelog
>>>> type features/changelog
>>>> option changelog on
>>>> option changelog-dir /export/raid/vol/.glusterfs/changelogs
>>>> option changelog-brick /export/raid/vol
>>>> subvolumes vol-posix
>>>> end-volume
>>>>
>>>> Any help will be appreciated :)
>>>>
>>>> Oh Btw, hard to stop / restart the volume as I have around 4k clients
>>>> connected.
>>>>
>>>> Thanks !
>>>>
>>>> --
>>>> Cyril Peponnet
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>
>>
More information about the Gluster-users
mailing list