[Gluster-users] Geo_replication to Faulty

Kotresh Hiremath Ravishankar khiremat at redhat.com
Tue Nov 19 06:05:34 UTC 2019


Hi,

Those issues are fixed in gluster v6.6. Please try 6.6. Non-root geo-rep is
stable in v6.6.

On Tue, Nov 19, 2019 at 11:24 AM deepu srinivasan <sdeepugd at gmail.com>
wrote:

> Hi
> We are using Gluster 5.6 now.
> We tested 6.2 earlier but it had the gluster-mountbroker issue.(
> https://bugzilla.redhat.com/show_bug.cgi?id=1709248).
>
> On Tue, Nov 19, 2019 at 11:22 AM Kotresh Hiremath Ravishankar <
> khiremat at redhat.com> wrote:
>
>> Which version of gluster are you using?
>>
>> On Tue, Nov 19, 2019 at 11:00 AM deepu srinivasan <sdeepugd at gmail.com>
>> wrote:
>>
>>> Hi kotresh
>>> Is there a stable release in 6.x series?
>>>
>>>
>>> On Tue, Nov 19, 2019, 10:44 AM Kotresh Hiremath Ravishankar <
>>> khiremat at redhat.com> wrote:
>>>
>>>> This issue has been recently fixed with the following patch and should
>>>> be available in latest gluster-6.x
>>>>
>>>> https://review.gluster.org/#/c/glusterfs/+/23570/
>>>>
>>>> On Tue, Nov 19, 2019 at 10:26 AM deepu srinivasan <sdeepugd at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> Hi Aravinda
>>>>> *The below logs are from master end:*
>>>>>
>>>>> [2019-11-16 17:29:43.536881] I [gsyncdstatus(worker
>>>>> /home/sas/gluster/data/code-misc6):281:set_active] GeorepStatus: Worker
>>>>> Status Change       status=Active
>>>>> [2019-11-16 17:29:43.629620] I [gsyncdstatus(worker
>>>>> /home/sas/gluster/data/code-misc6):253:set_worker_crawl_status]
>>>>> GeorepStatus: Crawl Status Change   status=History Crawl
>>>>> [2019-11-16 17:29:43.630328] I [master(worker
>>>>> /home/sas/gluster/data/code-misc6):1517:crawl] _GMaster: starting history
>>>>> crawl   turns=1 stime=(1573924576, 0)   entry_stime=(1573924576, 0)
>>>>> etime=1573925383
>>>>> [2019-11-16 17:29:44.636725] I [master(worker
>>>>> /home/sas/gluster/data/code-misc6):1546:crawl] _GMaster: slave's time
>>>>> stime=(1573924576, 0)
>>>>> [2019-11-16 17:29:44.778966] I [master(worker
>>>>> /home/sas/gluster/data/code-misc6):898:fix_possible_entry_failures]
>>>>> _GMaster: Fixing ENOENT error in slave. Parent does not exist on master.
>>>>> Safe to ignore, take out entry       retry_count=1   entry=({'uid': 0,
>>>>> 'gfid': 'c02519e0-0ead-4fe8-902b-dcae72ef83a3', 'gid': 0, 'mode': 33188,
>>>>> 'entry': '.gfid/d60aa0d5-4fdf-4721-97dc-9e3e50995dab/368307802', 'op':
>>>>> 'CREATE'}, 2, {'slave_isdir': False, 'gfid_mismatch': False, 'slave_name':
>>>>> None, 'slave_gfid': None, 'name_mismatch': False, 'dst': False})
>>>>> [2019-11-16 17:29:44.779306] I [master(worker
>>>>> /home/sas/gluster/data/code-misc6):942:handle_entry_failures] _GMaster:
>>>>> Sucessfully fixed entry ops with gfid mismatch    retry_count=1
>>>>> [2019-11-16 17:29:44.779516] I [master(worker
>>>>> /home/sas/gluster/data/code-misc6):1194:process_change] _GMaster: Retry
>>>>> original entries. count = 1
>>>>> [2019-11-16 17:29:44.879321] E [repce(worker
>>>>> /home/sas/gluster/data/code-misc6):214:__call__] RepceClient: call failed
>>>>>  call=151945:140353273153344:1573925384.78       method=entry_ops
>>>>>  error=OSError
>>>>> [2019-11-16 17:29:44.879750] E [syncdutils(worker
>>>>> /home/sas/gluster/data/code-misc6):338:log_raise_exception] <top>: FAIL:
>>>>> Traceback (most recent call last):
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 322,
>>>>> in main
>>>>>     func(args)
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 82,
>>>>> in subcmd_worker
>>>>>     local.service_loop(remote)
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line
>>>>> 1277, in service_loop
>>>>>     g3.crawlwrap(oneshot=True)
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 599,
>>>>> in crawlwrap
>>>>>     self.crawl()
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
>>>>> 1555, in crawl
>>>>>     self.changelogs_batch_process(changes)
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
>>>>> 1455, in changelogs_batch_process
>>>>>     self.process(batch)
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
>>>>> 1290, in process
>>>>>     self.process_change(change, done, retry)
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
>>>>> 1195, in process_change
>>>>>     failures = self.slave.server.entry_ops(entries)
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 233,
>>>>> in __call__
>>>>>     return self.ins(self.meth, *a)
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 215,
>>>>> in __call__
>>>>>     raise res
>>>>> OSError: [Errno 13] Permission denied:
>>>>> '/home/sas/gluster/data/code-misc6/.glusterfs/6a/90/6a9008b1-a4aa-4c30-9ae7-92a33e05d0bb'
>>>>> [2019-11-16 17:29:44.911767] I [repce(agent
>>>>> /home/sas/gluster/data/code-misc6):97:service_loop] RepceServer:
>>>>> terminating on reaching EOF.
>>>>> [2019-11-16 17:29:45.509344] I [monitor(monitor):278:monitor] Monitor:
>>>>> worker died in startup phase     brick=/home/sas/gluster/data/code-misc6
>>>>> [2019-11-16 17:29:45.511806] I
>>>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status
>>>>> Change status=Faulty
>>>>>
>>>>>
>>>>>
>>>>> *The below logs are from the slave end.*
>>>>>
>>>>> [2019-11-16 17:24:42.281599] I [resource(slave
>>>>> 192.168.185.106/home/sas/gluster/data/code-misc6):580:entry_ops
>>>>> <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:580:entry_ops>]
>>>>> <top>: Special case: rename on mkdir
>>>>>  gfid=6a9008b1-a4aa-4c30-9ae7-92a33e05d0bb
>>>>> entry='.gfid/a8921d78-a078-46d3-aca5-8b078eb62cac/8878061b-d5b3-47a6-b01c-8310fee39b20'
>>>>> [2019-11-16 17:24:42.370582] E [repce(slave
>>>>> 192.168.185.106/home/sas/gluster/data/code-misc6):122:worker
>>>>> <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:122:worker>]
>>>>> <top>: call failed:
>>>>> Traceback (most recent call last):
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 118,
>>>>> in worker
>>>>>     res = getattr(self.obj, rmeth)(*in_data[2:])
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line
>>>>> 581, in entry_ops
>>>>>     src_entry = get_slv_dir_path(slv_host, slv_volume, gfid)
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
>>>>> 690, in get_slv_dir_path
>>>>>     [ENOENT], [ESTALE])
>>>>>   File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
>>>>> 546, in errno_wrap
>>>>>     return call(*arg)
>>>>> OSError: [Errno 13] Permission denied:
>>>>> '/home/sas/gluster/data/code-misc6/.glusterfs/6a/90/6a9008b1-a4aa-4c30-9ae7-92a33e05d0bb'
>>>>> [2019-11-16 17:24:42.400402] I [repce(slave
>>>>> 192.168.185.106/home/sas/gluster/data/code-misc6):97:service_loop
>>>>> <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:97:service_loop>]
>>>>> RepceServer: terminating on reaching EOF.
>>>>> [2019-11-16 17:24:53.403165] W [gsyncd(slave
>>>>> 192.168.185.106/home/sas/gluster/data/code-misc6):304:main
>>>>> <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:304:main>]
>>>>> <top>: Session config file not exists, using the default config
>>>>>  path=/var/lib/glusterd/geo-replication/code-misc_192.168.185.107_code-misc/gsyncd.con
>>>>>
>>>>>
>>>>> On Sat, Nov 16, 2019, 9:26 PM Aravinda Vishwanathapura Krishna Murthy <
>>>>> avishwan at redhat.com> wrote:
>>>>>
>>>>>> Hi Deepu,
>>>>>>
>>>>>> Please share the reason for Faulty from Geo-rep logs of respective
>>>>>> master node.
>>>>>>
>>>>>>
>>>>>> On Sat, Nov 16, 2019 at 1:01 AM deepu srinivasan <sdeepugd at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Users/Development Team
>>>>>>> We have set up a Geo-replication session with non-root in slave
>>>>>>> setup in our DC.
>>>>>>> It was working well with Active Status and Changelogcrawl.
>>>>>>>
>>>>>>> We were mounting the master node and the file is being written in it.
>>>>>>> We were running some process as the root user so the process wrote
>>>>>>> some file and folder with root permission.
>>>>>>> After stopping the geo-replication and starting the process the
>>>>>>> session went to the faulty state.
>>>>>>> How to recover?
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> regards
>>>>>> Aravinda VK
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Thanks and Regards,
>>>> Kotresh H R
>>>>
>>>
>>
>> --
>> Thanks and Regards,
>> Kotresh H R
>>
>

-- 
Thanks and Regards,
Kotresh H R
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20191119/739f2062/attachment.html>


More information about the Gluster-users mailing list