[Bugs] [Bug 1200372] New: Geo-rep fails with disperse volume

Tue Mar 10 12:32:32 UTC 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1200372

            Bug ID: 1200372
           Summary: Geo-rep fails with disperse volume
           Product: GlusterFS
           Version: mainline
         Component: disperse
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: smanjara at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com

Description of problem: Geo-rep session in faulty state when created on
disperse volume

Version-Release number of selected component (if applicable):
glusterfs-3.7dev-0.667.gitadef0c8.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create and start a volume with disperse = 3 redundancy = 1 on master and
slave cluster each. 
2. Create geo-rep session and start it.

Actual results:

Files fail to sync and geo-rep status is faulty

Expected results:

Additional info:

1. on master cluster:
# gluster v i

Volume Name: master
Type: Distributed-Disperse
Volume ID: fad80ec1-2ef8-47b5-a356-baa3f4e9c039
Status: Started
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: 10.x.x.x:/rhs/brick1/m1
Brick2: 10.x.x.x:/rhs/brick1/m2
Brick3: 10.x.x.x:/rhs/brick1/m3
Brick4: 10.x.x.x:/rhs/brick1/m4
Brick5: 10.x.x.x:/rhs/brick1/m5
Brick6: 10.x.x.x:/rhs/brick1/m6
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on

3.Create passwordless ssh between master and one slave node and run:

#gluster system:: execute gsec_create

4.Create and start a geo-rep session from master to slave volume

gluster volume geo-rep master 10.x.x.x::slave create push-pem
gluster volume geo-rep master 10.x.x.x::slave start

4. Check the status of geo-rep:#  gluster volume geo-rep master
10.70.37.56::slave status 
MASTER NODE               MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE  
              STATUS    CHECKPOINT STATUS    CRAWL STATUS       
-----------------------------------------------------------------------------------------------------------------------------------------
ecnode1    master        /rhs/brick1/m1    root          10.x.x.x::slave   
faulty    N/A                  N/A                
ecnode1    master        /rhs/brick1/m4    root          10.x.x.x::slave   
faulty    N/A                  N/A                
ecnode2   master        /rhs/brick1/m2    root          10.x.x.x::slave   
faulty    N/A                  N/A                
ecnode2   master        /rhs/brick1/m5    root          10.x.x.x::slave   
faulty    N/A                  N/A                
ecnode3    master        /rhs/brick1/m3    root          10.x.x.x::slave   
faulty    N/A                  N/A                
ecnode3    master        /rhs/brick1/m6    root          10.x.x.x::slave   
faulty    N/A                  N/A 

Logs from /var/log/glusterfs/geo-replication/master:

 E [syncdutils(/rhs/brick1/m4):275:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 164, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 646, in
main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1333, in
service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 486, in
crawlwrap
    volinfo_sys = self.volinfo_hook()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 165, in
volinfo_hook
    return self.get_sys_volinfo()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 331, in
get_sys_volinfo
    self.master.server.aggregated.native_volume_info())
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1004, in
native_volume_info
    'volume-mark']))
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 961, in
_attr_unpack_dict
    buf = Xattr.lgetxattr('.', xattr, struct.calcsize(fmt_string))
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", line 55, in
lgetxattr
    return cls._query_xattr(path, siz, 'lgetxattr', attr)
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", line 47, in
_query_xattr
    cls.raise_oserr()
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", line 37, in
raise_oserr
    raise OSError(errn, os.strerror(errn))
OSError: [Errno 22] Invalid argument

>From etc-glusterfs-glusterd.vol.log:

[2015-03-10 06:53:47.130372] I
[glusterd-geo-rep.c:3586:glusterd_read_status_file] 0-: Using passed config
template(/var/lib/glusterd/geo-replication/master_10.70.37.56_slave/gsyncd.conf).
[2015-03-10 06:53:47.275735] E
[glusterd-geo-rep.c:3266:glusterd_gsync_read_frm_status] 0-: Unable to read
gsyncd status file
[2015-03-10 06:53:47.275789] E
[glusterd-geo-rep.c:3673:glusterd_read_status_file] 0-: Unable to read the
statusfile for /rhs/brick1/m1 brick for  master(master),
10.70.37.56::slave(slave) session
[2015-03-10 06:53:47.419816] E
[glusterd-geo-rep.c:3266:glusterd_gsync_read_frm_status] 0-: Unable to read
gsyncd status file
[2015-03-10 06:53:47.419868] E
[glusterd-geo-rep.c:3673:glusterd_read_status_file] 0-: Unable to read the
statusfile for /rhs/brick1/m4 brick for  master(master),
10.70.37.56::slave(slave) session

[2015-03-10 01:44:23.725119] W [fuse-bridge.c:3327:fuse_xattr_cbk]
>> 0-glusterfs-fuse: 6: GETXATTR(trusted.glusterfs.volume-mark) / => -1
>> (Invalid argument)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.