[Gluster-users] Géo-rep fail
Csaba Henk
csaba at gluster.com
Tue May 17 08:30:03 UTC 2011
On 05/17/11 13:04, anthony garnier wrote:
> Hi,
> I've put the Client log in Debug mod :
> # gluster volume geo-replication /soft/venus config log-level DEBUG
> geo-replication config updated successfully
>
> # gluster volume geo-replication /soft/venus config log-file
> /usr/local/var/log/glusterfs/geo-replication-slaves/${session_owner}:file%3A%2F%2F%2Fsoft%2Fvenus.log
>
> # gluster volume geo-replication athena /soft/venus config session-owner
> 28cbd261-3a3e-4a5a-b300-ea468483c944
>
> # gluster volume geo-replication athena /soft/venus start
> Starting geo-replication session between athena & /soft/venus has been
> successful
>
> # gluster volume geo-replication athena /soft/venus status
> MASTER SLAVE STATUS
> --------------------------------------------------------------------------------
> athena /soft/venus starting...
>
> and then :
>
> # gluster volume geo-replication athena /soft/venus status
> MASTER SLAVE STATUS
> --------------------------------------------------------------------------------
> athena /soft/venus faulty
Is this an edited output? By all chance, I'd expect to see the full
slave url, ie. file:///soft/venus in the status output.
> For client :
> cat
> /usr/local/var/log/glusterfs/geo-replication-slaves/28cbd261-3a3e-4a5a-b300-ea468483c944:file%3A%2F%2F%2Fsoft%2Fvenus.log
>
>
> [2011-05-17 09:20:40.519731] I [gsyncd(slave):287:main_i] <top>:
> syncing: file:///soft/venus
> [2011-05-17 09:20:40.520587] I [resource(slave):200:service_loop] FILE:
> slave listening
> [2011-05-17 09:20:40.532951] I [repce(slave):61:service_loop]
> RepceServer: terminating on reaching EOF.
> [2011-05-17 09:21:50.528803] I [gsyncd(slave):287:main_i] <top>:
> syncing: file:///soft/venus
> [2011-05-17 09:21:50.529666] I [resource(slave):200:service_loop] FILE:
> slave listening
> [2011-05-17 09:21:50.542349] I [repce(slave):61:service_loop]
> RepceServer: terminating on reaching EOF.
>
>
>
> For server :
> # cat
> /usr/local/var/log/glusterfs/geo-replication/athena/file%3A%2F%2F%2Fsoft%2Fvenus.log
>
> [2011-05-17 09:30:04.431369] I [monitor(monitor):42:monitor] Monitor:
> ------------------------------------------------------------
> [2011-05-17 09:30:04.431669] I [monitor(monitor):43:monitor] Monitor:
> starting gsyncd worker
> [2011-05-17 09:30:04.486852] I [gsyncd:287:main_i] <top>: syncing:
> gluster://localhost:athena -> file:///soft/venus
[...]
> raise RuntimeError("command failed: " + " ".join(argv))
> RuntimeError: command failed: /usr/local/sbin/glusterfs --xlator-option
> *-dht.assert-no-child-down=true -l
> /usr/local/var/log/glusterfs/geo-replication/athena/file%3A%2F%2F%2Fsoft%2Fvenus.gluster.log
> -s localhost --volfile-id athena --client-pid=-1
> /tmp/gsyncd-aux-mount-TEqjwY
> [2011-05-17 09:30:04.647973] D [monitor(monitor):57:monitor] Monitor:
> worker got connected in 0 sec, waiting 59 more to make sure it's fine
This is interesting in the sense that the error you get now is not the
same as in your first post. Better said, the _symptoms_ are different,
the error as such might be the same. I can imagine that there is a race
in between exceptional events and it's accidental which one interrupts
the event flow.
So, it seems that the auxiliary glusterfs instance used by master gsyncd
fails. (Sidenote: if you prefer to use client/server terminology instead
of master/slave, that's fine, but master should be called client and
slave should be called server, ie. the reverse way you do :) ) To see
what's wrong with that, I again ask for the respective logs:
## setting DEBUG loglevel for master's aux glusterfs
# gluster volume geo-replication athena /soft/venus config \
gluster-log-level DEBUG
## getting the path of the logfile of aux glusterfs
# gluster volume geo-replication athena /soft/venus config \
gluster-log-file
So pls post the latter thingy.
Csaba
More information about the Gluster-users
mailing list