[Gluster-users] geo-rep will not initialize

Karl Kleinpaste karl at kleinpaste.org
Fri Aug 30 20:29:30 UTC 2024


On 8/30/24 04:17, Strahil Nikolov wrote:
> Have you done the following setup on the receiving gluster volume:

Yes. For completeness' sake:
grep geoacct /etc/passwd /etc/group
/etc/passwd:geoacct:x:5273:5273:gluster 
geo-replication:/var/lib/glusterd/geoacct:/bin/bash
/etc/group:geoacct:x:5273:

gluster-mountbroker status
+-----------+-------------+---------------------------+-------------+----------------+
|    NODE   | NODE STATUS |         MOUNT ROOT        | GROUP    |     
USERS      |
+-----------+-------------+---------------------------+-------------+----------------+
|    pms    |          UP | /var/mountbroker-root(OK) | geoacct(OK) | 
geoacct(j, n)  |
| localhost |          UP | /var/mountbroker-root(OK) | geoacct(OK) | 
geoacct(n, j)  |
+-----------+-------------+---------------------------+-------------+----------------+

restarted glusterd
ssh-keyed, no-password access established.
gluster system:: execute gsec_create
reports success, and 
/var/lib/glusterd/geo-replication/common_secret.pem.pub exists on both 
systems, containing 4 lines.
created geo-rep session, and configured ssh-port, as before, successful.
Issued start command. Status report still says Created.

PRIMARY NODE    PRIMARY VOL    PRIMARY BRICK    SECONDARY USER    
SECONDARY         SECONDARY NODE STATUS     CRAWL STATUS    LAST_SYNCED
-------------------------------------------------------------------------------------------------------------------------------------------
pjs           j              /xx/brick/j      geoacct geoacct at pms::n    
N/A               Created    N/A             N/A

> Also, I think the source node must be able to reach the gluster slave. 
> Try to mount the slave vol on the master via the fuse client in order 
> to verify the status.

Each has mounted the other's volume. A single file, /gluster/j/stuff, is 
seen by both and is not replicated into /gluster/n.

> Also, check with the following command (found it in 
> https://access.redhat.com/solutions/2616791 ) 
> <https://access.redhat.com/solutions/2616791>:
> |sh -x /usr/libexec/glusterfs/gverify.sh masterVol slaveUser slaveHost 
> slaveVol sshPort logFileName|

That must be the wrong URL, "libexec" doesn't appear there.
However, running it with locally-appropriate args:
/usr/libexec/glusterfs/gverify.sh j geoacct pms n 6427 /tmp/verify.log
...generates a great deal of regular logging output, logs nothing in 
/tmp/verify.log, but the shell execution trace made this complaint:
shell-init: error retrieving current directory: getcwd: cannot access 
parent directories: No such file or directory
So I went looking for directories that might be restricted (it didn't 
tell me which one it didn't like), thus:
ls -ld / /gluster /gluster/? /xx /xx/brick /xx/brick/j
find /var/lib/glu* -type d | xargs ls -ld
The only directory, on both systems, that was at all restricted was 
/var/lib/glusterd/geo-replication, so...
chmod ugo+rx /var/lib/glusterd/geo-replication
...to take care of that. Again, attempted start, to no effect, the 
session remains in Created state.

I wish there was a single, exhaustive description of "problems causing 
georep not to initiate."
The fact that it reports apparent success without moving forward to 
Active state is odd, and maddening.
What event or state is which part of the process waiting for? Having 
started (in principle), what is evaluating conditions for moving to Active?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20240830/5617225f/attachment.html>


More information about the Gluster-users mailing list