[Gluster-devel] Gluster 3.5.0 geo-replication with multiple bricks
system admin
sysbcn74 at gmail.com
Mon Jun 30 15:06:43 UTC 2014
Hi all.
I've recently installed three Gluster 3.5 servers, two masters and one
geo-replication slave, all of them with 2 bricks. After some configuration
problems It seems that all is working ok but I've found some problems with
geo-replication.
First I'd like to do one question because I couldn't find the answers
neither in documentation nor in any mail list:
This is the volume configuration:
root at filepre03:/gluster/jbossbricks/pre01/disk01/b01/.glusterfs/changelogs#
gluster v info
(master)
Volume Name: jbpre01vol
Type: Distributed-Replicate
Volume ID: 316231f7-20bf-44f6-9d9b-20d4e3b27c2c
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: filepre03:/gluster/jbossbricks/pre01/disk01/b01
Brick2: filepre04:/gluster/jbossbricks/pre01/disk01/b01
Brick3: filepre03:/gluster/jbossbricks/pre01/disk02/b02
Brick4: filepre04:/gluster/jbossbricks/pre01/disk02/b02
Options Reconfigured:
diagnostics.brick-log-level: WARNING
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
(geo-replica)
Volume Name: jbpre01slvol
Type: Distribute
Volume ID: 0a4d2f3e-c803-4cfe-971b-2f8107180a69
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: filepre05:/gluster/jbossbricks/pre01/disk01/b01
Brick2: filepre05:/gluster/jbossbricks/pre01/disk02/b02
Options Reconfigured:
diagnostics.brick-log-level: WARNING
And geo replication is running on bricks b01 and b02 :
root at filepre03:/gluster/jbossbricks/pre01/disk01/b01/.glusterfs/changelogs#
gluster v g jbpre01vol filepre05::jbpre01slvol status
MASTER NODE MASTER VOL MASTER BRICK
SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS
-------------------------------------------------------------------------------------------------------------------------------------------------
filepre03 jbpre01vol /gluster/jbossbricks/pre01/disk01/b01
filepre05::jbpre01slvol Active N/A Changelog Crawl
filepre03 jbpre01vol /gluster/jbossbricks/pre01/disk02/b02
filepre05::jbpre01slvol Active N/A Changelog Crawl
filepre04 jbpre01vol /gluster/jbossbricks/pre01/disk01/b01
filepre05::jbpre01slvol Passive N/A N/A
filepre04 jbpre01vol /gluster/jbossbricks/pre01/disk02/b02
filepre05::jbpre01slvol Passive N/A N/A
Tests are done from another server mounting master and slave volumes:
root at testgluster:/mnt/gluster# mount |grep gluster
filepre03:/jbpre01vol on /mnt/gluster/pre01filepre03 type fuse.glusterfs
(rw,default_permissions,allow_other,max_read=131072)
filepre04:/jbpre01vol on /mnt/gluster/pre01filepre04 type fuse.glusterfs
(rw,default_permissions,allow_other,max_read=131072)
filepre05:/jbpre01slvol on /mnt/gluster/pre01filepre05 type fuse.glusterfs
(rw,default_permissions,allow_other,max_read=131072)
My question is about directories dates in geo replica, in all my tests
directory date in remote server shows the date when replication was
executed, not original date. Is this the usual behavior?
For example:
root at testgluster:/mnt/gluster# mkdir /mnt/gluster/pre01filepre03/TESTDIR1
root at testgluster:/mnt/gluster# echo TEST > pre01filepre03/TESTDIR1/TESTING1
After a while, gluster has created dir and file but directory's date is
current date not original:
root at testgluster:/mnt/gluster# ls -d --full-time pre01filepre0*/TESTDIR1
drwxr-xr-x 2 root root 8192 2014-06-30 11:55:18.651528230 +0200
pre01filepre03/TESTDIR1
drwxr-xr-x 2 root root 8192 2014-06-30 11:55:18.652637248 +0200
pre01filepre04/TESTDIR1
drwxr-xr-x 2 root root 8192 2014-06-30 11:56:14.087626822 +0200
pre01filepre05/TESTDIR1 (geo-replica)
However file is replicated with original date:
root at testgluster:/mnt/gluster# find . -type f -exec ls --full-time {} \;
-rw-r--r-- 1 root root 5 2014-06-30 11:55:18.664637725 +0200
./pre01filepre04/TESTDIR1/TESTING1
-rw-r--r-- 1 root root 5 2014-06-30 11:55:18.663528750 +0200
./pre01filepre03/TESTDIR1/TESTING1
-rw-r--r-- 1 root root 5 2014-06-30 11:55:18.000000000 +0200
./pre01filepre05/TESTDIR1/TESTING1 (geo-replica)
This makes dificult to validate any syncing error between masters and slave
using commands like rsync because directories are always different:
root at testgluster:/mnt/gluster# rsync -avn pre01filepre03/TESTDIR1/
pre01filepre05
sending incremental file list
./
TESTING1
sent 49 bytes received 18 bytes 134.00 bytes/sec
total size is 5 speedup is 0.07 (DRY RUN)
Next, I would ask If someone has found next problem when bricks in remoter
server goes down:
After check currently status I've kill one brick process in remote server:
root at filepre03:~# gluster v status jbpre01vol
Status of volume: jbpre01vol
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick filepre03:/gluster/jbossbricks/pre01/disk01/b01 49169 Y 8167
Brick filepre04:/gluster/jbossbricks/pre01/disk01/b01 49172 Y 7027
Brick filepre03:/gluster/jbossbricks/pre01/disk02/b02 49170 Y 8180
Brick filepre04:/gluster/jbossbricks/pre01/disk02/b02 49173 Y 7040
NFS Server on localhost
2049 Y 2088
Self-heal Daemon on localhost
N/A Y 30873
NFS Server on filepre04
2049 Y 9171
Self-heal Daemon on filepre04
N/A Y 7061
NFS Server on filepre05
2049 Y 1128
Self-heal Daemon on filepre05
N/A Y 1137
root at filepre03:~# gluster v status jbpre01slvol
Status of volume: jbpre01slvol
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick filepre05:/gluster/jbossbricks/pre01/disk01/b01 49152 Y 6321
Brick filepre05:/gluster/jbossbricks/pre01/disk02/b02 49155 Y 6375
NFS Server on localhost 2049 Y 2088
NFS Server on filepre04 2049 Y 9171
NFS Server on filepre05 2049 Y 1128
root at filepre03:~# gluster v g status
MASTER NODE MASTER VOL MASTER BRICK
SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS
-------------------------------------------------------------------------------------------------------------------------------------------------
filepre03 jbpre01vol /gluster/jbossbricks/pre01/disk01/b01
filepre05::jbpre01slvol Active N/A Changelog Crawl
filepre03 jbpre01vol /gluster/jbossbricks/pre01/disk02/b02
filepre05::jbpre01slvol Active N/A Changelog Crawl
filepre04 jbpre01vol /gluster/jbossbricks/pre01/disk01/b01
filepre05::jbpre01slvol Passive N/A N/A
filepre04 jbpre01vol /gluster/jbossbricks/pre01/disk02/b02
filepre05::jbpre01slvol Passive N/A N/A
root at filepre05:/gluster/jbossbricks/pre01# kill -9 6375 (slave: brick 02
process)
root at filepre03:~# gluster v status jbpre01slvol
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick filepre05:/gluster/jbossbricks/pre01/disk01/b01 49152 Y 6321
Brick filepre05:/gluster/jbossbricks/pre01/disk02/b02 N/A N
6375
NFS Server on localhost
2049 Y 2088
NFS Server on filepre04
2049 Y 9171
NFS Server on filepre05
2049 Y 1128
If I kill only one brick process geo replication doesn't show any problem
and doesn't detect problems when a client writes on the brick.
I write on some files:
root at testgluster:/mnt/gluster# echo "TEST2" >pre01filepre03/TESTFILE2
root at testgluster:/mnt/gluster# echo "TEST3" >pre01filepre03/TESTFILE3
root at testgluster:/mnt/gluster# echo "TEST4" >pre01filepre03/TESTFILE4
root at testgluster:/mnt/gluster# echo "TEST5" >pre01filepre03/TESTFILE5
Then, I check where they are:
root at filepre03:~# find /gluster -name "TESTFILE*" -exec ls -l {} \;
-rw-r--r-- 2 root root 6 Jun 30 16:11
/gluster/jbossbricks/pre01/disk01/b01/TESTFILE3
-rw-r--r-- 2 root root 6 Jun 30 16:11
/gluster/jbossbricks/pre01/disk01/b01/TESTFILE4
-rw-r--r-- 2 root root 6 Jun 30 16:09
/gluster/jbossbricks/pre01/disk01/b01/TESTFILE2
-rw-r--r-- 2 root root 6 Jun 30 16:12
/gluster/jbossbricks/pre01/disk02/b02/TESTFILE5 <--- brick 02
and finally check geo replication status:
root at filepre03:~root at filepre03:~# gluster v g jbpre01vol
filepre05::jbpre01slvol status detail
MASTER NODE MASTER VOL MASTER BRICK
SLAVE STATUS CHECKPOINT STATUS CRAWL
STATUS FILES SYNCD FILES PENDING BYTES PENDING DELETES
PENDING FILES SKIPPED
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
filepre03 jbpre01vol /gluster/jbossbricks/pre01/disk01/b01
filepre05::jbpre01slvol Active N/A Changelog
Crawl 3738 0 0
0 0
filepre03 jbpre01vol /gluster/jbossbricks/pre01/disk02/b02
filepre05::jbpre01slvol Active N/A Changelog
Crawl 3891 0 0
0 0
filepre04 jbpre01vol /gluster/jbossbricks/pre01/disk01/b01
filepre05::jbpre01slvol Passive N/A
N/A 0 0 0
0 0
filepre04 jbpre01vol /gluster/jbossbricks/pre01/disk02/b02
filepre05::jbpre01slvol Passive N/A
N/A 0 0 0
0 0
No problem is shown, no file sync is pending but on remote server:
root at testgluster:/mnt/gluster# ll pre01filepre05
total 14
drwxr-xr-x 4 root root 4096 Jun 30 16:12 ./
drwxr-xr-x 5 root root 4096 Jun 26 12:57 ../
-rw-r--r-- 1 root root 6 Jun 30 16:09 TESTFILE2
-rw-r--r-- 1 root root 6 Jun 30 16:11 TESTFILE3
-rw-r--r-- 1 root root 6 Jun 30 16:11 TESTFILE4
... only files written on brick 01 have been replicated
I can't find any gluster command that returns a failed status.
Only one more comment, if I kill all brick process at remote server then
geo replication status change to faulty as I expected
Any additional info will be appreciated.
Thank you
Eva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140630/7cf20bf1/attachment-0001.html>
More information about the Gluster-devel
mailing list