[Gluster-users] In a replica 2 server, file-updates on one server missing on the other server #Personal#

Fri Jan 23 09:17:45 UTC 2015

Oh, I didn't I only read a fragment of the IRC log and assumed 
--xlator-option would be enough. Apparently it's a lot more work.... 

I do have a query, though. These connections, from one of our setups, are 
these on secure ports? Or, maybe I didn't get it the first time.....

root at serv0:/root> ps -ef | grep replicated_vol
root      8851 25307  0 10:03 pts/2    00:00:00 grep replicated_vol
root     29751     1  4 Jan21 ?        01:47:20 /usr/sbin/glusterfsd -s 
serv0 --volfile-id replicated_vol.serv0.mnt-bricks-replicated_vol-brick -p 
/var/lib/glusterd/vols/_replicated_vol/run/serv0-mnt-bricks-replicated_vol-brick.pid 
-S /var/run/dff9fa3c93e82f20103f2a3d91adc4a8.socket --brick-name 
/mnt/bricks/replicated_vol/brick -l 
/var/log/glusterfs/bricks/mnt-bricks-replicated_vol-brick.log 
--xlator-option *-posix.glusterd-uuid=1a1d1ebc-4b92-428f-b66b-9c5efa49574d 
--brick-port 49185 --xlator-option replicated_vol-server.listen-port=49185
root     30399     1  0 Jan21 ?        00:19:06 /usr/sbin/glusterfs 
--volfile-id=replicated_vol --volfile-server=serv0 /mnt/replicated_vol

root at serv0:/root> netstat -p | grep 30399
tcp        0      0 serv0:969           serv0:49185         ESTABLISHED 
30399/glusterfs
tcp        0      0 serv0:999           serv1:49159         ESTABLISHED 
30399/glusterfs
tcp        0      0 serv0:1023          serv0:24007         ESTABLISHED 
30399/glusterfs
root at serv0:/root>

Thanks again,
Anirban

From:   Pranith Kumar Karampuri <pkarampu at redhat.com>
To:     A Ghoshal <a.ghoshal at tcs.com>
Cc:     gluster-users at gluster.org, Niels de Vos <ndevos at redhat.com>
Date:   01/23/2015 01:58 PM
Subject:        Re: [Gluster-users] In a replica 2 server, file-updates on 
one server missing on the other server #Personal#

On 01/23/2015 01:54 PM, A Ghoshal wrote:
Thanks a lot, Pranith. 

We'll set this option on our test servers and keep the setup under 
observation. 
How did you get the bind-insecure option working?
I guess I will send a patch to make it 'volume set option'

Pranith

Thanks, 
Anirban 

From:        Pranith Kumar Karampuri <pkarampu at redhat.com> 
To:        A Ghoshal <a.ghoshal at tcs.com> 
Cc:        gluster-users at gluster.org, Niels de Vos <ndevos at redhat.com> 
Date:        01/23/2015 01:28 PM 
Subject:        Re: [Gluster-users] In a replica 2 server, file-updates on 
one server missing on the other server #Personal# 

On 01/22/2015 02:07 PM, A Ghoshal wrote: 
Hi Pranith, 

Yes, the very same (chalcogen_eg_oxygen at yahoo.com). Justin Clift sent me a 
mail a while back telling me that it is better if we all use our business 
email addresses so I made me a new profile. 

Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports 
because we use a really old Linux kernel (2.6.34) wherein this feature is 
not present. We plan to upgrade our Linux so often but each time we are 
dissuaded from it by some compatibility issue or the other. So, we get 
this log every time - on both good volumes and bad ones. What bothered me 
was this (on serv1): 
Basically to make the connections to servers i.e. bricks clients need to 
choose secure ports i.e. port less than 1024. Since this file is not 
present, it is not binding to any port as per the code I just checked. 
There is an option called client-bind-insecure which bypasses this check. 
I feel that is one (probably only way) to get around this. 
You have to "volume set server.allow-insecure on" option and bind-insecure 
option.
CC ndevos who seemed to have helped someone set bind-insecure option 
correctly here (http://irclog.perlgeek.de/gluster/2014-04-09/text)

Pranith 

[2015-01-20 09:37:49.151744] T 
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 
456, payload: 360, rpc hdr: 96 
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit] 
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3, 
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0) 
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record] 
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner: 
0000000000000000 
[2015-01-20 09:37:49.151824] T 
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 
456, payload: 360, rpc hdr: 96 
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit] 
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3, 
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1) 
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x 
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
(replicated_vol-client-1) 
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x 
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
(replicated_vol-client-0) 

When I write on the good server (serv1), we see that an RPC request is 
sent to both client-0 and client-1. While, when I write on the bad server 
(serv0), the RPC request is sent only to client-0, which is why it is no 
wonder that the writes are not synced over to serv1. Somehow I could not 
make the daemon on serv0 understand that there are two up-children and not 
just one. 

One additional detail - since we are using a kernel that's too old, we do 
not have the (Anand Avati's?) FUse readdirplus patches, either. I've 
noticed that the fixes in the readdirplus version of glusterfs aren't 
always guaranteed to be present on the non-readdirplus version of the 
patches. I'd filed a bug around one such anomaly back, but never got 
around to writing a patch for it (sorry!) Here it is: 
https://bugzilla.redhat.com/show_bug.cgi?id=1062287 
I don't this has anything to do with readdirplus. 

Maybe something on similar lines here? 

Thanks, 
Anirban 

P.s. Please ignore the #Personal# in the subject line - we need to do that 
to push mails to the public domain past the email filter safely. 

From:        Pranith Kumar Karampuri <pkarampu at redhat.com> 
To:        A Ghoshal <a.ghoshal at tcs.com>, gluster-users at gluster.org 
Date:        01/22/2015 12:09 AM 
Subject:        Re: [Gluster-users] In a replica 2 server, file-updates on 
one server missing on the other server 

hi,
  Responses inline.

PS: You are chalkogen_oxygen?

Pranith 
On 01/20/2015 05:34 PM, A Ghoshal wrote: 
Hello, 

I am using the following replicated volume: 

root at serv0:~> gluster v info replicated_vol 

Volume Name: replicated_vol 
Type: Replicate 
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2 
Status: Started 
Number of Bricks: 1 x 2 = 2 
Transport-type: tcp 
Bricks: 
Brick1: serv0:/mnt/bricks/replicated_vol/brick 
Brick2: serv1:/mnt/bricks/replicated_vol/brick 
Options Reconfigured: 
diagnostics.client-log-level: INFO 
network.ping-timeout: 10 
nfs.enable-ino32: on 
cluster.self-heal-daemon: on 
nfs.disable: off 

replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1. 
If I do the following on serv0: 

root at serv0:~>echo "cranberries" > /mnt/replicated_vol/testfile 
root at serv0:~>echo "tangerines" >> /mnt/replicated_vol/testfile 

And then I check for the state of the replicas in the bricks, then I find 
that 

root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile 
cranberries 
tangerines 
root at serv0:~> 

root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile 
root at serv1:~> 

As may be seen, the replica on serv1 is blank, when I write into testfile 
from serv0 (even though the file is created on both bricks). 
Interestingly, if I write something to the file at serv1, then the two 
replicas become identical. 

root at serv1:~>echo "artichokes" >> /mnt/replicated_vol/testfile 

root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile 
cranberries 
tangerines 
artichokes 
root at serv1:~> 

root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile 
cranberries 
tangerines 
artichokes 
root at serv0:~> 

So, I dabbled into the logs a little bit, after upping the diagnostic 
level, and this is what I saw: 

When I write on serv0 (bad case): 

[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume] 
0-glusterfs-fuse: 53027: LOOKUP 
/testfl(f0a76987-8a42-47a2-b027-a823254b736b) 
[2015-01-20 09:21:52.197959] D 
[afr-common.c:131:afr_lookup_xattr_req_prepare] 
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict 
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record] 
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, owner: 
0000000000000000 
[2015-01-20 09:21:52.198024] T 
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 
456, payload: 360, rpc hdr: 96 
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit] 
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3, 
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0) 
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x 
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
(replicated_vol-client-0) 
[2015-01-20 09:21:52.198640] D 
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ] 
[2015-01-20 09:21:52.198669] D 
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ] 
[2015-01-20 09:21:52.198681] D 
[afr-self-heal-common.c:887:afr_mark_sources] 
0-replicated_vol-replicate-0: Number of sources: 1 
[2015-01-20 09:21:52.198694] D 
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] 
0-replicated_vol-replicate-0: returning read_child: 0 
[2015-01-20 09:21:52.198705] D 
[afr-common.c:1380:afr_lookup_select_read_child] 
0-replicated_vol-replicate-0: Source selected as 0 for /testfl 
[2015-01-20 09:21:52.198720] D 
[afr-common.c:1117:afr_lookup_build_response_params] 
0-replicated_vol-replicate-0: Building lookup response from 0 
[2015-01-20 09:21:52.198732] D 
[afr-common.c:1732:afr_lookup_perform_self_heal] 
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect 
self heal 

When I write on serv1 (good case): 

[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume] 
0-glusterfs-fuse: 31212: LOOKUP 
/testfl(f0a76987-8a42-47a2-b027-a823254b736b) 
[2015-01-20 09:37:49.151683] D 
[afr-common.c:131:afr_lookup_xattr_req_prepare] 
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict 
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record] 
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, owner: 
0000000000000000 
[2015-01-20 09:37:49.151744] T 
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 
456, payload: 360, rpc hdr: 96 
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit] 
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3, 
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0) 
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record] 
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner: 
0000000000000000 
[2015-01-20 09:37:49.151824] T 
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 
456, payload: 360, rpc hdr: 96 
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit] 
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3, 
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1) 
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x 
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
(replicated_vol-client-1) 
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x 
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
(replicated_vol-client-0) 
[2015-01-20 09:37:49.152582] D 
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ] 
[2015-01-20 09:37:49.152596] D 
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ] 
[2015-01-20 09:37:49.152621] D 
[afr-self-heal-common.c:887:afr_mark_sources] 
0-replicated_vol-replicate-0: Number of sources: 1 
[2015-01-20 09:37:49.152633] D 
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] 
0-replicated_vol-replicate-0: returning read_child: 0 
[2015-01-20 09:37:49.152644] D 
[afr-common.c:1380:afr_lookup_select_read_child] 
0-replicated_vol-replicate-0: Source selected as 0 for /testfl 
[2015-01-20 09:37:49.152657] D 
[afr-common.c:1117:afr_lookup_build_response_params] 
0-replicated_vol-replicate-0: Building lookup response from 0 

We see that when you write on serv1, the RPC request is sent to both 
replicated_vol-client-0 and replicated_vol-client-1, while when we write 
on serv0, the request is sent only to replicated_vol-client-0, and the 
FUse client is unaware of the presence of client-1 in the latter case. 

I checked a bit more in the logs. When I turn on my trace, I found many 
instances of these logs on serv0 but NOT on serv1: 

[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk] 
0-glusterfs-fuse: 53011: LOOKUP() / => 1 
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect] 
0-replicated_vol-client-1: attempting reconnect 
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family] 
0-replicated_vol-client-1: address-family not specified, guessing it to be 
inet from (remote-host: serv1) 
[2015-01-20 09:21:17.683178] T 
[name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-client-1: 
option remote-port missing in volume replicated_vol-client-1. Defaulting 
to 24007 
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6] 
0-resolver: flushing DNS cache 
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6] 
0-resolver: DNS cache not present, freshly probing hostname: serv1 
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6] 
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1 
and port: 24007 
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6] 
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007 
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay] 
0-replicated_vol-client-1: NODELAY enabled for socket 10 
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive] 
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2, 
idle: 20 
[2015-01-20 09:21:17.684236] W [common-utils.c:2247:gf_get_reserved_ports] 
0-glusterfs: could not open the file 
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info 
(No such file or directory) 
[2015-01-20 09:21:17.684253] W 
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to 
get reserved ports, hence there is a possibility that glusterfs may 
consume reserved port 
Logs above suggest that mount process couldn't assign a reserved port 
because it couldn't find the file 
/proc/sys/net/ipv4/ip_local_reserved_ports

I guess reboot of the machine fixed it. Wonder why it was not found in the 
first place.

Pranith. 
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown] 
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is 
not connected 
[2015-01-20 09:21:17.684699] T 
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-client-1: 
cleaning up state in transport object 0x68a630 
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv] 
0-replicated_vol-client-1: EOF on socket 
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv] 
0-replicated_vol-client-1: readv failed (No data available) 
[2015-01-20 09:21:17.684766] D 
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-client-1: 
reading from socket failed. Error (No data available), peer 
(192.168.24.81:49198) 

I could not find a 'remote-port' option in /var/lib/glusterd on either 
peer. Could somebody tell me where this configuration is looked up from? 
Also, sometime later, I rebooted serv0 and that seemed to solve the 
problem. However, stop+start of replicated_vol and restart of 
/etc/init.d/glusterd did NOT solve the problem. 
Ignore that log. If no port is given in that volfile, it picks 24007 as 
the port, which is the default port where glusterd 'listens'

Any help on this matter will be greatly appreciated as I need to provide 
robustness assurances for our setup. 

Thanks a lot, 
Anirban 

P.s. Additional details: 
glusterfs version: 3.4.2 
Linux kernel version: 2.6.34 
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you 

_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150123/259eda67/attachment.html>