[Gluster-users] In a replica 2 server, file-updates on one server missing on the other server #Personal#

Tue Jan 27 01:34:44 UTC 2015

On 01/23/2015 02:47 PM, A Ghoshal wrote:
> Oh, I didn't I only read a fragment of the IRC log and assumed 
> --xlator-option would be enough. Apparently it's a lot more work....
>
> I do have a query, though. These connections, from one of our setups, 
> are these on secure ports? Or, maybe I didn't get it the first time.....
>
> root at serv0:/root> ps -ef | grep replicated_vol
> root      8851 25307  0 10:03 pts/2    00:00:00 grep replicated_vol
> root     29751 1  4 Jan21 ?        01:47:20 /usr/sbin/glusterfsd -s 
> serv0 --volfile-id 
> replicated_vol.serv0.mnt-bricks-replicated_vol-brick -p 
> /var/lib/glusterd/vols/_replicated_vol/run/serv0-mnt-bricks-replicated_vol-brick.pid 
> -S /var/run/dff9fa3c93e82f20103f2a3d91adc4a8.socket --brick-name 
> /mnt/bricks/replicated_vol/brick -l 
> /var/log/glusterfs/bricks/mnt-bricks-replicated_vol-brick.log 
> --xlator-option 
> *-posix.glusterd-uuid=1a1d1ebc-4b92-428f-b66b-9c5efa49574d 
> --brick-port 49185 --xlator-option 
> replicated_vol-server.listen-port=49185
> root     30399 1  0 Jan21 ?        00:19:06 /usr/sbin/glusterfs 
> --volfile-id=replicated_vol --volfile-server=serv0 /mnt/replicated_vol
>
> root at serv0:/root> netstat -p | grep 30399
> tcp        0    0 serv0:969           serv0:49185         ESTABLISHED 
> 30399/glusterfs
> tcp        0    0 serv0:999           serv1:49159         ESTABLISHED 
> 30399/glusterfs
> tcp        0    0 serv0:1023          serv0:24007         ESTABLISHED 
> 30399/glusterfs
Seems to be. The ports are: 969, 999, 1023 all of which are < 1024.

Pranith
> root at serv0:/root>
>
> Thanks again,
> Anirban
>
>
>
> From: Pranith Kumar Karampuri <pkarampu at redhat.com>
> To: A Ghoshal <a.ghoshal at tcs.com>
> Cc: gluster-users at gluster.org, Niels de Vos <ndevos at redhat.com>
> Date: 01/23/2015 01:58 PM
> Subject: Re: [Gluster-users] In a replica 2 server, file-updates on 
> one server missing on the other server #Personal#
> ------------------------------------------------------------------------
>
>
>
>
> On 01/23/2015 01:54 PM, A Ghoshal wrote:
> Thanks a lot, Pranith.
>
> We'll set this option on our test servers and keep the setup under 
> observation.
> How did you get the bind-insecure option working?
> I guess I will send a patch to make it 'volume set option'
>
> Pranith
>
> Thanks,
> Anirban
>
>
>
> From: Pranith Kumar Karampuri _<pkarampu at redhat.com>_ 
> <mailto:pkarampu at redhat.com>
> To: A Ghoshal _<a.ghoshal at tcs.com>_ <mailto:a.ghoshal at tcs.com>
> Cc: _gluster-users at gluster.org_ <mailto:gluster-users at gluster.org>, 
> Niels de Vos _<ndevos at redhat.com>_ <mailto:ndevos at redhat.com>
> Date: 01/23/2015 01:28 PM
> Subject: Re: [Gluster-users] In a replica 2 server, file-updates on 
> one server missing on the other server #Personal#
> ------------------------------------------------------------------------
>
>
>
>
> On 01/22/2015 02:07 PM, A Ghoshal wrote:
> Hi Pranith,
>
> Yes, the very same (_chalcogen_eg_oxygen at yahoo.com_ 
> <mailto:chalcogen_eg_oxygen at yahoo.com>). Justin Clift sent me a mail a 
> while back telling me that it is better if we all use our business 
> email addresses so I made me a new profile.
>
> Glusterfs complains about /proc/sys/net/ipv4/ip_local_reserved_ports 
> because we use a really old Linux kernel (2.6.34) wherein this feature 
> is not present. We plan to upgrade our Linux so often but each time we 
> are dissuaded from it by some compatibility issue or the other. So, we 
> get this log every time - on both good volumes and bad ones. What 
> bothered me was this (on serv1):
> Basically to make the connections to servers i.e. bricks clients need 
> to choose secure ports i.e. port less than 1024. Since this file is 
> not present, it is not binding to any port as per the code I just 
> checked. There is an option called client-bind-insecure which bypasses 
> this check. I feel that is one (probably only way) to get around this.
> You have to "volume set server.allow-insecure on" option and 
> bind-insecure option.
> CC ndevos who seemed to have helped someone set bind-insecure option 
> correctly here (_http://irclog.perlgeek.de/gluster/2014-04-09/text_)
>
> Pranith
>
> [2015-01-20 09:37:49.151744] T 
> [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request 
> fraglen 456, payload: 360, rpc hdr: 96
> [2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit] 
> 0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3, 
> ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
> [2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record] 
> 0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, 
> owner: 0000000000000000
> [2015-01-20 09:37:49.151824] T 
> [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request 
> fraglen 456, payload: 360, rpc hdr: 96
> [2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit] 
> 0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3, 
> ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
> [2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
> 0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x 
> Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
> (replicated_vol-client-1)
> [2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
> 0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x 
> Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
> (replicated_vol-client-0)
>
> When I write on the good server (serv1), we see that an RPC request is 
> sent to both client-0 and client-1. While, when I write on the bad 
> server (serv0), the RPC request is sent only to client-0, which is why 
> it is no wonder that the writes are not synced over to serv1. Somehow 
> I could not make the daemon on serv0 understand that there are two 
> up-children and not just one.
>
> One additional detail - since we are using a kernel that's too old, we 
> do not have the (Anand Avati's?) FUse readdirplus patches, either. 
> I've noticed that the fixes in the readdirplus version of glusterfs 
> aren't always guaranteed to be present on the non-readdirplus version 
> of the patches. I'd filed a bug around one such anomaly back, but 
> never got around to writing a patch for it (sorry!) Here it is: 
> _https://bugzilla.redhat.com/show_bug.cgi?id=1062287_
> I don't this has anything to do with readdirplus.
>
> Maybe something on similar lines here?
>
> Thanks,
> Anirban
>
> P.s. Please ignore the #Personal# in the subject line - we need to do 
> that to push mails to the public domain past the email filter safely.
>
>
>
> From: Pranith Kumar Karampuri _<pkarampu at redhat.com>_ 
> <mailto:pkarampu at redhat.com>
> To: A Ghoshal _<a.ghoshal at tcs.com>_ <mailto:a.ghoshal at tcs.com>, 
> _gluster-users at gluster.org_ <mailto:gluster-users at gluster.org>
> Date: 01/22/2015 12:09 AM
> Subject: Re: [Gluster-users] In a replica 2 server, file-updates on 
> one server missing on the other server
> ------------------------------------------------------------------------
>
>
>
> hi,
>  Responses inline.
>
> PS: You are chalkogen_oxygen?
>
> Pranith
> On 01/20/2015 05:34 PM, A Ghoshal wrote:
> Hello,
>
> I am using the following replicated volume:
>
> root at serv0:~> gluster v info replicated_vol
>
> Volume Name: replicated_vol
> Type: Replicate
> Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: serv0:/mnt/bricks/replicated_vol/brick
> Brick2: serv1:/mnt/bricks/replicated_vol/brick
> Options Reconfigured:
> diagnostics.client-log-level: INFO
> network.ping-timeout: 10
> nfs.enable-ino32: on
> cluster.self-heal-daemon: on
> nfs.disable: off
>
> replicated_vol is mounted at /mnt/replicated_vol on both serv0 and 
> serv1. If I do the following on serv0:
>
> root at serv0:~>echo "cranberries" > /mnt/replicated_vol/testfile
> root at serv0:~>echo "tangerines" >> /mnt/replicated_vol/testfile
>
> And then I check for the state of the replicas in the bricks, then I 
> find that
>
> root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
> cranberries
> tangerines
> root at serv0:~>
>
> root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
> root at serv1:~>
>
> As may be seen, the replica on serv1 is blank, when I write into 
> testfile from serv0 (even though the file is created on both bricks). 
> Interestingly, if I write something to the file at serv1, then the two 
> replicas become identical.
>
> root at serv1:~>echo "artichokes" >> /mnt/replicated_vol/testfile
>
> root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
> cranberries
> tangerines
> artichokes
> root at serv1:~>
>
> root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
> cranberries
> tangerines
> artichokes
> root at serv0:~>
>
> So, I dabbled into the logs a little bit, after upping the diagnostic 
> level, and this is what I saw:*_
>
> When I write on serv0 (bad case):_*
>
> [2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume] 
> 0-glusterfs-fuse: 53027: LOOKUP 
> /testfl(f0a76987-8a42-47a2-b027-a823254b736b)
> [2015-01-20 09:21:52.197959] D 
> [afr-common.c:131:afr_lookup_xattr_req_prepare] 
> 0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
> [2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record] 
> 0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, 
> owner: 0000000000000000
> [2015-01-20 09:21:52.198024] T 
> [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request 
> fraglen 456, payload: 360, rpc hdr: 96
> [2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit] 
> 0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3, 
> ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
> [2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
> 0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x 
> Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
> (replicated_vol-client-0)
> [2015-01-20 09:21:52.198640] D 
> [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
> 0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
> [2015-01-20 09:21:52.198669] D 
> [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
> 0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
> [2015-01-20 09:21:52.198681] D 
> [afr-self-heal-common.c:887:afr_mark_sources] 
> 0-replicated_vol-replicate-0: Number of sources: 1
> [2015-01-20 09:21:52.198694] D 
> [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] 
> 0-replicated_vol-replicate-0: returning read_child: 0
> [2015-01-20 09:21:52.198705] D 
> [afr-common.c:1380:afr_lookup_select_read_child] 
> 0-replicated_vol-replicate-0: Source selected as 0 for /testfl
> [2015-01-20 09:21:52.198720] D 
> [afr-common.c:1117:afr_lookup_build_response_params] 
> 0-replicated_vol-replicate-0: Building lookup response from 0
> [2015-01-20 09:21:52.198732] D 
> [afr-common.c:1732:afr_lookup_perform_self_heal] 
> 0-replicated_vol-replicate-0: Only 1 child up - do not attempt to 
> detect self heal*_
>
> When I write on serv1 (good case):_*
>
> [2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume] 
> 0-glusterfs-fuse: 31212: LOOKUP 
> /testfl(f0a76987-8a42-47a2-b027-a823254b736b)
> [2015-01-20 09:37:49.151683] D 
> [afr-common.c:131:afr_lookup_xattr_req_prepare] 
> 0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
> [2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record] 
> 0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, 
> owner: 0000000000000000
> [2015-01-20 09:37:49.151744] T 
> [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request 
> fraglen 456, payload: 360, rpc hdr: 96
> [2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit] 
> 0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3, 
> ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
> [2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record] 
> 0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, 
> owner: 0000000000000000
> [2015-01-20 09:37:49.151824] T 
> [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request 
> fraglen 456, payload: 360, rpc hdr: 96
> [2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit] 
> 0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3, 
> ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
> [2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
> 0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x 
> Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
> (replicated_vol-client-1)
> [2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
> 0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x 
> Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
> (replicated_vol-client-0)
> [2015-01-20 09:37:49.152582] D 
> [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
> 0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
> [2015-01-20 09:37:49.152596] D 
> [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
> 0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
> [2015-01-20 09:37:49.152621] D 
> [afr-self-heal-common.c:887:afr_mark_sources] 
> 0-replicated_vol-replicate-0: Number of sources: 1
> [2015-01-20 09:37:49.152633] D 
> [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] 
> 0-replicated_vol-replicate-0: returning read_child: 0
> [2015-01-20 09:37:49.152644] D 
> [afr-common.c:1380:afr_lookup_select_read_child] 
> 0-replicated_vol-replicate-0: Source selected as 0 for /testfl
> [2015-01-20 09:37:49.152657] D 
> [afr-common.c:1117:afr_lookup_build_response_params] 
> 0-replicated_vol-replicate-0: Building lookup response from 0
>
> We see that when you write on serv1, the RPC request is sent to both 
> replicated_vol-client-0 and replicated_vol-client-1, while when we 
> write on serv0, the request is sent only to replicated_vol-client-0, 
> and the FUse client is unaware of the presence of client-1 in the 
> latter case.
>
> I checked a bit more in the logs. When I turn on my trace, I found 
> many instances of these logs on serv0 but NOT on serv1:
>
> [2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk] 
> 0-glusterfs-fuse: 53011: LOOKUP() / => 1
> [2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect] 
> 0-replicated_vol-client-1: attempting reconnect
> [2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family] 
> 0-replicated_vol-client-1: address-family not specified, guessing it 
> to be inet from (remote-host: serv1)
> [2015-01-20 09:21:17.683178] T 
> [name.c:225:af_inet_client_get_remote_sockaddr] 
> 0-replicated_vol-client-1: option remote-port missing in volume 
> replicated_vol-client-1. Defaulting to 24007
> [2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6] 
> 0-resolver: flushing DNS cache
> [2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6] 
> 0-resolver: DNS cache not present, freshly probing hostname: serv1
> [2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6] 
> 0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: 
> serv1 and port: 24007
> [2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6] 
> 0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
> [2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay] 
> 0-replicated_vol-client-1: NODELAY enabled for socket 10
> [2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive] 
> 0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 
> 2, idle: 20
> [2015-01-20 09:21:17.684236] W 
> [common-utils.c:2247:gf_get_reserved_ports] 0-glusterfs: could not 
> open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting 
> reserved ports info (No such file or directory)
> [2015-01-20 09:21:17.684253] W 
> [common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able 
> to get reserved ports, hence there is a possibility that glusterfs may 
> consume reserved port
> Logs above suggest that mount process couldn't assign a reserved port 
> because it couldn't find the file 
> /proc/sys/net/ipv4/ip_local_reserved_ports
>
> I guess reboot of the machine fixed it. Wonder why it was not found in 
> the first place.
>
> Pranith.
> [2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown] 
> 0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint 
> is not connected
> [2015-01-20 09:21:17.684699] T 
> [rpc-clnt.c:519:rpc_clnt_connection_cleanup] 
> 0-replicated_vol-client-1: cleaning up state in transport object 0x68a630
> [2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv] 
> 0-replicated_vol-client-1: EOF on socket
> [2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv] 
> 0-replicated_vol-client-1: readv failed (No data available)
> [2015-01-20 09:21:17.684766] D 
> [socket.c:1962:__socket_proto_state_machine] 
> 0-replicated_vol-client-1: reading from socket failed. Error (No data 
> available), peer (192.168.24.81:49198)
>
> I could not find a 'remote-port' option in /var/lib/glusterd on either 
> peer. Could somebody tell me where this configuration is looked up 
> from? Also, sometime later, I rebooted serv0 and that seemed to solve 
> the problem. However, stop+start of replicated_vol and restart of 
> /etc/init.d/glusterd did NOT solve the problem.
> Ignore that log. If no port is given in that volfile, it picks 24007 
> as the port, which is the default port where glusterd 'listens'
>
>
> Any help on this matter will be greatly appreciated as I need to 
> provide robustness assurances for our setup.
>
> Thanks a lot,
> Anirban
>
> P.s. Additional details:/
> glusterfs version: 3.4.2//
> Linux kernel version: 2.6.34/
>
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
>
> _______________________________________________
> Gluster-users mailing list_
> __Gluster-users at gluster.org_ <mailto:Gluster-users at gluster.org>_
> __http://www.gluster.org/mailman/listinfo/gluster-users_
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150127/b4d2c640/attachment.html>