[Gluster-users] 3.1.2 Debian - client_rpc_notify "failed to get the port number for remote subvolume"

Fri Feb 4 20:58:05 UTC 2011

>>> However, if I do a gluster volume info I see that it's listed:
>>> # gluster volume info | grep 98
>>> Brick98: clustr-02:/mnt/data17

But now I'm thinking this is wrong because while it says clustr-02,
the error stops occurring when I stop clustr-03. So how do I really
know, not only what host it's on, but what brick each mount is on?
(/mnt/data* in my case)

In other words, does
bhl-volume-client-98 != Brick98: clustr-02:/mnt/data17 ?

and if not, how can I tell which brick is bhl-volume-client-98?

P

On Fri, Feb 4, 2011 at 1:49 PM, phil cryer <phil at cryer.us> wrote:
> On Fri, Feb 4, 2011 at 12:33 PM, Anand Avati <anand.avati at gmail.com> wrote:
>> It is very likely the brick process is failing to start. Please look at the
>> brick log on that server. (in /var/log/glusterfs/bricks/* )
>> Avati
>
> Thanks, so if I'm looking at it right, the 'bhl-volume-client-98' is
> really Brick98: clustr-02:/mnt/data17 - I'm figuring that from this:
>
>>> [2011-02-04 13:09:28.407300] I [client.c:1590:client_rpc_notify]
>>> bhl-volume-client-98: disconnected
>>>
>>> However, if I do a gluster volume info I see that it's listed:
>>> # gluster volume info | grep 98
>>> Brick98: clustr-02:/mnt/data17
>
> But on that server I don't see any issues with that brick starting:
>
> # head mnt-data17.log -n50
> [2011-02-03 23:29:24.235648] W [graph.c:274:gf_add_cmdline_options]
> bhl-volume-server: adding option 'listen-port' for volume
> 'bhl-volume-server' with value '24025'
> [2011-02-03 23:29:24.236017] W
> [rpc-transport.c:566:validate_volume_options] tcp.bhl-volume-server:
> option 'listen-port' is deprecated, preferred is
> 'transport.socket.listen-port', continuing with correction
> Given volfile:
> +------------------------------------------------------------------------------+
>  1: volume bhl-volume-posix
>  2:     type storage/posix
>  3:     option directory /mnt/data17
>  4: end-volume
>  5:
>  6: volume bhl-volume-access-control
>  7:     type features/access-control
>  8:     subvolumes bhl-volume-posix
>  9: end-volume
>  10:
>  11: volume bhl-volume-locks
>  12:     type features/locks
>  13:     subvolumes bhl-volume-access-control
>  14: end-volume
>  15:
>  16: volume bhl-volume-io-threads
>  17:     type performance/io-threads
>  18:     subvolumes bhl-volume-locks
>  19: end-volume
>  20:
>  21: volume /mnt/data17
>  22:     type debug/io-stats
>  23:     subvolumes bhl-volume-io-threads
>  24: end-volume
>  25:
>  26: volume bhl-volume-server
>  27:     type protocol/server
>  28:     option transport-type tcp
>  29:     option auth.addr./mnt/data17.allow *
>  30:     subvolumes /mnt/data17
>  31: end-volume
>
> +------------------------------------------------------------------------------+
> [2011-02-03 23:29:28.575630] I
> [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
> client from 128.128.164.219:724
> [2011-02-03 23:29:28.583169] I
> [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
> client from 127.0.1.1:985
> [2011-02-03 23:29:28.603357] I
> [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
> client from 128.128.164.218:726
> [2011-02-03 23:29:28.605650] I
> [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
> client from 128.128.164.217:725
> [2011-02-03 23:29:28.608033] I
> [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
> client from 128.128.164.215:725
> [2011-02-03 23:29:31.161985] I
> [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
> client from 128.128.164.74:697
> [2011-02-04 00:40:11.600314] I
> [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted
> client from 128.128.164.74:805
>
> Plus, looking at the tail of this log, it's still working, latest
> messages (from 4 seconds before) as I'm moving some things on the
> cluster
>
> [2011-02-04 23:13:35.53685] W [server-resolve.c:565:server_resolve]
> bhl-volume-server: pure path resolution for
> /www/d/dasobstdertropen00schrrich (INODELK)
> [2011-02-04 23:13:35.57107] W [server-resolve.c:565:server_resolve]
> bhl-volume-server: pure path resolution for
> /www/d/dasobstdertropen00schrrich (SETXATTR)
> [2011-02-04 23:13:35.59699] W [server-resolve.c:565:server_resolve]
> bhl-volume-server: pure path resolution for
> /www/d/dasobstdertropen00schrrich (INODELK)
>
> Thanks!
>
> P
>
>
>
>>
>> On Fri, Feb 4, 2011 at 10:19 AM, phil cryer <phil at cryer.us> wrote:
>>>
>>> I have glusterfs 3.1.2 running on Debian, I'm able to start the volume
>>> and now mount it via mount -t gluster and I can see everything. I am
>>> still seeing the following error in /var/log/glusterfs/nfs.log
>>>
>>> [2011-02-04 13:09:16.404851] E
>>> [client-handshake.c:1079:client_query_portmap_cbk]
>>> bhl-volume-client-98: failed to get the port number for remote
>>> subvolume
>>> [2011-02-04 13:09:16.404909] I [client.c:1590:client_rpc_notify]
>>> bhl-volume-client-98: disconnected
>>> [2011-02-04 13:09:20.405843] E
>>> [client-handshake.c:1079:client_query_portmap_cbk]
>>> bhl-volume-client-98: failed to get the port number for remote
>>> subvolume
>>> [2011-02-04 13:09:20.405938] I [client.c:1590:client_rpc_notify]
>>> bhl-volume-client-98: disconnected
>>> [2011-02-04 13:09:24.406634] E
>>> [client-handshake.c:1079:client_query_portmap_cbk]
>>> bhl-volume-client-98: failed to get the port number for remote
>>> subvolume
>>> [2011-02-04 13:09:24.406711] I [client.c:1590:client_rpc_notify]
>>> bhl-volume-client-98: disconnected
>>> [2011-02-04 13:09:28.407249] E
>>> [client-handshake.c:1079:client_query_portmap_cbk]
>>> bhl-volume-client-98: failed to get the port number for remote
>>> subvolume
>>> [2011-02-04 13:09:28.407300] I [client.c:1590:client_rpc_notify]
>>> bhl-volume-client-98: disconnected
>>>
>>> However, if I do a gluster volume info I see that it's listed:
>>> # gluster volume info | grep 98
>>> Brick98: clustr-02:/mnt/data17
>>>
>>> I've gone to that host, unmounted the specific drive, ran fsck.ext4 on
>>> it, and it came back clean. Remounting and then restarting gluster on
>>> all the nodes hasn't changed anything, I keep getting that error.
>>> Also, I don't understand why it can't get the port number since it's
>>> working fine on 23 other bricks (drives) on that server; leads me to
>>> believe that it's not an accurate error.
>>>
>>> I searched the mailing lists and bug-tracker, and only found this similar
>>> bug:
>>> http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1640
>>>
>>> Any idea what's going on? Is this just a benign error since the
>>> cluster still seems to be working, or ?
>>>
>>> Thanks
>>>
>>> P
>>> --
>>> http://philcryer.com
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
>>
>
>
>
> --
> http://philcryer.com
>

-- 
http://philcryer.com