[Gluster-users] Change transport-type on volume from tcp to rdma, tcp
Geoffrey Letessier
geoffrey.letessier at cnrs.fr
Wed Jul 22 08:06:38 UTC 2015
Oops, i forgot to add all people in CC.
Yes, i guessed.
With TCP protocol, all my volume seem OK and I dont note, for the moment, any hang.
mount command:
- with RDMA: mount -t glusterfs -o transport=rdma,direct-io-mode=disable,enable-ino32 ib-storage1:vol_home /mnt
- with TCP: mount -t glusterfs -o transport=tcp,direct-io-mode=disable,enable-ino32 ib-storage1:vol_home /mnt
volume status:
# gluster volume status all
Status of volume: vol_home
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick ib-storage1:/export/brick_home/brick1
/data 49159 49165 Y 6547
Brick ib-storage2:/export/brick_home/brick1
/data 49161 49173 Y 24348
Brick ib-storage3:/export/brick_home/brick1
/data 49152 49156 Y 5616
Brick ib-storage4:/export/brick_home/brick1
/data 49152 49162 Y 5424
Brick ib-storage1:/export/brick_home/brick2
/data 49160 49166 Y 6548
Brick ib-storage2:/export/brick_home/brick2
/data 49162 49174 Y 24355
Brick ib-storage3:/export/brick_home/brick2
/data 49153 49157 Y 5635
Brick ib-storage4:/export/brick_home/brick2
/data 49153 49163 Y 5443
Self-heal Daemon on localhost N/A N/A Y 6534
Self-heal Daemon on ib-storage3 N/A N/A Y 7656
Self-heal Daemon on ib-storage2 N/A N/A Y 24519
Self-heal Daemon on ib-storage4 N/A N/A Y 7288
Task Status of Volume vol_home
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: vol_shared
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick ib-storage1:/export/brick_shared/data 49152 49164 Y 6554
Brick ib-storage2:/export/brick_shared/data 49152 49172 Y 24362
Self-heal Daemon on localhost N/A N/A Y 6534
Self-heal Daemon on ib-storage3 N/A N/A Y 7656
Self-heal Daemon on ib-storage2 N/A N/A Y 24519
Self-heal Daemon on ib-storage4 N/A N/A Y 7288
Task Status of Volume vol_shared
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: vol_workdir_amd
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick ib-storage1:/export/brick_workdir/bri
ck1/data 49191 49192 Y 6555
Brick ib-storage3:/export/brick_workdir/bri
ck1/data 49164 49165 Y 6368
Brick ib-storage1:/export/brick_workdir/bri
ck2/data 49193 49194 Y 6576
Brick ib-storage3:/export/brick_workdir/bri
ck2/data 49166 49167 Y 6387
Task Status of Volume vol_workdir_amd
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: vol_workdir_intel
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick ib-storage2:/export/brick_workdir/bri
ck1/data 49175 49176 Y 24371
Brick ib-storage2:/export/brick_workdir/bri
ck2/data 49177 49178 Y 24372
Brick ib-storage4:/export/brick_workdir/bri
ck1/data 49164 49165 Y 5571
Brick ib-storage4:/export/brick_workdir/bri
ck2/data 49166 49167 Y 5590
Task Status of Volume vol_workdir_intel
------------------------------------------------------------------------------
There are no active volume tasks
Concerning the brick logs, do you wanna have all bricks on every servers?
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
Le 22 juil. 2015 à 10:00, Mohammed Rafi K C <rkavunga at redhat.com> a écrit :
>
>
> On 07/22/2015 12:55 PM, Geoffrey Letessier wrote:
>> Concerning the hang, I just saw this only once with TCP protocol but, actually, RDMA seems to be in cause.
>
> If you are mounting a tcp,rdma volume using tcp protocol, all the communication will go through the tcp connection and rdma won't come in between client and server.
>
>> … And, after a moment (a few minutes after having restarted my back-transfert of around 40TB), my volume fall down (and all my rsync too):
>> [root at atlas ~]# df -h /mnt
>> df: « /mnt »: Noeud final de transport n'est pas connecté
>> df: aucun système de fichiers traité
>> aka "transport endpoint is not connected »
>
> Can you sent me the following details , if possible, ?
> 1) mount command used, 2) volume status 3) Client, brick logs
>
> Regards
> Rafi KC
>
>>
>> Geoffrey
>>
>>
>> ------------------------------------------------------
>> Geoffrey Letessier
>> Responsable informatique & ingénieur système
>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>> Institut de Biologie Physico-Chimique
>> 13, rue Pierre et Marie Curie - 75005 Paris
>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>
>> Le 22 juil. 2015 à 09:17, Geoffrey Letessier <geoffrey.letessier at cnrs.fr> a écrit :
>>
>>> Hi Rafi,
>>>
>>> It’s what I do. But I note particularly this kind of trouble when I mount my volumes manually.
>>>
>>> In addition, when I changed my transport-type from tcp or rdma to tcp,rdma, I have had to restart my volume in order they can took effect.
>>>
>>> I wonder if these trouble are not due to RDMA protocol… because it looks like more stable with TCP one.
>>>
>>> Another idea?
>>> Thanks for replying and by advance,
>>> Geoffrey
>>> ------------------------------------------------------
>>> Geoffrey Letessier
>>> Responsable informatique & ingénieur système
>>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>>> Institut de Biologie Physico-Chimique
>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>>
>>> Le 22 juil. 2015 à 07:33, Mohammed Rafi K C <rkavunga at redhat.com> a écrit :
>>>
>>>>
>>>>
>>>> On 07/22/2015 04:51 AM, Geoffrey Letessier wrote:
>>>>> Hi Niels,
>>>>>
>>>>> Thanks for replying.
>>>>>
>>>>> In fact, after having checked the log, I've discovered GlusterFS tried to connect a brick with a TCP (or RDMA) port allocated to another volume… (bug?)
>>>>> For example, here is a extract of my workdir.log file :
>>>>> [2015-07-21 21:34:01.820188] E [socket.c:2332:socket_connect_finish] 0-vol_workdir_amd-client-0: connection to 10.0.4.1:49161 failed (Connexion refusée)
>>>>> [2015-07-21 21:34:01.822563] E [socket.c:2332:socket_connect_finish] 0-vol_workdir_amd-client-2: connection to 10.0.4.1:49162 failed (Connexion refusée)
>>>>>
>>>>> But the 2 ports (49161 and 49162) concerned only my vol_home volume, not the vol_workdir_amd one.
>>>>>
>>>>> Now, after having restart all glusterd synchronously (pdsh -w cl-storage[1-4] service glusterd restart), all seems to be back into a normal situation (size, write permission, etc.)
>>>>>
>>>>> But, a few minutes later, i note a strange thing I notice since i’ve upgraded my cluster storage from 3.5.3 to 3.7.2-3: when I try to mount some volume (particularly my vol_shared volume (replicated volume)) my system can hang… And, because I use it in my bashrc file for my environment modules, i need to restart my node. Idem if I try to do a DF on my mounted volume (if it doesn’t hang during the mount).
>>>>>
>>>>> With TCP transport-type, the situation seems to be more stable..
>>>>>
>>>>> In addition: If I restart a storage node, I can’t use Gluster CLI (it also hang).
>>>>>
>>>>> Do you have an idea?
>>>>
>>>> Are you using bash script to start/mount the volume ? If so, add a sleep after volume start and mount, to allow all the process to start properly. Because RDMA protocol will take some time to init the resources.
>>>>
>>>> Regards
>>>> Rafi KC
>>>>
>>>>
>>>>
>>>>>
>>>>> One more time, thanks a lot for your help,
>>>>> Geoffrey
>>>>>
>>>>> ------------------------------------------------------
>>>>> Geoffrey Letessier
>>>>> Responsable informatique & ingénieur système
>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>>>>> Institut de Biologie Physico-Chimique
>>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>>>>
>>>>> Le 21 juil. 2015 à 23:49, Niels de Vos <ndevos at redhat.com> a écrit :
>>>>>
>>>>>> On Tue, Jul 21, 2015 at 11:20:20PM +0200, Geoffrey Letessier wrote:
>>>>>>> Hello Soumya, Hello everybody,
>>>>>>>
>>>>>>> network.ping-timeout was set to 42 seconds. I set it to 0 but no
>>>>>>> difference. The problem was, after having re-set le transport-type to
>>>>>>> rdma,tcp some brick down after a few minutes.. Despite of restarting
>>>>>>> volumes, after a few minutes, some [other/different] bricks down
>>>>>>> again.
>>>>>>
>>>>>> I'm not sure how if the ping-timeout is differently handled when RDMA is
>>>>>> used. Adding two of the guys that know RDMA well on CC.
>>>>>>
>>>>>>> Now, after re-creation of my volume, bricks keep alive but, oddly, i’m
>>>>>>> not able to write on my volume. In addition, I defined a distributed
>>>>>>> volume with 2 servers, 4 bricks of 250GB each and my final volume
>>>>>>> seems to be only sized to 500GB… It’s amazing..
>>>>>>
>>>>>> As seen further below, the 500GB volume is caused by two unreachable
>>>>>> bricks. When the bricks are not reachable, the size of the bricks can
>>>>>> not be detected by the client and therefore 2x 250 GB is missing.
>>>>>>
>>>>>> It is unclear to me why writing to a pure distributed volume fails. When
>>>>>> a brick is not reachable, and the file should be created there, it
>>>>>> would normally get created on an other brick. When the brick that should
>>>>>> have the file gets online, and a new lookup for the file is done, a so
>>>>>> called "link file" is created, which points to the file on the other
>>>>>> brick. I guess the failure has to do with the connection issues, and I
>>>>>> would suggest to get that solved first.
>>>>>>
>>>>>> HTH,
>>>>>> Niels
>>>>>>
>>>>>>
>>>>>>> Here you can find some information:
>>>>>>> # gluster volume status vol_workdir_amd
>>>>>>> Status of volume: vol_workdir_amd
>>>>>>> Gluster process TCP Port RDMA Port Online Pid
>>>>>>> ------------------------------------------------------------------------------
>>>>>>> Brick ib-storage1:/export/brick_workdir/bri
>>>>>>> ck1/data 49185 49186 Y 23098
>>>>>>> Brick ib-storage3:/export/brick_workdir/bri
>>>>>>> ck1/data 49158 49159 Y 3886
>>>>>>> Brick ib-storage1:/export/brick_workdir/bri
>>>>>>> ck2/data 49187 49188 Y 23117
>>>>>>> Brick ib-storage3:/export/brick_workdir/bri
>>>>>>> ck2/data 49160 49161 Y 3905
>>>>>>>
>>>>>>> # gluster volume info vol_workdir_amd
>>>>>>>
>>>>>>> Volume Name: vol_workdir_amd
>>>>>>> Type: Distribute
>>>>>>> Volume ID: 087d26ea-c6df-4cbe-94af-ecd87b59aedb
>>>>>>> Status: Started
>>>>>>> Number of Bricks: 4
>>>>>>> Transport-type: tcp,rdma
>>>>>>> Bricks:
>>>>>>> Brick1: ib-storage1:/export/brick_workdir/brick1/data
>>>>>>> Brick2: ib-storage3:/export/brick_workdir/brick1/data
>>>>>>> Brick3: ib-storage1:/export/brick_workdir/brick2/data
>>>>>>> Brick4: ib-storage3:/export/brick_workdir/brick2/data
>>>>>>> Options Reconfigured:
>>>>>>> performance.readdir-ahead: on
>>>>>>>
>>>>>>> # pdsh -w storage[1,3] df -h /export/brick_workdir/brick{1,2}
>>>>>>> storage3: Filesystem Size Used Avail Use% Mounted on
>>>>>>> storage3: /dev/mapper/st--block1-blk1--workdir
>>>>>>> storage3: 250G 34M 250G 1% /export/brick_workdir/brick1
>>>>>>> storage3: /dev/mapper/st--block2-blk2--workdir
>>>>>>> storage3: 250G 34M 250G 1% /export/brick_workdir/brick2
>>>>>>> storage1: Filesystem Size Used Avail Use% Mounted on
>>>>>>> storage1: /dev/mapper/st--block1-blk1--workdir
>>>>>>> storage1: 250G 33M 250G 1% /export/brick_workdir/brick1
>>>>>>> storage1: /dev/mapper/st--block2-blk2--workdir
>>>>>>> storage1: 250G 33M 250G 1% /export/brick_workdir/brick2
>>>>>>>
>>>>>>> # df -h /workdir/
>>>>>>> Filesystem Size Used Avail Use% Mounted on
>>>>>>> localhost:vol_workdir_amd.rdma
>>>>>>> 500G 67M 500G 1% /workdir
>>>>>>>
>>>>>>> # touch /workdir/test
>>>>>>> touch: impossible de faire un touch « /workdir/test »: Aucun fichier ou dossier de ce type
>>>>>>>
>>>>>>> # tail -30l /var/log/glusterfs/workdir.log
>>>>>>> Host Unreachable, Check your connection with IPoIB
>>>>>>> [2015-07-21 21:10:33.927673] W [rdma.c:1263:gf_rdma_cm_event_handler] 0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1020 peer:10.0.4.1:49174)
>>>>>>> Host Unreachable, Check your connection with IPoIB
>>>>>>> [2015-07-21 21:10:37.877231] I [rpc-clnt.c:1819:rpc_clnt_reconfig] 0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
>>>>>>> [2015-07-21 21:10:37.880556] I [rpc-clnt.c:1819:rpc_clnt_reconfig] 0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
>>>>>>> [2015-07-21 21:10:37.914661] W [rdma.c:1263:gf_rdma_cm_event_handler] 0-vol_workdir_amd-client-0: cma event RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1021 peer:10.0.4.1:49173)
>>>>>>> Host Unreachable, Check your connection with IPoIB
>>>>>>> [2015-07-21 21:10:37.923535] W [rdma.c:1263:gf_rdma_cm_event_handler] 0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1020 peer:10.0.4.1:49174)
>>>>>>> Host Unreachable, Check your connection with IPoIB
>>>>>>> [2015-07-21 21:10:41.883925] I [rpc-clnt.c:1819:rpc_clnt_reconfig] 0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
>>>>>>> [2015-07-21 21:10:41.887085] I [rpc-clnt.c:1819:rpc_clnt_reconfig] 0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
>>>>>>> [2015-07-21 21:10:41.919394] W [rdma.c:1263:gf_rdma_cm_event_handler] 0-vol_workdir_amd-client-0: cma event RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1021 peer:10.0.4.1:49173)
>>>>>>> Host Unreachable, Check your connection with IPoIB
>>>>>>> [2015-07-21 21:10:41.932622] W [rdma.c:1263:gf_rdma_cm_event_handler] 0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1020 peer:10.0.4.1:49174)
>>>>>>> Host Unreachable, Check your connection with IPoIB
>>>>>>> [2015-07-21 21:10:44.682636] W [dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
>>>>>>> [2015-07-21 21:10:44.682947] W [dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
>>>>>>> [2015-07-21 21:10:44.683240] W [dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
>>>>>>> [2015-07-21 21:10:44.683472] W [dht-diskusage.c:48:dht_du_info_cbk] 0-vol_workdir_amd-dht: failed to get disk info from vol_workdir_amd-client-0
>>>>>>> [2015-07-21 21:10:44.683506] W [dht-diskusage.c:48:dht_du_info_cbk] 0-vol_workdir_amd-dht: failed to get disk info from vol_workdir_amd-client-2
>>>>>>> [2015-07-21 21:10:44.683532] W [dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
>>>>>>> [2015-07-21 21:10:44.683551] W [fuse-bridge.c:1970:fuse_create_cbk] 0-glusterfs-fuse: 18: /test => -1 (Aucun fichier ou dossier de ce type)
>>>>>>> [2015-07-21 21:10:44.683619] W [dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
>>>>>>> [2015-07-21 21:10:44.683846] W [dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
>>>>>>> [2015-07-21 21:10:45.886807] I [rpc-clnt.c:1819:rpc_clnt_reconfig] 0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
>>>>>>> [2015-07-21 21:10:45.893059] I [rpc-clnt.c:1819:rpc_clnt_reconfig] 0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
>>>>>>> [2015-07-21 21:10:45.920434] W [rdma.c:1263:gf_rdma_cm_event_handler] 0-vol_workdir_amd-client-0: cma event RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1021 peer:10.0.4.1:49173)
>>>>>>> Host Unreachable, Check your connection with IPoIB
>>>>>>> [2015-07-21 21:10:45.925292] W [rdma.c:1263:gf_rdma_cm_event_handler] 0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1020 peer:10.0.4.1:49174)
>>>>>>> Host Unreachable, Check your connection with IPoIB
>>>>>>>
>>>>>>> I use GlusterFS in production since around 3 years without any block
>>>>>>> problem but now the situation is awesome since more than 3 weeks…
>>>>>>> Indeed, our production are down since roughly 3.5 weeks (with a lot
>>>>>>> and different problems with GlusterFS v3.5.3 and now with 3.7.2-3) and
>>>>>>> i need to restart it…
>>>>>>>
>>>>>>> Thanks in advance,
>>>>>>> Geoffrey
>>>>>>> ------------------------------------------------------
>>>>>>> Geoffrey Letessier
>>>>>>> Responsable informatique & ingénieur système
>>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>>>>>>> Institut de Biologie Physico-Chimique
>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>>>>>>
>>>>>>> Le 21 juil. 2015 à 19:36, Soumya Koduri <skoduri at redhat.com> a écrit :
>>>>>>>
>>>>>>>> From the following errors,
>>>>>>>>
>>>>>>>> [2015-07-21 14:36:30.495321] I [MSGID: 114020] [client.c:2118:notify] 0-vol_shared-client-0: parent translators are ready, attempting connect on transport
>>>>>>>> [2015-07-21 14:36:30.498989] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT 0 on socket 12, Protocole non disponible
>>>>>>>> [2015-07-21 14:36:30.499004] E [socket.c:3015:socket_connect] 0-vol_shared-client-0: Failed to set keep-alive: Protocole non disponible
>>>>>>>>
>>>>>>>> looks like setting TCP_USER_TIMEOUT value to 0 on the socket failed with error (IIUC) "Protocol not available".
>>>>>>>> Could you check if 'network.ping-timeout' is set to zero for that volume using 'gluster volume info'? Anyways from the code looks like 'TCP_USER_TIMEOUT' can take value zero. Not sure why it has failed.
>>>>>>>>
>>>>>>>> Niels, any thoughts?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Soumya
>>>>>>>>
>>>>>>>> On 07/21/2015 08:15 PM, Geoffrey Letessier wrote:
>>>>>>>>> [2015-07-21 14:36:30.495321] I [MSGID: 114020] [client.c:2118:notify]
>>>>>>>>> 0-vol_shared-client-0: parent translators are ready, attempting connect
>>>>>>>>> on transport
>>>>>>>>> [2015-07-21 14:36:30.498989] W [socket.c:923:__socket_keepalive]
>>>>>>>>> 0-socket: failed to set TCP_USER_TIMEOUT 0 on socket 12, Protocole non
>>>>>>>>> disponible
>>>>>>>>> [2015-07-21 14:36:30.499004] E [socket.c:3015:socket_connect]
>>>>>>>>> 0-vol_shared-client-0: Failed to set keep-alive: Protocole non disponible
>>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150722/c2e20dee/attachment.html>
More information about the Gluster-users
mailing list