[Gluster-users] Gluster Infiniband/RDMA Help

Thu Aug 11 06:53:26 UTC 2016

Added Rafi, Raghavendra who work on RDMA

On Mon, Aug 8, 2016 at 7:58 AM, Dan Lavu <dan at redhat.com> wrote:

> Hello,
>
> I'm having some major problems with Gluster and oVirt, I've been ripping
> my hair out with this, so if anybody can provide insight, that will be
> fantastic. I've tried both transports TCP and RDMA... both are having
> instability problems.
>
> So the first thing I'm running into, intermittently, on one specific node,
> will get spammed with the following message;
>
> "[2016-08-08 00:42:50.837992] E [rpc-clnt.c:357:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fb728b0f293] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7fb7288d73d1] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fb7288d74ee] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7fb7288d8d0e]
> (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7fb7288d9528] )))))
> 0-vmdata1-client-0: forced unwinding frame type(GlusterFS 3.3)
> op(WRITE(13)) called at 2016-08-08 00:42:43.620710 (xid=0x6800b)"
>
> Then the infiniband device will get bounced and VMs will get stuck.
>
> Another problem I'm seeing, once a day, or every two days, an oVirt node
> will hang on gluster mounts. Issuing a df to check the mounts will just
> stall, this occurs hourly if RDMA is used. I can log into the hypervisor
> remount the gluster volumes most of the time.
>
> This is on Fedora 23; Gluster 3.8.1-1, the Infiniband gear is 40Gb/s QDR
> Qlogic, using the ib_qib module, this configuration was working with our
> old infinihost III. I couldn't get OFED to compile so all the infiniband
> modules are Fedora installed.
>
> So a volume looks like the following, (please if there is anything I need
> to adjust, the settings was pulled from several examples)
>
> Volume Name: vmdata_ha
> Type: Replicate
> Volume ID: 325a5fda-a491-4c40-8502-f89776a3c642
> Status: Started
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp,rdma
> Bricks:
> Brick1: deadpool.ib.runlevelone.lan:/gluster/vmdata_ha
> Brick2: spidey.ib.runlevelone.lan:/gluster/vmdata_ha
> Brick3: groot.ib.runlevelone.lan:/gluster/vmdata_ha (arbiter)
> Options Reconfigured:
> performance.least-prio-threads: 4
> performance.low-prio-threads: 16
> performance.normal-prio-threads: 24
> performance.high-prio-threads: 24
> cluster.self-heal-window-size: 32
> cluster.self-heal-daemon: on
> performance.md-cache-timeout: 1
> performance.cache-max-file-size: 2MB
> performance.io-thread-count: 32
> network.ping-timeout: 5
> performance.write-behind-window-size: 4MB
> performance.cache-size: 256MB
> performance.cache-refresh-timeout: 10
> server.allow-insecure: on
> network.remote-dio: enable
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> storage.owner-gid: 36
> storage.owner-uid: 36
> performance.readdir-ahead: on
> nfs.disable: on
> config.transport: tcp,rdma
> performance.stat-prefetch: off
> cluster.eager-lock: enable
>
> Volume Name: vmdata1
> Type: Distribute
> Volume ID: 3afefcb3-887c-4315-b9dc-f4e890f786eb
> Status: Started
> Number of Bricks: 2
> Transport-type: tcp,rdma
> Bricks:
> Brick1: spidey.ib.runlevelone.lan:/gluster/vmdata1
> Brick2: deadpool.ib.runlevelone.lan:/gluster/vmdata1
> Options Reconfigured:
> config.transport: tcp,rdma
> network.remote-dio: enable
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> nfs.disable: on
> storage.owner-gid: 36
> storage.owner-uid: 36
> performance.readdir-ahead: on
> server.allow-insecure: on
> performance.stat-prefetch: off
> performance.cache-refresh-timeout: 10
> performance.cache-size: 256MB
> performance.write-behind-window-size: 4MB
> network.ping-timeout: 5
> performance.io-thread-count: 32
> performance.cache-max-file-size: 2MB
> performance.md-cache-timeout: 1
> performance.high-prio-threads: 24
> performance.normal-prio-threads: 24
> performance.low-prio-threads: 16
> performance.least-prio-threads: 4
>
>
> /etc/glusterfs/glusterd.vol
> volume management
>     type mgmt/glusterd
>     option working-directory /var/lib/glusterd
>     option transport-type socket,tcp
>     option transport.socket.keepalive-time 10
>     option transport.socket.keepalive-interval 2
>     option transport.socket.read-fail-log off
>     option ping-timeout 0
>     option event-threads 1
> #    option rpc-auth-allow-insecure on
>     option transport.socket.bind-address 0.0.0.0
> #   option transport.address-family inet6
> #   option base-port 49152
> end-volume
>
> I think that's a good start, thank you so much for taking the time to look
> at this. You can find me on freenode, nick side_control if you want to
> chat, I'm GMT -5.
>
> Cheers,
>
> Dan
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160811/7a209c78/attachment.html>