[Gluster-users] Performance issues with striped volume over Infiniband

Ionescu, A. a.ionescu at student.vu.nl
Wed Apr 18 10:05:33 UTC 2012


Dear Gluster Users,

We are facing some severe performance issues with GlusterFS and we would very much appreciate any help on identifying the cause of this.

Our setup is extremely simple: 2 nodes interconnected with 40Gb/s Infiniband and also 1Gb/s Ethernet, running Centos 6.2 and GlusterFS 3.2.6.
Each node has 4 SATA drives put in a RAID0 array that gives ~750 MB/s random reads bandwidth. The tool that we used for measuring IO performance relies on O_DIRECT access, so we patched the fuse kernel: http://marc.info/?l=linux-fsdevel&m=132950081331043&w=2.

We created the following volume and mounted it at /mnt/gfs/.

Volume Name: GFS_RDMA_VOLUME
Type: Stripe
Status: Started
Number of Bricks: 2
Transport-type: rdma
Bricks:
Brick1: node01:/mnt/md0/gfs_storage
Brick2: node02:/mnt/md0/gfs_storage
Options Reconfigured:
cluster.stripe-block-size: *:2MB
performance.quick-read: on
performance.io-cache: on
performance.cache-size: 256MB
performance.cache-max-file-size: 128MB

We expected to see an IO bandwidth of about 1500 MB/s (measured with the exact same tool and parameters), but unfortunately we only get ~100MB/s, which is very disappointing.

Please find below the output of  #cat /var/log/glusterfs/mnt-gfs-.log. If you need any other information that I forgot to mentioned, please let me know.

Thanks,
Adrian

________________________________
[2012-04-18 11:59:42.847818] I [glusterfsd.c:1493:main] 0-/opt/glusterfs/3.2.6/sbin/glusterfs: Started running /opt/glusterfs/3.2.6/sbin/glusterfs version 3.2.6
[2012-04-18 11:59:42.862610] W [write-behind.c:3023:init] 0-GFS_RDMA_VOLUME-write-behind: disabling write-behind for first 0 bytes
[2012-04-18 11:59:43.318188] I [client.c:1935:notify] 0-GFS_RDMA_VOLUME-client-0: parent translators are ready, attempting connect on transport
[2012-04-18 11:59:43.321287] I [client.c:1935:notify] 0-GFS_RDMA_VOLUME-client-1: parent translators are ready, attempting connect on transport
Given volfile:
+------------------------------------------------------------------------------+
  1: volume GFS_RDMA_VOLUME-client-0
  2:     type protocol/client
  3:     option remote-host node01
  4:     option remote-subvolume /mnt/md0/gfs_storage
  5:     option transport-type rdma
  6: end-volume
  7:
  8: volume GFS_RDMA_VOLUME-client-1
  9:     type protocol/client
 10:     option remote-host node02
 11:     option remote-subvolume /mnt/md0/gfs_storage
 12:     option transport-type rdma
 13: end-volume
 14:
 15: volume GFS_RDMA_VOLUME-stripe-0
 16:     type cluster/stripe
 17:     option block-size *:2MB
 18:     subvolumes GFS_RDMA_VOLUME-client-0 GFS_RDMA_VOLUME-client-1
 19: end-volume
 20:
 21: volume GFS_RDMA_VOLUME-write-behind
 22:     type performance/write-behind
 23:     subvolumes GFS_RDMA_VOLUME-stripe-0
 24: end-volume
 25:
 26: volume GFS_RDMA_VOLUME-read-ahead
 27:     type performance/read-ahead
 28:     subvolumes GFS_RDMA_VOLUME-write-behind
 29: end-volume
 30:
 31: volume GFS_RDMA_VOLUME-io-cache
 32:     type performance/io-cache
 33:     option max-file-size 128MB
 34:     option cache-size 256MB
 35:     subvolumes GFS_RDMA_VOLUME-read-ahead
 36: end-volume
 37:
 38: volume GFS_RDMA_VOLUME-quick-read
 39:     type performance/quick-read
 40:     option cache-size 256MB
 41:     subvolumes GFS_RDMA_VOLUME-io-cache
 42: end-volume
 43:
 44: volume GFS_RDMA_VOLUME-stat-prefetch
 45:     type performance/stat-prefetch
 46:     subvolumes GFS_RDMA_VOLUME-quick-read
 47: end-volume
 48:
 49: volume GFS_RDMA_VOLUME
 50:     type debug/io-stats
 51:     option latency-measurement off
 52:     option count-fop-hits off
 53:     subvolumes GFS_RDMA_VOLUME-stat-prefetch
 54: end-volume

+------------------------------------------------------------------------------+
[2012-04-18 11:59:43.326287] E [client-handshake.c:1171:client_query_portmap_cbk] 0-GFS_RDMA_VOLUME-client-1: failed to get the port number for remote subvolume
[2012-04-18 11:59:43.764287] E [client-handshake.c:1171:client_query_portmap_cbk] 0-GFS_RDMA_VOLUME-client-0: failed to get the port number for remote subvolume
[2012-04-18 11:59:46.868595] I [rpc-clnt.c:1536:rpc_clnt_reconfig] 0-GFS_RDMA_VOLUME-client-0: changing port to 24009 (from 0)
[2012-04-18 11:59:46.879292] I [rpc-clnt.c:1536:rpc_clnt_reconfig] 0-GFS_RDMA_VOLUME-client-1: changing port to 24009 (from 0)
[2012-04-18 11:59:50.872346] I [client-handshake.c:1090:select_server_supported_programs] 0-GFS_RDMA_VOLUME-client-0: Using Program GlusterFS 3.2.6, Num (1298437), Version (310)
[2012-04-18 11:59:50.872760] I [client-handshake.c:913:client_setvolume_cbk] 0-GFS_RDMA_VOLUME-client-0: Connected to 192.168.0.101:24009, attached to remote volume '/mnt/md0/gfs_storage'.
[2012-04-18 11:59:50.874975] I [client-handshake.c:1090:select_server_supported_programs] 0-GFS_RDMA_VOLUME-client-1: Using Program GlusterFS 3.2.6, Num (1298437), Version (310)
[2012-04-18 11:59:50.875290] I [client-handshake.c:913:client_setvolume_cbk] 0-GFS_RDMA_VOLUME-client-1: Connected to 192.168.0.103:24009, attached to remote volume '/mnt/md0/gfs_storage'.
[2012-04-18 11:59:50.878013] I [fuse-bridge.c:3339:fuse_graph_setup] 0-fuse: switched to graph 0
[2012-04-18 11:59:50.878321] I [fuse-bridge.c:2927:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13
________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120418/bc943535/attachment.html>


More information about the Gluster-users mailing list