[Gluster-users] gluster and LIO, fairly basic setup, having major issues

Thu Oct 6 20:40:25 UTC 2016

here is the profile for about 30 seconds.. I didn't let it run a full 60:

Brick: media2-be:/gluster/brick1/gluster_volume_0
-------------------------------------------------
Cumulative Stats:
   Block Size:                512b+                1024b+
 2048b+
 No. of Reads:                    0                     0
  0
No. of Writes:                31133                 37339
35573

   Block Size:               4096b+                8192b+
16384b+
 No. of Reads:                    0                     0
  0
No. of Writes:               284535                 91431
43838

   Block Size:              32768b+               65536b+
 131072b+
 No. of Reads:                    0                     0
 181121
No. of Writes:                27764                 22258
 226187

   Block Size:             262144b+
 No. of Reads:                    0
No. of Writes:                    7
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
Fop
 ---------   -----------   -----------   -----------   ------------
 ----
      0.00       0.00 us       0.00 us       0.00 us             53
RELEASE
      0.00       0.00 us       0.00 us       0.00 us            272
 RELEASEDIR
      0.08     217.02 us      17.00 us   46751.00 us            837
 STAT
      0.54     487.22 us       5.00 us  150634.00 us           2675
 FINODELK
      1.07    2591.53 us      24.00 us  186199.00 us           1001
 READ
      1.77    3224.61 us      16.00 us  113361.00 us           1322
WRITE
      3.02      84.19 us       8.00 us  186102.00 us          86693
INODELK
      5.10   11293.23 us      20.00 us  153002.00 us           1090
 FXATTROP
     88.42  395188.99 us    2771.00 us 2378742.00 us            540
FSYNC

    Duration: 82547 seconds
   Data Read: 23739891712 bytes
Data Written: 36058159104 bytes

Interval 1 Stats:
   Block Size:                512b+                1024b+
 2048b+
 No. of Reads:                    0                     0
  0
No. of Writes:                   24                     2
 14

   Block Size:               4096b+                8192b+
16384b+
 No. of Reads:                    0                     0
  0
No. of Writes:                  167                    28
  4

   Block Size:              32768b+               65536b+
 131072b+
 No. of Reads:                    0                     0
309
No. of Writes:                    8                     1
  0

 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
Fop
 ---------   -----------   -----------   -----------   ------------
 ----
      0.13      61.58 us      17.00 us    2074.00 us            224
 STAT
      0.19      41.50 us       7.00 us    4143.00 us            498
 FINODELK
      0.53     188.37 us      27.00 us   11377.00 us            309
 READ
      3.04    1337.96 us      18.00 us   48765.00 us            248
WRITE
      5.28    2594.87 us      21.00 us   47939.00 us            222
 FXATTROP
     14.58      20.41 us       8.00 us   47905.00 us          77937
INODELK
     76.25   74945.14 us   23687.00 us  199942.00 us            111
FSYNC

    Duration: 53 seconds
   Data Read: 40501248 bytes
Data Written: 1512448 bytes

Brick: media1-be:/gluster/brick1/gluster_volume_0
-------------------------------------------------
Cumulative Stats:
   Block Size:                512b+                1024b+
 2048b+
 No. of Reads:                    0                     0
  0
No. of Writes:                 2831                  4699
 6142

   Block Size:               4096b+                8192b+
16384b+
 No. of Reads:                    0                     0
  0
No. of Writes:                46751                 16712
 7972

   Block Size:              32768b+               65536b+
 131072b+
 No. of Reads:                    0                     0
  0
No. of Writes:                 4462                  2938
27952

 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
Fop
 ---------   -----------   -----------   -----------   ------------
 ----
      0.00       0.00 us       0.00 us       0.00 us              7
RELEASE
      0.00       0.00 us       0.00 us       0.00 us             11
 RELEASEDIR
      1.75     245.15 us      46.00 us   19886.00 us           1321
WRITE
      6.99    1191.45 us     114.00 us  215838.00 us           1089
 FXATTROP
     10.07     698.36 us      18.00 us  286316.00 us           2674
 FINODELK
     24.51      52.44 us      23.00 us  171166.00 us          86694
INODELK
     56.69   19472.96 us    1568.00 us  249274.00 us            540
FSYNC

    Duration: 2224 seconds
   Data Read: 0 bytes
Data Written: 4669031424 bytes

Interval 1 Stats:
   Block Size:                512b+                1024b+
 2048b+
 No. of Reads:                    0                     0
  0
No. of Writes:                   24                     2
 14

   Block Size:               4096b+                8192b+
16384b+
 No. of Reads:                    0                     0
  0
No. of Writes:                  167                    28
  4

   Block Size:              32768b+               65536b+
 No. of Reads:                    0                     0
No. of Writes:                    8                     1
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
Fop
 ---------   -----------   -----------   -----------   ------------
 ----
      1.12     302.43 us     114.00 us   10132.00 us            222
 FXATTROP
      1.18     285.74 us      56.00 us    7058.00 us            248
WRITE
      4.31     519.43 us      21.00 us  188427.00 us            498
 FINODELK
     32.76   17714.02 us    5018.00 us  205904.00 us            111
FSYNC
     60.63      46.69 us      23.00 us    9550.00 us          77936
INODELK

    Duration: 53 seconds
   Data Read: 0 bytes
Data Written: 1512448 bytes

[root at media2 ~]# gluster volume status
Status of volume: gvol0
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------
------------------
Brick media1-be:/gluster/brick1/gluster_vol
ume_0                                       49152     0          Y
2829
Brick media2-be:/gluster/brick1/gluster_vol
ume_0                                       49152     0          Y
1456
NFS Server on localhost                     N/A       N/A        N
N/A
Self-heal Daemon on localhost               N/A       N/A        Y
1451
NFS Server on media1                        N/A       N/A        N
N/A
Self-heal Daemon on media1                  N/A       N/A        Y
2824

Task Status of Volume gvol0
------------------------------------------------------------
------------------
There are no active volume tasks

On Thu, Oct 6, 2016 at 4:25 PM, Michael Ciccarelli <mikecicc01 at gmail.com>
wrote:

> this is the info file contents.. is there another file you would want to
> see for config?
> type=2
> count=2
> status=1
> sub_count=2
> stripe_count=1
> replica_count=2
> disperse_count=0
> redundancy_count=0
> version=3
> transport-type=0
> volume-id=98c258e6-ae9e-4407-8f25-7e3f7700e100
> username=removed just cause
> password=removed just cause
> op-version=3
> client-op-version=3
> quota-version=0
> parent_volname=N/A
> restored_from_snap=00000000-0000-0000-0000-000000000000
> snap-max-hard-limit=256
> diagnostics.count-fop-hits=on
> diagnostics.latency-measurement=on
> performance.readdir-ahead=on
> brick-0=media1-be:-gluster-brick1-gluster_volume_0
> brick-1=media2-be:-gluster-brick1-gluster_volume_0
>
> here are some log entries, etc-glusterfs-glusterd.vol.log:
> The message "I [MSGID: 106006] [glusterd-svc-mgmt.c:323:
> glusterd_svc_common_rpc_notify] 0-management: nfs has disconnected from
> glusterd." repeated 39 times between [2016-10-06 20:10:14.963402] and
> [2016-10-06 20:12:11.979684]
> [2016-10-06 20:12:14.980203] I [MSGID: 106006] [glusterd-svc-mgmt.c:323:
> glusterd_svc_common_rpc_notify] 0-management: nfs has disconnected from
> glusterd.
> [2016-10-06 20:13:50.993490] W [socket.c:596:__socket_rwv] 0-nfs: readv on
> /var/run/gluster/360710d59bc4799f8c8a6374936d2b1b.socket failed (Invalid
> argument)
>
> I can provide any specific details you would like to see.. Last night I
> tried 1 more time and it appeared to be working ok for running 1 VM under
> VMware but as soon as I had 3 running the targets became unresponsive. I
> believe gluster volume is ok but for whatever reason the ISCSI target
> daemon seems to be having some issues...
>
> here is from the messages file:
> Oct  5 23:13:00 media2 kernel: MODE SENSE: unimplemented page/subpage:
> 0x1c/0x02
> Oct  5 23:13:00 media2 kernel: MODE SENSE: unimplemented page/subpage:
> 0x1c/0x02
> Oct  5 23:13:35 media2 kernel: iSCSI/iqn.1998-01.com.vmware:vmware4-0941d552:
> Unsupported SCSI Opcode 0x4d, sending CHECK_CONDITION.
> Oct  5 23:13:35 media2 kernel: iSCSI/iqn.1998-01.com.vmware:vmware4-0941d552:
> Unsupported SCSI Opcode 0x4d, sending CHECK_CONDITION.
>
> and here are some more VMware iscsi errors:
> 2016-10-06T20:22:11.496Z cpu2:32825)NMP: nmp_ThrottleLogForDevice:2321:
> Cmd 0x89 (0x412e808532c0, 32801) to dev "naa.
> 6001405c0d86944f3d2468d80c7d1540" on
> 2016-10-06T20:22:11.635Z cpu2:32787)ScsiDeviceIO: 2338:
> Cmd(0x412e808532c0) 0x89, CmdSN 0x4f05 from world 32801 to dev "naa.
> 6001405c0d86944f3d2468d80c7d1
> 2016-10-06T20:22:11.635Z cpu3:35532)Fil3: 15389: Max timeout retries
> exceeded for caller Fil3_FileIO (status 'Timeout')
>
> 2016-10-06T20:22:11.635Z cpu2:196414)HBX: 2832: Waiting for timed out [HB
> state abcdef02 offset 3928064 gen 25 stampUS 49571997650 uuid
> 57f5c142-45632d75
> 2016-10-06T20:22:11.635Z cpu3:35532)HBX: 2832: Waiting for timed out [HB
> state abcdef02 offset 3928064 gen 25 stampUS 49571997650 uuid
> 57f5c142-45632d75-
> 2016-10-06T20:22:11.635Z cpu0:32799)NMP: nmp_ThrottleLogForDevice:2321:
> Cmd 0x28 (0x412e80848580, 32799) to dev "naa.
> 6001405c0d86944f3d2468d80c7d1540" on
> 2016-10-06T20:22:11.635Z cpu0:32799)ScsiDeviceIO: 2325:
> Cmd(0x412e80848580) 0x28, CmdSN 0x4f06 from world 32799 to dev "naa.
> 6001405c0d86944f3d2468d80c7d1
> 2016-10-06T20:22:11.773Z cpu0:32843)NMP: nmp_ThrottleLogForDevice:2321:
> Cmd 0x28 (0x412e80848580, 32799) to dev "naa.
> 6001405c0d86944f3d2468d80c7d1540" on
> 2016-10-06T20:22:11.916Z cpu0:35549)NMP: nmp_ThrottleLogForDevice:2321:
> Cmd 0x28 (0x412e80848580, 32799) to dev "naa.
> 6001405c0d86944f3d2468d80c7d1540" on
> 2016-10-06T20:22:12.000Z cpu2:33431)iscsi_vmk: iscsivmk_ConnNetRegister:
> socket 0x410987bf0800 network resource pool netsched.pools.persist.iscsi
> associa
> 2016-10-06T20:22:12.000Z cpu2:33431)iscsi_vmk: iscsivmk_ConnNetRegister:
> socket 0x410987bf0800 network tracker id 16 tracker.iSCSI.172.16.1.40
> associated
> 2016-10-06T20:22:12.056Z cpu0:35549)NMP: nmp_ThrottleLogForDevice:2321:
> Cmd 0x28 (0x412e80848580, 32799) to dev "naa.
> 6001405c0d86944f3d2468d80c7d1540" on
> 2016-10-06T20:22:12.194Z cpu0:35549)NMP: nmp_ThrottleLogForDevice:2321:
> Cmd 0x28 (0x412e80848580, 32799) to dev "naa.
> 6001405c0d86944f3d2468d80c7d1540" on
> 2016-10-06T20:22:12.253Z cpu2:33431)WARNING: iscsi_vmk:
> iscsivmk_StartConnection: vmhba38:CH:1 T:1 CN:0: iSCSI connection is being
> marked "ONLINE"
> 2016-10-06T20:22:12.253Z cpu2:33431)WARNING: iscsi_vmk:
> iscsivmk_StartConnection: Sess [ISID: 00023d000004 TARGET:
> iqn.2016-09.iscsi.gluster:shared TPGT:
> 2016-10-06T20:22:12.253Z cpu2:33431)WARNING: iscsi_vmk:
> iscsivmk_StartConnection: Conn [CID: 0 L: 172.16.1.53:49959 R:
> 172.16.1.40:3260]
>
> Is it that the gluster overhead is just killing LIO/target?
>
> thanks,
> Mike
>
>
>
> On Thu, Oct 6, 2016 at 12:22 PM, Vijay Bellur <vbellur at redhat.com> wrote:
>
>> Hi Mike,
>>
>> Can you please share your gluster volume configuration?
>>
>> Also do you notice anything in client logs on the node where fileio
>> backstore is configured?
>>
>> Thanks,
>> Vijay
>>
>> On Wed, Oct 5, 2016 at 8:56 PM, Michael Ciccarelli <mikecicc01 at gmail.com>
>> wrote:
>> > So I have a fairly basic setup using glusterfs between 2 nodes. The
>> nodes
>> > have 10 gig connections and the bricks reside on SSD LVM LUNs:
>> >
>> > Brick1: media1-be:/gluster/brick1/gluster_volume_0
>> > Brick2: media2-be:/gluster/brick1/gluster_volume_0
>> >
>> >
>> > On this volume I have a LIO iscsi target with 1 fileio backstore that's
>> > being shared out to vmware ESXi hosts. The volume is around 900 gig and
>> the
>> > fileio store is around 850g:
>> >
>> > -rw-r--r-- 1 root root 912680550400 Oct  5 20:47 iscsi.disk.3
>> >
>> > I set the WWN to be the same so the ESXi hosts see the nodes as 2 paths
>> to
>> > the same target. I believe this is what I want. The issues I'm seeing is
>> > that while the IO wait is low I'm seeing high CPU usage with only 3 VMs
>> > running on only 1 of the ESX servers:
>> >
>> > this is media2-be:
>> >   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
>> COMMAND
>> >  1474 root      20   0 1396620  37912   5980 S 135.0  0.1 157:01.84
>> > glusterfsd
>> >  1469 root      20   0  747996  13724   5424 S   2.0  0.0   1:10.59
>> > glusterfs
>> >
>> > And this morning it seemed like I had to restart the LIO service on
>> > media1-be as the VMware was seeing time-out issues. I'm seeing issues
>> like
>> > this on the VMware ESX servers:
>> >
>> > 2016-10-06T00:51:41.100Z cpu0:32785)WARNING: ScsiDeviceIO: 1223: Device
>> > naa.600140501ce79002e724ebdb66a6756d performance has deteriorated. I/O
>> > latency increased from average value of 33420 microseconds to 732696
>> > microseconds.
>> >
>> > Are there any special settings I need to have gluster+LIO+vmware to
>> work?
>> > Has anyone gotten this to work fairly well that it is stable? What am I
>> > missing?
>> >
>> > thanks,
>> > Mike
>> >
>> >
>> >
>> > _______________________________________________
>> > Gluster-users mailing list
>> > Gluster-users at gluster.org
>> > http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161006/b463c67c/attachment.html>