[Gluster-users] High load on glusterfs client

Sebastian.Gumprich at t-systems.com Sebastian.Gumprich at t-systems.com
Tue Mar 1 19:07:36 UTC 2016


Hello everyone,

I'm experiencing high load on our glusterfs clients.

Here's the setup:

There are to glusterfs server:

Nfs01 and nfs02 with the following configuration:

[root nfs01 ~]# gluster volume info opt

Volume Name: opt
Type: Replicate
Volume ID: 5b77070f-5378-45ec-9eda-5f7dd007ff8a
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1:  nfs01:/opt/bkk
Brick2: nfs02:/opt/bkk
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.cache-size: 512MB
performance.cache-refresh-timeout: 10
performance.read-ahead: off
performance.write-behind-window-size: 4MB
network.ping-timeout: 2
performance.io-thread-count: 16
performance.cache-max-file-size: 2MB
performance.md-cache-timeout: 1

Then there are two clients (web01 and web02) that mount the brick via a virtual ip-address (nfs-VIP):
nfs-VIP:/opt on /opt/bkk type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

operating system on all server is CentOS Linux release 7.2.1511 (Core).
Glusterfs version is glusterfs 3.7.6 built on Nov  9 2015 15:20:26

On the brick lies the PHP dynamic webcontent from a typo3 CMS.

On the client (web01) the following is logged in the gluster.log:

iner_08850598886fb5f39c9cf1d269d7e20677f97ede.php>, e09948dd-1e9b-4430-8f55-3df64cda2385 on opt-client-1 and ba80a475-7b83-4c83-bd0c-798a108bfb63 on opt-client-0. Skipping conservative merge on the file.
[2016-03-01 18:40:50.570040] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/News_News_layout_Detail_html_bd113d9c433c8f88376e47547db3b94e698a5ecd.php>, 739ee14c-2d5d-458b-bffd-83595bfcbe6a on opt-client-1 and 5a311733-731e-4478-ad3c-a70fbf66ba30 on opt-client-0. Skipping conservative merge on the file.
[2016-03-01 18:40:50.572992] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/News_News_partial_Detail_FalMediaContainer_9c1b3fd40fca9019726b3f6b8bc04618ffadab7b.php>, bb6907a1-ce80-4e03-92df-6fbc69d24a4d on opt-client-1 and 6f88aa67-cb81-4c26-94b2-e3aaa8704e8d on opt-client-0. Skipping conservative merge on the file.
[2016-03-01 18:40:50.791704] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/Powermail_Form_action_create_f40464a6a7f73d86cda514065167d59a7ddece73.php>, 5e5b224b-ea20-4d38-8504-61b24f5d6a3b on opt-client-1 and fab07af5-2aa5-4873-a6e2-6265ec78e304 on opt-client-0. Skipping conservative merge on the file.
[2016-03-01 18:40:54.085964] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/News_News_action_detail_8d30b654cd8343fe40616b8a2f8a5343b1ed776e.php>, 4d75a687-b9ab-4f97-b698-38668d1981ae on opt-client-1 and 110b315e-2e28-4859-a8b9-e0f1629faa3c on opt-client-0. Skipping conservative merge on the file.
[2016-03-01 18:40:56.153651] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/Powermail_Form_layout_Default_aae217b167ad82f4b1258bb01fa73f305844dbd8.php>, 6f7e2709-8c14-486a-85a2-a3cb48af4ca5 on opt-client-1 and 6ab62408-0406-4834-96b9-a51e18441d4c on opt-client-0. Skipping conservative merge on the file.
[2016-03-01 18:41:05.476126] I [MSGID: 108026] [afr-self-heal-entry.c:593:afr_selfheal_entry_do] 0-opt-replicate-0: performing entry selfheal on 7a922c37-48d0-4dfb-8abb-18a435c948af
[2016-03-01 18:41:05.597093] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-opt-replicate-0: Completed entry selfheal on 7a922c37-48d0-4dfb-8abb-18a435c948af. source=1 sinks=0
[2016-03-01 18:41:05.790944] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php>, 118668d9-608a-477a-b655-bcc6c2298bf4 on opt-client-1 and a87943a4-e18a-4642-adff-1ad765496533 on opt-client-0. Skipping conservative merge on the file.
[2016-03-01 18:41:06.649695] W [MSGID: 108008] [afr-self-heal-name.c:359:afr_selfheal_name_gfid_mismatch_check] 0-opt-replicate-0: GFID mismatch for <gfid:4c6dda77-6a2b-4996-bca4-9ace4cee45cc>/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php 118668d9-608a-477a-b655-bcc6c2298bf4 on opt-client-1 and a87943a4-e18a-4642-adff-1ad765496533 on opt-client-0
[2016-03-01 18:41:06.661277] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 184415191: LOOKUP() /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php => -1 (Input/output error)
[2016-03-01 18:41:06.680968] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 184422672: LOOKUP() /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php => -1 (Input/output error)
[2016-03-01 18:41:06.680222] W [MSGID: 108008] [afr-self-heal-name.c:359:afr_selfheal_name_gfid_mismatch_check] 0-opt-replicate-0: GFID mismatch for <gfid:4c6dda77-6a2b-4996-bca4-9ace4cee45cc>/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php 118668d9-608a-477a-b655-bcc6c2298bf4 on opt-client-1 and a87943a4-e18a-4642-adff-1ad765496533 on opt-client-0

There are many more of these entries, this is just a really small excerpt. The files that have a mismatch are tempary php-cache files.
When I delete these files, the load goes down and the files in the volume heal info become less (see below).

Here's the output of gluster volume heal opt info. Note that this output is *after* deleting most of the cache files, before that there were many more entries.

[root at nfs01 fluid_template]# gluster volume heal opt info
Brick nfs01:/opt/bkk
<gfid:23fc1027-0aec-4b84-9ffb-c164a9d43d20>
<gfid:92cb9dde-2721-4c11-93a6-2582ed9edd5d>
<gfid:a0dbcf8a-67f8-4870-ab57-3d5d1218601c>
<gfid:947cbcc4-1978-4b9e-b726-2acd0a4fda5a>
<gfid:440b9b36-bad5-4cb6-b935-8a004132340a>
Number of entries: 5

Brick nfs02:/opt/bkk
/releases/1.0.1/typo3temp/Cache/Code/fluid_template - Possibly undergoing heal

/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Select_7bf809152d985037de761d8d375d286e44b4f13a.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Misc_GoogleAdwordsConversion_f7254aeb252ea43cd89f9051b5a43109d47938f1.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_File_4b3a3f667c475577847aa77118f3af5666ecb2c6.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Input_b3e08744b23680f0a14e60e716f9994d9580e3f4.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/News_News_action_detail_8d30b654cd8343fe40616b8a2f8a5343b1ed776e.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/News_News_layout_Detail_html_bd113d9c433c8f88376e47547db3b94e698a5ecd.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/News_News_partial_Detail_Opengraph_b98680f3686dccf00e22181e66d11ca9de7a44bd.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/News_News_partial_Detail_FalMediaContainer_9c1b3fd40fca9019726b3f6b8bc04618ffadab7b.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/News_News_partial_Detail_MediaContainer_08850598886fb5f39c9cf1d269d7e20677f97ede.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Text_c10766db8d335d5cd9555878aef5d886dcb6926e.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_action_create_f40464a6a7f73d86cda514065167d59a7ddece73.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_layout_Default_aae217b167ad82f4b1258bb01fa73f305844dbd8.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Misc_HoneyPod_fc83c414f744612c3cb44c8827372a30f17791d0.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Textarea_39d24d8e3e2813636dfff2a89b7cefb8e9117c97.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Submit_86e69c50ccebf20584db2e3c74859373c53d320f.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Check_a2a11c64ac58dab16eab29e4cda88518c15a4d25.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Misc_FormError_7cade8e8fc1d23c761360c0efbe8cb145eed2e39.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Radio_d038a263b5ea81f0f7795e1e47516e5e2937cbd9.php
/releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Hidden_a7651f5498e0d36b4e2eae5fca015ef0e9365067.php
Number of entries: 21

Here are some heal-infos during the high load:
Starting time of crawl: Tue Mar  1 19:09:45 2016

Ending time of crawl: Tue Mar  1 19:09:52 2016

Type of crawl: INDEX
No. of entries healed: 2
No. of entries in split-brain: 0
No. of heal failed entries: 168


And here's the performance monitoring info during 60 seconds of high load:

Brick: nfs01:/opt/bkk
------------------------------
Cumulative Stats:
   Block Size:                  1b+                   2b+                   4b+
No. of Reads:                   18                    33                   308
No. of Writes:                   64                    88                  2994

   Block Size:                  8b+                  16b+                  32b+
No. of Reads:                  117                   154                 15612
No. of Writes:                  370                   369                  1432

   Block Size:                 64b+                 128b+                 256b+
No. of Reads:                 3721                 12884                 19917
No. of Writes:                 7585                900221                135011

   Block Size:                512b+                1024b+                2048b+
No. of Reads:                23929                 12251                 19835
No. of Writes:                63067                 30950                 23540

   Block Size:               4096b+                8192b+               16384b+
No. of Reads:                 9096                  9449                  5566
No. of Writes:                40455                 36397                 13926

   Block Size:              32768b+               65536b+              131072b+
No. of Reads:                 5159                  6055                 34001
No. of Writes:                20722                  6600                 12762

%-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us         212065      FORGET
      0.00       0.00 us       0.00 us       0.00 us        4118713     RELEASE
      0.00       0.00 us       0.00 us       0.00 us        9931097  RELEASEDIR
      0.00      47.00 us      47.00 us      47.00 us              1    GETXATTR
      0.00     124.00 us     124.00 us     124.00 us              1     XATTROP
      0.00     144.00 us     144.00 us     144.00 us              1      UNLINK
      0.00      37.17 us      35.00 us      41.00 us              6      STATFS
      0.01      36.84 us      32.00 us      61.00 us             19       FSTAT
      0.01      48.80 us      44.00 us      62.00 us             20        STAT
      0.05      75.52 us      54.00 us     156.00 us             48 REMOVEXATTR
      0.05      81.69 us      69.00 us     146.00 us             48     SETATTR
      0.05      37.16 us      14.00 us     436.00 us            109       FLUSH
      0.05      38.31 us      18.00 us     115.00 us            108    FINODELK
      0.06      73.16 us      46.00 us     170.00 us             63        OPEN
      0.10      38.70 us      20.00 us     167.00 us            195     INODELK
      0.11     335.37 us      40.00 us     563.00 us             27     READDIR
      0.15      50.58 us      27.00 us     392.00 us            232     OPENDIR
      0.15      81.74 us      35.00 us     215.00 us            144    FXATTROP
      0.16     130.53 us      68.00 us     697.00 us            100       WRITE
      0.93    1532.04 us     171.00 us   15356.00 us             48      CREATE
      1.40     284.90 us      25.00 us    1146.00 us            390    READDIRP
     10.20      52.29 us      28.00 us    2323.00 us          15482    READLINK
     18.94      33.25 us      11.00 us   27242.00 us          45219     ENTRYLK
     67.58      93.74 us      32.00 us   27521.00 us          57223      LOOKUP

    Duration: 6492593 seconds
   Data Read: 5679496942 bytes
Data Written: 4510536316 bytes

Interval 1 Stats:
   Block Size:               4096b+                8192b+               16384b+
No. of Reads:                    0                     0                     0
No. of Writes:                    1                    50                     2

   Block Size:              32768b+
No. of Reads:                    0
No. of Writes:                   47
%-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us             48      FORGET
      0.00       0.00 us       0.00 us       0.00 us            104     RELEASE
      0.00       0.00 us       0.00 us       0.00 us            231  RELEASEDIR
      0.00     124.00 us     124.00 us     124.00 us              1     XATTROP
      0.00     144.00 us     144.00 us     144.00 us              1      UNLINK
      0.00      36.40 us      35.00 us      37.00 us              5      STATFS
      0.01      36.84 us      32.00 us      61.00 us             19       FSTAT
      0.01      48.80 us      44.00 us      62.00 us             20        STAT
      0.05      75.52 us      54.00 us     156.00 us             48 REMOVEXATTR
      0.05      37.69 us      14.00 us     436.00 us            101       FLUSH
      0.06      81.69 us      69.00 us     146.00 us             48     SETATTR
      0.06      74.26 us      46.00 us     170.00 us             53        OPEN
      0.06      38.31 us      18.00 us     115.00 us            108    FINODELK
      0.10     311.33 us      40.00 us     563.00 us             24     READDIR
      0.11      38.70 us      20.00 us     167.00 us            195     INODELK
      0.16      50.58 us      27.00 us     392.00 us            231     OPENDIR
      0.17      81.74 us      35.00 us     215.00 us            144    FXATTROP
      0.18     130.53 us      68.00 us     697.00 us            100       WRITE
      1.03    1532.04 us     171.00 us   15356.00 us             48      CREATE
      1.56     284.90 us      25.00 us    1146.00 us            390    READDIRP
     11.31      52.27 us      28.00 us    2323.00 us          15395    READLINK
     18.24      33.67 us      11.00 us   27242.00 us          38571     ENTRYLK
     66.83      94.49 us      32.00 us   27521.00 us          50338      LOOKUP

    Duration: 68 seconds
   Data Read: 0 bytes
Data Written: 3347998 bytes

Brick: nfs02:/opt/bkk
------------------------------
Cumulative Stats:
   Block Size:                  1b+                   2b+                   4b+
No. of Reads:                   26                    49                   541
No. of Writes:                   64                    94                  3848

   Block Size:                  8b+                  16b+                  32b+
No. of Reads:                  218                   205                  1267
No. of Writes:                  452                   417                  1448

   Block Size:                 64b+                 128b+                 256b+
No. of Reads:                 6097                 39042                 11111
No. of Writes:                 8617                924503                136768

   Block Size:                512b+                1024b+                2048b+
No. of Reads:               120819                 37802                 16506
No. of Writes:                64399                 35996                 24999

   Block Size:               4096b+                8192b+               16384b+
No. of Reads:                76162                 20449                 10948
No. of Writes:                41302                 37488                 14034

   Block Size:              32768b+               65536b+              131072b+
No. of Reads:                 7733                  7306                 31648
No. of Writes:                20849                  6750                 12886

%-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us         231622      FORGET
      0.00       0.00 us       0.00 us       0.00 us        6123626     RELEASE
      0.00       0.00 us       0.00 us       0.00 us       10869781  RELEASEDIR
      0.00     116.00 us     116.00 us     116.00 us              1     XATTROP
      0.00      40.00 us      38.00 us      43.00 us              6      STATFS
      0.01      40.85 us      33.00 us      97.00 us             13       FSTAT
      0.01      46.26 us      29.00 us     100.00 us             23        STAT
      0.04      73.96 us      53.00 us     150.00 us             48 REMOVEXATTR
      0.04      76.85 us      61.00 us      99.00 us             48     SETATTR
      0.04      35.83 us      28.00 us     155.00 us            103      UNLINK
      0.04      36.28 us      15.00 us     142.00 us            112    FINODELK
      0.05      38.68 us      13.00 us     220.00 us            133       FLUSH
      0.09     324.11 us      28.00 us     589.00 us             28     READDIR
      0.12      85.88 us      36.00 us     215.00 us            144    FXATTROP
      0.12     124.55 us      76.00 us     192.00 us            100       WRITE
      0.13      78.76 us      19.00 us     529.00 us            161    GETXATTR
      0.20      78.65 us      43.00 us     384.00 us            261        OPEN
      0.23      54.09 us       2.00 us     260.00 us            426     OPENDIR
      0.60    1261.23 us     174.00 us   10655.00 us             48      CREATE
      0.66      81.15 us      17.00 us    9254.00 us            819     INODELK
      1.21     279.97 us      23.00 us    1587.00 us            434    READDIRP
      8.10      52.61 us      27.00 us    1283.00 us          15496    READLINK
     15.48      34.54 us      10.00 us   13810.00 us          45133     ENTRYLK
     72.84     104.30 us      14.00 us   14613.00 us          70322      LOOKUP

    Duration: 6492593 seconds
   Data Read: 6308054987 bytes
Data Written: 4579768980 bytes

Interval 1 Stats:
   Block Size:               4096b+                8192b+               16384b+
No. of Reads:                    0                     0                     0
No. of Writes:                    1                    50                     2

   Block Size:              32768b+
No. of Reads:                    0
No. of Writes:                   47
%-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us             48      FORGET
      0.00       0.00 us       0.00 us       0.00 us            286     RELEASE
      0.00       0.00 us       0.00 us       0.00 us            400  RELEASEDIR
      0.00     116.00 us     116.00 us     116.00 us              1     XATTROP
      0.00      40.40 us      38.00 us      43.00 us              5      STATFS
      0.01      40.85 us      33.00 us      97.00 us             13       FSTAT
      0.01      46.14 us      29.00 us     100.00 us             22        STAT
      0.04      73.96 us      53.00 us     150.00 us             48 REMOVEXATTR
      0.04      76.85 us      61.00 us      99.00 us             48     SETATTR
      0.04      35.83 us      28.00 us     155.00 us            103      UNLINK
      0.05      36.28 us      15.00 us     142.00 us            112    FINODELK
      0.05      39.64 us      13.00 us     220.00 us            117       FLUSH
      0.10     330.28 us      28.00 us     589.00 us             25     READDIR
      0.14      85.88 us      36.00 us     215.00 us            144    FXATTROP
      0.14     124.55 us      76.00 us     192.00 us            100       WRITE
      0.14      79.13 us      19.00 us     529.00 us            159    GETXATTR
      0.22      79.51 us      43.00 us     384.00 us            235        OPEN
      0.25      54.27 us       2.00 us     260.00 us            400     OPENDIR
      0.70    1261.23 us     174.00 us   10655.00 us             48      CREATE
      0.77      81.15 us      17.00 us    9254.00 us            819     INODELK
      1.26     283.31 us      23.00 us    1587.00 us            386    READDIRP
      7.67      52.88 us      27.00 us    1283.00 us          12595    READLINK
     15.47      34.83 us      10.00 us   13810.00 us          38576     ENTRYLK
     72.91     105.05 us      14.00 us   14613.00 us          60295      LOOKUP

    Duration: 68 seconds
   Data Read: 0 bytes
Data Written: 3347998 bytes



Can anybody tell me how to fix the problem with the high load and these cache files?

Thanks in advance!

Regards
Sebastian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160301/4b4598b8/attachment.html>


More information about the Gluster-users mailing list