<div dir="ltr">Hi,<div><br></div><div>Just to update this thread. We updated from Gluster 3.12.2 to 3.12.3 which resolved the issue it seems. I checked the changelog but don't see anything that looks like this issue, but I'm glad it seems like it's OK now.</div><div><br></div><div>Niels Hendriks<br><div class="gmail_extra"><br><div class="gmail_quote">On 14 November 2017 at 09:42, Niels Hendriks <span dir="ltr"><<a href="mailto:niels@nuvini.com" target="_blank">niels@nuvini.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><span style="box-sizing:content-box">Hi,</span><div style="margin:0px;padding:0px;box-sizing:content-box"><br style="box-sizing:content-box"></div><div style="margin:0px;padding:0px;box-sizing:content-box">We're using a 3-node setup where GlusterFS is running as both a client and a server with a fuse mount-point.</div><div style="margin:0px;padding:0px;box-sizing:content-box"><br style="box-sizing:content-box"></div><div style="margin:0px;padding:0px;box-sizing:content-box">We tried to change the performance.parallel-<wbr>readdir setting to on for a volume, but after that the load on all 3 nodes skyrocketed due to the glusterd process and we saw CPU soft lockup errors in the console.</div><div style="margin:0px;padding:0px;box-sizing:content-box">I had to completely bring down/reboot all 3 nodes and disable the setting again.</div><div style="margin:0px;padding:0px;box-sizing:content-box"><br style="box-sizing:content-box"></div><div style="margin:0px;padding:0px;box-sizing:content-box">There were tons of errors like mentioned below, does anyone know what could cause this?</div><div style="margin:0px;padding:0px;box-sizing:content-box"><br style="box-sizing:content-box"></div><div style="margin:0px;padding:0px;box-sizing:content-box"><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960126] BUG: soft lockup - CPU#6 stuck for 22s! [touch:25995]</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960168] Modules linked in: xt_multiport binfmt_misc hcpdriver(PO) nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_conntrack iptable_filter ip_tables x_tables xfs libcrc32c crc32c_generic nls_utf8 nls_cp437 vfat fat x86_pkg_temp_thermal coretemp iTCO_wdt iTCO_vendor_support kvm_intel kvm crc32_pclmul ast ttm aesni_intel drm_kms_helper efi_pstore aes_x86_64 lrw gf128mul evdev glue_helper ablk_helper joydev cryptd pcspkr efivars drm i2c_algo_bit mei_me lpc_ich mei mfd_core shpchp tpm_tis wmi tpm ipmi_si ipmi_msghandler acpi_pad processor button acpi_power_meter thermal_sys nf_conntrack_ftp nf_conntrack fuse autofs4 ext4 crc16 mbcache jbd2 dm_mod hid_generic usbhid hid sg sd_mod crc_t10dif crct10dif_generic ehci_pci xhci_hcd ehci_hcd ahci libahci i2c_i801 crct10dif_pclmul crct10dif_common libata ixgbe i2c_core crc32c_intel dca usbcore scsi_mod ptp usb_common pps_core nvme mdio</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960224] CPU: 6 PID: 25995 Comm: touch Tainted: P O 3.16.0-4-amd64 #1 Debian 3.16.43-2+deb8u5</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960226] Hardware name: Supermicro SYS-1028U-TNR4T+/X10DRU-i+, BIOS 2.0c 04/21/2017</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960228] task: ffff88184bf872f0 ti: ffff88182cbc0000 task.ti: ffff88182cbc0000</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960229] RIP: 0010:[<ffffffff81519fe5>] [<ffffffff81519fe5>] _raw_spin_lock+0x25/0x30</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960237] RSP: 0018:ffff88182cbc3b78 EFLAGS: 00000287</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960239] RAX: 0000000000005e5c RBX: ffffffff811646a5 RCX: 0000000000005e69</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960240] RDX: 0000000000005e69 RSI: ffffffff811bffa0 RDI: ffff88182e42bc70</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960241] RBP: ffff88182e42bc18 R08: 0000000000200008 R09: 0000000000000001</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960242] R10: 0000000000012f40 R11: 0000000000000010 R12: ffff88182cbc3af0</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960243] R13: 0000000000000286 R14: 0000000000000010 R15: ffffffff81519fce</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960244] FS: 00007fc005c67700(0000) GS:ffff88187fc80000(0000) knlGS:0000000000000000</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960246] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960247] CR2: 00007f498c1ab148 CR3: 0000000b0fcf4000 CR4: 00000000003407e0</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960248] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960249] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960250] Stack:</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960251] ffffffff81513196 ffff88182e42bc18 ffff880aa062e8d8 ffff880aa062e858</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960254] ffffffff811c1120 ffff88182cbc3bd0 ffff88182e42bc18 0000000000032b68</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960256] ffff88182943b010 ffffffff811c1988 ffff88182e42bc18 ffff8817de0db218</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960258] Call Trace:</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960263] [<ffffffff81513196>] ? lock_parent.part.17+0x1d/0x43</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960268] [<ffffffff811c1120>] ? shrink_dentry_list+0x1f0/0x240</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960270] [<ffffffff811c1988>] ? check_submounts_and_drop+0x68/<wbr>0x90</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960278] [<ffffffffa017f7f8>] ? fuse_dentry_revalidate+0x1e8/<wbr>0x300 [fuse]</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960281] [<ffffffff811b4e5e>] ? lookup_fast+0x25e/0x2b0</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960283] [<ffffffff811b5ebb>] ? link_path_walk+0x1ab/0x870</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960285] [<ffffffff811ba2ec>] ? path_openat+0x9c/0x680</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960289] [<ffffffff8116c0fc>] ? handle_mm_fault+0x63c/0x1150</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960292] [<ffffffff8117253c>] ? mmap_region+0x19c/0x650</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960294] [<ffffffff811bb07a>] ? do_filp_open+0x3a/0x90</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960297] [<ffffffff811c736c>] ? __alloc_fd+0x7c/0x120</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960302] [<ffffffff811aa169>] ? do_sys_open+0x129/0x220</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960305] [<ffffffff8151a48d>] ? system_call_fast_compare_end+<wbr>0x10/0x15</div><div style="margin:0px;padding:0px;box-sizing:content-box">Nov 10 20:55:53 n01c01 kernel: [196591.960306] Code: 84 00 00 00 00 00 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 89 d1 75 01 c3 0f b7 07 66 39 d0 74 f7 f3 90 <0f> b7 07 66 39 c8 75 f6 c3 66 90 0f 1f 44 00 00 65 81 04 25 60</div></div><div style="margin:0px;padding:0px;box-sizing:content-box"><br style="box-sizing:content-box"></div><div style="margin:0px;padding:0px;box-sizing:content-box"><br style="box-sizing:content-box"></div><div style="margin:0px;padding:0px;box-sizing:content-box">We are running Debian 8 with Glusterfs 3.12.2</div><div style="margin:0px;padding:0px;box-sizing:content-box"><br style="box-sizing:content-box"></div><div style="margin:0px;padding:0px;box-sizing:content-box">Here are our settings:</div><div style="margin:0px;padding:0px;box-sizing:content-box"><div style="margin:0px;padding:0px;box-sizing:content-box">gluster volume info</div><div style="margin:0px;padding:0px;box-sizing:content-box"><br style="box-sizing:content-box"></div><div style="margin:0px;padding:0px;box-sizing:content-box">Volume Name: ssl</div><div style="margin:0px;padding:0px;box-sizing:content-box">Type: Replicate</div><div style="margin:0px;padding:0px;box-sizing:content-box">Volume ID: 8aa7bab2-a8dc-4855-ab46-<wbr>bbcb4a9ff174</div><div style="margin:0px;padding:0px;box-sizing:content-box">Status: Started</div><div style="margin:0px;padding:0px;box-sizing:content-box">Snapshot Count: 0</div><div style="margin:0px;padding:0px;box-sizing:content-box">Number of Bricks: 1 x 3 = 3</div><div style="margin:0px;padding:0px;box-sizing:content-box">Transport-type: tcp</div><div style="margin:0px;padding:0px;box-sizing:content-box">Bricks:</div><div style="margin:0px;padding:0px;box-sizing:content-box">Brick1: 10.0.0.1:/storage/gluster/ssl</div><div style="margin:0px;padding:0px;box-sizing:content-box">Brick2: 10.0.0.2:/storage/gluster/ssl</div><div style="margin:0px;padding:0px;box-sizing:content-box">Brick3: 10.0.0.3:/storage/gluster/ssl</div><div style="margin:0px;padding:0px;box-sizing:content-box">Options Reconfigured:</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.client-io-threads: off</div><div style="margin:0px;padding:0px;box-sizing:content-box">nfs.disable: on</div><div style="margin:0px;padding:0px;box-sizing:content-box">transport.address-family: inet</div><div style="margin:0px;padding:0px;box-sizing:content-box">network.ping-timeout: 5</div><div style="margin:0px;padding:0px;box-sizing:content-box"><br style="box-sizing:content-box"></div><div style="margin:0px;padding:0px;box-sizing:content-box">Volume Name: www</div><div style="margin:0px;padding:0px;box-sizing:content-box">Type: Replicate</div><div style="margin:0px;padding:0px;box-sizing:content-box">Volume ID: d8d87920-f92d-4669-8509-<wbr>acd1936ba365</div><div style="margin:0px;padding:0px;box-sizing:content-box">Status: Started</div><div style="margin:0px;padding:0px;box-sizing:content-box">Snapshot Count: 0</div><div style="margin:0px;padding:0px;box-sizing:content-box">Number of Bricks: 1 x 3 = 3</div><div style="margin:0px;padding:0px;box-sizing:content-box">Transport-type: tcp</div><div style="margin:0px;padding:0px;box-sizing:content-box">Bricks:</div><div style="margin:0px;padding:0px;box-sizing:content-box">Brick1: 10.0.0.1:/storage/gluster/www</div><div style="margin:0px;padding:0px;box-sizing:content-box">Brick2: 10.0.0.2:/storage/gluster/www</div><div style="margin:0px;padding:0px;box-sizing:content-box">Brick3: 10.0.0.3:/storage/gluster/www</div><div style="margin:0px;padding:0px;box-sizing:content-box">Options Reconfigured:</div><div style="margin:0px;padding:0px;box-sizing:content-box">features.scrub: Active</div><div style="margin:0px;padding:0px;box-sizing:content-box">features.bitrot: on</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.flush-behind: on</div><div style="margin:0px;padding:0px;box-sizing:content-box">network.ping-timeout: 3</div><div style="margin:0px;padding:0px;box-sizing:content-box">features.cache-invalidation-<wbr>timeout: 600</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.write-behind-<wbr>window-size: 4MB</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.quick-read: off</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.md-cache-timeout: 600</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.io-thread-count: 64</div><div style="margin:0px;padding:0px;box-sizing:content-box">cluster.readdir-optimize: on</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.write-behind: on</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.readdir-ahead: on</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.cache-size: 1GB</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.cache-<wbr>invalidation: on</div><div style="margin:0px;padding:0px;box-sizing:content-box">features.cache-invalidation: on</div><div style="margin:0px;padding:0px;box-sizing:content-box">diagnostics.brick-log-level: WARNING</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.cache-priority: *.php:3,*.temp:3,*:1</div><div style="margin:0px;padding:0px;box-sizing:content-box">network.inode-lru-limit: 90000</div><div style="margin:0px;padding:0px;box-sizing:content-box">transport.address-family: inet</div><div style="margin:0px;padding:0px;box-sizing:content-box">nfs.disable: on</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.client-io-threads: off</div><div style="margin:0px;padding:0px;box-sizing:content-box">performance.parallel-readdir: off</div></div><div style="margin:0px;padding:0px;box-sizing:content-box"><br style="box-sizing:content-box"></div><div style="margin:0px;padding:0px;box-sizing:content-box">And here are our /etc/fstab entries for the fuse mountpoint:</div><div style="margin:0px;padding:0px;box-sizing:content-box"><div style="margin:0px;padding:0px;box-sizing:content-box">localhost:/ssl /etc/ssl/shared glusterfs backup-volfile-servers=10.0.0.<wbr>2:10.0.0.3,log-level=WARNING 0 0</div><div style="margin:0px;padding:0px;box-sizing:content-box">localhost:/www /var/www glusterfs backup-volfile-servers=10.0.0.<wbr>2:10.0.0.3,log-level=WARNING 0 0</div><div style="margin:0px;padding:0px;box-sizing:content-box"><br></div><div style="margin:0px;padding:0px;box-sizing:content-box">Fuse version:</div><div style="margin:0px;padding:0px;box-sizing:content-box"><div style="margin:0px;padding:0px;box-sizing:content-box">dpkg -l | grep fuse</div><div style="margin:0px;padding:0px;box-sizing:content-box">ii fuse 2.9.3-15+deb8u2 amd64 Filesystem in Userspace</div><div style="margin:0px;padding:0px;box-sizing:content-box">ii libfuse2:amd64 2.9.3-15+deb8u2 amd64 Filesystem in Userspace (library)</div><div><br></div></div><div style="margin:0px;padding:0px;box-sizing:content-box"><br style="box-sizing:content-box"></div></div><div style="margin:0px;padding:0px;box-sizing:content-box">Thank you!</div><span class="HOEnZb"><font color="#888888"><div style="margin:0px;padding:0px;box-sizing:content-box">Niels Hendriks</div></font></span></div>
</blockquote></div><br></div></div></div>