<div><div><div><div><div>Hello everyone,</div><div> </div><div>I'm running an oVirt cluster on top of a distributed-replicate gluster volume and one of the bricks cannot be mounted anymore from my oVirt hosts. This morning I also noticed a stack trace and a spike in TCP connections on one of the three gluster nodes (storage2), which I have attached at the end of this mail. Only this particular brick on storage2 seems to be causing trouble:</div><div><em>Brick storage2:/data/glusterfs/hdd/brick3/brick</em></div><div><em>Status: Transport endpoint is not connected</em></div><div> </div><div>I don't know what's causing this or how to resolve this issue. I would appreciate it if someone could take a look at my logs and point me in the right direction. If any additional logs are required, please let me know. Thank you in advance!</div><div> </div><div>Operating system on all hosts: Centos 7.9.2009</div><div>oVirt version: 4.3.10.4-1</div><div>Gluster versions:</div><div>- storage1: 6.10-1</div><div>- storage2: 6.7-1</div><div>- storage3: 6.7-1</div><div> </div><div>####################################</div><div># brick is not connected/mounted on the oVirt hosts</div><div> </div><div><em>[xlator.protocol.client.hdd-client-7.priv]</em></div><div><em>fd.0.remote_fd = -1</em></div><div><em>------ = ------</em></div><div><em>granted-posix-lock[0] = owner = 9d673ffe323e25cd, cmd = F_SETLK fl_type = F_RDLCK, fl_start = 100, fl_end = 100, user_flock: l_type = F_RDLCK, l_start = 100, l_len = 1</em></div><div><em>granted-posix-lock[1] = owner = 9d673ffe323e25cd, cmd = F_SETLK fl_type = F_RDLCK, fl_start = 101, fl_end = 101, user_flock: l_type = F_RDLCK, l_start = 101, l_len = 1</em></div><div><em>------ = ------</em></div><div><em>connected = 0</em></div><div><em>total_bytes_read = 11383136800</em></div><div><em>ping_timeout = 10</em></div><div><em>total_bytes_written = 16699851552</em></div><div><em>ping_msgs_sent = 1</em></div><div><em>msgs_sent = 2</em></div><div> </div><div>####################################</div><div># mount log from one of the oVirt hosts</div><div># the IP 172.22.102.142 corresponds to my gluster node "storage2"</div><div># the port 49154 corresponds to the brick storage2:/data/glusterfs/hdd/brick3/brick      </div><div> </div><div><em>[2022-03-24 10:59:28.138178] W [rpc-clnt-ping.c:210:rpc_clnt_ping_cbk] 0-hdd-client-7: socket disconnected</em></div><div><em>[2022-03-24 10:59:38.142698] I [rpc-clnt.c:2028:rpc_clnt_reconfig] 0-hdd-client-7: changing port to 49154 (from 0)</em></div><div><em>The message "I [MSGID: 114018] [client.c:2331:client_rpc_notify] 0-hdd-client-7: disconnected from hdd-client-7. Client process will keep trying to connect to glusterd until brick's port is available" repeated 4 times between [2022-03-24 10:58:04.114741] and [2022-03-24 10:59:28.137380]</em></div><div><em>The message "W [MSGID: 114032] [client-handshake.c:1546:client_dump_version_cbk] 0-hdd-client-7: received RPC status error [Transport endpoint is not connected]" repeated 4 times between [2022-03-24 10:58:04.115169] and [2022-03-24 10:59:28.138052]</em></div><div><em>[2022-03-24 10:59:49.143217] C [rpc-clnt-ping.c:155:rpc_clnt_ping_timer_expired] 0-hdd-client-7: server 172.22.102.142:49154 has not responded in the last 10 seconds, disconnecting.</em></div><div><em>[2022-03-24 10:59:49.143838] I [MSGID: 114018] [client.c:2331:client_rpc_notify] 0-hdd-client-7: disconnected from hdd-client-7. Client process will keep trying to connect to glusterd until brick's port is available</em></div><div><em>[2022-03-24 10:59:49.144540] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f6724643adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f67243ea7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f67243ea8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f67243eb987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f67243ec518] ))))) 0-hdd-client-7: forced unwinding frame type(GF-DUMP) op(DUMP(1)) called at 2022-03-24 10:59:38.145208 (xid=0x861)</em></div><div><em>[2022-03-24 10:59:49.144557] W [MSGID: 114032] [client-handshake.c:1546:client_dump_version_cbk] 0-hdd-client-7: received RPC status error [Transport endpoint is not connected]</em></div><div><em>[2022-03-24 10:59:49.144653] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f6724643adb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f67243ea7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f67243ea8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f67243eb987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f67243ec518] ))))) 0-hdd-client-7: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at 2022-03-24 10:59:38.145218 (xid=0x862)</em></div><div><em>[2022-03-24 10:59:49.144665] W [rpc-clnt-ping.c:210:rpc_clnt_ping_cbk] 0-hdd-client-7: socket disconnected</em></div><div> </div><div>####################################</div><div># netcat/telnet to the brick's port of storage2 are working</div><div> </div><div><em>[<a href="mailto:root@storage1" rel="noopener noreferrer" target="_blank">root@storage1</a> ~]#  netcat -z -v 172.22.102.142 49154</em></div><div><em>Connection to 172.22.102.142 49154 port [tcp/*] succeeded!</em></div><div> </div><div><em>[<a href="mailto:root@storage3" rel="noopener noreferrer" target="_blank">root@storage3</a> ~]# netcat -z -v 172.22.102.142 49154</em></div><div><em>Connection to 172.22.102.142 49154 port [tcp/*] succeeded!</em></div><div> </div><div><em>[<a href="mailto:root@ovirthost1" rel="noopener noreferrer" target="_blank">root@ovirthost1</a> /var/log/glusterfs]#  netcat -z -v 172.22.102.142 49154</em></div><div><em>Connection to 172.22.102.142 49154 port [tcp/*] succeeded!</em></div><div> </div><div>####################################</div><div># gluster peer status - all gluster peers are connected</div><div><em>[<a href="mailto:root@storage3" rel="noopener noreferrer" target="_blank">root@storage3</a> ~]#  gluster peer status</em></div><div><em>Number of Peers: 2</em></div><div> </div><div><em>Hostname: storage1</em></div><div><em>Uuid: 055e79c2-b1ff-4a82-9296-205d6877904e</em></div><div><em>State: Peer in Cluster (Connected)</em></div><div> </div><div><em>Hostname: storage2</em></div><div><em>Uuid: d7adcb92-2e71-41a9-80d4-13180ee673cf</em></div><div><em>State: Peer in Cluster (Connected)</em></div><div> </div><div>####################################</div><div># Configuration of the volume</div><div><em>Volume Name: hdd</em></div><div><em>Type: Distributed-Replicate</em></div><div><em>Volume ID: 1b47c2f8-5024-4b85-aa7f-a3f767bb076c</em></div><div><em>Status: Started</em></div><div><em>Snapshot Count: 0</em></div><div><em>Number of Bricks: 4 x 3 = 12</em></div><div><em>Transport-type: tcp</em></div><div><em>Bricks:</em></div><div><em>Brick1: storage1:/data/glusterfs/hdd/brick1/brick</em></div><div><em>Brick2: storage2:/data/glusterfs/hdd/brick1/brick</em></div><div><em>Brick3: storage3:/data/glusterfs/hdd/brick1/brick</em></div><div><em>Brick4: storage1:/data/glusterfs/hdd/brick2/brick</em></div><div><em>Brick5: storage2:/data/glusterfs/hdd/brick2/brick</em></div><div><em>Brick6: storage3:/data/glusterfs/hdd/brick2/brick</em></div><div><em>Brick7: storage1:/data/glusterfs/hdd/brick3/brick</em></div><div><em>Brick8: storage2:/data/glusterfs/hdd/brick3/brick</em></div><div><em>Brick9: storage3:/data/glusterfs/hdd/brick3/brick</em></div><div><em>Brick10: storage1:/data/glusterfs/hdd/brick4/brick</em></div><div><em>Brick11: storage2:/data/glusterfs/hdd/brick4/brick</em></div><div><em>Brick12: storage3:/data/glusterfs/hdd/brick4/brick</em></div><div><em>Options Reconfigured:</em></div><div><em>storage.owner-gid: 36</em></div><div><em>storage.owner-uid: 36</em></div><div><em>server.event-threads: 4</em></div><div><em>client.event-threads: 4</em></div><div><em>cluster.choose-local: off</em></div><div><em>user.cifs: off</em></div><div><em>features.shard: on</em></div><div><em>cluster.shd-wait-qlength: 10000</em></div><div><em>cluster.shd-max-threads: 8</em></div><div><em>cluster.locking-scheme: granular</em></div><div><em>cluster.data-self-heal-algorithm: full</em></div><div><em>cluster.server-quorum-type: server</em></div><div><em>cluster.eager-lock: enable</em></div><div><em>network.remote-dio: enable</em></div><div><em>performance.low-prio-threads: 32</em></div><div><em>performance.io-cache: off</em></div><div><em>performance.read-ahead: off</em></div><div><em>performance.quick-read: off</em></div><div><em>auth.allow: *</em></div><div><em>network.ping-timeout: 10</em></div><div><em>cluster.quorum-type: auto</em></div><div><em>transport.address-family: inet</em></div><div><em>nfs.disable: on</em></div><div><em>performance.client-io-threads: on</em></div><div> </div><div>####################################</div><div># gluster volume status. The brick running on port 49154 is supposedly online</div><div> </div><div><em>Status of volume: hdd</em></div><div><em>Gluster process                             TCP Port  RDMA Port  Online  Pid</em></div><div><em>------------------------------------------------------------------------------</em></div><div><em>Brick storage1:/data/gluste</em></div><div><em>rfs/hdd/brick1/brick                        49158     0          Y       9142</em></div><div><em>Brick storage2:/data/gluste</em></div><div><em>rfs/hdd/brick1/brick                        49152     0          Y       115896</em></div><div><em>Brick storage3:/data/gluste</em></div><div><em>rfs/hdd/brick1/brick                        49158     0          Y       131775</em></div><div><em>Brick storage1:/data/gluste</em></div><div><em>rfs/hdd/brick2/brick                        49159     0          Y       9151</em></div><div><em>Brick storage2:/data/gluste</em></div><div><em>rfs/hdd/brick2/brick                        49153     0          Y       115904</em></div><div><em>Brick storage3:/data/gluste</em></div><div><em>rfs/hdd/brick2/brick                        49159     0          Y       131783</em></div><div><em>Brick storage1:/data/gluste</em></div><div><em>rfs/hdd/brick3/brick                        49160     0          Y       9163</em></div><div><em>Brick storage2:/data/gluste</em></div><div><em>rfs/hdd/brick3/brick                        49154     0          Y       115913</em></div><div><em>Brick storage3:/data/gluste</em></div><div><em>rfs/hdd/brick3/brick                        49160     0          Y       131792</em></div><div><em>Brick storage1:/data/gluste</em></div><div><em>rfs/hdd/brick4/brick                        49161     0          Y       9170</em></div><div><em>Brick storage2:/data/gluste</em></div><div><em>rfs/hdd/brick4/brick                        49155     0          Y       115923</em></div><div><em>Brick storage3:/data/gluste</em></div><div><em>rfs/hdd/brick4/brick                        49161     0          Y       131800</em></div><div><em>Self-heal Daemon on localhost               N/A       N/A        Y       170468</em></div><div><em>Self-heal Daemon on storage3               N/A       N/A        Y       132263</em></div><div><em>Self-heal Daemon on storage1               N/A       N/A        Y       9512</em></div><div> </div><div><em>Task Status of Volume hdd</em></div><div><em>------------------------------------------------------------------------------</em></div><div><em>There are no active volume tasks</em></div><div> </div><div>####################################</div><div># gluster volume heal hdd info split-brain. All bricks are connected and showing no entries (0), except for brick3 on storage2</div><div><em>Brick storage2:/data/glusterfs/hdd/brick3/brick</em></div><div><em>Status: Transport endpoint is not connected</em></div><div><em>Number of entries in split-brain: -</em></div><div> </div><div>####################################</div><div># gluster volume heal hdd info. Only brick3 seems to be affected and it has lots of entries. brick3 on storage2 is not connected</div><div> </div><div><em>Brick storage1:/data/glusterfs/hdd/brick3/brick</em></div><div><em>/538befbf-ffa7-4a8c-8827-cee679d589f4/images/615fa020-9737-4b83-a3c1-a61e32400d59/f4917758-deae-4a62-bf4d-5b9a95a7db5b</em></div><div><em><gfid:f3d0b19a-2544-48c5-90b7-addd561113bc></em></div><div><em>/.shard/753a8a81-bd06-4c8c-9515-d54123f6fe4d.1</em></div><div><em>/.shard/c7f5f88f-dc85-4645-9178-c7df8e46a99d.83</em></div><div><em>/538befbf-ffa7-4a8c-8827-cee679d589f4/images/bc4362e6-cd43-4ab8-b8fa-0ea72405b7da/ea9c0e7c-d2c7-43c8-b19f-7a3076cc6743</em></div><div><em>/.shard/dc46e963-2b68-4802-9537-42f25ea97ae2.10872</em></div><div><em>/.shard/dc46e963-2b68-4802-9537-42f25ea97ae2.1901</em></div><div><em>/538befbf-ffa7-4a8c-8827-cee679d589f4/images/e48e80fb-d42f-47a4-9a56-07fd7ad868b3/31fd839f-85bf-4c42-ac0e-7055d903df40</em></div><div><em>/.shard/82700f9b-c7e0-4568-a565-64c9a770449f.223</em></div><div><em>/.shard/82700f9b-c7e0-4568-a565-64c9a770449f.243</em></div><div><em>/.shard/dc46e963-2b68-4802-9537-42f25ea97ae2.10696</em></div><div><em>/.shard/dc46e963-2b68-4802-9537-42f25ea97ae2.10902</em></div><div><em>..</em></div><div><em>Status: Connected</em></div><div><em>Number of entries: 664</em></div><div> </div><div><em>Brick storage2:/data/glusterfs/hdd/brick3/brick</em></div><div><em>Status: Transport endpoint is not connected</em></div><div><em>Number of entries: -</em></div><div> </div><div><em>Brick storage3:/data/glusterfs/hdd/brick3/brick</em></div><div><em>/538befbf-ffa7-4a8c-8827-cee679d589f4/images/615fa020-9737-4b83-a3c1-a61e32400d59/f4917758-deae-4a62-bf4d-5b9a95a7db5b</em></div><div><em><gfid:f3d0b19a-2544-48c5-90b7-addd561113bc></em></div><div><em>/.shard/753a8a81-bd06-4c8c-9515-d54123f6fe4d.1</em></div><div><em>..</em></div><div><em>Status: Connected</em></div><div><em>Number of entries: 664</em></div><div> </div><div>####################################</div><div># /data/glusterfs/hdd/brick3 on storage2 is running inside of a software RAID</div><div> </div><div><em>md6 : active raid6 sdac1[6] sdz1[3] sdx1[1] sdad1[7] sdaa1[4] sdy1[2] sdw1[0] sdab1[5] sdae1[8]</em></div><div><em>      68364119040 blocks super 1.2 level 6, 512k chunk, algorithm 2 [9/9] [UUUUUUUUU]</em></div><div><em>      [============>........]  check = 64.4% (6290736128/9766302720) finish=3220.5min speed=17985K/sec</em></div><div><em>      bitmap: 10/73 pages [40KB], 65536KB chunk</em></div><div> </div><div>####################################</div><div># glfsheal-hdd.log on storage2</div><div> </div><div><em>[2022-03-24 10:15:33.238884] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-hdd-client-10: Connected to hdd-client-10, attached to remote volume '/data/glusterfs/hdd/brick4/brick'.</em></div><div><em>[2022-03-24 10:15:33.238931] I [MSGID: 108002] [afr-common.c:5607:afr_notify] 0-hdd-replicate-3: Client-quorum is met</em></div><div><em>[2022-03-24 10:15:33.241616] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-hdd-client-11: Connected to hdd-client-11, attached to remote volume '/data/glusterfs/hdd/brick4/brick'.</em></div><div><em>[2022-03-24 10:15:44.078651] C [rpc-clnt-ping.c:155:rpc_clnt_ping_timer_expired] 0-hdd-client-7: server 172.22.102.142:49154 has not responded in the last 10 seconds, disconnecting.</em></div><div><em>[2022-03-24 10:15:44.078891] I [MSGID: 114018] [client.c:2331:client_rpc_notify] 0-hdd-client-7: disconnected from hdd-client-7. Client process will keep trying to connect to glusterd until brick's port is available</em></div><div><em>[2022-03-24 10:15:44.079954] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fc6c0cadadb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7fc6c019f7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7fc6c019f8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7fc6c01a0987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7fc6c01a1518] ))))) 0-hdd-client-7: forced unwinding frame type(GF-DUMP) op(DUMP(1)) called at 2022-03-24 10:15:33.209640 (xid=0x5)</em></div><div><em>[2022-03-24 10:15:44.080008] W [MSGID: 114032] [client-handshake.c:1547:client_dump_version_cbk] 0-hdd-client-7: received RPC status error [Transport endpoint is not connected]</em></div><div><em>[2022-03-24 10:15:44.080526] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fc6c0cadadb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7fc6c019f7e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7fc6c019f8fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7fc6c01a0987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7fc6c01a1518] ))))) 0-hdd-client-7: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at 2022-03-24 10:15:33.209655 (xid=0x6)</em></div><div><em>[2022-03-24 10:15:44.080574] W [rpc-clnt-ping.c:210:rpc_clnt_ping_cbk] 0-hdd-client-7: socket disconnected</em></div><div> </div><div>####################################</div><div># stack trace on storage2 that happened this morning</div><div> </div><div><em>Mar 24 06:24:06 storage2 kernel: INFO: task glfs_iotwr000:115974 blocked for more than 120 seconds.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: glfs_iotwr000   D ffff9b91b8951070     0 115974      1 0x00000080</em></div><div><em>Mar 24 06:24:06 storage2 kernel: Call Trace:</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80a29>] schedule+0x29/0x70</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc05056e1>] _xfs_log_force_lsn+0x2d1/0x310 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db4d0>] ? wake_up_state+0x20/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc04e5a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167fbf7>] do_fsync+0x67/0xb0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167ff03>] SyS_fdatasync+0x13/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b8dede>] system_call_fastpath+0x25/0x2a</em></div><div><em>Mar 24 06:24:06 storage2 kernel: INFO: task glfs_iotwr001:121353 blocked for more than 120 seconds.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: glfs_iotwr001   D ffff9b9b7d4dac80     0 121353      1 0x00000080</em></div><div><em>Mar 24 06:24:06 storage2 kernel: Call Trace:</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80a29>] schedule+0x29/0x70</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc05056e1>] _xfs_log_force_lsn+0x2d1/0x310 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db4d0>] ? wake_up_state+0x20/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc04e5a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167fbf7>] do_fsync+0x67/0xb0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167ff03>] SyS_fdatasync+0x13/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b8dede>] system_call_fastpath+0x25/0x2a</em></div><div><em>Mar 24 06:24:06 storage2 kernel: INFO: task glfs_iotwr002:121354 blocked for more than 120 seconds.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: glfs_iotwr002   D ffff9b9b7d75ac80     0 121354      1 0x00000080</em></div><div><em>Mar 24 06:24:06 storage2 kernel: Call Trace:</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80a29>] schedule+0x29/0x70</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc05056e1>] _xfs_log_force_lsn+0x2d1/0x310 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db4d0>] ? wake_up_state+0x20/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc04e5a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167fbf7>] do_fsync+0x67/0xb0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167ff03>] SyS_fdatasync+0x13/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b8dede>] system_call_fastpath+0x25/0x2a</em></div><div><em>Mar 24 06:24:06 storage2 kernel: INFO: task glfs_iotwr003:121355 blocked for more than 120 seconds.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: glfs_iotwr003   D ffff9b9b7d51ac80     0 121355      1 0x00000080</em></div><div><em>Mar 24 06:24:06 storage2 kernel: Call Trace:</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80a29>] schedule+0x29/0x70</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b7e531>] schedule_timeout+0x221/0x2d0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14d77a9>] ? ttwu_do_wakeup+0x19/0xe0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14d78df>] ? ttwu_do_activate+0x6f/0x80</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db210>] ? try_to_wake_up+0x190/0x390</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80ddd>] wait_for_completion+0xfd/0x140</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db4d0>] ? wake_up_state+0x20/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14be9aa>] flush_work+0x10a/0x1b0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14bb6c0>] ? move_linked_works+0x90/0x90</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc05070ba>] xlog_cil_force_lsn+0x8a/0x210 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc0505484>] _xfs_log_force_lsn+0x74/0x310 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa15bcb1f>] ? filemap_fdatawait_range+0x1f/0x30</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b7fd22>] ? down_read+0x12/0x40</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc04e5a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167fbf7>] do_fsync+0x67/0xb0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167ff03>] SyS_fdatasync+0x13/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b8dede>] system_call_fastpath+0x25/0x2a</em></div><div><em>Mar 24 06:24:06 storage2 kernel: INFO: task glfs_iotwr004:121356 blocked for more than 120 seconds.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: glfs_iotwr004   D ffff9b9b7d75ac80     0 121356      1 0x00000080</em></div><div><em>Mar 24 06:24:06 storage2 kernel: Call Trace:</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80a29>] schedule+0x29/0x70</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b7e531>] schedule_timeout+0x221/0x2d0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14d77a9>] ? ttwu_do_wakeup+0x19/0xe0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14d78df>] ? ttwu_do_activate+0x6f/0x80</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db210>] ? try_to_wake_up+0x190/0x390</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80ddd>] wait_for_completion+0xfd/0x140</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db4d0>] ? wake_up_state+0x20/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14be9aa>] flush_work+0x10a/0x1b0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14bb6c0>] ? move_linked_works+0x90/0x90</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc05070ba>] xlog_cil_force_lsn+0x8a/0x210 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc0505484>] _xfs_log_force_lsn+0x74/0x310 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa15bcb1f>] ? filemap_fdatawait_range+0x1f/0x30</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b7fd22>] ? down_read+0x12/0x40</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc04e5a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167fbf7>] do_fsync+0x67/0xb0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167ff03>] SyS_fdatasync+0x13/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b8dede>] system_call_fastpath+0x25/0x2a</em></div><div><em>Mar 24 06:24:06 storage2 kernel: INFO: task glfs_iotwr005:153774 blocked for more than 120 seconds.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: glfs_iotwr005   D ffff9b9b7d61ac80     0 153774      1 0x00000080</em></div><div><em>Mar 24 06:24:06 storage2 kernel: Call Trace:</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80a29>] schedule+0x29/0x70</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b7e531>] schedule_timeout+0x221/0x2d0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14d77a9>] ? ttwu_do_wakeup+0x19/0xe0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14d78df>] ? ttwu_do_activate+0x6f/0x80</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db210>] ? try_to_wake_up+0x190/0x390</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80ddd>] wait_for_completion+0xfd/0x140</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db4d0>] ? wake_up_state+0x20/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14be9aa>] flush_work+0x10a/0x1b0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14bb6c0>] ? move_linked_works+0x90/0x90</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc05070ba>] xlog_cil_force_lsn+0x8a/0x210 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167335b>] ? getxattr+0x11b/0x180</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc0505484>] _xfs_log_force_lsn+0x74/0x310 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b7fd22>] ? down_read+0x12/0x40</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc04e5a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167fbf7>] do_fsync+0x67/0xb0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167ff03>] SyS_fdatasync+0x13/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b8dede>] system_call_fastpath+0x25/0x2a</em></div><div><em>Mar 24 06:24:06 storage2 kernel: INFO: task glfs_iotwr006:153775 blocked for more than 120 seconds.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: glfs_iotwr006   D ffff9b9b7d49ac80     0 153775      1 0x00000080</em></div><div><em>Mar 24 06:24:06 storage2 kernel: Call Trace:</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80a29>] schedule+0x29/0x70</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b7e531>] schedule_timeout+0x221/0x2d0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14d77a9>] ? ttwu_do_wakeup+0x19/0xe0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14d78df>] ? ttwu_do_activate+0x6f/0x80</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db210>] ? try_to_wake_up+0x190/0x390</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80ddd>] wait_for_completion+0xfd/0x140</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db4d0>] ? wake_up_state+0x20/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14be9aa>] flush_work+0x10a/0x1b0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14bb6c0>] ? move_linked_works+0x90/0x90</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc05070ba>] xlog_cil_force_lsn+0x8a/0x210 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167335b>] ? getxattr+0x11b/0x180</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc0505484>] _xfs_log_force_lsn+0x74/0x310 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b7fd22>] ? down_read+0x12/0x40</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc04e5a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167fbf7>] do_fsync+0x67/0xb0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167ff03>] SyS_fdatasync+0x13/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b8dede>] system_call_fastpath+0x25/0x2a</em></div><div><em>Mar 24 06:24:06 storage2 kernel: INFO: task glfs_iotwr007:153776 blocked for more than 120 seconds.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: glfs_iotwr007   D ffff9b9958c962a0     0 153776      1 0x00000080</em></div><div><em>Mar 24 06:24:06 storage2 kernel: Call Trace:</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80a29>] schedule+0x29/0x70</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b7e531>] schedule_timeout+0x221/0x2d0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14d7782>] ? check_preempt_curr+0x92/0xa0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14d77a9>] ? ttwu_do_wakeup+0x19/0xe0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db210>] ? try_to_wake_up+0x190/0x390</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80ddd>] wait_for_completion+0xfd/0x140</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db4d0>] ? wake_up_state+0x20/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14be9aa>] flush_work+0x10a/0x1b0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14bb6c0>] ? move_linked_works+0x90/0x90</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc05070ba>] xlog_cil_force_lsn+0x8a/0x210 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167335b>] ? getxattr+0x11b/0x180</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc0505484>] _xfs_log_force_lsn+0x74/0x310 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b7fd22>] ? down_read+0x12/0x40</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc04e5a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167fbf7>] do_fsync+0x67/0xb0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167ff03>] SyS_fdatasync+0x13/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b8dede>] system_call_fastpath+0x25/0x2a</em></div><div><em>Mar 24 06:24:06 storage2 kernel: INFO: task glfs_iotwr008:153777 blocked for more than 120 seconds.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: glfs_iotwr008   D ffff9b9b7d61ac80     0 153777      1 0x00000080</em></div><div><em>Mar 24 06:24:06 storage2 kernel: Call Trace:</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80a29>] schedule+0x29/0x70</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc05056e1>] _xfs_log_force_lsn+0x2d1/0x310 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db4d0>] ? wake_up_state+0x20/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc04e5a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167fbf7>] do_fsync+0x67/0xb0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167ff03>] SyS_fdatasync+0x13/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b8dede>] system_call_fastpath+0x25/0x2a</em></div><div><em>Mar 24 06:24:06 storage2 kernel: INFO: task glfs_iotwr009:153778 blocked for more than 120 seconds.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.</em></div><div><em>Mar 24 06:24:06 storage2 kernel: glfs_iotwr009   D ffff9b9958c920e0     0 153778      1 0x00000080</em></div><div><em>Mar 24 06:24:06 storage2 kernel: Call Trace:</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b80a29>] schedule+0x29/0x70</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc05056e1>] _xfs_log_force_lsn+0x2d1/0x310 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa14db4d0>] ? wake_up_state+0x20/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffc04e5a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167fbf7>] do_fsync+0x67/0xb0</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa167ff03>] SyS_fdatasync+0x13/0x20</em></div><div><em>Mar 24 06:24:06 storage2 kernel: [<ffffffffa1b8dede>] system_call_fastpath+0x25/0x2a</em></div></div></div></div></div>