[Bugs] [Bug 1564071] directories are invisible on client side

Fri Apr 27 13:33:45 UTC 2018

https://bugzilla.redhat.com/show_bug.cgi?id=1564071

g.amedick at uni-luebeck.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|needinfo?(g.amedick at uni-lue |
                   |beck.de)                    |

--- Comment #8 from g.amedick at uni-luebeck.de ---
Hi,

as I said, the directories seem to heal over time. We currently don't know of a
"hidden" folder. There are "hidden" files though. I'll proceed with those, hope
it helps. I created a new mount at /mnt on a test compute node.

1. 
$ mount | grep /mnt
gluster01.FQDN:/$vol1 on /mnt type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

$ systemctl status mnt.mount
● mnt.mount
   Loaded: loaded (/etc/systemd/system/mnt.mount; disabled)
   Active: active (mounted) since Tue 2018-04-24 16:52:06 CEST; 2 days ago
    Where: /mnt
     What: gluster01.FQDN:/$vol1
  Process: 3104 ExecMount=/bin/mount -n gluster01.FQDN:/$vol1 /mnt -t glusterfs
-o defaults,_netdev,backupvolfile-server=gluster02.FQDN (code=exited,
status=0/SUCCESS)
   CGroup: /system.slice/mnt.mount
           └─3173 /usr/sbin/glusterfs --volfile-server=gluster01.FQDN
--volfile-server=gluster02.FQDN --volfile-id=/$vol1 /mnt

2. see attachment.

3.
$ gluster volume status
Status of volume: $vol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gluster02:/srv/glusterfs/bricks/DATA2
01/data                                     49152     0          Y       4064 
Brick gluster02:/srv/glusterfs/bricks/DATA2
02/data                                     49153     0          Y       4072 
Brick gluster02:/srv/glusterfs/bricks/DATA2
03/data                                     49154     0          Y       4080 
Brick gluster02:/srv/glusterfs/bricks/DATA2
04/data                                     49155     0          Y       4090 
Brick gluster02:/srv/glusterfs/bricks/DATA2
05/data                                     49156     0          Y       4098 
Brick gluster02:/srv/glusterfs/bricks/DATA2
06/data                                     49157     0          Y       4107 
Brick gluster02:/srv/glusterfs/bricks/DATA2
07/data                                     49158     0          Y       4116 
Brick gluster02:/srv/glusterfs/bricks/DATA2
08/data                                     49159     0          Y       4125 
Brick gluster01:/srv/glusterfs/bricks/DATA1
10/data                                     49152     0          Y       4418 
Brick gluster01:/srv/glusterfs/bricks/DATA1
11/data                                     49153     0          Y       4426 
Brick gluster01:/srv/glusterfs/bricks/DATA1
12/data                                     49154     0          Y       4434 
Brick gluster01:/srv/glusterfs/bricks/DATA1
13/data                                     49155     0          Y       4444 
Brick gluster01:/srv/glusterfs/bricks/DATA1
14/data                                     49156     0          Y       4452 
Brick gluster02:/srv/glusterfs/bricks/DATA2
09/data                                     49160     0          Y       4134 
Brick gluster01:/srv/glusterfs/bricks/DATA1
01/data                                     49157     0          Y       4461 
Brick gluster01:/srv/glusterfs/bricks/DATA1
02/data                                     49158     0          Y       4470 
Brick gluster01:/srv/glusterfs/bricks/DATA1
03/data                                     49159     0          Y       4479 
Brick gluster01:/srv/glusterfs/bricks/DATA1
04/data                                     49160     0          Y       4488 
Brick gluster01:/srv/glusterfs/bricks/DATA1
05/data                                     49161     0          Y       4498 
Brick gluster01:/srv/glusterfs/bricks/DATA1
06/data                                     49162     0          Y       4507 
Brick gluster01:/srv/glusterfs/bricks/DATA1
07/data                                     49163     0          Y       4516 
Brick gluster01:/srv/glusterfs/bricks/DATA1
08/data                                     49164     0          Y       4525 
Brick gluster01:/srv/glusterfs/bricks/DATA1
09/data                                     49165     0          Y       4533 
Quota Daemon on localhost                   N/A       N/A        Y       4041 
Quota Daemon on gluster03.FQDN              N/A       N/A        Y       701  
Quota Daemon on gluster04.FQDN              N/A       N/A        Y       810  
Quota Daemon on gluster05.FQDN              N/A       N/A        Y       3011 
Quota Daemon on gluster01                   N/A       N/A        Y       4393 

Task Status of Volume $vol1
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 326d0a79-98e7-4e7a-9ae1-6fc5e33663ae
Status               : failed              

Status of volume: $vol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gluster02:/srv/glusterfs/bricks/SRV_C
LOUD_201/data                               49161     0          Y       4143 

Task Status of Volume $vol2
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: $vol3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gluster02:/srv/glusterfs/bricks/SRV_H
OME_201/data                                49162     0          Y       4152 
Quota Daemon on localhost                   N/A       N/A        Y       4041 
Quota Daemon on gluster04.FQDN              N/A       N/A        Y       810  
Quota Daemon on gluster03.FQDN              N/A       N/A        Y       701  
Quota Daemon on gluster01                   N/A       N/A        Y       4393 
Quota Daemon on gluster05.FQDN              N/A       N/A        Y       3011 

Task Status of Volume $vol3
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: $vol4
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gluster02:/srv/glusterfs/bricks/SRV_S
LURM_201/data                               49163     0          Y       4161 

Task Status of Volume $vol4
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: $vol5
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gluster05.FQDN:/srv/glusterfs/bricks/TEST001/data                N/A     
 N/A        N       N/A  

Task Status of Volume $vol5
------------------------------------------------------------------------------
There are no active volume tasks

Volume TEST_DISPERSED is not started

Volumes 1,2,3 and 4 are productive. We haven't recieved any reports about
errors on 2-4. They are small though and don't have a high load. We don't know
whether they are affected or not.

4. see attachments

5. The volume is part of a cluster that does genome analysis. I'm afraid I
can't publish the complete path, it contains sensitive information. But it only
contains alphanumeric symbols, "." and "_".

1st file:
root at gluster02:~# getfattr -d -m -
/srv/glusterfs/bricks/DATA202/data/$PATH/sd.bin
# file: srv/glusterfs/bricks/DATA202/data/$PATH/sd.bin
trusted.gfid=0s85TbqbmpQoG/3BV5LbJwxg==
trusted.gfid2path.6ccfa9a95c18c513="3847d58a-0225-4be2-8ba6-a7fcaf16dcf2/sd.bin"
trusted.glusterfs.quota.3847d58a-0225-4be2-8ba6-a7fcaf16dcf2.contri.1=0sAAAAAAAAfgAAAAAAAAAAAQ==
trusted.pgfid.3847d58a-0225-4be2-8ba6-a7fcaf16dcf2=0sAAAAAQ==

root at gluster02:~# stat /srv/glusterfs/bricks/DATA202/data/$PATH/sd.bin 
  File: /srv/glusterfs/bricks/DATA202/data/$PATH/sd.bin
  Size: 32058         Blocks: 72         IO Block: 4096   regular file
Device: fe11h/65041d    Inode: 34550135635  Links: 2
Access: (0644/-rw-r--r--)  Uid: ( 1029/ $user)   Gid: ( 1039/$group)
Access: 2018-04-24 16:53:47.688932475 +0200
Modify: 2018-03-27 09:11:01.000000000 +0200
Change: 2018-04-24 13:32:26.357256496 +0200
 Birth: -

2nd file:
root at gluster02:~# getfattr -d -m -
/srv/glusterfs/bricks/DATA202/data/$PATH/pairtable.bin 
# file: srv/glusterfs/bricks/DATA202/data/$PATH/pairtable.bin
trusted.gfid=0sGGS421fzQpquDiz3KTaO1g==
trusted.gfid2path.5b44f1b5ab80e888="3847d58a-0225-4be2-8ba6-a7fcaf16dcf2/pairtable.bin"
trusted.glusterfs.quota.3847d58a-0225-4be2-8ba6-a7fcaf16dcf2.contri.1=0sAAAAAAAABgAAAAAAAAAAAQ==
trusted.pgfid.3847d58a-0225-4be2-8ba6-a7fcaf16dcf2=0sAAAAAQ==

root at gluster02:~# stat /srv/glusterfs/bricks/DATA202/data/$PATH/pairtable.bin 
  File: /srv/glusterfs/bricks/DATA202/data/$PATH/pairtable.bin
  Size: 1054          Blocks: 16         IO Block: 4096   regular file
Device: fe11h/65041d    Inode: 34550135634  Links: 2
Access: (0644/-rw-r--r--)  Uid: ( 1029/ $user)   Gid: ( 1039/$group)
Access: 2018-04-24 13:29:51.615393077 +0200
Modify: 2018-03-27 09:11:04.000000000 +0200
Change: 2018-04-24 13:32:26.357256496 +0200
 Birth: -

3rd file
root at gluster02:~# getfattr -d -m -
/srv/glusterfs/bricks/DATA201/data/$PATH/seqdata.bin 
# file: srv/glusterfs/bricks/DATA201/data/$PATH/seqdata.bin
trusted.gfid=0soL+uP9hOTWyo3Z3+cLOa6w==
trusted.gfid2path.91ad63dbe24d5d40="3847d58a-0225-4be2-8ba6-a7fcaf16dcf2/seqdata.bin"
trusted.glusterfs.quota.3847d58a-0225-4be2-8ba6-a7fcaf16dcf2.contri.1=0sAAAAAAAJ+AAAAAAAAAAAAQ==
trusted.pgfid.3847d58a-0225-4be2-8ba6-a7fcaf16dcf2=0sAAAAAQ==

root at gluster02:~# stat /srv/glusterfs/bricks/DATA201/data/$PATH/seqdata.bin
  File: /srv/glusterfs/bricks/DATA201/data/$PATH/seqdata.bin
  Size: 653142        Blocks: 1288       IO Block: 4096   regular file
Device: fe10h/65040d    Inode: 34385264557  Links: 2
Access: (0644/-rw-r--r--)  Uid: ( 1029/ $user)   Gid: ( 1039/$group)
Access: 2018-04-24 16:53:29.588711695 +0200
Modify: 2018-03-27 09:11:03.000000000 +0200
Change: 2018-04-24 13:32:26.357256496 +0200
 Birth: -

There's another thing that happend. We started the rebalance and, as you can
see, it failed on gluster02. This is the part of the log where it failed:

[2018-04-24 18:55:08.253990] C
[rpc-clnt-ping.c:166:rpc_clnt_ping_timer_expired] 0-$vol1-client-7: server
$IP_gluster02:49159 has not responded in the last 42 seconds, disconnecting.
[2018-04-24 18:55:08.254210] I [MSGID: 114018]
[client.c:2285:client_rpc_notify] 0-$vol1-client-7: disconnected from
$vol1-client-7. Client process will keep trying to connect to glusterd until
brick's port is available
[2018-04-24 18:55:08.254260] W [MSGID: 109073] [dht-common.c:9315:dht_notify]
0-$vol1-dht: Received CHILD_DOWN. Exiting
[2018-04-24 18:55:08.254283] I [MSGID: 109029]
[dht-rebalance.c:5283:gf_defrag_stop] 0-: Received stop command on rebalance
[2018-04-24 18:55:08.254620] E [rpc-clnt.c:350:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x13e)[0x7f5ca82e3b6e]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7f5ca80aa111]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f5ca80aa23e]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x91)[0x7f5ca80ab8d1]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x288)[0x7f5ca80ac3f8]
))))) 0-$vol1-client-7: forced unwinding frame type(GlusterFS 3.3) op(READ(12))
called at 2018-04-24 18:54:25.704755 (xid=0x4150a5)
[2018-04-24 18:55:08.254651] W [MSGID: 114031]
[client-rpc-fops.c:2922:client3_3_readv_cbk] 0-$vol1-client-7: remote operation
failed [Transport endpoint is not connected]
[2018-04-24 18:55:08.254740] E [MSGID: 109023]
[dht-rebalance.c:1820:dht_migrate_file] 0-$vol1-dht: Migrate file failed:
/$PATH1/file1: failed to migrate data
[2018-04-24 18:55:08.254807] W [MSGID: 114061]
[client-common.c:704:client_pre_fstat] 0-$vol1-client-7: 
(7d4a7dd7-db43-428f-9618-add08088d7bb) remote_fd is -1. EBADFD [File descriptor
in bad state]
[2018-04-24 18:55:08.254836] E [MSGID: 109023]
[dht-rebalance.c:1459:__dht_migration_cleanup_src_file] 0-$vol1-dht: Migrate
file cleanup failed: failed to fstat file /$PATH1/file1 on $vol1-client-7 
[File descriptor in bad state]
[2018-04-24 18:55:08.254853] W [MSGID: 109023]
[dht-rebalance.c:2275:dht_migrate_file] 0-$vol1-dht: /$PATH1/file1: failed to
cleanup source file on $vol1-client-7
[2018-04-24 18:55:08.254870] E [rpc-clnt.c:350:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x13e)[0x7f5ca82e3b6e]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7f5ca80aa111]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f5ca80aa23e]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x91)[0x7f5ca80ab8d1]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x288)[0x7f5ca80ac3f8]
))))) 0-$vol1-client-7: forced unwinding frame type(GF-DUMP) op(NULL(2)) called
at 2018-04-24 18:54:26.249541 (xid=0x4150a6)
[2018-04-24 18:55:08.254898] W [rpc-clnt-ping.c:223:rpc_clnt_ping_cbk]
0-$vol1-client-7: socket disconnected
[2018-04-24 18:55:14.862395] E [MSGID: 114031]
[client-rpc-fops.c:1508:client3_3_inodelk_cbk] 0-$vol1-client-7: remote
operation failed [Transport endpoint is not connected]
[2018-04-24 18:55:14.862493] W [MSGID: 109023]
[dht-rebalance.c:2300:dht_migrate_file] 0-$vol1-dht: /$PATH1/file1: failed to
unlock file on $vol1-client-7 [Transport endpoint is not connected]
[2018-04-24 18:55:14.862585] E [MSGID: 109023]
[dht-rebalance.c:2790:gf_defrag_migrate_single_file] 0-$vol1-dht: migrate-data
failed for /$PATH1/file1 [Transport endpoint is not connected]
[2018-04-24 18:55:14.862626] W [dht-rebalance.c:3397:gf_defrag_process_dir]
0-$vol1-dht: Found error from gf_defrag_get_entry
[2018-04-24 18:55:14.863078] E [MSGID: 109111]
[dht-rebalance.c:3914:gf_defrag_fix_layout] 0-$vol1-dht: gf_defrag_process_dir
failed for directory: /$PATH2
[2018-04-24 18:55:16.492243] W [MSGID: 114061]
[client-common.c:1197:client_pre_readdirp] 0-$vol1-client-7: 
(12c968a9-4d43-4746-9c16-2e3671b87dd7) remote_fd is -1. EBADFD [File descriptor
in bad state]
[2018-04-24 18:55:18.256351] I [rpc-clnt.c:1986:rpc_clnt_reconfig]
0-$vol1-client-7: changing port to 49159 (from 0)
[2018-04-24 18:55:18.256828] I [MSGID: 114057]
[client-handshake.c:1478:select_server_supported_programs] 0-$vol1-client-7:
Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2018-04-24 18:55:18.257718] I [MSGID: 114046]
[client-handshake.c:1231:client_setvolume_cbk] 0-$vol1-client-7: Connected to
$vol1-client-7, attached to remote volume '/srv/glusterfs/bricks/DATA208/data'.
[2018-04-24 18:55:18.257753] I [MSGID: 114047]
[client-handshake.c:1242:client_setvolume_cbk] 0-$vol1-client-7: Server and
Client lk-version numbers are not same, reopening the fds
[2018-04-24 18:55:18.257771] I [MSGID: 114042]
[client-handshake.c:1047:client_post_handshake] 0-$vol1-client-7: 9 fds open -
Delaying child_up until they are re-opened
[2018-04-24 18:55:18.258105] I [MSGID: 114060]
[client-handshake.c:817:client3_3_reopendir_cbk] 0-$vol1-client-7: reopendir on
<gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 0)
[2018-04-24 18:55:18.258152] I [MSGID: 114060]
[client-handshake.c:817:client3_3_reopendir_cbk] 0-$vol1-client-7: reopendir on
<gfid:30abdc78-e85b-43fb-aac1-df9be4facf8e> succeeded (fd = 1)
[2018-04-24 18:55:18.258193] I [MSGID: 114060]
[client-handshake.c:817:client3_3_reopendir_cbk] 0-$vol1-client-7: reopendir on
<gfid:87541f65-770a-4cc3-89ab-19f6d0e98aa5> succeeded (fd = 4)
[2018-04-24 18:55:18.258222] I [MSGID: 114060]
[client-handshake.c:817:client3_3_reopendir_cbk] 0-$vol1-client-7: reopendir on
<gfid:9faf9889-0419-4e1e-ade1-2929a8575ce2> succeeded (fd = 5)
[2018-04-24 18:55:18.258248] I [MSGID: 114060]
[client-handshake.c:817:client3_3_reopendir_cbk] 0-$vol1-client-7: reopendir on
<gfid:14d6f9bc-5756-444d-86b7-a55d64753ca7> succeeded (fd = 6)
[2018-04-24 18:55:18.258272] I [MSGID: 114060]
[client-handshake.c:817:client3_3_reopendir_cbk] 0-$vol1-client-7: reopendir on
<gfid:ba3dbaf6-5774-416d-956c-483ddb514f42> succeeded (fd = 2)
[2018-04-24 18:55:18.258328] I [MSGID: 114060]
[client-handshake.c:817:client3_3_reopendir_cbk] 0-$vol1-client-7: reopendir on
<gfid:12c968a9-4d43-4746-9c16-2e3671b87dd7> succeeded (fd = 3)
[2018-04-24 18:55:18.258442] I [MSGID: 114060]
[client-handshake.c:817:client3_3_reopendir_cbk] 0-$vol1-client-7: reopendir on
<gfid:14d6f9bc-5756-444d-86b7-a55d64753ca7> succeeded (fd = 6)
[2018-04-24 18:55:18.258541] I [MSGID: 114041]
[client-handshake.c:678:client_child_up_reopen_done] 0-$vol1-client-7: last fd
open'd/lock-self-heal'd - notifying CHILD-UP
[2018-04-24 18:55:18.258659] I [MSGID: 114035]
[client-handshake.c:202:client_set_lk_version_cbk] 0-$vol1-client-7: Server lk
version = 1
[2018-04-24 18:55:20.089490] I [MSGID: 109081] [dht-common.c:4379:dht_setxattr]
0-$vol1-dht: fixing the layout of /dir1/dir2/dir3/dir4/dir5
[2018-04-24 18:56:02.823510] I [dht-rebalance.c:3223:gf_defrag_process_dir]
0-$vol1-dht: migrate data called on /dir1/dir2/dir3/dir4/dir5
[2018-04-24 18:56:02.854207] W [dht-rebalance.c:3397:gf_defrag_process_dir]
0-$vol1-dht: Found error from gf_defrag_get_entry
[2018-04-24 18:56:02.854759] E [MSGID: 109111]
[dht-rebalance.c:3914:gf_defrag_fix_layout] 0-$vol1-dht: gf_defrag_process_dir
failed for directory: /dir1/dir2/dir3/dir4/dir5
[2018-04-24 18:56:02.855041] E [MSGID: 109016]
[dht-rebalance.c:3851:gf_defrag_fix_layout] 0-$vol1-dht: Fix layout failed for
/dir1/dir2/dir3/dir4
[2018-04-24 18:56:02.855225] E [MSGID: 109016]
[dht-rebalance.c:3851:gf_defrag_fix_layout] 0-$vol1-dht: Fix layout failed for
/dir1/dir2/dir3
[2018-04-24 18:56:02.855565] E [MSGID: 109016]
[dht-rebalance.c:3851:gf_defrag_fix_layout] 0-$vol1-dht: Fix layout failed for
/dir1/dir2
[2018-04-24 18:56:02.855760] E [MSGID: 109016]
[dht-rebalance.c:3851:gf_defrag_fix_layout] 0-$vol1-dht: Fix layout failed for
/dir1
[2018-04-24 18:59:19.254438] I [MSGID: 109022]
[dht-rebalance.c:2218:dht_migrate_file] 0-$vol1-dht: completed migration of
/$PATH3/file3 from subvolume $vol1-client-0 to $vol1-client-1
[2018-04-24 18:59:19.256074] I [MSGID: 109028]
[dht-rebalance.c:5097:gf_defrag_status_get] 0-$vol1-dht: Rebalance is failed.
Time taken is 120378.00 secs
[2018-04-24 18:59:19.256119] I [MSGID: 109028]
[dht-rebalance.c:5101:gf_defrag_status_get] 0-$vol1-dht: Files migrated:
434664, size: 21939317004280, lookups: 935269, failures: 8, skipped: 166223
[2018-04-24 18:59:19.256371] W [glusterfsd.c:1375:cleanup_and_exit]
(-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7494) [0x7f5ca755b494]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xf5) [0x5644710ead45]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x5644710eaba4] ) 0-: received
signum (15), shutting down

client-7 seems to be responsable for the brick DATA208 on gluster02. His log
contains this line at that time:

[2018-04-24 17:19:33.929281] E [MSGID: 113001]
[posix.c:5983:_posix_handle_xattr_keyvalue_pair] 0-$vol1-posix: setxattr failed
on
/srv/glusterfs/bricks/DATA208/data/.glusterfs/ac/24/ac246c06-bd39-4799-bdbe-7fba9beb4fb7
while doing xattrop:
key=trusted.glusterfs.quota.33157353-a842-48ac-8f84-e0cc55a59eae.contri.1 [No
such file or directory]
[2018-04-24 18:50:33.091715] E [MSGID: 113001]
[posix.c:5983:_posix_handle_xattr_keyvalue_pair] 0-$vol1-posix: setxattr failed
on
/srv/glusterfs/bricks/DATA208/data/.glusterfs/33/c4/33c4ca2e-63cd-4ab6-b56d-b95bb085b9b3
while doing xattrop:
key=trusted.glusterfs.quota.8bc4b5a3-0792-429b-878a-7bcfba5d8360.contri.1 [No
such file or directory]
[2018-04-24 18:56:02.744587] W [socket.c:593:__socket_rwv] 0-tcp.$vol1-server:
writev on $IP_gluster02:49057 failed (Broken pipe)
[2018-04-24 18:56:02.744742] W [inodelk.c:499:pl_inodelk_log_cleanup]
0-$vol1-server: releasing lock on 7d4a7dd7-db43-428f-9618-add08088d7bb held by
{client=0x7f38600ba190, pid=-3 lk-owner=fdffffff}
[2018-04-25 04:38:35.718259] A [MSGID: 120004] [quota.c:4998:quota_log_usage]
0-$vol1-quota: Usage is above soft limit: 199.7TB used by /$some_dir

I think the brick somehow lost connection so something. Not sure what port
49057 was used for though. We can't find anything in the logs and it's
currently not used according to "netstat".

We're also starting to see errors in the brick logs (all bricks) that look like
this:

[2018-04-26 11:43:42.244821] W [marker-quota.c:33:mq_loc_copy] 0-marker: src
loc is not valid
[2018-04-26 11:43:42.244854] E [marker-quota.c:1488:mq_initiate_quota_task]
0-$vol1-marker: loc copy failed
[2018-04-26 11:43:34.752298] W [MSGID: 113001]
[posix.c:4430:posix_get_ancestry_non_directory] 0-$vol1-posix: listxattr failed
on/srv/glusterfs/bricks/DATA208/data/.glusterfs/75/79/757961cd-4348-41fa-93cb-2a681f87af96
[No such file or directory]
[2018-04-26 11:43:42.245003] W [MSGID: 113001]
[posix.c:4430:posix_get_ancestry_non_directory] 0-$vol1-posix: listxattr failed
on/srv/glusterfs/bricks/DATA208/data/.glusterfs/e2/8a/e28a69cc-e23a-43ab-998c-f41ef77212b5
[No such file or directory]

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=Di6tN6R8PT&a=cc_unsubscribe