[Bugs] [Bug 1761350] New: Directories are not healed, when dirs are created on the backend bricks and performed lookup from mount path.

Mon Oct 14 08:46:40 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1761350

            Bug ID: 1761350
           Summary: Directories are not healed, when dirs are created on
                    the backend bricks and performed lookup from mount
                    path.
           Product: GlusterFS
           Version: 6
          Hardware: x86_64
                OS: Linux
            Status: NEW
         Component: replicate
          Severity: medium
          Assignee: bugs at gluster.org
          Reporter: mwaykole at redhat.com
                CC: bugs at gluster.org
  Target Milestone: ---
    Classification: Community

[afr] Heal is not completed in (1*3)replicated volume after enabling client
side healing options .
 Some files are always left while healing .

"metadata-self-heal": "on",
"entry-self-heal": "on",
"data-self-heal": "on"

steps:

        1) create replicate volume ( 1 * 3 )
        2. Test the case with default afr options.
        3. Test the case with volume option 'self-heal-daemon'
        4) create dirs on bricks from the backend. lets say dir1, dir2 and dir3
        5) From mount point,
            echo "hi" >dir1 ->must fail
            touch dir2 --> must pass
            mkdir dir3 ->must fail
        6) From mount point,
            ls -l  and find, must list both dir1 and dir2 and dir3
        7) check on all backend bricks, dir1, dir2 and dir3 should be created
        8) heal info should show zero, and also gfid and other attributes
         must exist

Actual result :

# gluster volume heal testvol_replicated info
Brick server:/bricks/brick1/testvol_replicated_brick0
Status: Connected
Number of entries: 1

Brick server2:/bricks/brick1/testvol_replicated_brick1
Status: Connected
Number of entries: 1

Brick serevr3:/bricks/brick1/testvol_replicated_brick2
Status: Connected
Number of entries: 1

Expected Result :

Brick server:/bricks/brick1/testvol_replicated_brick0
Status: Connected
Number of entries: 0

Brick server2:/bricks/brick1/testvol_replicated_brick1
Status: Connected
Number of entries: 0

Brick serevr3:/bricks/brick1/testvol_replicated_brick2
Status: Connected
Number of entries: 0

Additional info:

[2019-10-09 09:15:36.822052] W [socket.c:774:__socket_rwv]
0-testvol_replicated-client-0: readv on 10.70.35.132:49152 failed (No data
available)
The message "I [MSGID: 100040] [glusterfsd-mgmt.c:106:mgmt_process_volfile]
0-glusterfs: No change in volfile, continuing" repeated 2 times between
[2019-10-09 09:15:35.989751] and [2019-10-09 09:15:36.303521]
[2019-10-09 09:15:36.822109] I [MSGID: 114018]
[client.c:2398:client_rpc_notify] 0-testvol_replicated-client-0: disconnected
from testvol_replicated-client-0. Client process will keep trying to connect to
glusterd until brick's port is available
[2019-10-09 09:15:38.859761] W [socket.c:774:__socket_rwv]
0-testvol_replicated-client-1: readv on 10.70.35.216:49152 failed (No data
available)
[2019-10-09 09:15:38.859805] I [MSGID: 114018]
[client.c:2398:client_rpc_notify] 0-testvol_replicated-client-1: disconnected
from testvol_replicated-client-1. Client process will keep trying to connect to
glusterd until brick's port is available
[2019-10-09 09:15:38.859834] W [MSGID: 108001] [afr-common.c:5653:afr_notify]
0-testvol_replicated-replicate-0: Client-quorum is not met
[2019-10-09 09:15:38.860994] W [socket.c:774:__socket_rwv]
0-testvol_replicated-client-2: readv on 10.70.35.80:49152 failed (No data
available)
[2019-10-09 09:15:38.861025] I [MSGID: 114018]
[client.c:2398:client_rpc_notify] 0-testvol_replicated-client-2: disconnected
from testvol_replicated-client-2. Client process will keep trying to connect to
glusterd until brick's port is available
[2019-10-09 09:15:38.861046] E [MSGID: 108006]
[afr-common.c:5357:__afr_handle_child_down_event]
0-testvol_replicated-replicate-0: All subvolumes are down. Going offline until
at least one of them comes back up.
[2019-10-09 09:15:39.827168] E [MSGID: 114058]
[client-handshake.c:1268:client_query_portmap_cbk]
0-testvol_replicated-client-0: failed to get the port number for remote
subvolume. Please run 'gluster volume status' on server to see if brick process
is running.
[2019-10-09 09:15:39.827274] I [MSGID: 114018]
[client.c:2398:client_rpc_notify] 0-testvol_replicated-client-0: disconnected
from testvol_replicated-client-0. Client process will keep trying to connect to
glusterd until brick's port is available
[2019-10-09 09:15:39.881864] W [glusterfsd.c:1645:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7dd5) [0x7fd028181dd5]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x56243aebc805]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x56243aebc66b] ) 0-: received
signum (15), shutting down

######################### latest logs
[2019-10-11 11:25:29.749047] I [rpc-clnt.c:1967:rpc_clnt_reconfig]
5-testvol_replicated-client-1: changing port to 49155 (from 0)
[2019-10-11 11:25:29.754160] I [rpc-clnt.c:1967:rpc_clnt_reconfig]
5-testvol_replicated-client-2: changing port to 49155 (from 0)
[2019-10-11 11:25:29.754806] I [MSGID: 114057]
[client-handshake.c:1188:select_server_supported_programs]
5-testvol_replicated-client-1: Using Program GlusterFS 4.x v1, Num (1298437),
Version (400) 
x[2019-10-11 11:25:29.756036] I [MSGID: 114046]
[client-handshake.c:904:client_setvolume_cbk] 5-testvol_replicated-client-1:
Connected to testvol_replicated-client-1, attached to remote volume
'/bricks/brick1/testvol_replicated_brick1'. 
[2019-10-11 11:25:29.756076] I [MSGID: 108002] [afr-common.c:5648:afr_notify]
5-testvol_replicated-replicate-0: Client-quorum is met 
[2019-10-11 11:25:29.758143] I [MSGID: 114057]
[client-handshake.c:1188:select_server_supported_programs]
5-testvol_replicated-client-2: Using Program GlusterFS 4.x v1, Num (1298437),
Version (400) 
[2019-10-11 11:25:29.759918] I [MSGID: 114046]
[client-handshake.c:904:client_setvolume_cbk] 5-testvol_replicated-client-2:
Connected to testvol_replicated-client-2, attached to remote volume
'/bricks/brick1/testvol_replicated_brick2'. 
[2019-10-11 11:25:30.778455] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
5-testvol_replicated-replicate-0: performing metadata selfheal on
fb2f6540-41eb-4ed6-9fe1-e821f02bda9e 
[2019-10-11 11:25:30.793172] I [MSGID: 108026]
[afr-self-heal-common.c:1750:afr_log_selfheal]
5-testvol_replicated-replicate-0: Completed metadata selfheal on
fb2f6540-41eb-4ed6-9fe1-e821f02bda9e. sources=[0]  sinks=1 2  
[2019-10-11 11:25:30.797468] I [MSGID: 108026]
[afr-self-heal-entry.c:916:afr_selfheal_entry_do]
5-testvol_replicated-replicate-0: performing entry selfheal on
fb2f6540-41eb-4ed6-9fe1-e821f02bda9e 
[2019-10-11 11:25:30.812701] I [MSGID: 108026]
[afr-self-heal-common.c:1750:afr_log_selfheal]
5-testvol_replicated-replicate-0: Completed entry selfheal on
fb2f6540-41eb-4ed6-9fe1-e821f02bda9e. sources=[0]  sinks=1 2  
Ending Test:
functional.afr.test_gfid_assignment_on_lookup.AssignGfidOnLookup_cplex_replicated_glusterfs.test_gfid_assignment_on_lookup
: 16_55_11_10_2019
[2019-10-11 11:25:31.572199] W [socket.c:774:__socket_rwv]
5-testvol_replicated-client-0: readv on 10.70.35.132:49155 failed (No data
available)
[2019-10-11 11:25:31.572250] I [MSGID: 114018]
[client.c:2398:client_rpc_notify] 5-testvol_replicated-client-0: disconnected
from testvol_replicated-client-0. Client process will keep trying to connect to
glusterd until brick's port is available 
[2019-10-11 11:25:31.820309] W [MSGID: 114031]
[client-rpc-fops_v2.c:911:client4_0_getxattr_cbk]
5-testvol_replicated-client-0: remote operation failed. [{path=/},
{gfid=00000000-0000-0000-0000-000000000001},
{key=glusterfs.xattrop_index_gfid}, {errno=107}, {error=Transport endpoint is
not connected}] 
[2019-10-11 11:25:31.820350] W [MSGID: 114029]
[client-rpc-fops_v2.c:4467:client4_0_getxattr] 5-testvol_replicated-client-0:
failed to send the fop 
[2019-10-11 11:25:31.820366] W [MSGID: 108034]
[afr-self-heald.c:463:afr_shd_index_sweep] 5-testvol_replicated-replicate-0:
unable to get index-dir on testvol_replicated-client-0 
[2019-10-11 11:25:32.601159] I [MSGID: 101218]
[graph.c:1522:glusterfs_process_svc_detach] 0-mgmt: detaching child
shd/testvol_replicated 
[2019-10-11 11:25:32.601338] I [MSGID: 114021] [client.c:2498:notify]
5-testvol_replicated-client-0: current graph is no longer active, destroying
rpc_client  
[2019-10-11 11:25:32.601377] I [MSGID: 114021] [client.c:2498:notify]
5-testvol_replicated-client-1: current graph is no longer active, destroying
rpc_client  
[2019-10-11 11:25:32.601663] I [MSGID: 114018]
[client.c:2398:client_rpc_notify] 5-testvol_replicated-client-1: disconnected
from testvol_replicated-client-1. Client process will keep trying to connect to
glusterd until brick's port is available 
[2019-10-11 11:25:32.601691] W [MSGID: 108001] [afr-common.c:5654:afr_notify]
5-testvol_replicated-replicate-0: Client-quorum is not met 
[2019-10-11 11:25:32.601600] I [MSGID: 114021] [client.c:2498:notify]
5-testvol_replicated-client-2: current graph is no longer active, destroying
rpc_client  
[2019-10-11 11:25:32.602242] I [MSGID: 114018]
[client.c:2398:client_rpc_notify] 5-testvol_replicated-client-2: disconnected
from testvol_replicated-client-2. Client process will keep trying to connect to
glusterd until brick's port is available 
[2019-10-11 11:25:32.602273] E [MSGID: 108006]
[afr-common.c:5358:__afr_handle_child_down_event]
5-testvol_replicated-replicate-0: All subvolumes are down. Going offline until
at least one of them comes back up. 
[2019-10-11 11:25:32.602649] I [io-stats.c:4047:fini] 0-testvol_replicated:
io-stats translator

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.