[Bugs] [Bug 1749305] New: Failures in remove-brick due to [Input/output error] errors

Thu Sep 5 10:26:28 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1749305

            Bug ID: 1749305
           Summary: Failures in remove-brick due to  [Input/output error]
                    errors
           Product: GlusterFS
           Version: 7
            Status: NEW
         Component: replicate
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: rkavunga at redhat.com
                CC: bugs at gluster.org, ksubrahm at redhat.com,
                    nchilaka at redhat.com, rhs-bugs at redhat.com,
                    rkavunga at redhat.com, saraut at redhat.com,
                    spalai at redhat.com, storage-qa-internal at redhat.com
        Depends On: 1726673, 1728770
  Target Milestone: ---
    Classification: Community

+++ This bug was initially created as a clone of Bug #1728770 +++

+++ This bug was initially created as a clone of Bug #1726673 +++

Description of problem:
While performing remove-brick to convert 3X3 volume to 2X3 volume, there were
failures in remove-brick rebalance due to " E [MSGID: 114031]
[client-rpc-fops_v2.c:2540:client4_0_opendir_cbk] 0-vol4-client-8: remote
operation failed. Path: /dir1/thread0/level03/level13/level23/level33/level43
(69e97af3-d2d7-450a-881e-0c4ef6ac1355) [Input/output error] "

Version-Release number of selected component (if applicable):
6.0.7

How reproducible:
1/1

Steps to Reproduce:
1. Created 1X3 volume.
2. Fuse mount the volume and start I/O on the volume.
3. Convert it into 2X3 volume, triggered rebalance.
4. Let the rebalance complete and then convert into 3X3 volume;triggered
rebalance.
5. After that, started remove-brick operation on the volume to convert it back 
  into 2X3 volume.
6. Check the remove-brick status.

Actual results:
There are failures in remove-brick rebalance.
Errors from rebalance logs:
E [MSGID: 114031] [client-rpc-fops_v2.c:2540:client4_0_opendir_cbk]
0-vol4-client-2: remote operation failed. Path:
/dir1/thread0/level03/level13/level23/level33/level43
(69e97af3-d2d7-450a-881e-0c4ef6ac1355) [Input/output error]

E [MSGID: 114031] [client-rpc-fops_v2.c:2540:client4_0_opendir_cbk]
0-vol4-client-8: remote operation failed. Path:
/dir1/thread0/level03/level13/level23/level33/level43
(69e97af3-d2d7-450a-881e-0c4ef6ac1355) [Input/output error]

W [MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk]
0-vol4-client-8: remote operation failed. Path:
/dir1/thread0/level03/level13/level23/level33/level43/level53/5d1b1579%%P3TRO7PG35
(558423e2-478e-40e9-9958-31c710e50b89) [Input/output error]

W [MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk]
0-vol4-client-2: remote operation failed. Path:
/dir1/thread0/level03/level13/level23/level33/level43
(69e97af3-d2d7-450a-881e-0c4ef6ac1355) [Input/output error]

Expected results:
Remove-brick should complete successfully.

Remove-brick rebalance status:
==============================
# gluster v remove-brick vol4 replica 3 10.70.47.88:/bricks/brick2/vol4-b2
10.70.47.190:/bricks/brick2/vol4-b2 10.70.47.5:/bricks/brick2/vol4-b2 status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                            10.70.47.190             3463         3.5MB        
18425            23             0            completed        0:37:14
                              10.70.47.5             3308         3.7MB        
21920           136             0            completed        0:32:59
                               localhost             3397         3.3MB        
21977           138             0            completed        0:33:35

On checking the volume status, it showed that two bricks are down:
=================================================================
# gluster v status vol4
Status of volume: vol4
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.88:/bricks/brick2/vol4-b1    49159     0          Y       30394
Brick 10.70.47.190:/bricks/brick2/vol4-b1   49159     0          Y       29191
Brick 10.70.47.5:/bricks/brick2/vol4-b1     N/A       N/A        N       N/A  
Brick 10.70.46.246:/bricks/brick2/vol4-b1   49158     0          Y       22598
Brick 10.70.47.188:/bricks/brick2/vol4-b1   49158     0          Y       22865
Brick 10.70.46.63:/bricks/brick2/vol4-b1    49158     0          Y       21036
Brick 10.70.47.88:/bricks/brick2/vol4-b2    49160     0          Y       5938 
Brick 10.70.47.190:/bricks/brick2/vol4-b2   49160     0          Y       4825 
Brick 10.70.47.5:/bricks/brick2/vol4-b2     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       6330 
Self-heal Daemon on 10.70.46.246            N/A       N/A        Y       5672 
Self-heal Daemon on 10.70.47.5              N/A       N/A        Y       5600 
Self-heal Daemon on 10.70.46.63             N/A       N/A        Y       4593 
Self-heal Daemon on 10.70.47.188            N/A       N/A        Y       4501 
Self-heal Daemon on 10.70.47.190            N/A       N/A        Y       5352 

Task Status of Volume vol4
------------------------------------------------------------------------------
Task                 : Remove brick        
ID                   : 273f04c3-b8bb-4613-a403-0c655de86ca3
Removed bricks:     
10.70.47.88:/bricks/brick2/vol4-b2
10.70.47.190:/bricks/brick2/vol4-b2
10.70.47.5:/bricks/brick2/vol4-b2
Status               : completed           

dmesg:
=====

[161039.214245] XFS (dm-66): Metadata CRC error detected at
xfs_dir3_block_read_verify+0x5e/0x110 [xfs], xfs_dir3_block block 0x1dd8568
[161039.214912] XFS (dm-66): Unmount and run xfs_repair
[161039.215126] XFS (dm-66): First 64 bytes of corrupted metadata buffer:
[161039.215426] ffffbb1db27a6000: 20 20 20 20 20 23 20 51 75 69 63 6b 20 4d 61
69       # Quick Mai
[161039.215729] ffffbb1db27a6010: 6c 20 54 72 61 6e 73 66 65 72 20 50 72 6f 74
6f  l Transfer Proto
[161039.216110] ffffbb1db27a6020: 63 6f 6c 0a 71 6d 74 70 20 20 20 20 20 20 20
20  col.qmtp
[161039.216527] ffffbb1db27a6030: 20 20 20 20 32 30 39 2f 75 64 70 20 20 20 20
20      209/udp
[161039.217200] XFS (dm-66): metadata I/O error: block 0x1dd8568
("xfs_trans_read_buf_map") error 74 numblks 16
[161039.217937] XFS (dm-66): xfs_do_force_shutdown(0x1) called from line 370 of
file fs/xfs/xfs_trans_buf.c.  Return address = 0xffffffffc057de9a
[161039.344196] XFS (dm-66): I/O Error Detected. Shutting down filesystem
[161039.344495] XFS (dm-66): Please umount the filesystem and rectify the
problem(s)

---> Though due to the brick issue, one brick is down in two replica pairs of
the volume, but as it is a distributed-replicated volume,there should not be
failures in rebalance.

Failure reason: 

"[2019-07-02 08:32:01.514139] W [MSGID: 109023]
[dht-rebalance.c:626:__is_file_migratable] 0-vol4-dht: Mi
grate file
failed:/dir1/thread0/level04/level14/level24/level34/level44/level54/level64/level74/level84/
symlink_to_files/5d1b15ed%%XS3OMQKQBN: Unable to get lock count for file
"

Key:/GLUSTERFS_POSIXLK_COUNT is used to get lock count from posix-lock
translator. This information is used to decide whether to migrate the file or
not.
In the current scenario as Sayalee mentioned one disk is corrupted on server
*.5 rendering both participating brick from that server unresponsive(all
operation leading to IO error). Given that only of the brick from two replicas
was down, DHT should have received a valid response. Actually, the key was
entirely missing from the dictionary itself.

Moving to AFR component for analysis.

Adding a needinfo on Rafi, as he had done some investigation on the same.

--- Additional comment from Mohammed Rafi KC on 2019-07-10 16:08:22 UTC ---

RCA:

As mentioned in the comment6, it failed because the lookup couldn't return lock
count requested through GLUSTERFS_POSIXLK_COUNT. This is because While
processing afr_lookup_cbk, if it requires a name heal, we process the name heal
in afr_lookup_selfheal_wrap by wiping all the current lookup data. And after
finishing the lookup we return the fresh data. But here when doing the healing
using lookup we are not passing the xdata_req, which then posix misses to
populate lock count.

<code>

2802 int
2803 afr_lookup_selfheal_wrap(void *opaque)
2804 {
2805     int ret = 0;
2806     call_frame_t *frame = opaque;
2807     afr_local_t *local = NULL;
2808     xlator_t *this = NULL;
2809     inode_t *inode = NULL;
2810     uuid_t pargfid = {
2811         0,
2812     };
2813 
2814     local = frame->local;
2815     this = frame->this;
2816     loc_pargfid(&local->loc, pargfid);
2817 
2818     ret = afr_selfheal_name(frame->this, pargfid, local->loc.name,
2819                             &local->cont.lookup.gfid_req,
local->xattr_req);
2820     if (ret == -EIO)
2821         goto unwind;
2822     
2823     afr_local_replies_wipe(local, this->private);
2824     
2825     inode = afr_selfheal_unlocked_lookup_on(frame, local->loc.parent,
2826                                             local->loc.name,
local->replies,
2827                                             local->child_up, NULL);
2828     if (inode)
2829         inode_unref(inode);
2830     
2831     afr_lookup_metadata_heal_check(frame, this);
2832     return 0;
2833 
2834 unwind:
2835     AFR_STACK_UNWIND(lookup, frame, -1, EIO, NULL, NULL, NULL, NULL);
2836     return 0;
</code>

--- Additional comment from Worker Ant on 2019-07-10 16:22:14 UTC ---

REVIEW: https://review.gluster.org/23024 (afr/lookup: Pass xattr_req in while
doing a slefheal in lookup) posted (#1) for review on master by mohammed rafi 
kc

--- Additional comment from Worker Ant on 2019-09-05 09:53:57 UTC ---

REVIEW: https://review.gluster.org/23024 (afr/lookup: Pass xattr_req in while
doing a selfheal in lookup) merged (#15) on master by Ravishankar N

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1726673
[Bug 1726673] Failures in remove-brick due to  [Input/output error] errors
https://bugzilla.redhat.com/show_bug.cgi?id=1728770
[Bug 1728770] Failures in remove-brick due to  [Input/output error] errors
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.