[Bugs] [Bug 1734252] Heal not completing after geo-rep session is stopped on EC volumes.

bugzilla at redhat.com bugzilla at redhat.com
Tue Jul 30 05:02:23 UTC 2019


--- Comment #2 from Ashish Pandey <aspandey at redhat.com> ---
Here is the root cause of the issue -

when features.read-only is enabled, ro_fxattrop will check for the following
condition - 
    if (is_readonly_or_worm_enabled(frame, this) && !allzero)                   
        STACK_UNWIND_STRICT(fxattrop, frame, -1, EROFS, NULL, xdata);  

In this is_readonly_or_worm_enabled(frame, this) will return "false" for shd if
frame->root->pid < 0, which we set for the frame used in healing as "-6".

However, in this case this frame->root->pid is coming up with value as "0".
That's why this condition is failing (0 < 0) and the function returning "true" 
and making this as read-only for shd process also.

Why is it happening?

when shd triggers heal for the file, it is finding that there is nothing to
heal so it is calling "ec_data_undo_pending" to remove  dirty flag for data
which in turn calling syncop_fxattrop->SYNCOP

We do not pass frame to this SYNCOP and it gets the frame from the task -
        task = synctask_get();                                                
        stb->task = task;                                                     
        if (task)                                                             
            frame = task->opframe;   

However, while creating task we provided frame as NULL.

ec_launch_heal(ec_t *ec, ec_fop_data_t *fop)
    int ret = 0;

    ret = synctask_new(ec->xl->ctx->env, ec_synctask_heal_wrap, ec_heal_done,
                       NULL, fop);

So synctask_create will create a task with new frame but it will not set
frame->root->pid as -6 and it will be "0" only.
This is what we are checking in "is_readonly_or_worm_enabled" and getting
read-only as TRUE and the heal (fxattrop) is failing with "read-only file
system" error.

When we don't enable feature.read-only, this xlator will not be loaded and this
condition will not be checked and hence fxattrop sent by "ec_data_undo_pending"
will succeed and it will remove the dirty [data part] flag.

You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.

More information about the Bugs mailing list