[Gluster-devel] Errors in gf_defrag_handle_migrate_error() ?
Vijay Bellur
vbellur at redhat.com
Mon Jan 5 11:55:29 UTC 2015
On 01/03/2015 05:59 PM, Vijay Bellur wrote:
> gf_defrag_handle_migrate_error() seems broken at the moment.
>
> /* if errno is not ENOSPC or ENOTCONN, we can still continue
> with rebalance process */
> if ((op_errno != ENOSPC) || (op_errno != ENOTCONN))
> return 1;
>
> if (op_errno == ENOTCONN) {
> /* Most probably mount point went missing (mostly due
> to a brick down), say rebalance failure to user,
> let him restart it if everything is fine */
> defrag->defrag_status = GF_DEFRAG_STATUS_FAILED;
> return -1;
> }
>
> if (op_errno == ENOSPC) {
> /* rebalance process itself failed, may be
> remote brick went down, or write failed due to
> disk full etc etc.. */
> defrag->defrag_status = GF_DEFRAG_STATUS_FAILED;
> return -1;
> }
>
> return 0;
>
> The first if condition should have a logical AND instead of an OR. It
> looks like we will still continue (return 1) if op_errno happens to be
> anything, even if it is ENOSPC or ENOTCONN. Do we need any special
> handling for ENOTCONN & ENOSPC? We anyway abort a rebalance process if
> we were to get a CHILD_DOWN and that should handle errors with op_errno
> ENOTCONN. If a write fails because of ENOSPC in one brick, shouldn't we
> let rebalance continue and migrate data to other non-full bricks? That
> seems to be the intent in gf_defrag_migrate_data() where we update the
> skipped counter if an ENOSPC is encountered. If we don't need to accord
> special behavior to these errnos, this function can be retired.
>
Sent [1] to rfc queue. Please check and let me know your comments.
-Vijay
[1] http://review.gluster.org/9384
More information about the Gluster-devel
mailing list