[Gluster-devel] DHT NULL pointer usage

Xavier Hernandez xhernandez at datalab.es
Mon Nov 17 15:50:02 UTC 2014


Hi Shyam,

On 11/17/2014 03:50 PM, Shyam wrote:
> On 11/17/2014 07:20 AM, Emmanuel Dreyfus wrote:
>> Hello
>>
>> I have an almost reliable test that fails on NetBSD:
>>
>>> ./tests/basic/ec/quota.t                 (Wstat: 0 Tests: 22 Failed: 3)
>>>    Failed tests:  19-21
>>
>> This one is a real bug: glusterfsd crashed because of a NULL pointer. I
>> am going to submit a change to avoid touhcing postbuf is op_ret = -1 but
>> if someone has a better idea, please let me know.
>
> This should be fine from a DHT perspective. The current tests check if
> the file is under migration, using the postbuf attrs sticky and SGID
> (for the file). If the FOP errore'd out then the code should look to
> check if this is a dht_inode_missing error and then do a phase2 check to
> see if the file migrated to a new target (based on a good/bad op_ret).
>
> Looking at the test, it seems that we hit a quota error on the write. In
> which case following the file even if it migrated does not make sense.
> We should have received an EDQUOT error as the op_errno (if I am not
> mistaken) and exited this function at the check below,
>
> dht_writev_cbk:
> 35:       if (op_ret == -1 && !dht_inode_missing(op_errno)) {
> 36:                goto out;
> 37:        }

It's very strange. The failing test tries to write a file. There should 
be enough quota for the file to be written and it's not being migrated, 
but it seems that the brick is returning ENOENT (at least this is what 
ec_writev_cbk() is receiving). However all other tests writing to files 
seems to work ok, so I'm not sure what is special with this particular test.

>
> Are we not sending an EDQUOT upwards in this case?
>
> Validating assumptions:
> - As this is an fd based operation, the fd should have been open, and so
> why did the writev fail exactly? EDQUOT?

The op_errno is 2 (ENOENT)

> - Was this an NFS based test, which could explain the use of anon-fds
> which would return an error if the file is no longer on the backend. Is
> this a test run from an NFS mount point?

It's a normal FUSE mount.

Xavi


More information about the Gluster-devel mailing list