[Bugs] [Bug 1360785] Direct io to sharded files fails when on zfs backend

bugzilla at redhat.com bugzilla at redhat.com
Thu Jul 28 17:33:25 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1360785

Krutika Dhananjay <kdhananj at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kdhananj at redhat.com,
                   |                            |pkarampu at redhat.com,
                   |                            |sabose at redhat.com



--- Comment #4 from Krutika Dhananjay <kdhananj at redhat.com> ---
Hi,

Open() on these affected files seems to be returning ENOENT, however as per the
find command output you gave on ovirt-users ML, both the file and its gfid
handle seem to be existing in the backend. Then the failure was not due to
ENOENT. I looked at the code in posix again and there is evidence to suggest
that the actual error code (the real reason for open() failing) is getting
masked by stat in .unlink directory:

30         if (fd->inode->ia_type == IA_IFREG) {                                
 29                 _fd = open (real_path, fd->flags);                          
 28                 if (_fd == -1) {                          
 27                         POSIX_GET_FILE_UNLINK_PATH (priv->base_path,        
 26                                                     fd->inode->gfid,        
 25                                                     unlink_path);           
 24                         _fd = open (unlink_path, fd->flags);                
 23                 }                                                           
 22                 if (_fd == -1) {                                            
 21                         op_errno = errno;                                   
 20                         gf_msg (this->name, GF_LOG_ERROR, op_errno,         
 19                                 P_MSG_READ_FAILED,                          
 18                                 "Failed to get anonymous "                  
 17                                 "real_path: %s _fd = %d", real_path, _fd);  
 16                         GF_FREE (pfd);                                      
 15                         pfd = NULL;                                         
 14                         goto out;                                           
 13                 }                                                           
 12         }                         

In your case, on line 29, the open on
.glusterfs/de/b6/deb61291-5176-4b81-8315-3f1cf8e3534d failed for a reason other
than ENOENT (it can't be ENOENT because we already saw on doing find that the
file exists). And then line 27 is executed. If the file exists in its real
path, then it must be absent in .unlink directory (because the gfid handle
can't be present at both places). So it is the open() on line 24 that is
failing with ENOENT and not the open on line 29.

I'll be sending a patch to fix this problem.

Meanwhile, in order to understand why the open on line 29 failed, could you
attach all of your bricks to strace, run the test again, wait for it to fail,
and then attach both the strace output files and the resultant glusterfs client
and brick logs here?

# strace -ff -p <pid-of-the-brick> -o
<path-where-you-want-to-capture-the-output>

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list