[Bugs] [Bug 1332396] New: posix: Set correct d_type for readdirp() calls
bugzilla at redhat.com
bugzilla at redhat.com
Tue May 3 06:29:20 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1332396
Bug ID: 1332396
Summary: posix: Set correct d_type for readdirp() calls
Product: GlusterFS
Version: 3.8.0
Component: posix
Keywords: Performance, Triaged
Severity: low
Priority: low
Assignee: bugs at gluster.org
Reporter: ppai at redhat.com
CC: bugs at gluster.org, ndevos at redhat.com, thiago at redhat.com
Depends On: 1175711
+++ This bug was initially created as a clone of Bug #1175711 +++
Description of problem:
os.walk() in Python walks the entire path given to it. It internally does a
stat to determine if a file is a file or directory. This additional stat is not
required to determine of a file is a file/directory. An alternative
implementation called "scandir.walk()" exists which is at least 2-3 times
faster. This is because "scandir.walk()" reads the d_type member of dirent
structure returned by readdir(). GlusterFS posix xlator does properly populate
the d_type member. Hence it can be accessed/consumed by applications.
https://github.com/benhoyt/scandir
Version-Release number of selected component (if applicable):
GlusterFS master branch
How reproducible:
Run the benchmark script on glusterfs mount point vs on a xfs mountpoint.
https://github.com/benhoyt/scandir/blob/master/benchmark.py
Actual results:
On XFS:
# python benchmark.py
Using fast C version of scandir
Comparing against builtin version of os.walk()
Priming the system's cache...
Benchmarking walks on benchtree, repeat 1/3...
Benchmarking walks on benchtree, repeat 2/3...
Benchmarking walks on benchtree, repeat 3/3...
os.walk took 0.035s, scandir.walk took 0.019s -- 1.9x as fast
On GlusterFS:
# python benchmark.py
Using fast C version of scandir
Comparing against builtin version of os.walk()
Priming the system's cache...
Benchmarking walks on benchtree, repeat 1/3...
Benchmarking walks on benchtree, repeat 2/3...
Benchmarking walks on benchtree, repeat 3/3...
os.walk took 0.845s, scandir.walk took 0.864s -- 1.0x as fast
Expected results:
scandir.walk() to be faster than os.walk() as it only does readdir() without
doing stat() on each file.
TODO:
Retry with all performance xlators disabled.
--- Additional comment from Niels de Vos on 2014-12-23 07:38:50 EST ---
We need to verify if the 'struct dirent'->d_type is retrieved correctly over
the fuse filesystem. In case it is not, this would be a bug in fuse.
--- Additional comment from Prashanth Pai on 2014-12-23 09:54:04 EST ---
I did check that (using following script) on latest master branch code. It does
fill that.
#!/usr/bin/env python
# Return status indicates if d_type returned
import ctypes
import sys
(DT_UNKNOWN, DT_DIR,) = (0, 4,)
class dirent(ctypes.Structure):
_fields_ = [
("d_ino", ctypes.c_long),
("d_off", ctypes.c_long),
("d_reclen", ctypes.c_ushort),
("d_type", ctypes.c_ubyte),
("d_name", ctypes.c_char*256)]
direntp = ctypes.POINTER(dirent)
libc = ctypes.cdll.LoadLibrary("libc.so.6")
libc.readdir.restype = direntp
dirp = libc.opendir(".")
if dirp:
ep = libc.readdir(dirp)
else:
sys.exit(1)
print ep.contents.d_type
--- Additional comment from Prashanth Pai on 2016-04-27 07:28:51 EDT ---
I was wrong. The above script failed to detect it because d_type is always set
correctly for "." and ".." entries. GlusterFS correctly propagates d_type from
posix xlator up the stack till FUSE.
It turns out that XFS does't fill correct d_type until recently (Linux>=3.15
and xfsprogs>=3.2.0). If one formats his/her filesystem with XFS's newer
version 5 on-disk format, d_type is rightly set.
Example: mkfs.xfs -m crc=1 /srv/disk1
However, GlusterFS can support filling the right d_type in readdirp() responses
even if XFS doesn't using the pre-fetched stat information.
--- Additional comment from Vijay Bellur on 2016-04-27 10:14:05 EDT ---
REVIEW: http://review.gluster.org/14095 (posix: Set correct d_type for
readdirp() calls) posted (#1) for review on master by Prashanth Pai
(ppai at redhat.com)
--- Additional comment from Prashanth Pai on 2016-04-27 10:32:35 EDT ---
Created a nested fs tree of depth = 4 on glusterfs mountpoint.
In the below example: ls command from coreutils is capable of avoiding
additional lstat() if it finds d_type to be set correctly.
BEFORE http://review.gluster.org/14095:
root# strace -fc -e getdents,lstat ls -fR /mnt/gluster-object/gsmetadata >>
/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
55.95 0.672307 30 22226 getdents
44.05 0.529388 24 22224 lstat
------ ----------- ----------- --------- --------- ----------------
100.00 1.201695 44450 total
AFTER http://review.gluster.org/14095:
root# strace -fc -e getdents,lstat ls -fR /mnt/gluster-object/gsmetadata >>
/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.595680 27 22226 getdents
------ ----------- ----------- --------- --------- ----------------
100.00 0.595680 22226 total
--- Additional comment from Vijay Bellur on 2016-05-02 01:32:25 EDT ---
REVIEW: http://review.gluster.org/14095 (posix: Set correct d_type for
readdirp() calls) posted (#2) for review on master by Prashanth Pai
(ppai at redhat.com)
--- Additional comment from Vijay Bellur on 2016-05-02 07:48:49 EDT ---
COMMIT: http://review.gluster.org/14095 committed in master by Jeff Darcy
(jdarcy at redhat.com)
------
commit 77def44d497d090ef3f393b6d9403c1a29dcf993
Author: Prashanth Pai <ppai at redhat.com>
Date: Wed Apr 27 13:37:07 2016 +0530
posix: Set correct d_type for readdirp() calls
dirent.d_type can contain the type of the directory entry. The 'd_type'
struct member in dirent is present in Linux and many BSD flavours.
However, filling d_type with correct value requires support from the
underlying filesystem. If not, d_type is set to DT_UNKNOWN. XFS added
support for d_type as part of their newer version 5 on-disk format.
However, this requires Linux >= 3.15, xfsprogs >= 3.2.0 and the bricks
to be formatted using the new format.
This patch enables posix xlator to set d_type to the right value even
when the underlying filesystem does not support it. d_type can be set
using information previously fetched by stat() on the dir entry.
This will aid FUSE applications to leverage d_type to avoid the expense
of calling lstat() if further actions depend on the type of the file.
Refer `man 3 readdir` and `man 2 getdents`
BUG: 1175711
Change-Id: Ic5a262fe4c64122726b4fae2d1bea375c559ca04
Signed-off-by: Prashanth Pai <ppai at redhat.com>
Reviewed-on: http://review.gluster.org/14095
Smoke: Gluster Build System <jenkins at build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy at redhat.com>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1175711
[Bug 1175711] posix: Set correct d_type for readdirp() calls
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list