[Gluster-devel] Dht readdir filtering out names

Soumya Koduri skoduri at redhat.com
Fri Sep 30 05:33:58 UTC 2016



On 09/30/2016 10:08 AM, Pranith Kumar Karampuri wrote:
> Does samba/gfapi/nfs-ganesha have options to disable readdirp?

AFAIK, currently there is no option to disable/enable readdirp in gfapi 
& nfs-ganesha (not sure about samba). But looks like nfs-ganesha seem to 
be always using readdir, which I plan to change it to readdirp in the 
near future to check if it improves performance of stat on small-files. 
Could you please summarize the issues with using readdirp?

Thanks,
Soumya

>
> On Fri, Sep 30, 2016 at 10:04 AM, Pranith Kumar Karampuri
> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>
>     What if the lower xlators want to set the entry->inode to NULL and
>     clear the entry->d_stat to force a lookup on the name? i.e.
>     gfid-split-brain/ia_type mismatches.
>
>     On Fri, Sep 30, 2016 at 10:00 AM, Raghavendra Gowdappa
>     <rgowdapp at redhat.com <mailto:rgowdapp at redhat.com>> wrote:
>
>
>
>         ----- Original Message -----
>         > From: "Raghavendra Gowdappa" <rgowdapp at redhat.com <mailto:rgowdapp at redhat.com>>
>         > To: "Pranith Kumar Karampuri" <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>
>         > Cc: "Shyam Ranganathan" <srangana at redhat.com <mailto:srangana at redhat.com>>, "Nithya
>         Balachandran" <nbalacha at redhat.com
>         <mailto:nbalacha at redhat.com>>, "Gluster Devel"
>         > <gluster-devel at gluster.org <mailto:gluster-devel at gluster.org>>
>         > Sent: Friday, September 30, 2016 9:58:34 AM
>         > Subject: Re: Dht readdir filtering out names
>         >
>         >
>         >
>         > ----- Original Message -----
>         > > From: "Pranith Kumar Karampuri" <pkarampu at redhat.com
>         <mailto:pkarampu at redhat.com>>
>         > > To: "Raghavendra Gowdappa" <rgowdapp at redhat.com
>         <mailto:rgowdapp at redhat.com>>
>         > > Cc: "Shyam Ranganathan" <srangana at redhat.com
>         <mailto:srangana at redhat.com>>, "Nithya Balachandran"
>         > > <nbalacha at redhat.com <mailto:nbalacha at redhat.com>>, "Gluster
>         Devel"
>         > > <gluster-devel at gluster.org <mailto:gluster-devel at gluster.org>>
>         > > Sent: Friday, September 30, 2016 9:53:44 AM
>         > > Subject: Re: Dht readdir filtering out names
>         > >
>         > > On Fri, Sep 30, 2016 at 9:50 AM, Raghavendra Gowdappa
>         <rgowdapp at redhat.com <mailto:rgowdapp at redhat.com>>
>         > > wrote:
>         > >
>         > > >
>         > > >
>         > > > ----- Original Message -----
>         > > > > From: "Pranith Kumar Karampuri" <pkarampu at redhat.com
>         <mailto:pkarampu at redhat.com>>
>         > > > > To: "Raghavendra Gowdappa" <rgowdapp at redhat.com
>         <mailto:rgowdapp at redhat.com>>
>         > > > > Cc: "Shyam Ranganathan" <srangana at redhat.com
>         <mailto:srangana at redhat.com>>, "Nithya Balachandran" <
>         > > > nbalacha at redhat.com <mailto:nbalacha at redhat.com>>,
>         "Gluster Devel"
>         > > > > <gluster-devel at gluster.org
>         <mailto:gluster-devel at gluster.org>>
>         > > > > Sent: Friday, September 30, 2016 9:15:04 AM
>         > > > > Subject: Re: Dht readdir filtering out names
>         > > > >
>         > > > > On Fri, Sep 30, 2016 at 9:13 AM, Raghavendra Gowdappa <
>         > > > rgowdapp at redhat.com <mailto:rgowdapp at redhat.com>>
>         > > > > wrote:
>         > > > >
>         > > > > > dht_readdirp_cbk has different behaviour for
>         directories and files.
>         > > > > >
>         > > > > > 1. If file, pick the dentry (passed from subvols as
>         part of readdirp
>         > > > > > response) if the it corresponds to data file.
>         > > > > > 2. If directory pick the dentry if readdirp response
>         is from
>         > > > hashed-subvol.
>         > > > > >
>         > > > > > In all other cases, the dentry is skipped and not
>         passed to higher
>         > > > > > layers/application. To elaborate, the dentries which
>         are ignored are:
>         > > > > > 1. dentries corresponding to linkto files.
>         > > > > > 2. dentries from non-hashed subvols corresponding to
>         directories.
>         > > > > >
>         > > > > > Since the behaviour is different for different
>         filesystem objects,
>         > > > > > dht
>         > > > > > needs ia_type to choose its behaviour.
>         > > > > >
>         > > > > > ----- Original Message -----
>         > > > > > > From: "Pranith Kumar Karampuri" <pkarampu at redhat.com
>         <mailto:pkarampu at redhat.com>>
>         > > > > > > To: "Shyam Ranganathan" <srangana at redhat.com
>         <mailto:srangana at redhat.com>>, "Raghavendra
>         > > > Gowdappa" <
>         > > > > > rgowdapp at redhat.com <mailto:rgowdapp at redhat.com>>,
>         "Nithya Balachandran"
>         > > > > > > <nbalacha at redhat.com <mailto:nbalacha at redhat.com>>
>         > > > > > > Cc: "Gluster Devel" <gluster-devel at gluster.org
>         <mailto:gluster-devel at gluster.org>>
>         > > > > > > Sent: Friday, September 30, 2016 8:39:28 AM
>         > > > > > > Subject: Dht readdir filtering out names
>         > > > > > >
>         > > > > > > hi,
>         > > > > > >        In dht_readdirp_cbk() there is a check about
>         skipping files
>         > > > > > without
>         > > > > > > ia_type. Could you help me understand why this check
>         is added?
>         > > > > > > There
>         > > > are
>         > > > > > > times when users have to delete gfid of the entries
>         and trigger
>         > > > something
>         > > > > > > like 'find . | xargs stat' to heal the gfids. This
>         case would fail
>         > > > if we
>         > > > > > > skip entries without gfid, if the lower xlators
>         don't send stat
>         > > > > > information
>         > > > > > > for them.
>         > > > > >
>         > > > > > Probably we can make readdirp_cbk not rely on ia_type
>         and pass _all_
>         > > > > > dentries received by subvols to application without
>         filtering.
>         > > > > > However
>         > > > we
>         > > > > > should make this behaviour optional and use this only
>         for recovery
>         > > > setups.
>         > > > > > If we don't rely on ia_type (during non error scenarios),
>         > > > > > applications
>         > > > end
>         > > > > > up seeing duplicate dentries in readdir listing.
>         > > > > >
>         > > > >
>         > > > > That means dht_readdir() gives duplicate entries? As per
>         the code it
>         > > > seems
>         > > > > like it...
>         > > >
>         > > > No. It follows the filtering logic of "pick dentry only
>         from hashed
>         > > > subvol". This logic doesn't need ia_type. Now, that you
>         brought the topic
>         > > > of dht_readdir, I've another solution for your use case
>         (Basically don't
>         > > > use readdirp :) ):
>         > > >
>         > > > 1. mount glusterfs with "--use-readdirp=no" option.
>         > > > 2. disable md-cache/stat-prefetch as it converts all
>         readdir calls into
>         > > > readdirp calls
>         > > >
>         > >
>         > > Probably the ones in dht as well? i.e. use-readdirp option.
>         >
>         > No. dht doesn't convert a readdir into readdirp. The option
>         you are referring
>         > to might be "readdir-optimize" which is something different.
>
>         Sorry. I was wrong. There is an option in dht too, to force
>         using readdirp. As you said, we should disable that too.
>
>         >
>         > >
>         > >
>         > > >
>         > > > Use this only for recovery setups :).
>         > > >
>         > > > >
>         > > > >
>         > > > > >
>         > > > > > >
>         > > > > > > --
>         > > > > > > Pranith
>         > > > > > >
>         > > > > >
>         > > > > > regards,
>         > > > > > Raghavendra
>         > > > > >
>         > > > >
>         > > > >
>         > > > >
>         > > > > --
>         > > > > Pranith
>         > > > >
>         > > >
>         > >
>         > >
>         > >
>         > > --
>         > > Pranith
>         > >
>         >
>
>
>
>
>     --
>     Pranith
>
>
>
>
> --
> Pranith


More information about the Gluster-devel mailing list