[Gluster-devel] Bitrot stub forget()
Venky Shankar
vshankar at redhat.com
Thu Feb 18 04:14:27 UTC 2016
On Tue, Feb 16, 2016 at 11:38:24PM -0500, FNU Raghavendra Manjunath wrote:
> Venky,
>
> Yes. You are right. We should not remove the quarantine entry in forget.
>
> We have to remove it upon getting -ve lookups in bit-rot-stub and upon
> getting an unlink.
>
> I have attached a patch for it.
>
> Unfortunately rfc.sh is failing for me with the below error.
Thanks for the patch. One of us will take care of putting it up for review.
>
>
> ssh: connect to host git.gluster.com port 22: Connection timed out
> fatal: Could not read from remote repository.
>
> Please make sure you have the correct access rights
> and the repository exists."
>
>
> Regards,
> Raghavendra
>
>
> On Tue, Feb 16, 2016 at 10:53 AM, Venky Shankar <vshankar at redhat.com> wrote:
>
> > Hey Raghu,
> >
> > Bitrot stub inode forget implementation (br_stub_forget()) deletes the bad
> > object
> > marker (under quarantine directory) if present. This looks incorrect as
> > ->forget()
> > can be trigerred when inode table LRU size exceeeds configured limit -
> > check bug
> > #1308961 which tracks this issue. I recall that protocol/server calls
> > inode_forget()
> > on negative lookup (that might not invoke ->forget() though) and that's
> > the reason
> > why br_stub_forget() has this code.
> >
> > So, would it make sense to purge bad object marker just in lookup()? There
> > might be
> > a need to do the same in unlink() in case the object was removed by the
> > client.
> >
> > Thoughts?
> >
> > Thanks,
> >
> > Venky
> >
> From a0cc49172df24e263e0db25c53b57f58c19d2cab Mon Sep 17 00:00:00 2001
> From: Raghavendra Bhat <raghavendra at redhat.com>
> Date: Tue, 16 Feb 2016 20:22:36 -0500
> Subject: [PATCH] features/bitrot: do not remove the quarantine handle in
> forget
>
> If an object is marked as bad, then an entry is corresponding to the
> bad object is created in the .glusterfs/quarantine directory to help
> scrub status. The entry name is the gfid of the corrupted object.
> The quarantine handle is removed in below 2 cases.
>
> 1) When protocol/server revceives the -ve lookup on an entry whose inode
> is there in the inode table (it can happen when the corrupted object
> is deleted directly from the backend for recovery purpose) it sends a
> forget on the inode and bit-rot-stub removes the quarantine handle in
> upon getting the forget.
> refer to the below commit
> f853ed9c61bf65cb39f859470a8ffe8973818868:
> http://review.gluster.org/12743)
>
> 2) When bit-rot-stub itself realizes that lookup on a corrupted object
> has failed with ENOENT.
>
> But with step1, there is a problem when the bit-rot-stub receives forget
> due to lru limit exceeding in the inode table. In such cases, though the
> corrupted object is not deleted (either from the mount point or from the
> backend), the handle in the quarantine directory is removed and that object
> is not shown in the bad objects list in the scrub status command.
>
> So it is better to follow only 2nd step (i.e. bit-rot-stub removing the handle
> from the quarantine directory in -ve lookups). Also the handle has to be removed
> when a corrupted object is unlinked from the mount point itself.
>
> Change-Id: Ibc3bbaf4bc8a5f8986085e87b729ab912cbf8cf9
> Signed-off-by: Raghavendra Bhat <raghavendra at redhat.com>
> ---
> xlators/features/bit-rot/src/stub/bit-rot-stub.c | 103 +++++++++++++++++++++--
> 1 file changed, 95 insertions(+), 8 deletions(-)
>
> diff --git a/xlators/features/bit-rot/src/stub/bit-rot-stub.c b/xlators/features/bit-rot/src/stub/bit-rot-stub.c
> index 9c0f622..920aa7c 100644
> --- a/xlators/features/bit-rot/src/stub/bit-rot-stub.c
> +++ b/xlators/features/bit-rot/src/stub/bit-rot-stub.c
> @@ -2841,19 +2841,46 @@ unwind:
>
> /** {{{ */
>
> -/* forget() */
> +/* unlink() */
>
> int
> -br_stub_forget (xlator_t *this, inode_t *inode)
> +br_stub_unlink_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
> + int32_t op_ret, int32_t op_errno, struct iatt *preparent,
> + struct iatt *postparent, dict_t *xdata)
> {
> - uint64_t ctx_addr = 0;
> - br_stub_inode_ctx_t *ctx = NULL;
> + br_stub_local_t *local = NULL;
> + inode_t *inode = NULL;
> + uint64_t ctx_addr = 0;
> + br_stub_inode_ctx_t *ctx = NULL;
> + int32_t ret = -1;
>
> - inode_ctx_del (inode, this, &ctx_addr);
> - if (!ctx_addr)
> - return 0;
> + if (op_ret < 0)
> + goto unwind;
>
> - ctx = (br_stub_inode_ctx_t *) (long) ctx_addr;
> + local = frame->local;
> + frame->local = NULL;
> +
> + inode = local->u.context.inode;
> +
> + ret = br_stub_get_inode_ctx (this, inode, &ctx_addr);
> + if (ret) {
> + /**
> + * If the inode is bad AND context is not there, then there
> + * is a possibility of the gfid of the object being listed
> + * in the quarantine directory and will be shown in the
> + * bad objects list. So continuing with the fop with a
> + * warning log. The entry from the quarantine directory
> + * has to be removed manually. Its not a good idea to fail
> + * the fop, as the object has already been deleted.
> + */
> + gf_msg (this->name, GF_LOG_WARNING, 0,
> + BRS_MSG_GET_INODE_CONTEXT_FAILED,
> + "failed to get the context for the inode %s",
> + uuid_utoa (inode->gfid));
> + goto unwind;
> + }
> +
> + ctx = (br_stub_inode_ctx_t *)(long)ctx_addr;
>
> LOCK (&inode->lock);
> {
> @@ -2868,6 +2895,65 @@ br_stub_forget (xlator_t *this, inode_t *inode)
> }
> UNLOCK (&inode->lock);
>
> +unwind:
> + STACK_UNWIND_STRICT (unlink, frame, op_ret, op_errno, preparent, postparent, xdata);
> + br_stub_cleanup_local (local);
> + br_stub_dealloc_local (local);
> + return 0;
> +}
> +
> +int
> +br_stub_unlink (call_frame_t *frame, xlator_t *this, loc_t *loc, int flag,
> + dict_t *xdata)
> +{
> + br_stub_local_t *local = NULL;
> + int32_t op_ret = -1;
> + int32_t op_errno = 0;
> +
> + local = br_stub_alloc_local (this);
> + if (!local) {
> + op_ret = -1;
> + op_errno = ENOMEM;
> + gf_msg (this->name, GF_LOG_ERROR, ENOMEM, BRS_MSG_NO_MEMORY,
> + "failed to allocate memory for local (path: %s, gfid: %s)",
> + loc->path, uuid_utoa (loc->inode->gfid));
> + goto unwind;
> + }
> +
> + br_stub_fill_local (local, NULL, NULL, loc->inode,
> + loc->inode->gfid,
> + BR_STUB_NO_VERSIONING, 0);
> +
> + frame->local = local;
> +
> + STACK_WIND (frame, br_stub_unlink_cbk, FIRST_CHILD (this),
> + FIRST_CHILD (this)->fops->unlink, loc, flag, xdata);
> + return 0;
> +
> +unwind:
> + STACK_UNWIND_STRICT (unlink, frame, op_ret, op_errno, NULL, NULL, NULL);
> + return 0;
> +}
> +
> +
> +/** }}} */
> +
> +/** {{{ */
> +
> +/* forget() */
> +
> +int
> +br_stub_forget (xlator_t *this, inode_t *inode)
> +{
> + uint64_t ctx_addr = 0;
> + br_stub_inode_ctx_t *ctx = NULL;
> +
> + inode_ctx_del (inode, this, &ctx_addr);
> + if (!ctx_addr)
> + return 0;
> +
> + ctx = (br_stub_inode_ctx_t *) (long) ctx_addr;
> +
> GF_FREE (ctx);
>
> return 0;
> @@ -3133,6 +3219,7 @@ struct xlator_fops fops = {
> .setxattr = br_stub_setxattr,
> .opendir = br_stub_opendir,
> .readdir = br_stub_readdir,
> + .unlink = br_stub_unlink,
> };
>
> struct xlator_cbks cbks = {
> --
> 2.5.0
>
More information about the Gluster-devel
mailing list