[Gluster-devel] add-brick crashes client

Amar Tumballi amarts at redhat.com
Fri Aug 3 08:57:34 UTC 2012


On 08/03/2012 02:23 PM, Emmanuel Dreyfus wrote:
> On Fri, Aug 03, 2012 at 05:13:02AM +0000, Emmanuel Dreyfus wrote:
>> It seems there is a race condition here. Someone knowledgable can
>> confirm?
>
> I tried this. It does not crash anymore, but the volume gets broken
> with lookups returning EINVAL  (log below), it's therefore probably the
> wrong way, but hints are welcome.
>
> --- syncop.c.orig       2012-08-03 08:02:35.000000000 +0200
> +++ syncop.c    2012-08-03 10:43:28.000000000 +0200
> @@ -116,8 +116,10 @@
>           /* Do not trust the pointer received. It may be
>              wrong and can lead to crashes. */
>
>           task = synctask_get ();
> +       assert(task != NULL);
> +
>           task->ret = task->syncfn (task->opaque);
>          if (task->synccbk)
>                  task->synccbk (task->ret, task->frame, task->opaque);
>
> @@ -211,8 +213,14 @@
>
>           newtask->ctx.uc_stack.ss_sp   = newtask->stack;
>           newtask->ctx.uc_stack.ss_size = env->stacksize;
>
> +       /*
> +        * synctask_wrap does not trust its argument, and
> +        * uses syntask_get()
> +        */
> +       synctask_set (newtask);
> +
>           makecontext (&newtask->ctx, (void *) synctask_wrap, 2, newtask);
>
>          newtask->state = SYNCTASK_INIT;
>
>
> [2012-08-03 10:46:03.709177] E [afr-common.c:3664:afr_notify] 0-pfs-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
> [2012-08-03 10:46:03.825505] W [dht-layout.c:186:dht_layout_search] 1-pfs-dht: no subvolume for hash (value) = 4177819066
> [2012-08-03 10:46:03.825652] E [dht-common.c:1372:dht_lookup] 1-pfs-dht: Failed to get hashed subvol for /manu
> [2012-08-03 10:46:03.826315] W [fuse-bridge.c:292:fuse_entry_cbk] 0-glusterfs-fuse: 12944: LOOKUP() /manu => -1 (Invalid argument)
> [2012-08-03 10:46:03.827107] W [dht-layout.c:186:dht_layout_search] 1-pfs-dht: no subvolume for hash (value) = 4177819066
>

Looking at the logs, I feel its because of bug 815227, can you run a 
'rebalance' operation and see if everything comes to normal?

-Amar





More information about the Gluster-devel mailing list