[Gluster-devel] Regression failure: glusterd segfault in rcu_read_unlock_bp

Thu Apr 16 11:34:06 UTC 2015

Hey Kotresh,

This is a known issue. We are evaluating some possible solutions. The
failure is because of changes introduced by
https://review.gluster.org/10147 .

The GlusterD rpcsvc uses the synctask framework to provide multi
threading. GlusterD uses synclock_t provided by the synctask framework
to implement its big lock.
The synclock framework provides a userspace M:N thread multiplexing,
where M tasks are mapped onto N threads. When a synctask thread tries
to acquire an already locked synclock, the synctask framework will
yield the thread and put it to sleep to allow other tasks to execute.
Once the lock can be acquired, the synctask framework will resume the
swapped out thread. The task can be resumed on a completely different
thread from the one it was put to sleep.

Review-10147 introduced changes to the GlusterD transaction framework
to make peerinfo access within the framework RCU compatible. In the
transaction framework, GlusterD iterates over the list of peers and
sends requests to other peers. The pseudo code for this is as below,

```
Transaction Starts
Get BIG_LOCK
.
.
Do other stuff
.
.
rcu_read_lock
for each peer in peers list, do
  Release BIG_LOCK
  Send request to peer
  Get BIG_LOCK
 done
 rcu_read_unlock
.
other stuff
.
.
Release BIG_LOCK
```
During the iteration, we give up the big-lock when sending a RPC
request to prevent a deadlock, and obtain it after sending the
request. During the period when the transaction thread has given up
the big-lock, another thread could have obtained it. The transaction
thread is one the threads started by the GlusterD rpcsvc using
synctask. So when the thread tries to obtain the big-lock after
sending the rpc request, it could get swapped out and resumed on
another thread by synctask (as explained above).

If this thread swapping happens, it means that we are calling
rcu_read_lock() on one thread, but rcu_read_unlock() on another
thread. This by itself is a problem, as liburcu doesn't support a read
critical section starting in one thread and ending in another. The
particular flavour of liburcu, bulletproof/bp (no longer seems to be
though), we are using leads to the crash.

liburcu requires every thread that will enter a read-critical section
to do a thread registration (call rcu_thread_register). The BP flavour
does this registration automatically if required when rcu_read_lock is
called. In this case rcu_read_lock was called on one thread, but
rcu_read_unlock was called in another thread which was unregistered.
rcu_read_unlock tried to access some TLS variables, which would have
been created on thread registration, and caused a segfault.

Using an alternate flavour of liburcu and manually registering every
thread with urcu, would lead to other problems (RCU deadlocks!) as
rcu_read_lock and rcu_read_unlock could be called from different
thread due to the thread swapping.

We are currently evaluating some possible solutions to this. We are
trying to see if we can prevent the thread from being swapped, as this
is the only way we can get correct liburcu functionality.

I'll update here once we have a better plan.

Thanks,

Kaushal

On Thu, Apr 16, 2015 at 3:29 PM, Kotresh Hiremath Ravishankar
<khiremat at redhat.com> wrote:
> Hi All,
>
> I see glusterd SEGFAULT for my patch with the following stack trace. I see that is not related to my patch.
> Could someone look into this? I will retrigger regression for my patch.
>
> #0  0x00007f86f0968d16 in rcu_read_unlock_bp () from /home/kotresh/Downloads/regression/usr/lib64/liburcu-bp.so.1
> (gdb) bt
> #0  0x00007f86f0968d16 in rcu_read_unlock_bp () from /home/kotresh/Downloads/regression/usr/lib64/liburcu-bp.so.1
> #1  0x00007f86f1235467 in gd_commit_op_phase (op=GD_OP_START_VOLUME, op_ctx=0x7f86f9d5a230, req_dict=0x7f86f9d5bf2c, op_errstr=0x7f86e0244260,
>     txn_opinfo=0x7f86e02441e0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-syncop.c:1360
> #2  0x00007f86f1236366 in gd_sync_task_begin (op_ctx=0x7f86f9d5a230, req=0xcb6b8c)
>     at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-syncop.c:1736
> #3  0x00007f86f123654b in glusterd_op_begin_synctask (req=0xcb6b8c, op=GD_OP_START_VOLUME, dict=0x7f86f9d5a230)
>     at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-syncop.c:1787
> #4  0x00007f86f1221402 in __glusterd_handle_cli_start_volume (req=0xcb6b8c)
>     at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-volume-ops.c:471
> #5  0x00007f86f1190291 in glusterd_big_locked_handler (req=0xcb6b8c, actor_fn=0x7f86f122110d <__glusterd_handle_cli_start_volume>)
>     at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:83
> #6  0x00007f86f12214a3 in glusterd_handle_cli_start_volume (req=0xcb6b8c)
>     at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-volume-ops.c:489
> #7  0x00007f86fc375f66 in synctask_wrap (old_task=0x7f86e0041760) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/syncop.c:375
> #8  0x00007f86fb1508f0 in ?? () from /home/kotresh/Downloads/regression/lib64/libc.so.6
> #9  0x0000000000000000 in ?? ()
>
>
> Link to the core file:
> http://slave27.cloud.gluster.org/archived_builds/build-install-20150416:07:11:15.tar.bz2
>
>
> Thanks and Regards,
> Kotresh H R
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel