[Bugs] [Bug 1164218] New: glfs_set_volfile_server() method causes segmentation fault when bad arguments are passed.

Fri Nov 14 11:32:05 UTC 2014

https://bugzilla.redhat.com/show_bug.cgi?id=1164218

            Bug ID: 1164218
           Summary: glfs_set_volfile_server() method causes segmentation
                    fault when bad arguments are passed.
           Product: GlusterFS
           Version: 3.5.1
         Component: libgfapi
          Keywords: Regression
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: dnarayan at redhat.com
        QA Contact: sdharane at redhat.com
                CC: bazulay at redhat.com, bugs at gluster.org, bugs at ovirt.org,
                    daniel.guettes at interactivedata.com, danken at redhat.com,
                    dnarayan at redhat.com, ecohen at redhat.com,
                    gklein at redhat.com, gluster-bugs at redhat.com,
                    iheim at redhat.com, lsurette at redhat.com,
                    mgoldboi at redhat.com, penguin.wrangler at gmail.com,
                    rbalakri at redhat.com, yeylon at redhat.com
        Depends On: 1162640
            Blocks: 1155170 (oVirt_3.5.1_tracker)



+++ This bug was initially created as a clone of Bug #1162640 +++

Description of problem:
supervsdm encounters a segfault in libgfapi, causing all VMs on that host to
enter paused state and remain stuck there forever, even though supervdsm is
restarted automatically (by systemd). High-availability VM's are neither
migrated nor restarted. The paused VMs cannot be manually un-paused. Must
manually kill and then restart the VM's.

Version-Release number of selected component (if applicable):
vdsm-4.16.7-1.gitdb83943.fc20.x86_64
vdsm-gluster-4.16.7-1.gitdb83943.fc20.noarch
glusterfs-api-3.5.2-1.fc20.x86_64

How reproducible:
Consistent since upgrade to ovirt release 3.5. Ovirt nodes remain operational
for anywhere from 8 hours to as little as 30 minutes between instances of the
above-described segfaults.

It issue appears to occur when supervdsm is looping over all gluster volumes
and retrieving their details. This occurs regularily (every 5 minutes it seems)
usually without crashing, but then randomly does crash. It should also be noted
that the volume being queried when the crash occurs is not consistent. It looks
like each volume is queried twice, once plain (third arg of call to
volumeStatus() is empty ('')) and then again with detail (third arg of call to
volumeStatus() is 'detail'). The crash always comes during one of the 'detail'
queries.

Steps to Reproduce:
1. vm storage domain on glusterfs
2. run some VMs
3. wait

Actual results:
Here are the relevant log extracts that I can find:

/var/log/messages:
Nov  5 21:46:29 big kernel: supervdsmServer[11638]: segfault at 18 ip
00007f652c41115a sp 00007f5fd3b7cf40 error 4 in
libgfapi.so.0.0.0[7f652c40a000+19000]

/var/log/vdsm/supervdsm.log: (looping through getting information about each
gluster volume)
MainProcess|Thread-792::DEBUG::2014-11-05
21:46:29,015::supervdsmServer::101::SuperVdsm.ServerCallback::(wrapper) call
wrapper with ('web', '', 'detail') {}
MainProcess|Thread-792::DEBUG::2014-11-05
21:46:29,015::utils::738::root::(execCmd) /usr/sbin/gluster --mode=script
volume status web detail --xml (cwd None)
MainProcess|Thread-792::DEBUG::2014-11-05
21:46:29,323::utils::758::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0
MainProcess|Thread-792::DEBUG::2014-11-05
21:46:29,324::supervdsmServer::108::SuperVdsm.ServerCallback::(wrapper) return
wrapper with {'bricks': [{'sizeTotal': '5722917.992', 'mntOptions':
'rw,relatime,ssd,space_cache', 'hostuuid':
'6956bfd3-7a64-4789-9fda-d83bbea93924', 'device': '/dev/bcache0', 'blockSize':
'4096', 'brick': 'big:/mnt/export/r0/web', 'sizeFree': '5624569.770', 'fsName':
'btrfs'}, {'sizeTotal': '5722917.992', 'mntOptions':
'rw,relatime,ssd,space_cache', 'hostuuid':
'e2a5191a-4d72-430f-9227-feb150163051', 'device': '/dev/bcache0', 'blockSize':
'4096', 'brick': 'bang:/mnt/export/r0/web', 'sizeFree': '5619114.816',
'fsName': 'btrfs'}], 'name': 'web'}
MainProcess|Thread-792::DEBUG::2014-11-05
21:46:29,325::supervdsmServer::101::SuperVdsm.ServerCallback::(wrapper) call
wrapper with ('web',) {}
   [*** supervdsm crashed here and is restarted ***]
MainThread::DEBUG::2014-11-05
22:55:21,655::netconfpersistence::134::root::(_getConfigs) Non-existing config
set.
MainThread::DEBUG::2014-11-05
22:55:21,655::netconfpersistence::134::root::(_getConfigs) Non-existing config
set.
MainThread::DEBUG::2014-11-05
22:55:21,722::supervdsmServer::411::SuperVdsm.Server::(main) Making sure I'm
root - SuperVdsm
MainThread::DEBUG::2014-11-05
22:55:21,722::supervdsmServer::420::SuperVdsm.Server::(main) Parsing cmd args
MainThread::DEBUG::2014-11-05
22:55:21,722::supervdsmServer::423::SuperVdsm.Server::(main) Cleaning old
socket /var/run/vdsm/svdsm.sock
MainThread::DEBUG::2014-11-05
22:55:21,723::supervdsmServer::427::SuperVdsm.Server::(main) Setting up keep
alive thread
MainThread::DEBUG::2014-11-05
22:55:21,723::supervdsmServer::433::SuperVdsm.Server::(main) Creating remote
object manager
MainThread::DEBUG::2014-11-05
22:55:21,724::fileUtils::192::Storage.fileUtils::(chown) Changing owner for
/var/run/vdsm/svdsm.sock, to (36:36)
MainThread::DEBUG::2014-11-05
22:55:21,725::supervdsmServer::444::SuperVdsm.Server::(main) Started serving
super vdsm object

/var/log/vdsm/vdsm.log:
Thread-792::DEBUG::2014-11-05 21:46:28,579::BindingXMLRPC::1132::vds::(wrapper)
client [192.168.21.36]::call volumeStatus with ('web', '', '') {} flowID
[3d9b0b87]
Thread-792::DEBUG::2014-11-05 21:46:29,002::BindingXMLRPC::1139::vds::(wrapper)
return volumeStatus with {'volumeStatus': {'bricks': [{'status': 'ONLINE',
'brick': 'big:/mnt/export/r0/web', 'pid': '4531', 'port': '50154', 'hostuuid':
'6956bfd3-7a64-4789-9fda-d83bbea93924'}, {'status': 'ONLINE', 'brick':
'bang:/mnt/export/r0/web', 'pid': '20996', 'port': '50154', 'hostuuid':
'e2a5191a-4d72-430f-9227-feb150163051'}], 'nfs': [{'status': 'OFFLINE',
'hostname': '192.168.1.83', 'pid': '-1', 'port': 'N/A', 'hostuuid':
'6956bfd3-7a64-4789-9fda-d83bbea93924'}, {'status': 'ONLINE', 'hostname':
'bang', 'pid': '21004', 'port': '2049', 'hostuuid':
'e2a5191a-4d72-430f-9227-feb150163051'}], 'shd': [{'status': 'ONLINE',
'hostname': '192.168.1.83', 'pid': '4547', 'hostuuid':
'6956bfd3-7a64-4789-9fda-d83bbea93924'},
{'status': 'ONLINE', 'hostname': 'bang', 'pid': '21011', 'hostuuid':
'e2a5191a-4d72-430f-9227-feb150163051'}], 'name': 'web'}, 'status': {'message':
'Done', 'code': 0}}
Thread-792::DEBUG::2014-11-05 21:46:29,014::BindingXMLRPC::1132::vds::(wrapper)
client [192.168.21.36]::call volumeStatus with ('web', '', 'detail') {} flowID
[3d9b0b87]
   [*** expecting "BindingXMLRPC::1139::vds::(wrapper) return volumeStatus with
..." to come next, but of course it doesn't come because supervsdm crashed.]


Expected results:
volumeStatus() query should return sucessfully (no crash, duh). Also, running
VM's should not be affected by this random failed volume information query.

Additional info:
I have two identically configured compute nodes that both experience this issue
since upgrading to Ovirt 3.5, either one may suffer the crash at any given
time, and not at the same time. But eventually both will suffer the crash and
wind up with all VM's paused.

A core dump found in /var/log/core/core.538.1415241989.dump appears to be
truncated and cannot produce a stack trace:

# gdb core.538.1415241989.dump
...
BFD: Warning: /var/log/core/core.538.1415241989.dump is truncated: expected
core file size >= 19657957376, found: 14539485184.
...
Failed to read a valid object file image from memory.
Core was generated by `/usr/bin/python /usr/share/vdsm/supervdsmServer
--sockfile /var/run/vdsm/svdsm.'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f652c41115a in ?? ()

--- Additional comment from Dan Kenigsberg on 2014-11-12 10:09:17 EST ---

Ian, can you attach vdsm.log? It may help me understand why the VMs paused, and
why unpausing them failed.

How often are the crashes? Can you reproduce them when gdb is attached to
supervdsmd, and thus provide the backtrace?

Darshan, I'm 99.9% sure that the segfault occurs in libgfapi, and not in vdsm
or Python. Are you aware of such bug in libgfapi? If it cannot be solved there,
we may need to disable this new feature in Vdsm.

--- Additional comment from Ian Morgan on 2014-11-12 10:20:29 EST ---

Dan,

Here's the log you requested surrounding the previously described instance of
the failure. This is a 16MB single-file extract (524K compressed) from the
rotated log set.

--- Additional comment from Ian Morgan on 2014-11-12 10:22:32 EST ---

Take two: accidentally selected type text/plain last time, but the file is XZ
compressed.

--- Additional comment from Dan Kenigsberg on 2014-11-12 16:26:05 EST ---

> Ian, how often are the crashes? Can you reproduce them when gdb is attached to
> supervdsmd, and thus provide the backtrace?

--- Additional comment from Ian Morgan on 2014-11-12 18:13:29 EST ---

Dan,

The crashes were frequent. At least a couple of times per day. But never on cue
(not intentionally reproducible), and never while I was watching.

It's been about two weeks since I did the upgrade, and have been cursing at it
since, trying to find any plausible cause, and channelling my deepest Google-fu
to find anyone experiencing the same issue. Each of the nodes had been
restarted many times since the upgrade while trying to diagnose. I eventually
gave up a few days ago and shut the whole cluster down (actually considered a
switch to OpenStack, but that prospect made my scr*tum shrivel up real tight).

Taking a deep breath, I started everything back up and on a lark, stopped all
gluster volumes, disconnected the peers, added a 3rd dummy gluster node to the
stack (no bricks, just glusterd to help with quorum, even though there has
never been any communication issues between the two nodes), re-probed the
peers, started up the volumes, started vdsmd on the nodes, started the engine,
and attached gdb to the supervdsmd on one node.

And of course now with gdb attached it's been running all day without fail..
sigh. The previous pattern was that it would never fail while I was working
with it. Inevitably I would log out of the webadmin portal, and when I came
back later I would find that it had crapped out shortly thereafter. I could
never witness it failing in real-time.

I will keep gdb attached (a watched pot/program never boils/crashes??) and hope
for the best. Will report back when it finally decides to crash again, or if it
doesn't. [fingers crossed]

Of course if it doesn't crash now, it means that either something went haywire
in my gluster volumes during the upgrade from 3.4 to 3.5, or that some odd
metadata that 3.4 handled/ignored is making 3.5 happy, that a teardown and
rebuild of the gluster stack fixed (maybe?)

--- Additional comment from Ian Morgan on 2014-11-12 22:03:33 EST ---

Backtrace showing crash in glfs_set_volfile_server

--- Additional comment from Ian Morgan on 2014-11-12 22:36:02 EST ---

Backtrace attached.

It looks to me like supervdsm called into glfs_set_volfile_server() with a NULL
fs argument, and then the segfault comes naturally when it tries to dereference
&fs->ctx->cmd_args.

That comes from gfapi.py's glfsInit():

def glfsInit(volumeId, host, port, protocol):
    fs = _glfs_new(volumeId)

    rc = _glfs_set_volfile_server(fs,
                                  protocol,
                                  host,
                                  port)

Symptomatic cause of segfault: fs is generated from volumeID via _glfs_new()
and then used without being error-checked for NULL.

Root cause: still unknown.

I can see that glfs_new() is being passed a valid volumeId ('gv1'). glfs_new()
seems to have 4 places where it could potentially return NULL...

Your insight is greatly appreciated.

--- Additional comment from Ian Morgan on 2014-11-13 09:02:17 EST ---

I've set up another run of gdb attached to supervdsmd, with breakpoints at each
of the 4 places that glfs_new() can return NULL, to find out which one is the
culprit. If you have any hints to help narrow down the cause more quickly, I'm
all ears.

--- Additional comment from Ian Morgan on 2014-11-13 11:24:16 EST ---

I can't seem to get gdb to break where I ask it to. It keep breaking at some
line just before or after the desired breakpoint, so I  break at the point of
an actual "return NULL" in order to inspect _which_ "return NULL" is causing
the issue.

Is this because of compiler optimization? What's the point of the -debuginfo
packages if they aren't useful for debugging optimized code? Is there any way
for me to do anything helpful now without having proper debug builds of all the
relevant packages?

I'm stuck at this point and don't know what more I can do without your
development expertise on this stuff.

Your insight is greatly appreciated.

--- Additional comment from Dan Kenigsberg on 2014-11-13 14:18:22 EST ---

Vdsm must check the return value of glfs_new(). This would give us better hints
regarding why it fails and why. Please try with something like the following
patch:

diff --git a/vdsm/gluster/gfapi.py b/vdsm/gluster/gfapi.py
index 2aee49b..1359091 100644
--- a/vdsm/gluster/gfapi.py
+++ b/vdsm/gluster/gfapi.py
@@ -50,6 +50,8 @@ class StatVfsStruct(ctypes.Structure):

 def glfsInit(volumeId, host, port, protocol):
     fs = _glfs_new(volumeId)
+    if fs is None:
+        raise ge.GlfsInitException(rc=7, err=['glfs_new(%s) failed' %
volumeId])

     rc = _glfs_set_volfile_server(fs,
                                   protocol,

--- Additional comment from Ian Morgan on 2014-11-13 14:31:51 EST ---

Implemented the above patch on both compute nodes. I should expect to see the
'glfs_new(<volumeId>) failed' message in the vdsm.log right? So now when that
happens, what would you like me to report?


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1155170
[Bug 1155170] Tracker: oVirt 3.5.1 release
https://bugzilla.redhat.com/show_bug.cgi?id=1162640
[Bug 1162640] supervdsm segfault in libgfapi while querying volume status
detail
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.