[Gluster-users] gfapi access not working with 3.7.0

Fri May 29 18:53:25 UTC 2015

Hi,
I'm trying to access a volume using gfapi and gluster 3.7.0. This was
working with 3.6.3, but not working anymore after the upgrade.
The volume has snapshots enabled, and it's configured in the following
way:

# gluster volume info adsnet-vm-01

Volume Name: adsnet-vm-01
Type: Replicate
Volume ID: f8f615df-3dde-4ea6-9bdb-29a1706e864c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: gwads02.sta.adsnet.it:/gluster/vm01/data
Brick2: gwads03.sta.adsnet.it:/gluster/vm01/data
Options Reconfigured:
server.allow-insecure: on
features.file-snapshot: on
features.barrier: disable
nfs.disable: true

Also, my /etc/glusterfs/glusterd.vol has the needed option:

# cat /etc/glusterfs/glusterd.vol
# This file is managed by puppet, do not change
volume management
    type mgmt/glusterd
    option working-directory /var/lib/glusterd
    option transport-type socket,rdma
    option transport.socket.keepalive-time 10
    option transport.socket.keepalive-interval 2
    option transport.socket.read-fail-log off
    option ping-timeout 30
    option rpc-auth-allow-insecure on
#   option base-port 49152
end-volume

However, when I try for example to access an image via qemu-img it
segfaults:

# qemu-img info
gluster://gwads03.sta.adsnet.it/adsnet-vm-01/images/foreman7.vm.adsnet.it.qcow2 
[2015-05-29 18:39:41.436951] E [MSGID: 108006]
[afr-common.c:3919:afr_notify] 0-adsnet-vm-01-replicate-0: All
subvolumes are down. Going offline until atleast one of them comes back
up.
[2015-05-29 18:39:41.438234] E [rpc-transport.c:512:rpc_transport_unref]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7fc3851caf16]
(--> /lib64/libgfrpc.so.0(rpc_transport_unref+0xa3)[0x7fc387c855a3]
(--> /lib64/libgfrpc.so.0(rpc_clnt_unref+0x5c)[0x7fc387c888ec]
(--> /lib64/libglusterfs.so.0(+0x21791)[0x7fc3851c7791]
(--> /lib64/libglusterfs.so.0(+0x21725)[0x7fc3851c7725] )))))
0-rpc_transport: invalid argument: this
[2015-05-29 18:39:41.438484] E [rpc-transport.c:512:rpc_transport_unref]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7fc3851caf16]
(--> /lib64/libgfrpc.so.0(rpc_transport_unref+0xa3)[0x7fc387c855a3]
(--> /lib64/libgfrpc.so.0(rpc_clnt_unref+0x5c)[0x7fc387c888ec]
(--> /lib64/libglusterfs.so.0(+0x21791)[0x7fc3851c7791]
(--> /lib64/libglusterfs.so.0(+0x21725)[0x7fc3851c7725] )))))
0-rpc_transport: invalid argument: this
Segmentation fault (core dumped)

The volume is fine:

# gluster volume status adsnet-vm-01
Status of volume: adsnet-vm-01
Gluster process                             TCP Port  RDMA Port  Online
Pid
------------------------------------------------------------------------------
Brick gwads02.sta.adsnet.it:/gluster/vm01/d
ata                                         49159     0          Y
27878
Brick gwads03.sta.adsnet.it:/gluster/vm01/d
ata                                         49159     0          Y
24638
Self-heal Daemon on localhost               N/A       N/A        Y
28031
Self-heal Daemon on gwads03.sta.adsnet.it   N/A       N/A        Y
24667

Task Status of Volume adsnet-vm-01
------------------------------------------------------------------------------
There are no active volume tasks

Running with the debugger I see the following:

(gdb) r
Starting program: /usr/bin/qemu-img info
gluster://gwads03.sta.adsnet.it/adsnet-vm-01/images/foreman7.vm.adsnet.it.qcow2
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff176a700 (LWP 30027)]
[New Thread 0x7ffff0f69700 (LWP 30028)]
[New Thread 0x7fffe99ab700 (LWP 30029)]
[New Thread 0x7fffe8fa7700 (LWP 30030)]
[New Thread 0x7fffe3fff700 (LWP 30031)]
[New Thread 0x7fffdbfff700 (LWP 30032)]
[New Thread 0x7fffdb2dd700 (LWP 30033)]
[2015-05-29 18:51:25.656014] E [MSGID: 108006]
[afr-common.c:3919:afr_notify] 0-adsnet-vm-01-replicate-0: All
subvolumes are down. Going offline until atleast one of them comes back
up.
[2015-05-29 18:51:25.657338] E [rpc-transport.c:512:rpc_transport_unref]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7ffff48bcf16]
(--> /lib64/libgfrpc.so.0(rpc_transport_unref+0xa3)[0x7ffff73775a3]
(--> /lib64/libgfrpc.so.0(rpc_clnt_unref+0x5c)[0x7ffff737a8ec]
(--> /lib64/libglusterfs.so.0(+0x21791)[0x7ffff48b9791]
(--> /lib64/libglusterfs.so.0(+0x21725)[0x7ffff48b9725] )))))
0-rpc_transport: invalid argument: this
[2015-05-29 18:51:25.657619] E [rpc-transport.c:512:rpc_transport_unref]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7ffff48bcf16]
(--> /lib64/libgfrpc.so.0(rpc_transport_unref+0xa3)[0x7ffff73775a3]
(--> /lib64/libgfrpc.so.0(rpc_clnt_unref+0x5c)[0x7ffff737a8ec]
(--> /lib64/libglusterfs.so.0(+0x21791)[0x7ffff48b9791]
(--> /lib64/libglusterfs.so.0(+0x21725)[0x7ffff48b9725] )))))
0-rpc_transport: invalid argument: this

Program received signal SIGSEGV, Segmentation fault.
inode_unref (inode=0x7fffd975c06c) at inode.c:499
499	        table = inode->table;
(gdb) bt
#0  inode_unref (inode=0x7fffd975c06c) at inode.c:499
#1  0x00007fffe14d5a61 in fini (this=<optimized out>) at
qemu-block.c:1092
#2  0x00007ffff48b9791 in xlator_fini_rec (xl=0x7fffdc00b520) at
xlator.c:463
#3  0x00007ffff48b9725 in xlator_fini_rec (xl=0x7fffdc00c940) at
xlator.c:453
#4  0x00007ffff48b9725 in xlator_fini_rec (xl=0x7fffdc00dcf0) at
xlator.c:453
#5  0x00007ffff48b9725 in xlator_fini_rec (xl=0x7fffdc00f0a0) at
xlator.c:453
#6  0x00007ffff48b9725 in xlator_fini_rec (xl=0x7fffdc010470) at
xlator.c:453
#7  0x00007ffff48b9725 in xlator_fini_rec (xl=0x7fffdc011820) at
xlator.c:453
#8  0x00007ffff48b9725 in xlator_fini_rec (xl=0x7fffdc012bd0) at
xlator.c:453
#9  0x00007ffff48b9725 in xlator_fini_rec (xl=0x7fffdc014020) at
xlator.c:453
#10 0x00007ffff48b9725 in xlator_fini_rec (xl=0x7fffdc0154b0) at
xlator.c:453
#11 0x00007ffff48baeea in xlator_tree_fini (xl=<optimized out>) at
xlator.c:545
#12 0x00007ffff48f6b25 in glusterfs_graph_deactivate (graph=<optimized
out>) at graph.c:340
#13 0x00007ffff758de3c in pub_glfs_fini (fs=0x555555c13cd0) at
glfs.c:1155
#14 0x0000555555568f49 in bdrv_close (bs=bs at entry=0x555555c10790) at
block.c:1522
#15 0x0000555555568e08 in bdrv_delete (bs=0x555555c10790) at
block.c:1749
#16 bdrv_unref (bs=0x555555c10790) at block.c:5121
#17 0x0000555555568fd3 in bdrv_close (bs=bs at entry=0x555555c0b520) at
block.c:1544
#18 0x0000555555568e08 in bdrv_delete (bs=0x555555c0b520) at
block.c:1749
#19 bdrv_unref (bs=bs at entry=0x555555c0b520) at block.c:5121
#20 0x00005555555b75d6 in collect_image_info_list (chain=false,
fmt=<optimized out>, filename=<optimized out>)
    at qemu-img.c:1820
#21 img_info (argc=<optimized out>, argv=<optimized out>) at
qemu-img.c:1897
#22 0x00007ffff5004af5 in __libc_start_main (main=0x555555561740 <main>,
argc=3, ubp_av=0x7fffffffe398, 
    init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized
out>, stack_end=0x7fffffffe388)
    at libc-start.c:274
#23 0x000055555556185d in _start ()

It seems very similar to what I reported this morning about the healing,
still a problem with fini, but I'm not sure it's the real cause.
Any help?
Many thanks!

	Alessandro