[Bugs] [Bug 1418650] New: Samba crash when mounting a distributed dispersed volume over CIFS

Thu Feb 2 12:23:24 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1418650

            Bug ID: 1418650
           Summary: Samba crash when mounting a distributed dispersed
                    volume over CIFS
           Product: GlusterFS
           Version: 3.10
         Component: disperse
          Keywords: Triaged
          Assignee: bugs at gluster.org
          Reporter: xhernandez at datalab.es
                CC: anoopcs at redhat.com, aspandey at redhat.com,
                    bugs at gluster.org, nbalacha at redhat.com,
                    nigelb at redhat.com, pkarampu at redhat.com,
                    xhernandez at datalab.es
        Depends On: 1402661

+++ This bug was initially created as a clone of Bug #1402661 +++

I noticed this when running glusto. Here's the output of the mount command:

COMMAND: mount -t cifs -o username=root,password=foobar
\\\\172.19.2.47\\gluster-testvol_distributed-dispersed
/mnt/testvol_distributed-dispersed_cifs
RETCODE: 32
STDERR:
mount error(11): Resource temporarily unavailable
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)

I've attached log.smbd here. I'll try to attach the core file or add a link to
cores.

--- Additional comment from Nigel Babu on 2016-12-08 05:58:25 CET ---

The cores are here: http://slave1.cloud.gluster.org/smb-dispersed-cores/

Please copy them to another more permanent location as soon as possible. This
will be autodeleted in 15 days.

--- Additional comment from Nigel Babu on 2016-12-08 06:09:26 CET ---

--- Additional comment from Nigel Babu on 2016-12-09 07:49:00 CET ---

I have a feeling this happens over NFS as well. See
https://ci.centos.org/view/Gluster/job/gluster_glusto/67/console

Ashish, let me know what more logs you need for NFS and I can get them.

--- Additional comment from Nigel Babu on 2016-12-19 16:40:52 CET ---

Hey, any idea of what's going on and when we can fix? This is currently causing
the Glusto tests to fail on master.

--- Additional comment from Ashish Pandey on 2016-12-20 05:41:27 CET ---

Hi,

I tried to look in to cores but could not find the binaries used while this
issue occurred.

Can not debug it without respective binaries.

Could you please provide a complete sosreport from all the servers?

The configuration of volumes and all the steps reuired to reproduce this issue.

Ashish

--- Additional comment from Nigel Babu on 2016-12-20 07:19:37 CET ---

Were you able to reproduce the issue manually though? I can consistently
reproduce the issue when trying to mount a disperse volume over CIFS on
release-3.9 and master. You can either try it on your machine or I can redo the
test and get you the binaries + cores. Let me know which works best.

--- Additional comment from Nigel Babu on 2017-01-02 15:12:13 CET ---

Fresh set of cores: http://slave25.cloud.gluster.org/logs/samba.tar.gz

Path to corresponding RPM:
http://artifacts.ci.centos.org/gluster/nightly/master/7/x86_64/glusterfs-3.10dev-0.283.git0805642.el7.centos.x86_64.rpm

All gluster-related RPMs should be in the same folder and should have the same
version string - "283.git0805642"

Link to SOS report from the server:
http://slave25.cloud.gluster.org/logs/sosreport-n33.gusty.ci.centos.org.1402661-20170102140614.tar.xz

Samba version: samba-4.4.4-9.el7.x86_64

The tests are in https://github.com/gluster/glusto-tests. Please try to
reproduce the bug. Let me know if you can't or need clearly steps.

--- Additional comment from Anoop C S on 2017-01-13 06:53:08 CET ---

Following is the backtrace seen in Samba logs:

[2017/01/02 13:40:19.965429,  0] ../source3/lib/util.c:902(log_stack_trace)
  BACKTRACE: 21 stack frames:
   #0 /lib64/libsmbconf.so.0(log_stack_trace+0x1a) [0x7f926b7a5efa]
   #1 /lib64/libsmbconf.so.0(smb_panic_s3+0x20) [0x7f926b7a5fd0]
   #2 /lib64/libsamba-util.so.0(smb_panic+0x2f) [0x7f926dcd259f]
   #3 /lib64/libsamba-util.so.0(+0x247b6) [0x7f926dcd27b6]
   #4 /lib64/libpthread.so.0(+0xf370) [0x7f926df35370]
   #5 /usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(+0x34c61)
[0x7f924cd7ac61]
   #6 /usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(+0x33804)
[0x7f924cd79804]
   #7 /usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(init+0x1f4)
[0x7f924cd53154]
   #8 /lib64/libglusterfs.so.0(xlator_init+0x4b) [0x7f92542b550b]
   #9 /lib64/libglusterfs.so.0(glusterfs_graph_init+0x29) [0x7f92542ecb29]
   #10 /lib64/libglusterfs.so.0(glusterfs_graph_activate+0x3b) [0x7f92542ed44b]
   #11 /lib64/libgfapi.so.0(+0x97cd) [0x7f92547af7cd]
   #12 /lib64/libgfapi.so.0(+0x9986) [0x7f92547af986]
   #13 /lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90) [0x7f9254596720]
   #14 /lib64/libgfrpc.so.0(rpc_clnt_notify+0x1df) [0x7f92545969ff]
   #15 /lib64/libgfrpc.so.0(rpc_transport_notify+0x23) [0x7f92545928e3]
   #16 /usr/lib64/glusterfs/3.10dev/rpc-transport/socket.so(+0x72f4)
[0x7f924d22d2f4]
   #17 /usr/lib64/glusterfs/3.10dev/rpc-transport/socket.so(+0x9795)
[0x7f924d22f795]
   #18 /lib64/libglusterfs.so.0(+0x84590) [0x7f9254312590]
   #19 /lib64/libpthread.so.0(+0x7dc5) [0x7f926df2ddc5]
   #20 /lib64/libc.so.6(clone+0x6d) [0x7f9269eed73d]

While looking through various logs, I could see the same crash reported in
glustershd.log as follows:

[2017-01-02 13:39:41.399861] I [MSGID: 100030] [glusterfsd.c:2455:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.10dev
(args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p
/var/lib/glusterd/glustershd/run/glustershd.pid -l
/var/log/glusterfs/glustershd.log -S
/var/run/gluster/8998cabdccb9dec791fa49c3fd0ca055.socket --xlator-option
*replicate*.node-uuid=4d66676a-7e25-49c5-8c18-2d29db0d8d9a)
[2017-01-02 13:39:41.439745] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 1
[2017-01-02 13:39:41.454243] I [MSGID: 122067] [ec-code.c:896:ec_code_detect]
0-testvol_dispersed-disperse-0: Using 'sse' CPU extensions
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2017-01-02 13:39:41
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.10dev
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7fb1a0cd7da0]
/lib64/libglusterfs.so.0(gf_print_trace+0x324)[0x7fb1a0ce16a4]
/lib64/libc.so.6(+0x35250)[0x7fb19f393250]
/usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(+0x34c61)[0x7fb192fe2c61]
/usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(+0x33804)[0x7fb192fe1804]
/usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(init+0x1f4)[0x7fb192fbb154]
/lib64/libglusterfs.so.0(xlator_init+0x4b)[0x7fb1a0cd550b]
/lib64/libglusterfs.so.0(glusterfs_graph_init+0x29)[0x7fb1a0d0cb29]
/lib64/libglusterfs.so.0(glusterfs_graph_activate+0x3b)[0x7fb1a0d0d44b]
/usr/sbin/glusterfs(glusterfs_process_volfp+0x12d)[0x7fb1a11d858d]
/usr/sbin/glusterfs(mgmt_getspec_cbk+0x3c1)[0x7fb1a11ddf51]
/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7fb1a0a9e720]
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1df)[0x7fb1a0a9e9ff]
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fb1a0a9a8e3]
/usr/lib64/glusterfs/3.10dev/rpc-transport/socket.so(+0x72f4)[0x7fb19555d2f4]
/usr/lib64/glusterfs/3.10dev/rpc-transport/socket.so(+0x9795)[0x7fb19555f795]
/lib64/libglusterfs.so.0(+0x84590)[0x7fb1a0d32590]
/lib64/libpthread.so.0(+0x7dc5)[0x7fb19fb1ddc5]
/lib64/libc.so.6(clone+0x6d)[0x7fb19f45573d]
---------

Moreover I can't think of anything from Samba's perspective that could lead to
this crash. So this particular crash is more likely an issue with EC
translator.

I was able to reproduce this crash within self-heal daemon manually in a local
setup with CentOS 7. I can share the same for further debugging.

@Nigel,
Since cores provided in the links cannot be analysed without exact debuginfo
packages and binaries, it's better if you can run following basic commands
after attaching the core files to gdb.

$ gdb smbd <path-to-coredump-file>

In case gdb complains about missing debuginfo packages run the suggested
commands to have all dependant debuginfo packages(glusterfs-debuginfo and
samba-debuginfo are required at very minimum).

While you are inside gdb save the output of following gdb commands:
(gdb) bt
. . .
(gdb) thread apply all bt
. . .

--- Additional comment from Ashish Pandey on 2017-01-13 09:04:53 CET ---

As Anoop mentioned that the crashed can be seen in self heal daemon while
trying to start the volume -

Following is the backtrace and possible issue - 

[Thread debugging using libthread_db enabled]                                  

                               │··············································
Using host libthread_db library "/lib64/libthread_db.so.1".                    

                               │··············································
Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id
gluster/glustershd -p /var/lib/gl'.                                            

│··············································
Program terminated with signal 11, Segmentation fault.                         

                               │··············································
#0  list_add_tail (head=<optimized out>, new=<optimized out>) at
../../../../libglusterfs/src/list.h:40                                         

│··············································
40              new->next = head;                                              

                               │··············································
(gdb) bt                                                                       

                               │··············································
#0  list_add_tail (head=<optimized out>, new=<optimized out>) at
../../../../libglusterfs/src/list.h:40                                         

│··············································
#1  ec_code_space_alloc (size=400, code=0x7fb44802a7f0) at ec-code.c:428       

                               │··············································
#2  ec_code_alloc (size=<optimized out>, code=0x7fb44802a7f0) at ec-code.c:448 

                               │··············································
#3  ec_code_compile (builder=0x7fb44802a890) at ec-code.c:522                  

                               │··············································
#4  ec_code_build (code=<optimized out>, width=width at entry=64,
values=<optimized out>, count=<optimized out>, linear=linear at entry=_gf_true) at
ec-code.c:631                                  
│··············································
#5  0x00007fb44d860f5b in ec_code_build_linear (code=<optimized out>,
width=width at entry=64, values=<optimized out>, count=<optimized out>) at
ec-code.c:638                                   
│··············································
#6  0x00007fb44d85f804 in ec_method_matrix_init (inverse=_gf_false,
rows=0x7fb44e5038c0, mask=0, matrix=0x7fb44802a510, list=0x7fb448022228) at
ec-method.c:106                               
│··············································
#7  ec_method_setup (gen=<optimized out>, list=0x7fb448022228,
xl=0x7fb44e503930) at ec-method.c:299                                          

│··············································
#8  ec_method_init (xl=xl at entry=0x7fb448017450, list=list at entry=0x7fb448022228,
columns=<optimized out>, rows=<optimized out>, max=<optimized out>,
gen=<optimized out>) at ec-method.c:343   
│··············································
#9  0x00007fb44d839154 in init (this=0x7fb448017450) at ec.c:635               

                               │··············································
#10 0x00007fb45b4f650b in __xlator_init (xl=0x7fb448017450) at xlator.c:403    

                               │··············································
#11 xlator_init (xl=xl at entry=0x7fb448017450) at xlator.c:428                   

                               │··············································
#12 0x00007fb45b52db29 in glusterfs_graph_init
(graph=graph at entry=0x7fb448000af0) at graph.c:320                              

│··············································
#13 0x00007fb45b52e44b in glusterfs_graph_activate
(graph=graph at entry=0x7fb448000af0, ctx=ctx at entry=0x7fb45bc5c010) at graph.c:670

│··············································
#14 0x00007fb45b9f158d in glusterfs_process_volfp
(ctx=ctx at entry=0x7fb45bc5c010, fp=fp at entry=0x7fb448002d30) at glusterfsd.c:2325

│··············································
#15 0x00007fb45b9f6f51 in mgmt_getspec_cbk (req=<optimized out>, iov=<optimized
out>, count=<optimized out>, myframe=0x7fb458fd306c) at glusterfsd-mgmt.c:1675 
                               │····   

ec_code_space_alloc

 space = mmap(NULL, map_size, PROT_EXEC | PROT_READ | PROT_WRITE,               
                 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);                           
    if (space == NULL) {                                                        
        return NULL;                                                            
    }
    /* It's not important to check the return value of mlock(). If it fails     
     * everything will continue to work normally. */                            
    mlock(space, map_size);                                                     

    space->code = code;
    space->size = map_size;
    list_add_tail(&space->list, &code->spaces);      <<<<<<<<<                  
    INIT_LIST_HEAD(&space->chunks);            

   Do we need to actually check the mlock return?
   I think that mmap was successful but something has messed up with the
memory.

--- Additional comment from Xavier Hernandez on 2017-01-13 12:51:31 CET ---

I've tried to reproduce this issue but I've been unable to see the crash.

I've created a dispersed 2+1 volume and started it. Self heal daemon has
started successfully without crashing.

Is there anything else to see the self-heal daemon issue ?

--- Additional comment from Anoop C S on 2017-01-13 13:06:00 CET ---

Hi Xavi,

I created a CentOS local VM and installed the exact version of glusterfs
mentioned in comment #7. After starting a distribute-disperse volume 2x(4+2), I
noticed that self-heal daemon is not online and found the core.

Please note that I was unable to reproduce the issue with corresponding source
install on another VM.

--- Additional comment from Ashish Pandey on 2017-01-13 13:07:24 CET ---

Xavi,

Even we were not able to reproduce this issue using latest master source code
We tried the folloiwng rpm to reproduce it.

[root at centos /]#                                                               

                               │··············································
[root at centos /]# rpm -qa | grep gluster                                        

                               │··············································
glusterfs-client-xlators-3.10dev-0.283.git0805642.el7.centos.x86_64            

                               │··············································
glusterfs-server-3.10dev-0.283.git0805642.el7.centos.x86_64                    

                               │··············································
glusterfs-geo-replication-3.10dev-0.283.git0805642.el7.centos.x86_64           

                               │··············································
python-gluster-3.10dev-0.283.git0805642.el7.centos.noarch                      

                               │··············································
glusterfs-fuse-3.10dev-0.283.git0805642.el7.centos.x86_64                      

                               │··············································
glusterfs-devel-3.10dev-0.283.git0805642.el7.centos.x86_64                     

                               │··············································
glusterfs-rdma-3.10dev-0.283.git0805642.el7.centos.x86_64                      

                               │··············································
glusterfs-libs-3.10dev-0.283.git0805642.el7.centos.x86_64                      

                               │··············································
glusterfs-api-3.10dev-0.283.git0805642.el7.centos.x86_64                       

                               │··············································
glusterfs-extra-xlators-3.10dev-0.283.git0805642.el7.centos.x86_64             

                               │··············································
glusterfs-api-devel-3.10dev-0.283.git0805642.el7.centos.x86_64                 

                               │··············································
glusterfs-debuginfo-3.10dev-0.283.git0805642.el7.centos.x86_64                 

                               │··············································
glusterfs-3.10dev-0.283.git0805642.el7.centos.x86_64                           

                               │··············································
glusterfs-cli-3.10dev-0.283.git0805642.el7.centos.x86_64                       

                               │··············································
glusterfs-events-3.10dev-0.283.git0805642.el7.centos.x86_64                    

                               │··············································
samba-vfs-glusterfs-4.4.4-9.el7.x86_64

--- Additional comment from Xavier Hernandez on 2017-01-13 13:36:10 CET ---

I've just tried this exact version on a CentOS 7 and created a 2 x (4 + 2)
distributed-dispersed volume. Start has worked fine and self-heal daemon is
running normally.

I'll try to analyze the sos report. Maybe I see something there...

--- Additional comment from Xavier Hernandez on 2017-01-13 13:52:31 CET ---

I've something....

Analyzing the cores from smbd I've seen this:

   0x00007f924cd7ac33 <+531>:   callq  0x7f924cd4f8a0 <mmap64 at plt>
   0x00007f924cd7ac38 <+536>:   test   %rax,%rax
   0x00007f924cd7ac3b <+539>:   mov    %rax,%r15
   0x00007f924cd7ac3e <+542>:   je     0x7f924cd7af45 <ec_code_build+1317>
   0x00007f924cd7ac44 <+548>:   mov    0x8(%rsp),%r10
   0x00007f924cd7ac49 <+553>:   mov    %rax,%rdi
   0x00007f924cd7ac4c <+556>:   mov    %r10,%rsi
   0x00007f924cd7ac4f <+559>:   callq  0x7f924cd4fdd0 <mlock at plt>
   0x00007f924cd7ac54 <+564>:   mov    0x30(%r13),%rax
   0x00007f924cd7ac58 <+568>:   mov    0x8(%rsp),%r10
   0x00007f924cd7ac5d <+573>:   lea    0x30(%r15),%rdx
=> 0x00007f924cd7ac61 <+577>:   mov    %r14,(%r15)

r15 = 0xffffffffffffffff

So mmap() has returned -1. Looking at the man page I've seen that if mmap()
fails, it returns MAP_FAILED (-1) and not NULL as the code expects. This is a
bug in ec. I'll send a patch.

--- Additional comment from Worker Ant on 2017-01-13 13:59:54 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check
for mmap()) posted (#1) for review on master by Xavier Hernandez
(xhernandez at datalab.es)

--- Additional comment from Anoop C S on 2017-01-13 16:13:45 CET ---

Hi Xavi,

Thanks for the quick analysis and resulting patch. Same is the reason for the
crash from self-heal daemon. See below for core analysis from locally
reproduced crash with self-heal daemon. Even though I noticed a -1 for
(ec_code_space_t *) space pointer I never thought beyond.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id
gluster/glustershd -p /var/lib/gl'.
Program terminated with signal 11, Segmentation fault.
#0  list_add_tail (head=<optimized out>, new=<optimized out>) at
../../../../libglusterfs/src/list.h:40
40        new->next = head;
(gdb) f 1
#1  ec_code_space_alloc (size=400, code=0x7fc99402a7f0) at ec-code.c:428
428        list_add_tail(&space->list, &code->spaces);
(gdb) l 420
415            map_size = size;
416        }
417        space = mmap(NULL, map_size, PROT_EXEC | PROT_READ | PROT_WRITE,
418                     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
419        if (space == NULL) {
420            return NULL;
421        }
422        /* It's not important to check the return value of mlock(). If it
fails
423         * everything will continue to work normally. */
424        mlock(space, map_size);
(gdb) l 430
425    
426        space->code = code;
427        space->size = map_size;
428        list_add_tail(&space->list, &code->spaces);
429        INIT_LIST_HEAD(&space->chunks);
430    
431        chunk = ec_code_chunk_from_space(space);
432        chunk->size = EC_CODE_SIZE - ec_code_space_size() -
ec_code_chunk_size();
433        list_add(&chunk->list, &space->chunks);
434    
(gdb) p space
$1 = (ec_code_space_t *) 0xffffffffffffffff
(gdb) p (int)space
$2 = -1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The above confirms your findings.

@Xavi,
Which tool was used to analyze the core? I hope its dbx judging from the
backtrace you have provided in previous comment.

--- Additional comment from Worker Ant on 2017-01-16 11:33:11 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check
for mmap()) posted (#2) for review on master by Xavier Hernandez
(xhernandez at datalab.es)

--- Additional comment from Worker Ant on 2017-01-16 11:38:24 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check
for mmap()) posted (#3) for review on master by Xavier Hernandez
(xhernandez at datalab.es)

--- Additional comment from Nigel Babu on 2017-01-17 09:22:50 CET ---

The mmap code was being blocked by Selinux. I gathered the logs for Anoop today
so he could figure out what's wrong.

We found 2 SYSCALL entries blocked
type=SYSCALL msg=audit(1484635264.766:2718): arch=c000003e syscall=49
success=no exit=-13 a0=15 a1=7ff6f81ff400 a2=10 a3=7e items=0 ppid=1 pid=25743
auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0
tty=(none) ses=4294967295 comm="glusterd" exe="/usr/sbin/glusterfsd"
subj=system_u:system_r:glusterd_t:s0 key=(null)

type=SYSCALL msg=audit(1484635712.229:3062): arch=c000003e syscall=9 success=no
exit=-13 a0=0 a1=10000 a2=7 a3=22 items=0 ppid=27586 pid=27592 auid=4294967295
uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none)
ses=4294967295 comm="glusterfs" exe="/usr/sbin/glusterfsd"
subj=system_u:system_r:glusterd_t:s0 key=(null)

ausyscall says 49 is for bind and 9 for mmap. Anoop suspected that the second
syscall might be the problem at hand. As per his suggestion, I ran the tests
(just CIFS) with selinux in permissive mode and all tests passed. I'm running
the full test suite with selinux in permissive mode to confirm all is well.

Now we'll have to figure out how to make sure we can apply a specific selinux
policy for this particular access.

--- Additional comment from Xavier Hernandez on 2017-01-17 10:07:17 CET ---

I think the problem could be that the allocated memory will be used to store
code, so the PROT_EXEC flag is passed to mmap. I think this is the only
difference between this particular mmap() call and the other mmap() calls
present in gluster code.

Probably this will be the cause that selinux makes mmap() to fail.

Does "exit=-13" mean that the errno returned by mmap() is 13 (EACCES) ? In that
case I could add a specific error message in the patch to clearly show that
selinux could be the cause.

--- Additional comment from Anoop C S on 2017-01-17 10:34:20 CET ---

Here is the AVC in question:
type=AVC msg=audit(1484635756.506:3152): avc:  denied  { execmem } for 
pid=27918 comm="smbd" scontext=system_u:system_r:smbd_t:s0
tcontext=system_u:system_r:smbd_t:s0 tclass=process

(In reply to Xavier Hernandez from comment #20)
> I think the problem could be that the allocated memory will be used to store
> code, so the PROT_EXEC flag is passed to mmap. I think this is the only
> difference between this particular mmap() call and the other mmap() calls
> present in gluster code.
> 
> Probably this will be the cause that selinux makes mmap() to fail.
> 

So this assumption is correct as per the following one line explanation given
for 'allow_execmem' selinux boolean on
https://wiki.centos.org/TipsAndTricks/SelinuxBooleans:

. . .
allow_execmem (Memory Protection)
    Allow unconfined executables to map a memory region as both executable and
writable, this is dangerous and the executable should be reported in bugzilla
. . .

So selinux will prevent this memory map by default as this particular call from
EC specifies both PROT_EXEC and PROT_WRITE.

> Does "exit=-13" mean that the errno returned by mmap() is 13 (EACCES) ? In
> that case I could add a specific error message in the patch to clearly show
> that selinux could be the cause.

Yes.. you are right. See this:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/uapi/asm-generic/errno-base.h

--- Additional comment from Xavier Hernandez on 2017-01-17 10:45:47 CET ---

If enabling only PROT_EXEC without PROT_WRITE does not trigger the selinux
check, maybe we could change the way it's done. For example we could mmap only
with PROT_WRITE, generate the code and then change the protection to only
PROT_EXEC.

That would require considerable changes since the current implementation uses
the same allocated memory to create multiple dynamic fragments of code as they
are needed. We would need to have a single mmap() for each fragment of code.

What do you think ?

--- Additional comment from Anoop C S on 2017-01-17 13:18:46 CET ---

(In reply to Xavier Hernandez from comment #22)
> If enabling only PROT_EXEC without PROT_WRITE does not trigger the selinux
> check, maybe we could change the way it's done.

I am pretty sure that this is the case. But I currently do not have an option
to test this out and confirm my analysis.

> For example we could mmap
> only with PROT_WRITE, generate the code and then change the protection to
> only PROT_EXEC.
>
> That would require considerable changes since the current implementation
> uses the same allocated memory to create multiple dynamic fragments of code
> as they are needed. We would need to have a single mmap() for each fragment
> of code.
> 
> What do you think ?

If we can make such a change safely without affecting the overall functionality
in EC, then I would say we try once. :)

I admit that it would be a time-consuming task and thus the following question:
Is it possible to have a small change(probably a hack) in this particular area
so as to try it out and confirm? Then we can go for the bigger one.

So my request would be to go for it as and when you find time to do so. Till
then we will use the custom selinux policy to get rid of AVCs.

--- Additional comment from Nigel Babu on 2017-01-17 14:17:30 CET ---

If there's a patch that needs testing, push it onto review.gluster.org and I'm
happy to test it out (I need rpms).

--- Additional comment from Xavier Hernandez on 2017-01-18 13:56:56 CET ---

I'm trying to reproduce the problem to see if the issue can be avoided playing
with the mmap() protection flags. However I'm unable to get the error.

I've used a CentOS 7.3.1611 with latest patches and default configuration, but
it doesn't fail (selinux is enabled by default). Have you used any custom setup
?

I use this small program to try to reproduce the issue:

#include <stdio.h>
#include <sys/mman.h>
#include <errno.h>

#define MMAP_SIZE 4096

int main(void)
{
        void *ptr;

        ptr = mmap(NULL, MMAP_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC,
                   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        if (ptr == MAP_FAILED) {
                printf("mmap() error: %d\n", errno);
                return 1;
        }

        printf("mmap succeeded\n");

        munmap(ptr, MMAP_SIZE);

        return 0;
}

--- Additional comment from Xavier Hernandez on 2017-01-19 13:16:48 CET ---

I don't know selinux enough to configure it correctly, so I cannot test it, but
I've modified the patch to do the mmap as it's recommended for selinux. It
works fine

--- Additional comment from Xavier Hernandez on 2017-01-19 13:17:54 CET ---

...It works fine on my test machine, without selinux.

--- Additional comment from Worker Ant on 2017-01-19 13:18:49 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check
for mmap()) posted (#4) for review on master by Xavier Hernandez
(xhernandez at datalab.es)

--- Additional comment from Worker Ant on 2017-01-19 13:27:57 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check
for mmap()) posted (#5) for review on master by Xavier Hernandez
(xhernandez at datalab.es)

--- Additional comment from Nigel Babu on 2017-01-19 13:34:47 CET ---

I'm a little pre-occupied today. But I'll run a test first thing tomorrow.

--- Additional comment from Anoop C S on 2017-01-19 13:39:00 CET ---

(In reply to Xavier Hernandez from comment #25)
> I'm trying to reproduce the problem to see if the issue can be avoided
> playing with the mmap() protection flags. However I'm unable to get the
> error.
> 
> I've used a CentOS 7.3.1611 with latest patches and default configuration,
> but it doesn't fail (selinux is enabled by default). Have you used any
> custom setup ?
> 
> I use this small program to try to reproduce the issue:
> 
> #include <stdio.h>
> #include <sys/mman.h>
> #include <errno.h>
> 
> #define MMAP_SIZE 4096
> 
> int main(void)
> {
>         void *ptr;
> 
>         ptr = mmap(NULL, MMAP_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC,
>                    MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>         if (ptr == MAP_FAILED) {
>                 printf("mmap() error: %d\n", errno);
>                 return 1;
>         }
> 
>         printf("mmap succeeded\n");
> 
>         munmap(ptr, MMAP_SIZE);
> 
>         return 0;

To reproduce the AVC, please run the above program as below:

# > /var/log/audit/audit.log
# gcc mmap-selinux-test.c
# chcon -t glusterd_exec_t a.out
# runcon "system_u:system_r:glusterd_t:s0" ./a.out
# cat /var/log/audit/audit/log
type=AVC msg=audit(1484828797.810:996): avc:  denied  { execmem } for 
pid=26592 comm="a.out" scontext=system_u:system_r:glusterd_t:s0
tcontext=system_u:system_r:glusterd_t:s0 tclass=process
type=SYSCALL msg=audit(1484828797.810:996): arch=c000003e syscall=9 success=no
exit=-13 a0=0 a1=1000 a2=7 a3=22 items=0 ppid=26437 pid=26592 auid=0 uid=0
gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=16
comm="a.out" exe="/root/a.out" subj=system_u:system_r:glusterd_t:s0 key=(null)

Now remove either PROT_EXEC or PROT_WRITE from mmap call and repeat the above
steps. AVCs must not be present.

Why we need to do all this?
===========================
Because gluster binaries are run with the following selinux context:
# ps auxZ | grep /usr/sbin/glusterd | grep -v grep
system_u:system_r:glusterd_t:s0 root      7216  0.0  1.8 602096 18748 ?       
Ssl  12:55   0:01 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

So we need to test our sample programs too in the same selinux context and thus
we can be sure about it.

Why you couldn't reproduce it?
==============================
Run `ps auxZ | grep /usr/sbin/glusterd | grep -v grep` and check under which
context it is running. The behaviour changes based on what context gluster
daemon is running.

--- Additional comment from Xavier Hernandez on 2017-01-19 13:41:51 CET ---

Thank you very much, Anoop :) I'll run some tests with this.

--- Additional comment from Worker Ant on 2017-01-20 11:08:01 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check
for mmap()) posted (#6) for review on master by Xavier Hernandez
(xhernandez at datalab.es)

--- Additional comment from Xavier Hernandez on 2017-01-20 11:49:03 CET ---

I've been playing with selinux and I've been able to make it work. However to
do two mmaps with the same contents, a file needs to be created. In this case,
mmap() fails if the file is not created in an executable directory. I've tried
/tmp, /var/tmp, /var/run and /root, but all failed. If I use /sbin, it works.

The reason seems to be that files created in all those directories do not have
the selinux' bin_t type. If I manually set this type on the file, it works
whatever it's created.

Where's the best place to put that file without needing to manually set the
selinux type from code ?

--- Additional comment from Anoop C S on 2017-01-21 10:10:01 CET ---

Hi Xavi,

I ran a search through the current policies as follows in order to see the
SELinux allow rules for glusterd_t and highlighted those in which execute
permission is granted for class 'file':
# sesearch --allow | grep -E 'allow glusterd_t [a-z|_]* : file { ' | grep
execute

>From the output I think it is safe to create the file under
/usr/libexec/glusterfs/ based on the following allow rule:
allow glusterd_t glusterd_exec_t : file { ioctl read getattr lock execute
execute_no_trans entrypoint open } ;

By default files under /usr/libexec/glusterfs will have system_u:object_r:bin_t
as the SELinux context. I confirmed the same by modifying your sample C
program. I don't know whether we have /usr/libexec already pre-defined in
glusterfs source. But I guess its not a big deal.

Is this solution of creating the file under /usr/libexec/glusterfs/ for mmap()
acceptable for you?

--- Additional comment from Worker Ant on 2017-01-25 13:59:07 CET ---

REVIEW: https://review.gluster.org/16405 (cluster/ec: fix selinux issues with
mmap()) posted (#8) for review on master by Xavier Hernandez
(xhernandez at datalab.es)

--- Additional comment from Xavier Hernandez on 2017-01-25 14:02:56 CET ---

The latest change should satisfy all selinux restrictions, so it should work
without special rules.

Thanks Anoop for all the information you provided.

--- Additional comment from Worker Ant on 2017-02-02 13:02:32 CET ---

COMMIT: https://review.gluster.org/16405 committed in master by Jeff Darcy
(jdarcy at redhat.com) 
------
commit db80efc8d5cc24597de636d8df2e5a9ce81d670d
Author: Xavier Hernandez <xhernandez at datalab.es>
Date:   Fri Jan 13 13:54:35 2017 +0100

    cluster/ec: fix selinux issues with mmap()

    EC uses mmap() to create a memory area for the dynamic code. Since
    the code is created on the fly and executed when needed, this region
    of memory needs to have write and execution privileges.

    This combination is not allowed by default by selinux. To solve the
    problem a file is used as a backend storage for the dynamic code and
    it's mapped into two distinct memory regions, one with write access
    and the other one with execution access. This approach is the
    recommended way to create dynamic code by a program in a more secure
    way, and selinux allows it.

    Additionally selinux requires that the backend file be stored in a
    directory marked with type bin_t to be able to map it in an executable
    area. To satisfy this condition, GLUSTERFS_LIBEXECDIR has been used.

    This fix also changes the error check for mmap(), that was done
    incorrectly (it checked against NULL instead of MAP_FAILED), and it
    also correctly propagates the error codes and makes sure they aren't
    silently ignored.

    Change-Id: I71c2f88be4e4d795b6cfff96ab3799c362c54291
    BUG: 1402661
    Signed-off-by: Xavier Hernandez <xhernandez at datalab.es>
    Reviewed-on: https://review.gluster.org/16405
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Jeff Darcy <jdarcy at redhat.com>

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1402661
[Bug 1402661] Samba crash when mounting a distributed dispersed volume over
CIFS
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.