[Gluster-users] Brick-Xlators crashes after Set-RO and Read

Thu May 16 07:53:36 UTC 2019

Hello Vijay,

I could reproduce the issue. After doing a simple DIR Listing from Win10
powershell, all brick processes crashes. Its not the same scenario
mentioned before but the crash report in the bricks log is the same.
Attached you find the backtrace.

Regards
David Spisla

Am Di., 7. Mai 2019 um 20:08 Uhr schrieb Vijay Bellur <vbellur at redhat.com>:

> Hello David,
>
> On Tue, May 7, 2019 at 2:16 AM David Spisla <spisla80 at gmail.com> wrote:
>
>> Hello Vijay,
>>
>> how can I create such a core file? Or will it be created automatically if
>> a gluster process crashes?
>> Maybe you can give me a hint and will try to get a backtrace.
>>
>
> Generation of core file is dependent on the system configuration.  `man 5
> core` contains useful information to generate a core file in a directory.
> Once a core file is generated, you can use gdb to get a backtrace of all
> threads (using "thread apply all bt full").
>
>
>> Unfortunately this bug is not easy to reproduce because it appears only
>> sometimes.
>>
>
> If the bug is not easy to reproduce, having a backtrace from the generated
> core would be very useful!
>
> Thanks,
> Vijay
>
>
>>
>> Regards
>> David Spisla
>>
>> Am Mo., 6. Mai 2019 um 19:48 Uhr schrieb Vijay Bellur <vbellur at redhat.com
>> >:
>>
>>> Thank you for the report, David. Do you have core files available on any
>>> of the servers? If yes, would it be possible for you to provide a backtrace.
>>>
>>> Regards,
>>> Vijay
>>>
>>> On Mon, May 6, 2019 at 3:09 AM David Spisla <spisla80 at gmail.com> wrote:
>>>
>>>> Hello folks,
>>>>
>>>> we have a client application (runs on Win10) which does some FOPs on a
>>>> gluster volume which is accessed by SMB.
>>>>
>>>> *Scenario 1* is a READ Operation which reads all files successively
>>>> and checks if the files data was correctly copied. While doing this, all
>>>> brick processes crashes and in the logs one have this crash report on every
>>>> brick log:
>>>>
>>>>> CTX_ID:a0359502-2c76-4fee-8cb9-365679dc690e-GRAPH_ID:0-PID:32934-HOST:XX-XXXXX-XX-XX-PC_NAME:shortterm-client-2-RECON_NO:-0, gfid: 00000000-0000-0000-0000-000000000001, req(uid:2000,gid:2000,perm:1,ngrps:1), ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission denied]
>>>>> pending frames:
>>>>> frame : type(0) op(27)
>>>>> frame : type(0) op(40)
>>>>> patchset: git://git.gluster.org/glusterfs.git
>>>>> signal received: 11
>>>>> time of crash:
>>>>> 2019-04-16 08:32:21
>>>>> configuration details:
>>>>> argp 1
>>>>> backtrace 1
>>>>> dlfcn 1
>>>>> libpthread 1
>>>>> llistxattr 1
>>>>> setfsid 1
>>>>> spinlock 1
>>>>> epoll.h 1
>>>>> xattr.h 1
>>>>> st_atim.tv_nsec 1
>>>>> package-string: glusterfs 5.5
>>>>> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7f9a5bd4d64c]
>>>>> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7f9a5bd57d26]
>>>>> /lib64/libc.so.6(+0x361a0)[0x7f9a5af141a0]
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7f9a4ef0e910]
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7f9a4ef0b118]
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7f9a4f1278d6]
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7f9a4f35975b]
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7f9a4f1203b3]
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7f9a4ef0b5b2]
>>>>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
>>>>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7f9a4e8cf548]
>>>>> /usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7f9a5bdefc22]
>>>>> /usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7f9a5bd733a5]
>>>>> /usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7f9a4e6b7088]
>>>>> /lib64/libpthread.so.0(+0x7569)[0x7f9a5b29f569]
>>>>> /lib64/libc.so.6(clone+0x3f)[0x7f9a5afd69af]
>>>>>
>>>>> *Scenario 2 *The application just SET Read-Only on each file
>>>> sucessively. After the 70th file was set, all the bricks crashes and again,
>>>> one can read this crash report in every brick log:
>>>>
>>>>>
>>>>>
>>>>> [2019-05-02 07:43:39.953591] I [MSGID: 139001]
>>>>> [posix-acl.c:263:posix_acl_log_permit_denied] 0-longterm-access-control:
>>>>> client:
>>>>> CTX_ID:21aa9c75-3a5f-41f9-925b-48e4c80bd24a-GRAPH_ID:0-PID:16325-HOST:XXX-X-X-XXX-PC_NAME:longterm-client-0-RECON_NO:-0,
>>>>> gfid: 00000000-0000-0000-0000-000000000001,
>>>>> req(uid:2000,gid:2000,perm:1,ngrps:1),
>>>>> ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission
>>>>> denied]
>>>>>
>>>>> pending frames:
>>>>>
>>>>> frame : type(0) op(27)
>>>>>
>>>>> patchset: git://git.gluster.org/glusterfs.git
>>>>>
>>>>> signal received: 11
>>>>>
>>>>> time of crash:
>>>>>
>>>>> 2019-05-02 07:43:39
>>>>>
>>>>> configuration details:
>>>>>
>>>>> argp 1
>>>>>
>>>>> backtrace 1
>>>>>
>>>>> dlfcn 1
>>>>>
>>>>> libpthread 1
>>>>>
>>>>> llistxattr 1
>>>>>
>>>>> setfsid 1
>>>>>
>>>>> spinlock 1
>>>>>
>>>>> epoll.h 1
>>>>>
>>>>> xattr.h 1
>>>>>
>>>>> st_atim.tv_nsec 1
>>>>>
>>>>> package-string: glusterfs 5.5
>>>>>
>>>>> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fbb3f0b364c]
>>>>>
>>>>> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fbb3f0bdd26]
>>>>>
>>>>> /lib64/libc.so.6(+0x361e0)[0x7fbb3e27a1e0]
>>>>>
>>>>>
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7fbb32257910]
>>>>>
>>>>>
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7fbb32254118]
>>>>>
>>>>>
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7fbb324708d6]
>>>>>
>>>>>
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7fbb326a275b]
>>>>>
>>>>>
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7fbb324693b3]
>>>>>
>>>>>
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7fbb322545b2]
>>>>>
>>>>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]
>>>>>
>>>>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]
>>>>>
>>>>>
>>>>> /usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7fbb31c18548]
>>>>>
>>>>>
>>>>> /usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7fbb3f155c22]
>>>>>
>>>>> /usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7fbb3f0d93a5]
>>>>>
>>>>>
>>>>> /usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7fbb31a00088]
>>>>>
>>>>> /lib64/libpthread.so.0(+0x7569)[0x7fbb3e605569]
>>>>>
>>>>> /lib64/libc.so.6(clone+0x3f)[0x7fbb3e33c9ef]
>>>>>
>>>>
>>>> This happens on a 3-Node Gluster v5.5 Cluster on two different volumes.
>>>> But both volumes has the same settings:
>>>>
>>>>> Volume Name: shortterm
>>>>> Type: Replicate
>>>>> Volume ID: 5307e5c5-e8a1-493a-a846-342fb0195dee
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 1 x 3 = 3
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: fs-xxxxx-c1-n1:/gluster/brick4/glusterbrick
>>>>> Brick2: fs-xxxxx-c1-n2:/gluster/brick4/glusterbrick
>>>>> Brick3: fs-xxxxx-c1-n3:/gluster/brick4/glusterbrick
>>>>> Options Reconfigured:
>>>>> storage.reserve: 1
>>>>> performance.client-io-threads: off
>>>>> nfs.disable: on
>>>>> transport.address-family: inet
>>>>> user.smb: disable
>>>>> features.read-only: off
>>>>> features.worm: off
>>>>> features.worm-file-level: on
>>>>> features.retention-mode: enterprise
>>>>> features.default-retention-period: 120
>>>>> network.ping-timeout: 10
>>>>> features.cache-invalidation: on
>>>>> features.cache-invalidation-timeout: 600
>>>>> performance.nl-cache: on
>>>>> performance.nl-cache-timeout: 600
>>>>> client.event-threads: 32
>>>>> server.event-threads: 32
>>>>> cluster.lookup-optimize: on
>>>>> performance.stat-prefetch: on
>>>>> performance.cache-invalidation: on
>>>>> performance.md-cache-timeout: 600
>>>>> performance.cache-samba-metadata: on
>>>>> performance.cache-ima-xattrs: on
>>>>> performance.io-thread-count: 64
>>>>> cluster.use-compound-fops: on
>>>>> performance.cache-size: 512MB
>>>>> performance.cache-refresh-timeout: 10
>>>>> performance.read-ahead: off
>>>>> performance.write-behind-window-size: 4MB
>>>>> performance.write-behind: on
>>>>> storage.build-pgfid: on
>>>>> features.utime: on
>>>>> storage.ctime: on
>>>>> cluster.quorum-type: fixed
>>>>> cluster.quorum-count: 2
>>>>> features.bitrot: on
>>>>> features.scrub: Active
>>>>> features.scrub-freq: daily
>>>>> cluster.enable-shared-storage: enable
>>>>>
>>>>>
>>>> Why can this happen to all Brick processes? I don't understand the
>>>> crash report. The FOPs are nothing special and after restart brick
>>>> processes everything works fine and our application was succeed.
>>>>
>>>> Regards
>>>> David Spisla
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190516/42a4c56d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: backtrace.log
Type: application/octet-stream
Size: 36515 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190516/42a4c56d/attachment.obj>