[Gluster-users] GlusterFS 3.7.11 crash issue
Anoop C S
anoopcs at redhat.com
Wed Jun 29 06:39:28 UTC 2016
On Tue, 2016-06-28 at 10:49 +0200, Yann LEMARIE wrote:
> Hi,
>
> I found the coredump file, but it's a 15Mo file (zipped), I can't
> post it on this mailling list.
>
Great. In order to exactly pin point the crash location, can you please
attach gdb to extracted coredump file and share us the complete back
trace by executing `bt` command in gdb shell? Apart from gdb you may be
instructed to install some debug-info packages for extracting a useful
back trace while attaching gdb as follows:
# gdb /usr/sbin/glusterfsd <path-to-coredump-file>
If prompted install required packages and reattach the coredump file.
When you are inside (gdb) prompt type 'bt' and paste the back trace.
> Here is some parts of the repport :
>
> > ProblemType: Crash
> > Architecture: amd64
> > Date: Sun Jun 26 11:27:44 2016
> > DistroRelease: Ubuntu 14.04
> > ExecutablePath: /usr/sbin/glusterfsd
> > ExecutableTimestamp: 1460982898
> > ProcCmdline: /usr/sbin/glusterfsd -s nfs05 --volfile-id
> > cdn.nfs05.srv-cdn -p /var/lib/glusterd/vols/cdn/run/nfs05-srv-
> > cdn.pid -S /var/run/gluster/d52ac3e6c0a3fa316a9e8360976f3af5.socket
> > --brick-name /srv/cdn -l /var/log/glusterfs/bricks/srv-cdn.log --
> > xlator-option *-posix.glusterd-uuid=6af63b78-a3da-459d-a909-
> > c010e6c9072c --brick-port 49155 --xlator-option cdn-server.listen-
> > port=49155
> > ProcCwd: /
> > ProcEnviron:
> > PATH=(custom, no user)
> > TERM=linux
> > ProcMaps:
> > 7f25f18d9000-7f25f18da000 ---p 00000000 00:00 0
> > 7f25f18da000-7f25f19da000 rw-p 00000000 00:00
> > 0 [stack:849]
> > 7f25f19da000-7f25f19db000 ---p 00000000 00:00 0
> ...
> > ProcStatus:
> > Name: glusterfsd
> > State: D (disk sleep)
> > Tgid: 7879
> > Ngid: 0
> > Pid: 7879
> > PPid: 1
> > TracerPid: 0
> > Uid: 0 0 0 0
> > Gid: 0 0 0 0
> > FDSize: 64
> > Groups: 0
> > VmPeak: 878404 kB
> > VmSize: 878404 kB
> > VmLck: 0 kB
> > VmPin: 0 kB
> > VmHWM: 96104 kB
> > VmRSS: 90652 kB
> > VmData: 792012 kB
> > VmStk: 276 kB
> > VmExe: 84 kB
> > VmLib: 7716 kB
> > VmPTE: 700 kB
> > VmSwap: 20688 kB
> > Threads: 22
> > SigQ: 0/30034
> > SigPnd: 0000000000000000
> > ShdPnd: 0000000000000000
> > SigBlk: 0000000000004a01
> > SigIgn: 0000000000001000
> > SigCgt: 00000001800000fa
> > CapInh: 0000000000000000
> > CapPrm: 0000001fffffffff
> > CapEff: 0000001fffffffff
> > CapBnd: 0000001fffffffff
> > Seccomp: 0
> > Cpus_allowed: 7fff
> > Cpus_allowed_list: 0-14
> > Mems_allowed: 00000000,00000001
> > Mems_allowed_list: 0
> > voluntary_ctxt_switches: 3
> > nonvoluntary_ctxt_switches: 1
> > Signal: 11
> > Uname: Linux 3.13.0-44-generic x86_64
> > UserGroups:
> > CoreDump: base64
> ...
>
> Yann
>
> Le 28/06/2016 09:31, Anoop C S a écrit :
> > On Mon, 2016-06-27 at 15:05 +0200, Yann LEMARIE wrote:
> > > @Anoop,
> > >
> > > Where can I find the coredump file ?
> > >
> > You will get hints about the crash from entries inside
> > /var/log/messages(for example pid of the process, location of
> > coredump
> > etc).
> >
> > > The crash occurs 2 times last 7 days, each time a sunday morning
> > > with
> > > no reason, no increase of traffic or something like this, the
> > > volume
> > > was mounted since 15 days.
> > >
> > > The bricks are used as a CDN like, distributting small images and
> > > css
> > > files with a nginx https service (with a load balancer and 2
> > > EC2), on
> > > a sunday morning there is not a lot of activity ...
> > >
> > From the very minimal back trace that we have from brick logs I
> > would
> > assume that a truncate operation was being handled by trash
> > translator
> > and it crashed.
> >
> > > Volume infos:
> > > > root at nfs05 /var/log/glusterfs # gluster volume info cdn
> > > >
> > > > Volume Name: cdn
> > > > Type: Replicate
> > > > Volume ID: c53b9bae-5e12-4f13-8217-53d8c96c302c
> > > > Status: Started
> > > > Number of Bricks: 1 x 2 = 2
> > > > Transport-type: tcp
> > > > Bricks:
> > > > Brick1: nfs05:/srv/cdn
> > > > Brick2: nfs06:/srv/cdn
> > > > Options Reconfigured:
> > > > performance.readdir-ahead: on
> > > > features.trash: on
> > > > features.trash-max-filesize: 20MB
> > >
> > > I don't know if there is a link with this crash problem, but I
> > > have
> > > another problem with my 2 servers that make GluserFS's clients
> > > disconnected (from another volume) :
> > > > Jun 24 02:28:04 nfs05 kernel: [2039468.818617] xen_netfront:
> > > > xennet: skb rides the rocket: 19 slots
> > > > Jun 24 02:28:11 nfs05 kernel: [2039475.744086] net_ratelimit:
> > > > 66
> > > > callbacks suppressed
> > > It seem to be a network interface problem :
> > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1317811
> > >
> > > Yann
> > >
> > > Le 27/06/2016 12:59, Anoop C S a écrit :
> > > > On Mon, 2016-06-27 at 09:47 +0200, Yann LEMARIE wrote:
> > > > > Hi,
> > > > >
> > > > > I'm using GlusterFS since many years and never see this
> > > > > problem,
> > > > > but
> > > > > this is the second time in one week ...
> > > > >
> > > > > I have 3 volumes with 2 bricks and 1 volume crash with no
> > > > > reason,
> > > > Did you observe the crash while mounting the volume? Or can you
> > > > be
> > > > more
> > > > specific on what were you doing just before you saw the crash?
> > > > Can
> > > > you
> > > > please share the output of `gluster volume info <VOLNAME>`?
> > > >
> > > > > I just have to stop/start the volume to make it up again.
> > > > > The only logs I can find are in syslog :
> > > > >
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: pending frames:
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: frame : type(0) op(10)
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: patchset:
> > > > > > git://git.gluster.com/glusterfs.git
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: signal received: 11
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: time of crash:
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: 2016-06-26 09:27:44
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: configuration details:
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: argp 1
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: backtrace 1
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: dlfcn 1
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: libpthread 1
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: llistxattr 1
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: setfsid 1
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: spinlock 1
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: epoll.h 1
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: xattr.h 1
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: st_atim.tv_nsec 1
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: package-string:
> > > > > > glusterfs
> > > > > > 3.7.11
> > > > > > Jun 26 11:27:44 nfs05 srv-cdn[7879]: ---------
> > > > > >
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: pending frames:
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: frame : type(0) op(10)
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: patchset:
> > > > > > git://git.gluster.com/glusterfs.git
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: signal received: 11
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: time of crash:
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: 2016-06-26 09:27:44
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: configuration details:
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: argp 1
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: backtrace 1
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: dlfcn 1
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: libpthread 1
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: llistxattr 1
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: setfsid 1
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: spinlock 1
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: epoll.h 1
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: xattr.h 1
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: st_atim.tv_nsec 1
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: package-string:
> > > > > > glusterfs
> > > > > > 3.7.11
> > > > > > Jun 26 11:27:44 nfs06 srv-cdn[1787]: ---------
> > > > > >
> > > > >
> > > > > Thanks for your help
> > > > >
> > > > >
> > > > > Regards
> > > > > --
> > > > > Yann Lemarié
> > > > > iRaiser - Support Technique
> > > > >
> > > > > ylemarie at iraiser.eu
> > > > > _______________________________________________
> > > > > Gluster-users mailing list
> > > > > Gluster-users at gluster.org
> > > > > http://www.gluster.org/mailman/listinfo/gluster-users
> > >
> > > --
> > > Yann Lemarié
> > > iRaiser - Support Technique
> > >
> > > ylemarie at iraiser.eu
> > >
> > >
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users at gluster.org
> > > http://www.gluster.org/mailman/listinfo/gluster-users
>
> --
> Yann Lemarié
> iRaiser - Support Technique
>
> ylemarie at iraiser.eu
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
More information about the Gluster-users
mailing list