[Gluster-users] Lots of mount points failing with core dumps, help!

Franco Broi franco.broi at iongeo.com
Tue Aug 5 06:24:11 UTC 2014


On Mon, 2014-08-04 at 12:31 +0200, Niels de Vos wrote: 
> On Mon, Aug 04, 2014 at 05:05:10PM +0800, Franco Broi wrote:
> > 
> > A bit more background to this.
> > 
> > I was running 3.4.3 on all the clients (120+ nodes) but I also have a
> > 3.5 volume which I wanted to mount on the same nodes. The 3.4.3 client
> > mounts of the 3.5 volume would sometimes hang on mount requiring a
> > volume stop/start to clear. I raised this issue on this list but it was
> > never resolved. I also tried to downgrade the 3.5 volume to 3.4 but that
> > also didn't work.
> > 
> > I had a single client node running 3.5 and it was able to mount both
> > volumes so I decided to update everything on the client side.
> > 
> > Middle of last week I did a glusterfs update from 3.4.3 to 3.5.1 and
> > everything appeared to be ok. The existing 3.4.3 mounts continued to
> > work and I was able to mount the 3.5 volume without any of the hanging
> > problems I was seeing before. Great, I thought.
> > 
> > Today mount points started to fail, both for the 3.4 volume with the 3.4
> > client and for the 3.5 volume with the 3.5 client.
> > 
> > I've been remounting the filesystems as they break but it's a pretty
> > unstable environment.
> > 
> > BTW, is there some way to get gluster to write its core files somewhere
> > other than the root filesystem? If I could do that I might at least get
> > a complete core dump to run gdb on.
> 
> You can set a sysctl with a path, for example:
> 
>     # mkdir /var/cores
>     # mount /dev/local_vg/cores /var/cores
>     # sysctl -w kernel.core_pattern=/var/cores/core

Thanks for that.

> 
> I am not sure if the "mismatching layouts" can cause a segmentation 
> fault. In any case, it would be good to get the extended attributes for 
> the directories in question. The xattrs contain the hash-range (layout) 
> on where the files should get located.
> 
> For all bricks (replace the "..." with the path for the brick):
> 
>    # getfattr -m. -ehex -d .../promax_data/115_endurance/31fasttrackstk
> 
> Please also include a "gluster volume info $VOLUME".

Please see attached.


> 
> You should also file a bug for this, core dumping should definitely not 
> happen.
> 
> Thanks,
> Niels
> 
> 
> 
> >
> > Cheers,
> > 
> > On Mon, 2014-08-04 at 12:53 +0530, Pranith Kumar Karampuri wrote: 
> > > CC dht folks
> > > 
> > > Pranith
> > > On 08/04/2014 11:52 AM, Franco Broi wrote:
> > > > I've had a sudden spate of mount points failing with Transport endpoint
> > > > not connected and core dumps. The dumps are so large and my root
> > > > partitions so small that I haven't managed to get a decent traceback.
> > > >
> > > > BFD: Warning: //core.2351 is truncated: expected core file size >=
> > > > 165773312, found: 154107904.
> > > > [New Thread 2351]
> > > > [New Thread 2355]
> > > > [New Thread 2359]
> > > > [New Thread 2356]
> > > > [New Thread 2354]
> > > > [New Thread 2360]
> > > > [New Thread 2352]
> > > > Cannot access memory at address 0x1700000006
> > > > (gdb) where
> > > > #0  glusterfs_signals_setup (ctx=0x8b17c0) at glusterfsd.c:1715
> > > > Cannot access memory at address 0x7fffaa46b2e0
> > > >
> > > >
> > > > Log file is full of messages like this:
> > > >
> > > > [2014-08-04 06:10:11.160482] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk
> > > > [2014-08-04 06:10:11.160495] I [dht-layout.c:718:dht_layout_dir_mismatch] 0-data-dht: /promax_data/115_endurance/31fasttrackstk - disk layout missing
> > > > [2014-08-04 06:10:11.160502] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk
> > > > [2014-08-04 06:10:11.160514] I [dht-layout.c:718:dht_layout_dir_mismatch] 0-data-dht: /promax_data/115_endurance/31fasttrackstk - disk layout missing
> > > > [2014-08-04 06:10:11.160522] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk
> > > > [2014-08-04 06:10:11.160622] I [dht-layout.c:718:dht_layout_dir_mismatch] 0-data-dht: /promax_data/115_endurance/31fasttrackstk - disk layout missing
> > > > [2014-08-04 06:10:11.160634] I [dht-common.c:623:dht_revalidate_cbk] 0-data-dht: mismatching layouts for /promax_data/115_endurance/31fasttrackstk
> > > >
> > > >
> > > > I'm running 3.5.1 on the client side and 3.4.3 on the server.
> > > >
> > > > Any quick help much appreciated.
> > > >
> > > > Cheersm
> > > >
> > > > _______________________________________________
> > > > Gluster-users mailing list
> > > > Gluster-users at gluster.org
> > > > http://supercolony.gluster.org/mailman/listinfo/gluster-users
> > > 
> > 
> > 
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://supercolony.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
# file: data1/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c
trusted.glusterfs.dht=0x00000001000000002e8ba2e845d1745b

# file: data2/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c
trusted.glusterfs.dht=0x0000000100000000d1745d14e8ba2e87

# file: data3/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c
trusted.glusterfs.dht=0x0000000100000000e8ba2e88ffffffff

# file: data4/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c

# file: data5/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c

# file: data6/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c

# file: data7/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c

# file: data8/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c

# file: data10/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c
trusted.glusterfs.dht=0x00000001000000001745d1742e8ba2e7

# file: data11/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c
trusted.glusterfs.dht=0x000000010000000045d1745c5d1745cf

# file: data12/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c
trusted.glusterfs.dht=0x00000001000000005d1745d0745d1743

# file: data9/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c
trusted.glusterfs.dht=0x0000000100000000000000001745d173

# file: data13/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c
trusted.glusterfs.dht=0x0000000100000000745d17448ba2e8b7

# file: data14/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c
trusted.glusterfs.dht=0x00000001000000008ba2e8b8a2e8ba2b

# file: data15/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c
trusted.glusterfs.dht=0x0000000100000000a2e8ba2cba2e8b9f

# file: data16/gvol/promax_data/115_endurance/31fasttrackstk
trusted.gfid=0x8cb00624a1d846ce819e27bce29f742c
trusted.glusterfs.dht=0x0000000100000000ba2e8ba0d1745d13

 
Volume Name: data
Type: Distribute
Volume ID: 11d03f34-cc91-469f-afc3-35005db0faef
Status: Started
Number of Bricks: 16
Transport-type: tcp
Bricks:
Brick1: nas1-10g:/data1/gvol
Brick2: nas2-10g:/data5/gvol
Brick3: nas1-10g:/data2/gvol
Brick4: nas2-10g:/data6/gvol
Brick5: nas1-10g:/data3/gvol
Brick6: nas2-10g:/data7/gvol
Brick7: nas1-10g:/data4/gvol
Brick8: nas2-10g:/data8/gvol
Brick9: nas3-10g:/data9/gvol
Brick10: nas3-10g:/data10/gvol
Brick11: nas3-10g:/data11/gvol
Brick12: nas3-10g:/data12/gvol
Brick13: nas4-10g:/data13/gvol
Brick14: nas4-10g:/data14/gvol
Brick15: nas4-10g:/data15/gvol
Brick16: nas4-10g:/data16/gvol
Options Reconfigured:
nfs.export-volumes: on
nfs.disable: off
cluster.min-free-disk: 5%
network.frame-timeout: 10800
cluster.readdir-optimize: off


More information about the Gluster-users mailing list