[Bugs] [Bug 1361517] New: Bricks didn't become online after reboot. [Disk Full ]
bugzilla at redhat.com
bugzilla at redhat.com
Fri Jul 29 09:20:20 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1361517
Bug ID: 1361517
Summary: Bricks didn't become online after reboot. [Disk Full ]
Product: Red Hat Gluster Storage
Version: 3.1
Component: posix
Keywords: Triaged
Severity: high
Assignee: pkarampu at redhat.com
Reporter: pkarampu at redhat.com
QA Contact: storage-qa-internal at redhat.com
CC: aspandey at redhat.com, bugs at gluster.org,
ksandha at redhat.com, pkarampu at redhat.com,
ravishankar at redhat.com, rhs-bugs at redhat.com
Depends On: 1333341
+++ This bug was initially created as a clone of Bug #1333341 +++
Description of problem:
Rebooted the brick2 and started renaming the files in a brick1 which is full.
The brick2 didn't came online after the reboot. Errors were seen in the brick
logs.
"Creation of unlink directory failed"
sosreport kept at
rhsqe-repo.lab.eng.blr.redhat.com://var/www/html/sosreports/<bugid>
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. Create replica 3 volume and mount the volume on client using fuse.
2. Create files using
for (( i=1; i <= 50; i++ ))
do
dd if=/dev/zero of=file$i count=1000 bs=5M status=progress
done
3. After the creation is done. reboot the second brick.
4. start the renaming process of the files to test$i..n
5. When the second brick comes up it fails with below errors.
[2016-05-05 14:37:45.826772] E [MSGID: 113096]
[posix.c:6443:posix_create_unlink_dir] 0-arbiter-posix: Creating directory
/rhs/brick1/arbiter/.glusterfs/unlink failed [No space left on device]
[2016-05-05 14:37:45.826856] E [MSGID: 113096] [posix.c:6866:init]
0-arbiter-posix: Creation of unlink directory failed
[2016-05-05 14:37:45.826880] E [MSGID: 101019] [xlator.c:433:xlator_init]
0-arbiter-posix: Initialization of volume 'arbiter-posix' failed, review your
volfile again
[2016-05-05 14:37:45.826925] E [graph.c:322:glusterfs_graph_init]
0-arbiter-posix: initializing translator failed
[2016-05-05 14:37:45.826943] E [graph.c:661:glusterfs_graph_activate] 0-graph:
init failed
[2016-05-05 14:37:45.828349] W [glusterfsd.c:1251:cleanup_and_exit]
(-->/usr/sbin/glusterfsd(mgmt_getspec_cbk+0x331) [0x7f6ba63797d1]
-->/usr/sbin/glusterfsd(glusterfs_process_volfp+0x120) [0x7f6ba6374150]
-->/usr/sbin/glusterfsd(cleanup_and_exit+0x69) [0x7f6ba6373739] ) 0-: received
signum (0), shutting down
Actual results:
[root at dhcp43-167 arbiter]# gluster volume status
Status of volume: arbiter
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick dhcp42-58.lab.eng.blr.redhat.com:/rhs
/brick1/arbiter N/A N/A N N/A
Brick dhcp43-142.lab.eng.blr.redhat.com:/rh
s/brick1/arbiter 49157 0 Y 2120
Brick dhcp43-167.lab.eng.blr.redhat.com:/rh
s/brick1/arbiter 49156 0 Y 2094
NFS Server on localhost 2049 0 Y 2679
Self-heal Daemon on localhost N/A N/A Y 3172
NFS Server on dhcp42-58.lab.eng.blr.redhat.
com 2049 0 Y 2195
Self-heal Daemon on dhcp42-58.lab.eng.blr.r
edhat.com N/A N/A Y 2816
NFS Server on dhcp43-142.lab.eng.blr.redhat
.com 2049 0 Y 3072
Self-heal Daemon on dhcp43-142.lab.eng.blr.
redhat.com N/A N/A Y 3160
Task Status of Volume arbiter
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: arbiternfs
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick dhcp42-58.lab.eng.blr.redhat.com:/rhs
/brick2/arbiternfs N/A N/A N N/A
Brick dhcp43-142.lab.eng.blr.redhat.com:/rh
s/brick2/arbiternfs 49158 0 Y 2128
Brick dhcp43-167.lab.eng.blr.redhat.com:/rh
s/brick2/arbiternfs 49157 0 Y 2109
NFS Server on localhost 2049 0 Y 2679
Self-heal Daemon on localhost N/A N/A Y 3172
NFS Server on dhcp42-58.lab.eng.blr.redhat.
com 2049 0 Y 2195
Self-heal Daemon on dhcp42-58.lab.eng.blr.r
edhat.com N/A N/A Y 2816
NFS Server on dhcp43-142.lab.eng.blr.redhat
.com 2049 0 Y 3072
Self-heal Daemon on dhcp43-142.lab.eng.blr.
redhat.com N/A N/A Y 3160
Task Status of Volume arbiternfs
------------------------------------------------------------------------------
There are no active volume tasks
*************************************************************
[root at dhcp43-142 arbiter]# gluster volume heal arbiternfs info
Brick dhcp42-58.lab.eng.blr.redhat.com:/rhs/brick2/arbiternfs
Status: Transport endpoint is not connected
Number of entries: -
Brick dhcp43-142.lab.eng.blr.redhat.com:/rhs/brick2/arbiternfs
/file4
/file5
Status: Connected
Number of entries: 2
Brick dhcp43-167.lab.eng.blr.redhat.com:/rhs/brick2/arbiternfs
/file4
/file5
Status: Connected
Number of entries: 2
[root at dhcp43-142 arbiter]#
[root at dhcp43-142 arbiter]#
[root at dhcp43-142 arbiter]#
[root at dhcp43-142 arbiter]# gluster volume heal arbiter info
Brick dhcp42-58.lab.eng.blr.redhat.com:/rhs/brick1/arbiter
Status: Transport endpoint is not connected
Number of entries: -
Brick dhcp43-142.lab.eng.blr.redhat.com:/rhs/brick1/arbiter
/ - Possibly undergoing heal
Status: Connected
Number of entries: 1
Brick dhcp43-167.lab.eng.blr.redhat.com:/rhs/brick1/arbiter
/
Status: Connected
Number of entries: 1
[root at dhcp43-142 arbiter]#
Expected results:
The bricks should be up and running and file names should have been renamed.
Additional info:
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1333341
[Bug 1333341] Bricks didn't become online after reboot. [Disk Full ]
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=fNupqjrZPQ&a=cc_unsubscribe
More information about the Bugs
mailing list