[Bugs] [Bug 1324014] New: glusterd: glusted didn't come up after node reboot error" realpath () failed for brick /run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The underlying file system may be in bad state [No such file or directory]"
bugzilla at redhat.com
bugzilla at redhat.com
Tue Apr 5 10:47:50 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1324014
Bug ID: 1324014
Summary: glusterd: glusted didn't come up after node reboot
error" realpath () failed for brick
/run/gluster/snaps/130949baac8843cda443cf8a6441157f/br
ick3/b3. The underlying file system may be in bad
state [No such file or directory]"
Product: GlusterFS
Version: 3.7.10
Component: glusterd
Keywords: Triaged
Severity: urgent
Assignee: bugs at gluster.org
Reporter: amukherj at redhat.com
CC: ashah at redhat.com, bugs at gluster.org,
storage-qa-internal at redhat.com
Depends On: 1322765, 1322772
+++ This bug was initially created as a clone of Bug #1322772 +++
+++ This bug was initially created as a clone of Bug #1322765 +++
Description of problem:
After node reboot, glusterd didn't come up .
Error" realpath () failed for brick
/run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The underlying
file system may be in bad state [No such file or directory]"
Version-Release number of selected component (if applicable):
glusterfs-3.7.9-1.el7rhgs.x86_64
How reproducible:
100%
Steps to Reproduce:
1. Create 2*2 distribute replicate volume
2. Enable uss
3. Create snasphot and activate
4. Reboot one of the node
Actual results:
After node reboot, glusterd should come up
Expected results:
glusterd is down after node reboot
Additional info:
[root at dhcp46-4 ~]# gluster v info
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 60769503-f742-458d-97c0-8e090147f82a
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.46.4:/rhs/brick1/b1
Brick2: 10.70.47.46:/rhs/brick2/b2
Brick3: 10.70.46.213:/rhs/brick3/b3
Brick4: 10.70.46.148:/rhs/brick4/b4
Options Reconfigured:
performance.readdir-ahead: on
features.uss: enable
features.barrier: disable
snap-activate-on-create: enable
================================
glusterd logs from node which is rebooted
[2016-03-31 12:03:00.551394] C [MSGID: 106425]
[glusterd-store.c:2425:glusterd_store_retrieve_bricks] 0-management: realpath
() failed for brick
/run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The underlying
file system may be in bad state [No such file or directory]
[2016-03-31 12:19:44.102994] I [rpc-clnt.c:984:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2016-03-31 12:19:44.106631] W [socket.c:870:__socket_keepalive] 0-socket:
failed to set TCP_USER_TIMEOUT -1000 on socket 15, Invalid argument
[2016-03-31 12:19:44.106676] E [socket.c:2966:socket_connect] 0-management:
Failed to set keep-alive: Invalid argument
The message "I [MSGID: 106498]
[glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0" repeated 2 times between [2016-03-31 12:19:44.085669] and
[2016-03-31 12:19:44.086167]
[2016-03-31 12:19:44.114223] C [MSGID: 106425]
[glusterd-store.c:2425:glusterd_store_retrieve_bricks] 0-management: realpath
() failed for brick
/run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The underlying
file system may be in bad state [No such file or directory]
[2016-03-31 12:19:44.114364] E [MSGID: 106201]
[glusterd-store.c:3082:glusterd_store_retrieve_volumes] 0-management: Unable to
restore volume: 130949baac8843cda443cf8a6441157f
[2016-03-31 12:19:44.114387] E [MSGID: 106195]
[glusterd-store.c:3475:glusterd_store_retrieve_snap] 0-management: Failed to
retrieve snap volumes for snap snap1
[2016-03-31 12:19:44.114399] E [MSGID: 106043]
[glusterd-store.c:3629:glusterd_store_retrieve_snaps] 0-management: Unable to
restore snapshot: snap1
[2016-03-31 12:19:44.114509] E [MSGID: 101019] [xlator.c:433:xlator_init]
0-management: Initialization of volume 'management' failed, review your volfile
again
[2016-03-31 12:19:44.114542] E [graph.c:322:glusterfs_graph_init] 0-management:
initializing translator failed
[2016-03-31 12:19:44.114554] E [graph.c:661:glusterfs_graph_activate] 0-graph:
init failed
[2016-03-31 12:19:44.115626] W [glusterfsd.c:1251:cleanup_and_exit]
(-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7fc632a1b2ad]
-->/usr/sbin/glusterd(glusterfs_process_volfp+0x120) [0x7fc632a1b150]
-->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7fc632a1a739] ) 0-: received
signum (0), shutting down
--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-03-31
05:46:54 EDT ---
This bug is automatically being proposed for the current z-stream release of
Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'.
If this bug should be proposed for a different release, please manually change
the proposed release flag.
--- Additional comment from RHEL Product and Program Management on 2016-03-31
06:02:19 EDT ---
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases,
it is also being identified as a blocker for this release.
Please resolve ASAP.
--- Additional comment from Vijay Bellur on 2016-03-31 06:18:46 EDT ---
REVIEW: http://review.gluster.org/13869 (glusterd: build realpath post recreate
of brick mount for snapshot) posted (#1) for review on master by Atin Mukherjee
(amukherj at redhat.com)
--- Additional comment from Vijay Bellur on 2016-04-01 06:16:10 EDT ---
REVIEW: http://review.gluster.org/13869 (glusterd: build realpath post recreate
of brick mount for snapshot) posted (#2) for review on master by Atin Mukherjee
(amukherj at redhat.com)
--- Additional comment from Vijay Bellur on 2016-04-04 00:53:45 EDT ---
REVIEW: http://review.gluster.org/13869 (glusterd: build realpath post recreate
of brick mount for snapshot) posted (#3) for review on master by Atin Mukherjee
(amukherj at redhat.com)
--- Additional comment from Vijay Bellur on 2016-04-04 03:07:51 EDT ---
REVIEW: http://review.gluster.org/13869 (glusterd: build realpath post recreate
of brick mount for snapshot) posted (#4) for review on master by Atin Mukherjee
(amukherj at redhat.com)
--- Additional comment from Vijay Bellur on 2016-04-04 11:19:52 EDT ---
REVIEW: http://review.gluster.org/13869 (glusterd: build realpath post recreate
of brick mount for snapshot) posted (#5) for review on master by Atin Mukherjee
(amukherj at redhat.com)
--- Additional comment from Vijay Bellur on 2016-04-05 06:46:51 EDT ---
COMMIT: http://review.gluster.org/13869 committed in master by Atin Mukherjee
(amukherj at redhat.com)
------
commit d3c77459593255ed2c88094c8477b8a0c9ff9073
Author: Atin Mukherjee <amukherj at redhat.com>
Date: Thu Mar 31 14:58:02 2016 +0530
glusterd: build realpath post recreate of brick mount for snapshot
Commit a60c39d introduced a new field called real_path in brickinfo to hold
the
realpath() conversion. However at restore path for all snapshots and
snapshot
restored volumes the brickpath gets recreated post restoration of bricks
which
means the realpath () call will fail here for all the snapshots and cloned
volumes.
Fix is to store the realpath for snapshots and clones post recreating the
brick
mounts. For normal volume it would be done during retrieving the brick
details
from the store.
Change-Id: Ia34853acddb28bcb7f0f70ca85fabcf73276ef13
BUG: 1322772
Signed-off-by: Atin Mukherjee <amukherj at redhat.com>
Reviewed-on: http://review.gluster.org/13869
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.com>
Reviewed-by: Avra Sengupta <asengupt at redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph at redhat.com>
Smoke: Gluster Build System <jenkins at build.gluster.com>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1322765
[Bug 1322765] glusterd: glusted didn't come up after node reboot error"
realpath () failed for brick
/run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The
underlying file system may be in bad state [No such file or directory]"
https://bugzilla.redhat.com/show_bug.cgi?id=1322772
[Bug 1322772] glusterd: glusted didn't come up after node reboot error"
realpath () failed for brick
/run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The
underlying file system may be in bad state [No such file or directory]"
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list