[Bugs] [Bug 1324014] New: glusterd: glusted didn't come up after node reboot error" realpath () failed for brick /run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The underlying file system may be in bad state [No such file or directory]"

bugzilla at redhat.com bugzilla at redhat.com
Tue Apr 5 10:47:50 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1324014

            Bug ID: 1324014
           Summary: glusterd: glusted didn't come up after node reboot
                    error" realpath () failed for brick
                    /run/gluster/snaps/130949baac8843cda443cf8a6441157f/br
                    ick3/b3. The underlying file system may be in bad
                    state [No such file or directory]"
           Product: GlusterFS
           Version: 3.7.10
         Component: glusterd
          Keywords: Triaged
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: amukherj at redhat.com
                CC: ashah at redhat.com, bugs at gluster.org,
                    storage-qa-internal at redhat.com
        Depends On: 1322765, 1322772



+++ This bug was initially created as a clone of Bug #1322772 +++

+++ This bug was initially created as a clone of Bug #1322765 +++

Description of problem:

After node reboot, glusterd didn't come up .
Error" realpath () failed for brick
/run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The underlying
file system may be in bad state [No such file or directory]"

Version-Release number of selected component (if applicable):

glusterfs-3.7.9-1.el7rhgs.x86_64


How reproducible:

100%

Steps to Reproduce:
1. Create 2*2 distribute replicate volume
2. Enable uss
3. Create snasphot and activate
4. Reboot one of the node

Actual results:

After node reboot, glusterd should come up

Expected results:

glusterd is down after node reboot

Additional info:

[root at dhcp46-4 ~]# gluster v info

Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 60769503-f742-458d-97c0-8e090147f82a
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.46.4:/rhs/brick1/b1
Brick2: 10.70.47.46:/rhs/brick2/b2
Brick3: 10.70.46.213:/rhs/brick3/b3
Brick4: 10.70.46.148:/rhs/brick4/b4
Options Reconfigured:
performance.readdir-ahead: on
features.uss: enable
features.barrier: disable
snap-activate-on-create: enable


================================
glusterd logs from node which is rebooted

[2016-03-31 12:03:00.551394] C [MSGID: 106425]
[glusterd-store.c:2425:glusterd_store_retrieve_bricks] 0-management: realpath
() failed for brick
/run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The underlying
file system may be in bad state [No such file or directory]

[2016-03-31 12:19:44.102994] I [rpc-clnt.c:984:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2016-03-31 12:19:44.106631] W [socket.c:870:__socket_keepalive] 0-socket:
failed to set TCP_USER_TIMEOUT -1000 on socket 15, Invalid argument
[2016-03-31 12:19:44.106676] E [socket.c:2966:socket_connect] 0-management:
Failed to set keep-alive: Invalid argument
The message "I [MSGID: 106498]
[glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0" repeated 2 times between [2016-03-31 12:19:44.085669] and
[2016-03-31 12:19:44.086167]
[2016-03-31 12:19:44.114223] C [MSGID: 106425]
[glusterd-store.c:2425:glusterd_store_retrieve_bricks] 0-management: realpath
() failed for brick
/run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The underlying
file system may be in bad state [No such file or directory]
[2016-03-31 12:19:44.114364] E [MSGID: 106201]
[glusterd-store.c:3082:glusterd_store_retrieve_volumes] 0-management: Unable to
restore volume: 130949baac8843cda443cf8a6441157f
[2016-03-31 12:19:44.114387] E [MSGID: 106195]
[glusterd-store.c:3475:glusterd_store_retrieve_snap] 0-management: Failed to
retrieve snap volumes for snap snap1
[2016-03-31 12:19:44.114399] E [MSGID: 106043]
[glusterd-store.c:3629:glusterd_store_retrieve_snaps] 0-management: Unable to
restore snapshot: snap1
[2016-03-31 12:19:44.114509] E [MSGID: 101019] [xlator.c:433:xlator_init]
0-management: Initialization of volume 'management' failed, review your volfile
again
[2016-03-31 12:19:44.114542] E [graph.c:322:glusterfs_graph_init] 0-management:
initializing translator failed
[2016-03-31 12:19:44.114554] E [graph.c:661:glusterfs_graph_activate] 0-graph:
init failed
[2016-03-31 12:19:44.115626] W [glusterfsd.c:1251:cleanup_and_exit]
(-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7fc632a1b2ad]
-->/usr/sbin/glusterd(glusterfs_process_volfp+0x120) [0x7fc632a1b150]
-->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7fc632a1a739] ) 0-: received
signum (0), shutting down

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-03-31
05:46:54 EDT ---

This bug is automatically being proposed for the current z-stream release of
Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change
the proposed release flag.

--- Additional comment from RHEL Product and Program Management on 2016-03-31
06:02:19 EDT ---

This bug report has Keywords: Regression or TestBlocker.

Since no regressions or test blockers are allowed between releases,
it is also being identified as a blocker for this release.

Please resolve ASAP.

--- Additional comment from Vijay Bellur on 2016-03-31 06:18:46 EDT ---

REVIEW: http://review.gluster.org/13869 (glusterd: build realpath post recreate
of brick mount for snapshot) posted (#1) for review on master by Atin Mukherjee
(amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-04-01 06:16:10 EDT ---

REVIEW: http://review.gluster.org/13869 (glusterd: build realpath post recreate
of brick mount for snapshot) posted (#2) for review on master by Atin Mukherjee
(amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-04-04 00:53:45 EDT ---

REVIEW: http://review.gluster.org/13869 (glusterd: build realpath post recreate
of brick mount for snapshot) posted (#3) for review on master by Atin Mukherjee
(amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-04-04 03:07:51 EDT ---

REVIEW: http://review.gluster.org/13869 (glusterd: build realpath post recreate
of brick mount for snapshot) posted (#4) for review on master by Atin Mukherjee
(amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-04-04 11:19:52 EDT ---

REVIEW: http://review.gluster.org/13869 (glusterd: build realpath post recreate
of brick mount for snapshot) posted (#5) for review on master by Atin Mukherjee
(amukherj at redhat.com)

--- Additional comment from Vijay Bellur on 2016-04-05 06:46:51 EDT ---

COMMIT: http://review.gluster.org/13869 committed in master by Atin Mukherjee
(amukherj at redhat.com) 
------
commit d3c77459593255ed2c88094c8477b8a0c9ff9073
Author: Atin Mukherjee <amukherj at redhat.com>
Date:   Thu Mar 31 14:58:02 2016 +0530

    glusterd: build realpath post recreate of brick mount for snapshot

    Commit a60c39d introduced a new field called real_path in brickinfo to hold
the
    realpath() conversion. However at restore path for all snapshots and
snapshot
    restored volumes the brickpath gets recreated post restoration of bricks 
which
    means the realpath () call will fail here for all the snapshots and cloned
    volumes.

    Fix is to store the realpath for snapshots and clones post recreating the
brick
    mounts. For normal volume it would be done during retrieving the brick
details
    from the store.

    Change-Id: Ia34853acddb28bcb7f0f70ca85fabcf73276ef13
    BUG: 1322772
    Signed-off-by: Atin Mukherjee <amukherj at redhat.com>
    Reviewed-on: http://review.gluster.org/13869
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Avra Sengupta <asengupt at redhat.com>
    Reviewed-by: Rajesh Joseph <rjoseph at redhat.com>
    Smoke: Gluster Build System <jenkins at build.gluster.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1322765
[Bug 1322765] glusterd: glusted didn't come up after node reboot error"
realpath () failed for brick
/run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The
underlying file system may be in bad state [No such file or directory]"
https://bugzilla.redhat.com/show_bug.cgi?id=1322772
[Bug 1322772] glusterd: glusted didn't come up after node reboot error"
realpath () failed for brick
/run/gluster/snaps/130949baac8843cda443cf8a6441157f/brick3/b3. The
underlying file system may be in bad state [No such file or directory]"
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list