[Gluster-users] Cannot list `.snaps/<snapname>` directory

Riccardo Murri riccardo.murri at gmail.com
Wed Jul 18 10:28:34 UTC 2018


Hello,

I am trying the USS snapshots on an existing cluster (GlusterFS 3.12.9
on Ubuntu 16.04, installed from the DEB packages on GlusterFS.org).

I can successfully create a snapshot, and it is correctly listed under
the volume's `.snaps` directory everywhere; for example:

  $ stat .snaps/test_GMT-2018.07.18-10.02.05
  File: '.snaps/test_GMT-2018.07.18-10.02.05'
  Size: 4096            Blocks: 8          IO Block: 131072 directory
  Device: 2bh/43d Inode: 11895042009497816687  Links: 6
  Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
  Access: 2018-07-16 13:33:29.649519330 +0000
  Modify: 2018-07-16 13:33:29.548396587 +0000
  Change: 2018-07-16 13:33:29.645519332 +0000
  Birth: -

However running:

    ls .snaps/test_GMT-2018.07.18-10.02.05/

just hangs forever.  Running the same command under `strace` shows
that it's stuck repeatedly just doing this `getdents system call::

    getdents(3, [{d_ino=15424314298293051273,
d_off=9223372036854775805, d_reclen=32, d_name="modulefiles",
d_type=DT_DIR}, {d_ino=1, d_off=7346686173273481909, d_reclen=24,
d_name="..", d_type=DT_DIR}, {d_ino=13594788684206387598,
d_off=7356868894832018317, d_reclen=24, d_name=".", d_type=DT_DIR},
{d_ino=9350865815378619609, d_off=9223372036854775805, d_reclen=24,
d_name="bin", d_type=DT_DIR}], 131072) = 104

Looking in the server-side logs shows no issue (I am attaching the
last lines of logs and other config commands below).

What could be the problem here?

Thanks,
Riccardo

(Note for the logs: snap was created at 10:02 UTC.)

$ tail glusterd.log bricks/run-gluster-snaps*log snaps/glusterfs/snapd.log
[2018-07-18 10:02:07.183666] I
[glusterd-utils.c:6047:glusterd_brick_start] 0-management: starting a
fresh brick process for brick
/run/gluster/snaps/2bd07d5165de4277be94d3bc2b4a6ff4/brick4
[2018-07-18 10:02:07.194801] I
[rpc-clnt.c:1044:rpc_clnt_connection_init] 0-management: setting
frame-timeout to 600
[2018-07-18 10:02:07.296262] I [socket.c:2474:socket_event_handler]
0-transport: EPOLLERR - disconnecting now
[2018-07-18 10:02:07.297582] I [MSGID: 106005]
[glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management:
Brick glusterfs-server-001:/run/gluster/snaps/2bd07d5165de4277be94d3bc2b4a6ff4/brick4
has disconnected from glusterd.
[2018-07-18 10:02:07.298327] W [MSGID: 106057]
[glusterd-snapshot-utils.c:410:glusterd_snap_volinfo_find]
0-management: Snap volume
2bd07d5165de4277be94d3bc2b4a6ff4.glusterfs-server-001.run-gluster-snaps-2bd07d5165de4277be94d3bc2b4a6ff4-brick4
not found [Invalid argument]
[2018-07-18 10:02:07.587928] I [MSGID: 106143]
[glusterd-pmap.c:295:pmap_registry_bind] 0-pmap: adding brick
/run/gluster/snaps/2bd07d5165de4277be94d3bc2b4a6ff4/brick4 on port
49154

==> bricks/run-gluster-snaps-2bd07d5165de4277be94d3bc2b4a6ff4-brick4.log <==
166:     option transport.socket.keepalive-interval 2
167:     option transport.socket.keepalive-count 9
168:     option transport.listen-backlog 10
169:     subvolumes /run/gluster/snaps/2bd07d5165de4277be94d3bc2b4a6ff4/brick4
170: end-volume
171:
+------------------------------------------------------------------------------+
[2018-07-18 10:02:50.156027] I [addr.c:55:compare_addr_and_update]
0-/run/gluster/snaps/2bd07d5165de4277be94d3bc2b4a6ff4/brick4: allowed
= "*", received addr = "172.23.62.200"
[2018-07-18 10:02:50.156366] I [login.c:111:gf_auth] 0-auth/login:
allowed user names: f75e457a-7898-4863-9754-2209ddf47573
[2018-07-18 10:02:50.156402] I [MSGID: 115029]
[server-handshake.c:793:server_setvolume]
0-2bd07d5165de4277be94d3bc2b4a6ff4-server: accepted client from
glusterfs-server-005-16843-2018/07/18-10:02:50:142015-glusterfs-client-12-0-0
(version: 3.12.9)

==> snaps/glusterfs/snapd.log <==
[2018-07-18 08:27:43.710242] I [addr.c:55:compare_addr_and_update]
0-snapd-glusterfs: allowed = "*", received addr = "172.23.62.189"
[2018-07-18 08:27:43.710291] I [MSGID: 115029]
[server-handshake.c:793:server_setvolume] 0-glusterfs-server: accepted
client from slurm-master-001-28933-2018/06/11-14:28:30:530655-glusterfs-snapd-client-45-0
(version: 3.12.9)
[2018-07-18 08:27:43.713616] E
[server-handshake.c:385:server_first_lookup] 0-snapd-glusterfs: lookup
on root failed: Success
[2018-07-18 08:27:43.729767] I [addr.c:55:compare_addr_and_update]
0-snapd-glusterfs: allowed = "*", received addr = "172.23.62.189"
[2018-07-18 08:27:43.729809] I [MSGID: 115029]
[server-handshake.c:793:server_setvolume] 0-glusterfs-server: accepted
client from slurm-master-001-9204-2018/07/10-12:11:24:974712-glusterfs-snapd-client-18-0
(version: 3.12.9)
[2018-07-18 08:27:43.730260] E
[server-handshake.c:385:server_first_lookup] 0-snapd-glusterfs: lookup
on root failed: Success
[2018-07-18 08:28:49.401381] I
[snapview-server-mgmt.c:22:mgmt_cbk_snap] 0-mgmt: list of snapshots
changed
[2018-07-18 09:57:29.838877] I
[snapview-server-mgmt.c:22:mgmt_cbk_snap] 0-mgmt: list of snapshots
changed
[2018-07-18 09:58:28.026000] I
[snapview-server-mgmt.c:22:mgmt_cbk_snap] 0-mgmt: list of snapshots
changed
[2018-07-18 10:02:11.611279] I
[snapview-server-mgmt.c:22:mgmt_cbk_snap] 0-mgmt: list of snapshots
changed

$ pdsh -g glusterfs_server 'sudo lvs' | dshbak -c
----------------
glusterfs-server-001
----------------
  LV                                 VG   Attr       LSize Pool
Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  2bd07d5165de4277be94d3bc2b4a6ff4_0 data Vwi-aot--- 4.00t glusterfs
srv_glusterfs 32.30
  glusterfs                          data twi-aot--- 4.88t
            26.48  13.65
  srv_glusterfs                      data Vwi-aot--- 4.00t glusterfs
            32.30
----------------
glusterfs-server-002
----------------
  LV                                 VG   Attr       LSize Pool
Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  2bd07d5165de4277be94d3bc2b4a6ff4_0 data Vwi-aot--- 4.00t glusterfs
srv_glusterfs 32.28
  glusterfs                          data twi-aot--- 4.88t
            26.46  13.66
  srv_glusterfs                      data Vwi-aot--- 4.00t glusterfs
            32.28
----------------
glusterfs-server-003
----------------
  LV                                 VG   Attr       LSize Pool
Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  2bd07d5165de4277be94d3bc2b4a6ff4_0 data Vwi-aot--- 4.00t glusterfs
srv_glusterfs 57.46
  glusterfs                          data twi-aot--- 4.88t
            47.10  23.88
  srv_glusterfs                      data Vwi-aot--- 4.00t glusterfs
            57.46
----------------
glusterfs-server-004
----------------
  LV                                 VG   Attr       LSize Pool
Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  2bd07d5165de4277be94d3bc2b4a6ff4_0 data Vwi-aot--- 4.00t glusterfs
srv_glusterfs 46.72
  glusterfs                          data twi-aot--- 4.88t
            38.30  19.53
  srv_glusterfs                      data Vwi-aot--- 4.00t glusterfs
            46.72
----------------
glusterfs-server-005
----------------
  LV                                 VG   Attr       LSize Pool
Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  2bd07d5165de4277be94d3bc2b4a6ff4_0 data Vwi-aot--- 4.00t glusterfs
srv_glusterfs 65.51
  glusterfs                          data twi---t--- 4.88t
            53.70  27.20
  srv_glusterfs                      data Vwi-aot--- 4.00t glusterfs
            65.51

$ sudo gluster peer status
Number of Peers: 4

Hostname: glusterfs-server-005
Uuid: d53398f6-19d4-4633-8bc3-e493dac41789
State: Peer in Cluster (Connected)

Hostname: glusterfs-server-004
Uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318
State: Peer in Cluster (Connected)

Hostname: glusterfs-server-003
Uuid: 3c74d2b4-a4f3-42d4-9511-f6174b0a641d
State: Peer in Cluster (Connected)

Hostname: glusterfs-server-002
Uuid: 5951cff1-2155-40b0-a94f-84ff2e4aa7c6
State: Peer in Cluster (Connected)

$ sudo gluster volume status
Status of volume: glusterfs
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick glusterfs-server-005:/s
rv/glusterfs                                49152     0          Y       12007
Brick glusterfs-server-004:/s
rv/glusterfs                                49152     0          Y       14713
Brick glusterfs-server-003:/s
rv/glusterfs                                49152     0          Y       14993
Brick glusterfs-server-001:/s
rv/glusterfs                                49152     0          Y       20735
Brick glusterfs-server-002:/s
rv/glusterfs                                49152     0          Y       19967
Snapshot Daemon on localhost                49153     0          Y       28323
Snapshot Daemon on glusterfs-
server-004                                  49153     0          Y       21229
Snapshot Daemon on glusterfs-
server-005                                  49153     0          Y       16843
Snapshot Daemon on glusterfs-
server-003                                  49153     0          Y       20673
Snapshot Daemon on glusterfs-
server-002                                  49153     0          Y       21441

Task Status of Volume glusterfs
------------------------------------------------------------------------------
Task                 : Rebalance
ID                   : 0eaf6ad1-df95-48f4-b941-17488010ddcc
Status               : completed

$ sudo gluster snapshot status

Snap Name : test_GMT-2018.07.18-10.02.05
Snap UUID : 1b63ce46-334d-4faf-9081-a5b68d415786

        Brick Path        :
glusterfs-server-005:/run/gluster/snaps/2bd07d5165de4277be94d3bc2b4a6ff4/brick1
        Volume Group      :   data
        Brick Running     :   Yes
        Brick PID         :   17074
        Data Percentage   :   65.51
        LV Size           :   4.00t


        Brick Path        :
glusterfs-server-004:/run/gluster/snaps/2bd07d5165de4277be94d3bc2b4a6ff4/brick2
        Volume Group      :   data
        Brick Running     :   Yes
        Brick PID         :   21452
        Data Percentage   :   46.72
        LV Size           :   4.00t


        Brick Path        :
glusterfs-server-003:/run/gluster/snaps/2bd07d5165de4277be94d3bc2b4a6ff4/brick3
        Volume Group      :   data
        Brick Running     :   Yes
        Brick PID         :   20881
        Data Percentage   :   57.46
        LV Size           :   4.00t


        Brick Path        :
glusterfs-server-001:/run/gluster/snaps/2bd07d5165de4277be94d3bc2b4a6ff4/brick4
        Volume Group      :   data
        Brick Running     :   Yes
        Brick PID         :   28925
        Data Percentage   :   32.30
        LV Size           :   4.00t


        Brick Path        :
glusterfs-server-002:/run/gluster/snaps/2bd07d5165de4277be94d3bc2b4a6ff4/brick5
        Volume Group      :   data
        Brick Running     :   Yes
        Brick PID         :   21746
        Data Percentage   :   32.28
        LV Size           :   4.00t


More information about the Gluster-users mailing list