[Bugs] [Bug 1492307] New: glusterfsd crash with features.lock-heal on

bugzilla at redhat.com bugzilla at redhat.com
Sat Sep 16 09:41:09 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1492307

            Bug ID: 1492307
           Summary: glusterfsd crash with features.lock-heal on
           Product: GlusterFS
           Version: 3.12
         Component: locks
          Severity: low
          Assignee: bugs at gluster.org
          Reporter: gluster at jahu.sk
                CC: bugs at gluster.org



Description of problem:
gluster brick daemon crash with lock-heal feature on.

# gluster volume info
Volume Name: $volname
Type: Replicate
Volume ID: xxx
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: srv01:/srv/gfs/$volname/brick
Brick2: srv02:/srv/gfs/$volname/brick
Options Reconfigured:
performance.parallel-readdir: on
features.locks-revocation-secs: 1800
nfs.disable: on
transport.address-family: inet
cluster.min-free-disk: 3
diagnostics.brick-log-level: WARNING
diagnostics.client-log-level: WARNING
performance.cache-max-file-size: 10MB
performance.cache-refresh-timeout: 60
performance.readdir-ahead: off
performance.md-cache-timeout: 600
performance.client-io-threads: on
storage.linux-aio: on
features.lock-heal: on
cluster.readdir-optimize: on
diagnostics.client-sys-log-level: CRITICAL
diagnostics.brick-sys-log-level: CRITICAL
performance.cache-size: 256MB
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-samba-metadata: on
performance.cache-invalidation: on
network.inode-lru-limit: 90000
cluster.favorite-child-policy: mtime
cluster.enable-shared-storage: enable


Version-Release number of selected component (if applicable):
3.12.0, 3.12.1

How reproducible:
gluster volume set $volname features.lock-heal on
/etc/init.d/glusterfs-server stop
/etc/init.d/glusterfs-server start



Actual results:
Final graph:
+------------------------------------------------------------------------------+
  1: volume $volname-posix
  2:     type storage/posix
  3:     option glusterd-uuid xx
  4:     option directory /srv/gfs/$volname/brick
  5:     option volume-id xx
  6:     option shared-brick-count 1
  7:     option linux-aio on
  8: end-volume
  9:
 10: volume $volname-trash
 11:     type features/trash
 12:     option trash-dir .trashcan
 13:     option brick-path /srv/gfs/$volname/brick
 14:     option trash-internal-op off
 15:     subvolumes $volname-posix
 16: end-volume
 17:
 18: volume $volname-changetimerecorder
 19:     type features/changetimerecorder
 20:     option db-type sqlite3
 21:     option hot-brick off
 22:     option db-name brick.db
 23:     option db-path /srv/gfs/$volname/brick/.glusterfs/
 24:     option record-exit off
 25:     option ctr_link_consistency off
 26:     option ctr_lookupheal_link_timeout 300
 27:     option ctr_lookupheal_inode_timeout 300
 28:     option record-entry on
 29:     option ctr-enabled off
 30:     option record-counters off
 31:     option ctr-record-metadata-heat off
 32:     option sql-db-cachesize 12500
 33:     option sql-db-wal-autocheckpoint 25000
 34:     subvolumes $volname-trash
 35: end-volume
 36:
 37: volume $volname-changelog
 38:     type features/changelog
 39:     option changelog-brick /srv/gfs/$volname/brick
 40:     option changelog-dir /srv/gfs/$volname/brick/.glusterfs/changelogs
 41:     option changelog-barrier-timeout 120
 42:     subvolumes $volname-changetimerecorder
 43: end-volume
 44:
 45: volume $volname-bitrot-stub
 46:     type features/bitrot-stub
 47:     option export /srv/gfs/$volname/brick
 48:     option bitrot disable
 49:     subvolumes $volname-changelog
 50: end-volume
 51:
 52: volume $volname-access-control
 53:     type features/access-control
 54:     subvolumes $volname-bitrot-stub
 55: end-volume
 56:
 57: volume $volname-locks
 58:     type features/locks
 59:     option revocation-secs 1800
 60:     subvolumes $volname-access-control
 61: end-volume
 62:
 63: volume $volname-worm
 64:     type features/worm
 65:     option worm off
 66:     option worm-file-level off
 67:     subvolumes $volname-locks
 68: end-volume
 69:
 70: volume $volname-read-only
 71:     type features/read-only
 72:     option read-only off
 73:     subvolumes $volname-worm
 74: end-volume
 75:
 76: volume $volname-leases
 77:     type features/leases
 78:     option leases off
 79:     subvolumes $volname-read-only
 80: end-volume
 81:
 82: volume $volname-upcall
 83:     type features/upcall
 84:     option cache-invalidation on
 85:     option cache-invalidation-timeout 600
 86:     subvolumes $volname-leases
 87: end-volume
 88:
 89: volume $volname-io-threads
 90:     type performance/io-threads
 91:     subvolumes $volname-upcall
 92: end-volume
 93:
 94: volume $volname-selinux
 95:     type features/selinux
 96:     option selinux on
 97:     subvolumes $volname-io-threads
 98: end-volume
 99:
100: volume $volname-marker
101:     type features/marker
102:     option volume-uuid xx
103:     option timestamp-file /var/lib/glusterd/vols/$volname/marker.tstamp
104:     option quota-version 0
105:     option xtime off
106:     option gsync-force-xtime off
107:     option quota off
108:     option inode-quota off
109:     subvolumes $volname-selinux
110: end-volume
111:
112: volume $volname-barrier
113:     type features/barrier
114:     option barrier disable
115:     option barrier-timeout 120
116:     subvolumes $volname-marker
117: end-volume
118:
119: volume $volname-index
120:     type features/index
121:     option index-base /srv/gfs/$volname/brick/.glusterfs/indices
122:     option xattrop-dirty-watchlist trusted.afr.dirty
123:     option xattrop-pending-watchlist trusted.afr.$volname-
124:     subvolumes $volname-barrier
125: end-volume
126:
127: volume $volname-quota
128:     type features/quota
129:     option volume-uuid $volname
130:     option server-quota off
131:     option deem-statfs off
132:     subvolumes $volname-index
133: end-volume
134:
135: volume $volname-io-stats
136:     type debug/io-stats
137:     option unique-id /srv/gfs/$volname/brick
138:     option log-level WARNING
139:     option sys-log-level CRITICAL
140:     option latency-measurement off
141:     option count-fop-hits off
142:     subvolumes $volname-quota
143: end-volume
144:
145: volume /srv/gfs/$volname/brick
146:     type performance/decompounder
147:     option auth.addr./srv/gfs/$volname/brick.allow xx
148:     option auth-path /srv/gfs/$volname/brick
149:     option auth.login.xx
150:     option auth.login./srv/gfs/$volname/brick.allow xx
151:     subvolumes $volname-io-stats
152: end-volume
153:
154: volume $volname-server
155:     type protocol/server
156:     option transport.socket.listen-port 49153
157:     option rpc-auth.auth-glusterfs on
158:     option rpc-auth.auth-unix on
159:     option rpc-auth.auth-null on
160:     option rpc-auth-allow-insecure on
161:     option transport-type tcp
162:     option transport.address-family inet
163:     option auth.login./srv/gfs/$volname/brick.allow xx
164:     option auth.login.xx
165:     option auth-path /srv/gfs/$volname/brick
166:     option auth.addr./srv/gfs/$volname/brick.allow xx
167:     option inode-lru-limit 90000
168:     option transport.socket.keepalive 1
169:     option lk-heal on
170:     option transport.tcp-user-timeout 0
171:     option transport.socket.keepalive-time 20
172:     option transport.socket.keepalive-interval 2
173:     option transport.socket.keepalive-count 9
174:     option transport.listen-backlog 10
175:     subvolumes /srv/gfs/$volname/brick
176: end-volume
177:
+------------------------------------------------------------------------------+


pending frames:
frame : type(0) op(26)
frame : type(0) op(11)
frame : type(0) op(27)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash:
2017-09-16 08:25:01
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.12.1
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xaa)[0x7fe80ffc49ea]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x2e7)[0x7fe80ffce6c7]
/lib/x86_64-linux-gnu/libc.so.6(+0x354b0)[0x7fe80f3b74b0]
/usr/lib/x86_64-linux-gnu/glusterfs/3.12.1/xlator/features/upcall.so(+0xb8af)[0x7fe80825d8af]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_lk_resume+0x1c2)[0x7fe810051262]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(call_resume+0x75)[0x7fe80ffe7325]
/usr/lib/x86_64-linux-gnu/glusterfs/3.12.1/xlator/performance/io-threads.so(+0x4974)[0x7fe80804b974]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fe80f7536ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fe80f4893dd]


Expected results:
not to crash

Additional info:
fist i was suspisious about cache-invalidation (on), so i also turned off this
feature, but brick daemon has also crashed.

# gluster volume set $volname features.cache-invalidation on
# gluster volume set $volname performance.cache-invalidation on
# /etc/init.d/glusterfs-server stop
# /etc/init.d/glusterfs-server start


Final graph:
+------------------------------------------------------------------------------+
  1: volume $volname-posix
  2:     type storage/posix
  3:     option glusterd-uuid xx
  4:     option directory /srv/gfs/$volname/brick
  5:     option volume-id xx
  6:     option shared-brick-count 1
  7:     option linux-aio on
  8: end-volume
  9:
 10: volume $volname-trash
 11:     type features/trash
 12:     option trash-dir .trashcan
 13:     option brick-path /srv/gfs/$volname/brick
 14:     option trash-internal-op off
 15:     subvolumes $volname-posix
 16: end-volume
 17:
 18: volume $volname-changetimerecorder
 19:     type features/changetimerecorder
 20:     option db-type sqlite3
 21:     option hot-brick off
 22:     option db-name brick.db
 23:     option db-path /srv/gfs/$volname/brick/.glusterfs/
 24:     option record-exit off
 25:     option ctr_link_consistency off
 26:     option ctr_lookupheal_link_timeout 300
 27:     option ctr_lookupheal_inode_timeout 300
 28:     option record-entry on
 29:     option ctr-enabled off
 30:     option record-counters off
 31:     option ctr-record-metadata-heat off
 32:     option sql-db-cachesize 12500
 33:     option sql-db-wal-autocheckpoint 25000
 34:     subvolumes $volname-trash
 35: end-volume
 36:
 37: volume $volname-changelog
 38:     type features/changelog
 39:     option changelog-brick /srv/gfs/$volname/brick
 40:     option changelog-dir /srv/gfs/$volname/brick/.glusterfs/changelogs
 41:     option changelog-barrier-timeout 120
 42:     subvolumes $volname-changetimerecorder
 43: end-volume
 44:
 45: volume $volname-bitrot-stub
 46:     type features/bitrot-stub
 47:     option export /srv/gfs/$volname/brick
 48:     option bitrot disable
 49:     subvolumes $volname-changelog
 50: end-volume
 51:
 52: volume $volname-access-control
 53:     type features/access-control
 54:     subvolumes $volname-bitrot-stub
 55: end-volume
 56:
 57: volume $volname-locks
 58:     type features/locks
 59:     option revocation-secs 1800
 60:     subvolumes $volname-access-control
 61: end-volume
 62:
 63: volume $volname-worm
 64:     type features/worm
 65:     option worm off
 66:     option worm-file-level off
 67:     subvolumes $volname-locks
 68: end-volume
 69:
 70: volume $volname-read-only
 71:     type features/read-only
 72:     option read-only off
 73:     subvolumes $volname-worm
 74: end-volume
 75:
 76: volume $volname-leases
 77:     type features/leases
 78:     option leases off
 79:     subvolumes $volname-read-only
 80: end-volume
 81:
 82: volume $volname-upcall
 83:     type features/upcall
 84:     option cache-invalidation off
 85:     option cache-invalidation-timeout 600
 86:     subvolumes $volname-leases
 87: end-volume
 88:
 89: volume $volname-io-threads
 90:     type performance/io-threads
 91:     subvolumes $volname-upcall
 92: end-volume
 93:
 94: volume $volname-selinux
 95:     type features/selinux
 96:     option selinux on
 97:     subvolumes $volname-io-threads
 98: end-volume
 99:
100: volume $volname-marker
101:     type features/marker
102:     option volume-uuid xx
103:     option timestamp-file /var/lib/glusterd/vols/$volname/marker.tstamp
104:     option quota-version 0
105:     option xtime off
106:     option gsync-force-xtime off
107:     option quota off
108:     option inode-quota off
109:     subvolumes $volname-selinux
110: end-volume
111:
112: volume $volname-barrier
113:     type features/barrier
114:     option barrier disable
115:     option barrier-timeout 120
116:     subvolumes $volname-marker
117: end-volume
118:
119: volume $volname-index
120:     type features/index
121:     option index-base /srv/gfs/$volname/brick/.glusterfs/indices
122:     option xattrop-dirty-watchlist trusted.afr.dirty
123:     option xattrop-pending-watchlist trusted.afr.$volname-
124:     subvolumes $volname-barrier
125: end-volume
126:
127: volume $volname-quota
128:     type features/quota
129:     option volume-uuid $volname
130:     option server-quota off
131:     option deem-statfs off
132:     subvolumes $volname-index
133: end-volume
134:
135: volume $volname-io-stats
136:     type debug/io-stats
137:     option unique-id /srv/gfs/$volname/brick
138:     option log-level WARNING
139:     option sys-log-level CRITICAL
140:     option latency-measurement off
141:     option count-fop-hits off
142:     subvolumes $volname-quota
143: end-volume
144:
145: volume /srv/gfs/$volname/brick
146:     type performance/decompounder
147:     option auth.addr./srv/gfs/$volname/brick.allow xx
148:     option auth-path /srv/gfs/$volname/brick
149:     option auth.login.xx
150:     option auth.login./srv/gfs/$volname/brick.allow xx
151:     subvolumes $volname-io-stats
152: end-volume
153:
154: volume $volname-server
155:     type protocol/server
156:     option transport.socket.listen-port 49153
157:     option rpc-auth.auth-glusterfs on
158:     option rpc-auth.auth-unix on
159:     option rpc-auth.auth-null on
160:     option rpc-auth-allow-insecure on
161:     option transport-type tcp
162:     option transport.address-family inet
163:     option auth.login./srv/gfs/$volname/brick.allow xx
164:     option auth.login.xx
165:     option auth-path /srv/gfs/$volname/brick
166:     option auth.addr./srv/gfs/$volname/brick.allow xx
167:     option inode-lru-limit 90000
168:     option transport.socket.keepalive 1
169:     option lk-heal on
170:     option transport.tcp-user-timeout 0
171:     option transport.socket.keepalive-time 20
172:     option transport.socket.keepalive-interval 2
173:     option transport.socket.keepalive-count 9
174:     option transport.listen-backlog 10
175:     subvolumes /srv/gfs/$volname/brick
176: end-volume
177:
+------------------------------------------------------------------------------+

pending frames:
frame : type(0) op(26)
frame : type(0) op(27)
frame : type(0) op(27)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash:
2017-09-16 09:16:52
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.12.1
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xaa)[0x7f3d59fe19ea]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x2e7)[0x7f3d59feb6c7]
/lib/x86_64-linux-gnu/libc.so.6(+0x354b0)[0x7f3d593d44b0]
/usr/lib/x86_64-linux-gnu/glusterfs/3.12.1/xlator/features/locks.so(+0x18bf1)[0x7f3d4ea69bf1]
/usr/lib/x86_64-linux-gnu/glusterfs/3.12.1/xlator/features/worm.so(+0x2483)[0x7f3d4e845483]
/usr/lib/x86_64-linux-gnu/glusterfs/3.12.1/xlator/features/read-only.so(+0x1f03)[0x7f3d4e63cf03]
/usr/lib/x86_64-linux-gnu/glusterfs/3.12.1/xlator/features/leases.so(+0x6377)[0x7f3d4e42c377]
/usr/lib/x86_64-linux-gnu/glusterfs/3.12.1/xlator/features/upcall.so(+0xba83)[0x7f3d4e215a83]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_lk_resume+0x1c2)[0x7f3d5a06e262]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(call_resume+0x75)[0x7f3d5a004325]
/usr/lib/x86_64-linux-gnu/glusterfs/3.12.1/xlator/performance/io-threads.so(+0x4974)[0x7f3d4e003974]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f3d597706ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f3d594a63dd]

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list