[Gluster-users] EBADF after add-brick/self-heal operation in Gluster 3.7.15

Rama Shenai rama at unity3d.com
Fri Oct 21 18:33:50 UTC 2016


Hi Gluster Team,

We saw a bunch of  intermittent EBADF errors on clients, We saw these
errors immediately after an add-brick operation followed by a self-heal of
that volume. We are wondering if these errors might close file descriptors
prematurely. causing problems on files we had open/memory-mapped

Below are the errors that we saw.  Any thoughts on this , as well as input
in avoiding this when we do live add-brick operations in the future is much
appreciated.

[2016-10-19 15:11:53.372930] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c003e7c
inode-gfid:b1e19a5b-7867-4cb3-8bf0-545df3c5d556) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)
[2016-10-19 15:11:53.373058] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c003e7c
inode-gfid:b1e19a5b-7867-4cb3-8bf0-545df3c5d556) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)
[2016-10-19 15:11:53.373105] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c0065b8
inode-gfid:9231d39d-d88c-4a62-b25d-fad232ec9b98) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)
[2016-10-19 15:11:53.373121] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c0065b8
inode-gfid:9231d39d-d88c-4a62-b25d-fad232ec9b98) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)
[2016-10-19 15:11:53.373138] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c005ac0
inode-gfid:a0b02209-59c9-418b-bff0-fb31be01b3e8) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)
[2016-10-19 15:11:53.373155] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c005ac0
inode-gfid:a0b02209-59c9-418b-bff0-fb31be01b3e8) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)
[2016-10-19 15:11:53.373172] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c004e18
inode-gfid:c490e1fe-3ac8-4c11-9c62-fc8672a27737) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)
[2016-10-19 15:11:53.373199] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c004e18
inode-gfid:c490e1fe-3ac8-4c11-9c62-fc8672a27737) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)
[2016-10-19 15:11:53.373231] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c0037bc
inode-gfid:1aab19de-f4b1-47bf-8216-d174797ae64d) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)
[2016-10-19 15:11:53.373245] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c0037bc
inode-gfid:1aab19de-f4b1-47bf-8216-d174797ae64d) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)
[2016-10-19 15:11:53.373271] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c004b90
inode-gfid:5c3d5a39-26f0-4211-a2f0-59de33ea5ade) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)
[2016-10-19 15:11:53.373287] W [fuse-resolve.c:556:fuse_resolve_fd]
0-fuse-resolve: migration of basefd (ptr:0x7f6d5c004b90
inode-gfid:5c3d5a39-26f0-4211-a2f0-59de33ea5ade) did not complete, failing
fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2)

Volume information
$ sudo gluster volume info
Volume Name: volume1
Type: Replicate
Volume ID: 3bcca83e-2be5-410c-9a23-b159f570ee7e
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: ip-172-25-2-91.us-west-1.compute.internal:/data/
glusterfs/volume1/brick1/brick
Brick2: ip-172-25-2-206.us-west-1.compute.internal:/data/
glusterfs/volume1/brick1/brick
Brick3: ip-172-25-33-75.us-west-1.compute.internal:/data/glusterfs/volume1/brick1/brick
 <-- brick added
Options Reconfigured:
cluster.quorum-type: fixed
cluster.quorum-count: 2


Client translator configuration (from the client log)
  1: volume volume1-client-0
  2:     type protocol/client
  3:     option ping-timeout 42
  4:     option remote-host ip-172-25-2-91.us-west-1.compute.internal
  5:     option remote-subvolume /data/glusterfs/volume1/brick1/brick
  6:     option transport-type socket
  7:     option send-gids true
  8: end-volume
  9:
 10: volume volume1-client-1
 11:     type protocol/client
 12:     option ping-timeout 42
 13:     option remote-host ip-172-25-2-206.us-west-1.compute.internal
 14:     option remote-subvolume /data/glusterfs/volume1/brick1/brick
 15:     option transport-type socket
 16:     option send-gids true
 17: end-volume
 18:
 19: volume volume1-client-2
 20:     type protocol/client
 21:     option ping-timeout 42
 22:     option remote-host ip-172-25-33-75.us-west-1.compute.internal
 23:     option remote-subvolume /data/glusterfs/volume1/brick1/brick
 24:     option transport-type socket
 25:     option send-gids true
 26: end-volume
 27:
 28: volume volume1-replicate-0
 29:     type cluster/replicate
 30:     option quorum-type fixed
 31:     option quorum-count 2
 32:     subvolumes volume1-client-0 volume1-client-1 volume1-client-2
 33: end-volume
 34:
 35: volume volume1-dht
 36:     type cluster/distribute
 37:     subvolumes volume1-replicate-0
 38: end-volume
 39:
 40: volume volume1-write-behind
 41:     type performance/write-behind
 42:     subvolumes volume1-dht
 43: end-volume
 44:
 45: volume volume1-read-ahead
 46:     type performance/read-ahead
 47:     subvolumes volume1-write-behind
 48: end-volume
 49:
 50: volume volume1-io-cache
 51:     type performance/io-cache
 52:     subvolumes volume1-read-ahead
 53: end-volume
 54:
 55: volume volume1-quick-read
 56:     type performance/quick-read
 57:     subvolumes volume1-io-cache
 58: end-volume
 59:
 60: volume volume1-open-behind
 61:     type performance/open-behind
 62:     subvolumes volume1-quick-read
 63: end-volume
 64:
 65: volume volume1-md-cache
 66:     type performance/md-cache
 67:     option cache-posix-acl true
 68:     subvolumes volume1-open-behind
 69: end-volume
 70:
 71: volume volume1
 72:     type debug/io-stats
 73:     option log-level INFO
 74:     option latency-measurement off
 75:     option count-fop-hits off
 76:     subvolumes volume1-md-cache
 77: end-volume
 78:
 79: volume posix-acl-autoload
 80:     type system/posix-acl
 81:     subvolumes volume1
 82: end-volume
 83:
 84: volume meta-autoload
 85:     type meta
 86:     subvolumes posix-acl-autoload
 87: end-volume
 88:
+------------------------------------------------------------------------------+

Thanks
Rama
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161021/e15c40e7/attachment.html>


More information about the Gluster-users mailing list