[Bugs] [Bug 1343906] [Stress/Scale] : I/ O errors out from gNFS mount points during high load on an erasure coded volume , Logs flooded with Error messages.

bugzilla at redhat.com bugzilla at redhat.com
Wed Jun 8 10:56:19 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1343906

Pranith Kumar K <pkarampu at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|needinfo?(pkarampu at redhat.c |
                   |om)                         |



--- Comment #3 from Pranith Kumar K <pkarampu at redhat.com> ---
Description of problem:
------------------------

Had a 1*(4+2) volume.Added bricks,scaled it till 3*(4+2).Ran rebalance each
time.
In the meantime,I/O errored out on 2 of my clients :

dd: error writing ‘stress3’: Input/output error
8399+0 records in
8398+0 records out

Untaring the tarball failed as well.

Details about sos,the exact workload as well as error messages from logs in
comments

How reproducible:
-----------------

Reporting the first occurrence.

Steps to Reproduce:
-------------------

1. Create an EC volume.Mount it on multiple clients via gNFS.Add
bricks,rebalance.

2. Run all kinds of I/O from various mount points

3. Check for errors in logs/application side.

Actual results:
---------------

I/O errors out.
Logs flooded with error messages.

Expected results:
-----------------

I/Os on the application side should not be affected.

Additional info:
----------------

[root at gqas013 glusterfs]# gluster v info

Volume Name: drogon
Type: Distributed-Disperse
Volume ID: 6d49ee45-1048-4325-96fb-c14ac5e278e8
Status: Started
Number of Bricks: 3 x (4 + 2) = 18
Transport-type: tcp
Bricks:
Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickA
Brick2: gqas011.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickB
Brick3: gqas005.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickC
Brick4: gqas006.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickD
Brick5: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickE
Brick6: gqas011.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickF
Brick7: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickG
Brick8: gqas011.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickH
Brick9: gqas005.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickI
Brick10: gqas006.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickJ
Brick11: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickK
Brick12: gqas011.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickL
Brick13: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickM
Brick14: gqas011.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickN
Brick15: gqas005.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickO
Brick16: gqas006.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickP
Brick17: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickQ
Brick18: gqas011.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brickR
Options Reconfigured:
performance.readdir-ahead: on
[root at gqas013 glusterfs]# 
A ) From the rebalance log :

I see a lot of assertion failed messages here :

[2016-06-06 15:10:53.999653] E [ec-inode-write.c:395:ec_manager_setattr]
(-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_resume+0x91)
[0x7fd263b5e621]
-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(__ec_manager+0x57)
[0x7fd263b5e807]
-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_manager_setattr+0x2c6)
[0x7fd263b7be76] ) 0-: Assertion failed: ec_get_inode_size(fop,
fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size)
[2016-06-06 15:10:54.003509] E [ec-inode-write.c:395:ec_manager_setattr]
(-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_resume+0x91)
[0x7fd263b5e621]
-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(__ec_manager+0x57)
[0x7fd263b5e807]
-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_manager_setattr+0x2c6)
[0x7fd263b7be76] ) 0-: Assertion failed: ec_get_inode_size(fop,
fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size)
[2016-06-06 15:10:54.012540] E [ec-inode-write.c:395:ec_manager_setattr]
(-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_resume+0x91)
[0x7fd263b5e621]
-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(__ec_manager+0x57)
[0x7fd263b5e807]
-->/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so(ec_manager_setattr+0x2c6)
[0x7fd263b7be76] ) 0-: Assertion failed: ec_get_inode_size(fop,
fop->locks[0].lock->loc.inode, &cbk->iatt[0].ia_size)
[2016-06-06 15:13:15.333800] E [MSGID: 109023]
[dht-rebalance.c:2330:gf_defrag_get_entry] 0-drogon-dht: Migrate file
failed:/d4/linux-4.6.1/Documentation/arm/Porting lookup failed


B ) The brick logs are literally flooded with bad file descriptor errors :

bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.275139] E [MSGID: 115073]
[server-rpc-fops.c:1907:server_fxattrop_cbk] 0-drogon-server: 16513: FXATTROP
-2 (c812a0d6-a85d-41f6-a7e6-51630285ec60) ==> (File descriptor in bad state)
[File descriptor in bad state]
bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.275220] E [MSGID: 113040]
[posix-helpers.c:1686:__posix_fd_ctx_get] 0-drogon-posix: Failed to get
anonymous real_path:
/bricks/testvol_brickA/.glusterfs/14/c7/14c711cb-11c7-4c8e-8893-de71849e9eb9
_fd = -1 [No such file or directory]
bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.275290] E [MSGID: 115073]
[server-rpc-fops.c:1907:server_fxattrop_cbk] 0-drogon-server: 16514: FXATTROP
-2 (14c711cb-11c7-4c8e-8893-de71849e9eb9) ==> (File descriptor in bad state)
[File descriptor in bad state]
bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.275393] E [MSGID: 113040]
[posix-helpers.c:1686:__posix_fd_ctx_get] 0-drogon-posix: Failed to get
anonymous real_path:
/bricks/testvol_brickA/.glusterfs/80/3b/803b05cb-4e6f-457c-88b1-2a608530757c
_fd = -1 [No such file or directory]
bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.275468] E [MSGID: 115073]
[server-rpc-fops.c:1907:server_fxattrop_cbk] 0-drogon-server: 16515: FXATTROP
-2 (803b05cb-4e6f-457c-88b1-2a608530757c) ==> (File descriptor in bad state)
[File descriptor in bad state]
bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.275652] E [MSGID: 113040]
[posix-helpers.c:1686:__posix_fd_ctx_get] 0-drogon-posix: Failed to get
anonymous real_path:
/bricks/testvol_brickA/.glusterfs/8b/fe/8bfe4154-b137-4a84-a265-b541574adca9
_fd = -1 [No such file or directory]
bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.275714] E [MSGID: 115073]
[server-rpc-fops.c:1907:server_fxattrop_cbk] 0-drogon-server: 16516: FXATTROP
-2 (8bfe4154-b137-4a84-a265-b541574adca9) ==> (File descriptor in bad state)
[File descriptor in bad state]
bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.275786] E [MSGID: 113040]
[posix-helpers.c:1686:__posix_fd_ctx_get] 0-drogon-posix: Failed to get
anonymous real_path:
/bricks/testvol_brickA/.glusterfs/49/c3/49c3cbec-3b03-4ec2-968a-89e8bda401c6
_fd = -1 [No such file or directory]
bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.275851] E [MSGID: 115073]
[server-rpc-fops.c:1907:server_fxattrop_cbk] 0-drogon-server: 16518: FXATTROP
-2 (49c3cbec-3b03-4ec2-968a-89e8bda401c6) ==> (File descriptor in bad state)
[File descriptor in bad state]
bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.275979] E [MSGID: 113040]
[posix-helpers.c:1686:__posix_fd_ctx_get] 0-drogon-posix: Failed to get
anonymous real_path:
/bricks/testvol_brickA/.glusterfs/f8/6f/f86fe79c-7ad0-4d9c-8e5d-26ec63f14696
_fd = -1 [No such file or directory]
bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.276039] E [MSGID: 115073]
[server-rpc-fops.c:1907:server_fxattrop_cbk] 0-drogon-server: 16519: FXATTROP
-2 (f86fe79c-7ad0-4d9c-8e5d-26ec63f14696) ==> (File descriptor in bad state)
[File descriptor in bad state]
bricks/bricks-testvol_brickA.log:[2016-06-06 15:09:58.276223] E [MSGID: 113040]
[posix-helpers.c:1686:__posix_fd_ctx_get] 0-drogon-posix: Failed to get
anonymous real_path:
/bricks/testvol_brickA/.glusterfs/c4/8a/c48aff49-de3a-46e8-9445-8064c4ca19c5
_fd = -1 [No such file or directory]

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list