[Bugs] [Bug 1627620] SAS job aborts complaining about file doesn't exist

bugzilla at redhat.com bugzilla at redhat.com
Tue Sep 11 04:54:13 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1627620

Raghavendra G <rgowdapp at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|bugs at gluster.org            |rgowdapp at redhat.com



--- Comment #2 from Raghavendra G <rgowdapp at redhat.com> ---
>From logs, I can see that
/sasdata/bulked/base_forecast/partition_416_594_1_1.sas7bdat is a destination
of rename:

sasdata.log:[2018-09-09 02:40:10.772241] I [MSGID: 109066]
[dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming
/bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck
(84ea03de-3032-44ff-81fc-8cd47cf14694)
(hash=sasdata-replicate-0/cache=sasdata-replicate-0) =>
/bulked/base_forecast/partition_416_594_1_1.sas7bdat ((null))
(hash=sasdata-replicate-1/cache=<nul>) 
sasdata.log:[2018-09-09 02:40:10.861406] I [MSGID: 109066]
[dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming
/bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck
(cbb8c3dd-9317-4398-b92f-39146ab26bed)
(hash=sasdata-replicate-0/cache=sasdata-replicate-0) =>
/bulked/base_forecast/partition_416_594_1_1.sas7bdat
(84ea03de-3032-44ff-81fc-8cd47cf14694)
(hash=sasdata-replicate-1/cache=sasdata-replicate-0) 
sasdata.log:[2018-09-10 13:43:16.713796] I [MSGID: 109066]
[dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming
/bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck
(b65e9f74-3151-40ae-b87b-2450a95796dd)
(hash=sasdata-replicate-0/cache=sasdata-replicate-0) =>
/bulked/base_forecast/partition_416_594_1_1.sas7bdat ((null))
(hash=sasdata-replicate-1/cache=<nul>) 
sasdata.log:[2018-09-10 13:43:16.961512] I [MSGID: 109066]
[dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming
/bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck
(c36b2bc3-eebd-413f-91c4-894caa18e89c)
(hash=sasdata-replicate-0/cache=sasdata-replicate-0) =>
/bulked/base_forecast/partition_416_594_1_1.sas7bdat
(b65e9f74-3151-40ae-b87b-2450a95796dd)
(hash=sasdata-replicate-1/cache=sasdata-replicate-0) 

So, what's causing SAS to error out is lack of rename atomicity on glusterfs.
Note that as per rename atomicity rules, operations on destination should never
fail. However, here they are failing. So, its a bug in glusterfs.

Normally path based operations happen in two parts:
* path is resolved into a unique handle. This is done through lookups
* Actual operations like open happen on this handle

Now when rename overwrites the destination, the inode associated with
destination changes, so the previous handle/inode returned by lookup on dst
before rename are no longer present in fs. But, when open happens on old
handle, it fails.

[root at rhs-client22 glusterfs]# grep OPEN sasdata.log*
sasdata.log:[2018-09-09 01:34:14.671548] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51854: OPEN
a9d26828-24fe-4da9-b3a0-d1693bf21f34 resolution failed
sasdata.log:[2018-09-09 01:34:14.680534] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51873: OPEN
e129abc7-7762-46e5-a704-99d7ea586c1e resolution failed
sasdata.log:[2018-09-09 01:57:37.204277] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 563046: OPEN
0eb6df58-5b25-4f4e-aabc-df1a0f615b93 resolution failed
sasdata.log:[2018-09-09 02:40:10.820746] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967246: OPEN
d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:10.937086] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967646: OPEN
d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.551310] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972175: OPEN
d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.604315] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972201: OPEN
d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.687723] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972240: OPEN
d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:41:27.403833] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982942: OPEN
1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-09 02:41:27.405453] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982946: OPEN
1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-09 02:41:27.407134] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982952: OPEN
1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-10 13:43:17.007100] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1011630: OPEN
b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:47:50.980324] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081799: OPEN
b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:47:50.982690] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081806: OPEN
b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:49:45.617909] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117549: OPEN
b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:49:45.748022] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117585: OPEN
b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed

VFS provides a way out of this by a mechanism of retrying (exactly once) the
lookup and operation like open on receving an error with errno ESTALE. What
might've caused the issue here is that the error returned by Glusterfs is
ENOENT instead of ESTALE (note the error reported by SAS, it says
"P_BF.PARTITION_416_594_1_1.DATA does not exist" not "stale file handle").

We've evidence for that from fuse-dump of /sasdata,
2018-09-10T19:19:45.73953758+05:30 "GLUSTER\xf5" OPEN {Len:48 Opcode:14
Unique:1117585 Nodeid:140296383928016 Uid:0 Gid:0 Pid:29342 Padding:0}
{Flags:32770 Unused:0}
2018-09-10T19:19:45.748056325+05:30 "GLUSTER\xf5" {Len:16 Error:-2
Unique:1117585} ""
2018-09-10T19:19:45.617027131+05:30 "GLUSTER\xf5" OPEN {Len:48 Opcode:14
Unique:1117549 Nodeid:140296383928016 Uid:0 Gid:0 Pid:29330 Padding:0}
{Flags:32768 Unused:0} 
2018-09-10T19:19:45.617932073+05:30 "GLUSTER\xf5" {Len:16 Error:-2
Unique:1117549} "" 

Note the error returned in above two calls its -2 (ENOENT) and not ESTALE. This
would prevent kernel from retrying.

Note that earlier as part of readdirplus, partition_416_594_1_1.sas7bdat
resolved to Nodeid:140296383928016
[{Nodeid:140296383928016 Generation:0 EntryValid:1 AttrValid:1 EntryValidNsec:0
AttrValidNsec:0 Attr:{Ino:13293258653971224285 Size:196608 Blocks:264
Atime:1536586996 Mtime:1536586997 Ctime:1536586997 Atimensec:836653129
Mtimensec:125644779 Ctimensec:126644750 Mode:33188 Nlink:1 Uid:0 Gid:0 Rdev:0
Blksize:131072 Padding:0}} {Ino:13293258653971224285 Off:1441 Namelen:30 Type:8
Name:[]} partition_416_594_1_1.sas7bdat]

Earlier, I had submitted a patch to convert ENOENT errors to ESTALE,
https://review.gluster.org/#/c/glusterfs/+/18521/

But this patch had missed out the codepath of open failing when inode
resolution fails. As per the logs above, inode resolution in fact failed:
[root at rhs-client22 glusterfs]# grep OPEN sasdata.log*
sasdata.log:[2018-09-09 01:34:14.671548] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51854: OPEN
a9d26828-24fe-4da9-b3a0-d1693bf21f34 resolution failed
sasdata.log:[2018-09-09 01:34:14.680534] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51873: OPEN
e129abc7-7762-46e5-a704-99d7ea586c1e resolution failed
sasdata.log:[2018-09-09 01:57:37.204277] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 563046: OPEN
0eb6df58-5b25-4f4e-aabc-df1a0f615b93 resolution failed
sasdata.log:[2018-09-09 02:40:10.820746] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967246: OPEN
d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:10.937086] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967646: OPEN
d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.551310] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972175: OPEN
d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.604315] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972201: OPEN
d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.687723] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972240: OPEN
d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:41:27.403833] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982942: OPEN
1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-09 02:41:27.405453] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982946: OPEN
1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-09 02:41:27.407134] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982952: OPEN
1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-10 13:43:17.007100] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1011630: OPEN
b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:47:50.980324] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081799: OPEN
b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:47:50.982690] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081806: OPEN
b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:49:45.617909] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117549: OPEN
b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:49:45.748022] E
[fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117585: OPEN
b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed

So, patching up those codepaths to convert ENOENT to ESTALE will fix the issue.
Also, its recommended to make sure all inode based operations (like GETATTR,
SETATTR etc) never fail with ENOENT, but instead fail with ESTALE.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list