SV: [Gluster-devel] RE: LOOKUP conflict => OPEN failS_

Fredrik Widlund fredrik.widlund at qbrick.com
Mon Feb 8 21:59:16 UTC 2010



Hi,

The system is an iPhone storage/streaming platform.

I had problems with glusterfs and performance a couple of months ago, for example when re-exporting over NFS, but these limitations seem to be gone. I benchmarked a live distribution server to 8Gbps throughput with a 10GbE nic last week, using glusterfs as storagebackend, which is impressive, even though iocache took most of the load. The bottleneck ended up being the glusterfs-server backend which eventually got LOOKUP()-conflicts followed by OPEN() errors and "desynced" identifiers which only reset when the glusterfs-server backend was restarted. I.e. similar to the below pure-ftpd problems.

[2010-01-27 14:13:50] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/download/91001/live/Layer4/prog_index.m3u8) inode (ptr=0x7f2820017460, ino=5305, gen=5431421406367711815) found conflict (ptr=0x7f282000dd50, ino=5305, gen=5431421406367711815)
[2010-01-27 14:14:19] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/download/91001/live/Layer4/prog_index.m3u8) inode (ptr=0x7f28180056c0, ino=5307, gen=5431421406367711864) found conflict (ptr=0x7f2810002910, ino=5307, gen=5431421406367711864)
[2010-01-27 14:15:19] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/download/91001/live/Layer4/prog_index.m3u8) inode (ptr=0x7f2818013290, ino=5301, gen=5431421406367711959) found conflict (ptr=0x7f28180043b0, ino=5301, gen=5431421406367711959)
[2010-01-27 14:16:09] W [fuse-bridge.c:858:fuse_fd_cbk] glusterfs-fuse: 4335255: OPEN() /download/91001/live/Layer4/prog_index.m3u8 => -1 (No such file or directory)
[2010-01-27 14:16:09] W [fuse-bridge.c:858:fuse_fd_cbk] glusterfs-fuse: 4335270: OPEN() /download/91001/live/Layer4/prog_index.m3u8 => -1 (No such file or directory)
[2010-01-27 14:16:09] W [fuse-bridge.c:858:fuse_fd_cbk] glusterfs-fuse: 4335272: OPEN() /download/91001/live/Layer4/prog_index.m3u8 => -1 (No such file or directory)
[...] Multiple OPEN() errors...

The raid+storage backend handles semi-random I/O with 100 concurrent streaming reads at 500MB/s, 40 at around 750MB/s, and 1 at around 1200MB/s, without any obvious performance bottlenecks in glusterfs.

Now, if you could just please add a cachefiles client-side translator similar to the NFS one? ;)

Btw, the below problems seems to be an undefined behaviour in glusterfs. Pure-ftpd manages to "desync" the glusterfs server into failing to read files in the posix storage, until the glusterfs server is restarted.

Kind regards,
Fredrik Widlund

-----Ursprungligt meddelande-----
Från: Tejas N. Bhise [mailto:tejas at gluster.com]
Skickat: den 8 februari 2010 20:06
Till: Fredrik Widlund
Kopia: gluster-devel at nongnu.org
Ämne: Re: [Gluster-devel] RE: LOOKUP conflict => OPEN failS_

Hi Fredrik,

Good to know it works and thanks for letting us know what caused the problem with your setup. Feel free to ask more questions. Just a word of caution - a couple of GlusterFS users recently saw XFS errors - just something to keep at the back of your mind in case you are trying to debug any problems later.

Do let us know more about what you are using the system for and how you have configured it etc - it could be a good use case for others on the user list.

Regards,
Tejas.

----- Original Message -----
From: "Fredrik Widlund" <fredrik.widlund at qbrick.com>
To: gluster-devel at nongnu.org
Sent: Monday, February 8, 2010 10:59:23 PM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi
Subject: [Gluster-devel] RE: LOOKUP conflict => OPEN failS_







Hi,



Ok, it seems to be solved for now. The writer was a pure-ftpd server, and the “-O, atomic replace” flag caused the behavior. I browsed through the code briefly and it uses among other things hard-link schemes to do atomic changes.



Kind regards,

Fredrik Widlund





From: gluster-devel-bounces+fredrik.widlund=qbrick.com at nongnu.org [mailto:gluster-devel-bounces+fredrik.widlund=qbrick.com at nongnu.org] On Behalf Of Fredrik Widlund
Sent: den 8 februari 2010 16:57
To: gluster-devel at nongnu.org
Subject: [Gluster-devel] RE: LOOKUP conflict => OPEN fails_






It’s getting worse and worse. Upgraded to 3.0.2 but to no avail.



The prog_index.m3u8 files are being rewritten every 10 seconds. Every other read of a newly written index-file results in -1 and the file not being available, possibly until the next update of the file.



The strange thing is that until a few days ago this problem wasn’t noticeable at all, and now is huge. The only difference is the quickly growing number of files on the filesystem, now around 190k files.



Kind regards,

Fredrik Widlund




From: gluster-devel-bounces+fredrik.widlund=qbrick.com at nongnu.org [mailto:gluster-devel-bounces+fredrik.widlund=qbrick.com at nongnu.org] On Behalf Of Fredrik Widlund
Sent: den 8 februari 2010 15:02
To: gluster-devel at nongnu.org
Subject: [Gluster-devel] LOOKUP conflict => OPEN fails_






Hi,



I’m running a simple AFR setup, thouch currently with only one backend, and 2 tcp clients. Version is 3.0.0 from jan 20.



Basically one client is writing a large number of files, continuously, and the other client is reading.



I have a growing problem with lookup “conflicts”, resulting in files being listed in directories but where reads are returning “-1 (No such file…”.



Restarting the client does not solve the conflict, but restarting the server does and the files becomes available again.



The filesystem is a 5TB XFS hw raid-5 with around 150k files.



Debug trace of client:

[2010-02-08 13:39:29] N [trace.c:148:trace_open_cbk] replicated: 3073: (op_ret=0, op_errno=117, *fd=0x129a430)

[2010-02-08 13:39:37] N [trace.c:1837:trace_open] replicated: 3094: (loc {path=/download/90910/live/webb1/webb1/Layer3/prog_index.m3u8, ino=5042185}, flags=32768, fd=0x1296fc0, wbflags=0)

[2010-02-08 13:39:37] N [trace.c:148:trace_open_cbk] replicated: 3094: (op_ret=-1, op_errno=2, *fd=0x1296fc0)

[2010-02-08 13:39:37] W [fuse-bridge.c:858:fuse_fd_cbk] glusterfs-fuse: 3094: OPEN() /download/90910/live/webb1/webb1/Layer3/prog_index.m3u8 => -1 (No such file or directory)

[2010-02-08 13:39:38] N [trace.c:1837:trace_open] replicated: 3100: (loc {path=/download/90910/live/webb1/webb1/Layer4/prog_index.m3u8, ino=5013773}, flags=32768, fd=0x1296fc0, wbflags=0)

[2010-02-08 13:39:38] N [trace.c:148:trace_open_cbk] replicated: 3100: (op_ret=0, op_errno=117, *fd=0x1296fc0)

[2010-02-08 13:39:38] N [trace.c:1837:trace_open] replicated: 3106: (loc {path=/download/90910/live/webb1/webb1/Layer4/Period1/segment277.ts, ino=5050371}, flags=32768, fd=0x129a430, wbflags=0)

[…]



And server:

[2010-02-08 13:39:09] D [dict.c:303:dict_get] dict: @this=(nil) @key=0x7fedee4e43f3

[2010-02-08 13:39:09] D [dict.c:303:dict_get] dict: @this=(nil) @key=0x7fedee4e440b

[2010-02-08 13:39:17] D [server-protocol.c:2037:server_open_cbk] server: 1719: OPEN (null) (0) ==> -1 (No such file or directory)

[2010-02-08 13:39:18] D [server-protocol.c:2037:server_open_cbk] server: 1724: OPEN (null) (0) ==> -1 (No such file or directory)

[2010-02-08 13:39:28] D [server-resolve.c:238:resolve_path_deep] store0: RESOLVE OPEN() seeking deep resolution of /download/90910/live/webb1/webb1/Layer3/prog_index.m3u8

[2010-02-08 13:39:28] D [dict.c:303:dict_get] dict: @this=(nil) @key=0x7fedee4e43db

[2010-02-08 13:39:28] D [dict.c:303:dict_get] dict: @this=(nil) @key=0x7fedee4e43f3

[2010-02-08 13:39:28] D [dict.c:303:dict_get] dict: @this=(nil) @key=0x7fedee4e440b

[2010-02-08 13:39:28] D [dict.c:303:dict_get] dict: @this=(nil) @key=0x7fedee4e43db

[…]



Kind regards,

Fredrik Widlund




_______________________________________________
Gluster-devel mailing list
Gluster-devel at nongnu.org
http://lists.nongnu.org/mailman/listinfo/gluster-devel






More information about the Gluster-devel mailing list