[Gluster-devel] very strange in disk reads

Thu Feb 28 18:26:00 UTC 2008

Hello,

I have AFR setup on 2 nodes. The nodes are on a wireless connection, so I
can really see a performance lag when reading data.  On each node, a
glusterfs client is mounted pointing to the same glusterfs directory.

Here is the strange part:

node 1:
read file1.txt
read file2.txt
node 2:
read file1.txt
read file2.txt

We are seeing inconsistent performance from one of the nodes. Suddenly
instead of reading over the gluster fs network, it seems to be reading the
file only from the local node's gluster filesystem as if failover occurred.

word count on file1.txt  takes 5 seconds on node 1 and node 2

Then all of a sudden word count on file1.txt takes .2 seconds on node 1, but
same 5 seconds on node 2, almost as if the client on node 1 saw a terminated
connection of node 2 and only pulled locally.

Now here is the stranger part.
file2.txt takes 5 seconds to obtain a word count on node 1 and node 2
consistently. Even when file1.txt on node 1 takes .2 seconds to obtain a
word count, file2.txt on that same node still takes 5 seconds.

We are using round robin for writes (probably unrelated)

No errors or messages of disconnect in glusterfs.log or glusterfsd.log. All
seems healthy.

Using posix-locks and AFR. No other features/etc

Any ideas? Thanks.

Billy