[Gluster-devel] error while reading from an open file
Brian Hirt
bhirt at mobygames.com
Wed Sep 2 16:06:41 UTC 2009
On Sep 2, 2009, at 7:12 AM, Vijay Bellur wrote:
> Brian Hirt wrote:
>>
>> The first part of this problem (open files not surviving gluster
>> restarts) seems like a pretty major design flaw that needs to be
>> fixed.
> Yes, we do know that this is a problem and we have our sights set on
> solving this.
That is good to know. Do you know if is this planned on being back
ported into 2.0 or is it going to be part of 2.1? Is there a bug
report id so we can follow the progress?
>
>> The second part (gluster not reporting the error to the writer when
>> gluster chokes) is a critical problem that needs to be fixed.
>
> This is a bug in the write-behind translator and bug 242 has been
> tracked to address this.
>
> A discussion from the mailing list archives which could be of
> interest to you for the tail -f problem:
>
> http://gluster.org/pipermail/gluster-users/20090113/001362.html
>
Is there any additional information I can provide in this bug
report? I have disabled the following section from my test clients
and can confirm that some errors that were not being reported are now
being sent back to the writer program. It's certainly an improvement
over no errors being reported.
volume writebehind
type performance/write-behind
option window-size 1MB
subvolumes distribute
end-volume
I've also discovered, that this problem is not isolated to the
writebehind module. While some errors are being sent back to the
writer, there is still data corruption in the gluster created file.
Gluster is still reporting success to the writer when writes have
failed. I have a simple program that writes 1, 2, 3, 4 ... N to a
file at the rate of 100 lines per second. Whenever the writer gets an
error returned from write() it waits a second, reopens the file and
continues writing. While this writer is writing, I restart the
gluster nodes one by one. Once this is done, I stop the writter and
check it for corruption.
One interesting observation I have made is that when restarting the
gluster servers, sometimes errorno EBADFD is returned and sometimes
it's ENOTCONN. When errno is ENOTCONN (107 in ubuntu 9.04) the file
is not corrupted. When errno is EBADFD (77 in ubuntu 9.04) there is
file corruption. These statements are based on a limited number of
test runs, but were always true for me.
Some sample output of some tests:
bhirt at ubuntu:~/gluster-tests$ rm -f /unify/m/test1.2009-09-02 && ./
write-numbers /unify/m/test1.2009-09-02
problems writting to fd, reopening logfile (errno = 77) in one second
^C
bhirt at ubuntu:~/gluster-tests$ ./check-numbers /unify/m/test1.2009-09-02
169 <> 480
bhirt at ubuntu:~/gluster-tests$ rm -f /unify/m/test1.2009-09-02 && ./
write-numbers /unify/m/test1.2009-09-02
problems writting to fd, reopening logfile (errno = 107) in one second
^C
bhirt at ubuntu:~/gluster-tests$ ./check-numbers /unify/m/test1.2009-09-02
OK
The programs I use to test this are:
bhirt at ubuntu:~/gluster-tests$ cat write-numbers.c check-numbers
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#define BUFSIZE 65536
/* write 100 entries per second */
#define WRITE_DELAY 1000000 / 100
int open_testfile(char *testfile)
{
int fd;
fd = open(testfile, O_WRONLY | O_CREAT | O_APPEND, 0666);
if (fd < 0) {
perror("open");
exit(2);
}
return(fd);
}
void usage(char *s)
{
fprintf(stderr, "\nusage: %s testfile\n\n",s);
}
int main (int argc, char **argv)
{
char buf[BUFSIZE];
int logfd;
int nread;
int counter = 0;
if (argc != 2) {
usage(argv[0]);
exit(1);
}
logfd = open_testfile(argv[1]);
/* loop endlessly */
for (;;) {
snprintf(buf, sizeof(buf), "%d\n",counter);
nread = strnlen(buf,sizeof(buf));
/* write data */
int nwrite = write(logfd, buf, nread);
if (nwrite == nread) {
counter++;
usleep(WRITE_DELAY);
} else {
/* restarted gluster nodes give this error in 2.0.6 */
if (errno == EBADFD || errno == ENOTCONN)
{
/* wait a second before re-opening the file */
fprintf(stderr,"problems writting to fd, reopening logfile
(errno = %d) in one second\n",errno);
sleep(1);
/* reopen log file, and set write again flag so the data tries
to get written back */
logfd = open_testfile(argv[1]);
}
else
{
perror("write");
exit(2);
}
}
}
}
#!/usr/bin/perl
use strict;
use warnings;
my $i=0;
while (<>) { die "$i <> $_" if $i++ != $_; }
print STDERR "OK\n";
The client log file during one of the tests I ran.
[2009-09-02 09:59:23] E [saved-frames.c:165:saved_frames_unwind]
remote1: forced unwinding frame type(1) op(FINODELK)
[2009-09-02 09:59:23] N [client-protocol.c:6246:notify] remote1:
disconnected
[2009-09-02 09:59:23] E [socket.c:745:socket_connect_finish] remote1:
connection to 10.0.1.31:6996 failed (Connection refused)
[2009-09-02 09:59:26] N [client-protocol.c:5559:client_setvolume_cbk]
remote1: Connected to 10.0.1.31:6996, attached to remote volume 'brick'.
[2009-09-02 09:59:30] E [saved-frames.c:165:saved_frames_unwind]
remote2: forced unwinding frame type(1) op(WRITE)
[2009-09-02 09:59:30] W [fuse-bridge.c:1534:fuse_writev_cbk] glusterfs-
fuse: 153358: WRITE => -1 (Transport endpoint is not connected)
[2009-09-02 09:59:30] N [client-protocol.c:6246:notify] remote2:
disconnected
[2009-09-02 09:59:30] E [socket.c:745:socket_connect_finish] remote2:
connection to 10.0.1.32:6996 failed (Connection refused)
[2009-09-02 09:59:33] N [client-protocol.c:5559:client_setvolume_cbk]
remote2: Connected to 10.0.1.32:6996, attached to remote volume 'brick'.
[2009-09-02 09:59:34] N [client-protocol.c:5559:client_setvolume_cbk]
remote1: Connected to 10.0.1.31:6996, attached to remote volume 'brick'.
[2009-09-02 09:59:37] E [saved-frames.c:165:saved_frames_unwind]
remote1: forced unwinding frame type(1) op(FINODELK)
[2009-09-02 09:59:37] W [fuse-bridge.c:1534:fuse_writev_cbk] glusterfs-
fuse: 153923: WRITE => -1 (File descriptor in bad state)
[2009-09-02 09:59:37] N [client-protocol.c:6246:notify] remote1:
disconnected
[2009-09-02 09:59:40] N [client-protocol.c:5559:client_setvolume_cbk]
remote1: Connected to 10.0.1.31:6996, attached to remote volume 'brick'.
[2009-09-02 09:59:41] N [client-protocol.c:5559:client_setvolume_cbk]
remote2: Connected to 10.0.1.32:6996, attached to remote volume 'brick'.
[2009-09-02 09:59:44] N [client-protocol.c:5559:client_setvolume_cbk]
remote1: Connected to 10.0.1.31:6996, attached to remote volume 'brick'.
[2009-09-02 09:59:51] W [fuse-bridge.c:882:fuse_err_cbk] glusterfs-
fuse: 155106: FLUSH() ERR => -1 (File descriptor in bad state)
[2009-09-02 09:59:51] W [fuse-bridge.c:882:fuse_err_cbk] glusterfs-
fuse: 155108: FLUSH() ERR => -1 (File descriptor in bad state)
> Regards,
> Vijay
>
>
More information about the Gluster-devel
mailing list