[Gluster-users] Few issues with AFR resync

Deian Chepishev dchepishev at nexbrod.com
Wed Jun 18 12:08:49 UTC 2008


Hello,

You will find my answers in the text below.

Krishna Srinivas wrote:
> On Tue, Jun 17, 2008 at 5:47 PM, Deian Chepishev <dchepishev at nexbrod.com> wrote:
>   
>> Hello guys,
>>
>> Sorry if this has been discussed before, but the mailing list archives
>> are not available and I could not check there first.
>>
>> I have few strange issues with AFR and one config question.
>>
>> Here they are ;) :
>>
>> I have the following setup:
>> 6 servers and 3 clients.
>> For the test I use only 2 servers in AFR and 1 client
>> All the software is running under VMWARE virtual machines
>> Centos 5.1
>> Kernel: 2.6.18-53.1.21.el5
>> GFS: glusterfs-1.3.9-1
>> FUSE: fuse-2.7.3glfs10-1
>> I export one single partition with XFS filesystem
>>
>> Here are the config files:
>> Server config:
>> ================================================
>> volume brick
>>  type storage/posix
>>  option directory /data/export
>> end-volume
>>
>> volume server
>>  type protocol/server
>>  option transport-type tcp/server
>>  option auth.ip.brick.allow *
>>  subvolumes brick
>> end-volume
>> ================================================
>> Client Config:
>> ================================================
>> volume brick1
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 10.100.1.1
>>  option remote-subvolume brick
>> end-volume
>>
>> volume brick2
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host 10.100.1.2
>>  option remote-subvolume brick
>> end-volume
>>
>> volume afr1
>>  type cluster/afr
>>  subvolumes brick1 brick2
>> end-volume
>> ================================================
>>
>> I mount the volume with the following command:
>> glusterfs -f /etc/glusterfs/glusterfs-client.vol /mnt/glusterfs
>>
>>
>> The problems I am facing are the following
>> Problem 1
>>
>> a) I start both servers and mount on one client
>> b) I create some files and dirs on the mount point
>> c) killall glusterfs on server1
>> d) create some new dirs and files
>> e) start glusterfs again on server1
>>
>> At this point it just does not want to replicate. I tried unmount/mount
>> the underlaying filesystem or even recreating it but no luck, it just
>> dont want to replicate. The things get better if I reboot server1
>> machine. After the reboot replication start to work and replicates the
>> files.
>> However another problem pops up.
>>     
>
> Got it. When you bring back the downed server, can you "cd" out of
> glusterfs mount point and "cd" back in and see if things work fine?
>   


This fixes the problem and replicates the directory tree, but the 
problem is that if for example I have some Apache serving files from 
this tree the only way to make it go out and then back in the same 
directory is to restart it which is not cool because during the time 
before the restart the users will get error 404.


>   
>> Problem 1A:
>>
>> All the synced files/dirs have creation time Jan  1  1970 and I dont see
>> a way to fix this.
>> If I create new file on the client the date is fine, however if I start
>> from clean partition all the synced files have date Jan 1 1970
>>
>> The time on the servers and clients is in sync with ntpd.
>>
>>     
>
> That is fine, they are not completely replicated, their time stamps
> will be set right when the actual content sync happens. At this point
> the files are just created, not yet synced. you have to open() the
> files to sync.
>   

You are correct about this. head -1 * > /dev/null fixed the dates.

>   
>> Problem 2/mnt/glusterfs/a/b/c/d/e/f/g/h/i
>>
>>
>>    a) I create some files/dirs while both servers are up
>>    b) killall glusterfs on server1
>>    c) mkdir -p /mnt/glusterfs/1/2/3/4/5/6/7/8 - on the client
>>    d) cd /mnt/glusterfs/1/2/3/4/5/6/7/8 - on the client
>>    e) start glusgerfs on server1 - glusterfs -f
>> /etc/glusterfs/glusterfs-server.vol
>>    f)  on the client I do - touch some files here and there
>>       and I get this error:
>> [root at r3cl1 8]# touch some files here and there
>> touch: cannot touch `some': No such file or directory
>> touch: cannot touch `files': No such file or directory
>> touch: cannot touch `here': No such file or directory
>> touch: cannot touch `and': No such file or directory
>> touch: cannot touch `there': No such file or directory
>>
>>     
>
> Again can you see if "cd"ing out and back in after you bring
> the downed server up fixes the problem?
>
> We are thinking about how to fix this.
>   

The ugly part about this behavior is that when I start again the 
"crashed" server, entire directory trees suddenly disappear on the 
client machine and causes lots of troubles.


Thank you.

Regards,
Deian







More information about the Gluster-users mailing list