[Gluster-devel] This bug hunt just gets weirder...
nicolas prochazka
prochazka.nicolas at gmail.com
Tue Mar 3 14:26:08 UTC 2009
It seems to be the same problem occur with me ( cf previous report )
----
Hello
I'm using last gluster from git.
I think there's problem with lock server in AFR mode :
Test :
Server A and B in AFR
TEST 1
1 / install A , B then copie a file to A : synchro to B is perfect
2 / erase all B server and resinstall it : synchronisation is not
possible. ( nothing is doing )
TEST 2
1 / install A , B then copie a file to A (gluster mount point) :
synchro to B is perfect
2 / erase all A : reinstall it : synchro from B is perfect
Now if a redo TEST 1 , but in my last volume (volume last) , I
inverse brick_10.98.98.1 and 10.98.98.2 in subvolumes, so now it is
10.98.98.1 as lock server for AFR
TEST 1 work , TEST 2 not .
I think it try to use lock server where file does not exist in a case,
so problem occur.
I try to add 2 lock lock server with
option data-lock-server-count 2
option entry-lock-server-count 2
without success,
i'm trying with 0 , without success.
Client config file ( the same for A and B )
volume brick_10.98.98.1
type protocol/client
option transport-type tcp/client
option transport-timeout 120
option remote-host 10.98.98.1
option remote-subvolume brick
end-volume
volume brick_10.98.98.2
type protocol/client
option transport-type tcp/client
option transport-timeout 120
option remote-host 10.98.98.2
option remote-subvolume brick
end-volume
volume last
type cluster/replicate
subvolumes brick_10.98.98.2 brick_10.98.98.1
option read-subvolume brick_10.98.98.2
option favorite-child brick_10.98.98.2
end-volume
volume iothreads
type performance/io-threads
option thread-count 4
subvolumes last
end-volume
volume io-cache
type performance/io-cache
option cache-size 2048MB # default is 32MB
option page-size 1MB #128KB is default option
option cache-timeout 2 # default is 1
subvolumes iothreads
end-volume
volume writebehind
type performance/write-behind
option block-size 256KB # default is 0bytes
option cache-size 512KB
option flush-behind on # default is 'off'
subvolumes io-cache
end-volume
Server config for A and B the same execpt for IP
volume brickless
type storage/posix
option directory /mnt/disks/export
end-volume
volume brickthread
type features/posix-locks
option mandatory on # enables mandatory locking on all files
subvolumes brickless
end-volume
volume brickcache
type performance/io-cache
option cache-size 1024MB
option page-size 1MB
option cache-timeout 2
subvolumes brickthread
end-volume
volume brick
type performance/io-threads
option thread-count 8
option cache-size 256MB
subvolumes brickcache
end-volume
volume server
type protocol/server
subvolumes brick
option transport-type tcp
option auth.addr.brick.allow 10.98.98.*
end-volume
On Tue, Mar 3, 2009 at 2:40 PM, Gordan Bobic <gordan at bobich.net> wrote:
> On Tue, 3 Mar 2009 19:02:03 +0530, Anand Avati <avati at gluster.com> wrote:
>> On Wed, Feb 18, 2009 at 1:09 AM, Gordan Bobic <gordan at bobich.net> wrote:
>>> OK, I've managed to resolve this, but it wasn't possible to resync the
>>> primary off the secondary. What I ended up doing was backing up the
> files
>>> that were changed since the primary went down, blanking the secondary,
>>> resyncing the secondary off the primary, and copying the backed up files
>>> back into the file system.
>>>
>>> By primary and secondary here I am referring to the order in which they
>>> are listed in subvolumes.
>>>
>>> So to re-iterate - syncing primary off the secondary wasn't working, but
>>> syncing secondary off the primary worked.
>>>
>>> Can anyone hazard a guess as to how to debug this issue further? Since I
>>> have the backup of the old data on the secondary, I can probably have a
>>> go
>>> at re-creating the problem (I'm hoping it won't be re-creatable with the
>>> freshly synced data).
>>
>> Did you happen to change the subvolume order in the vofile, or
>> add/remove subvols? Doing so may result in such unexpected behavior.
>
> Yes. I added a server/subvolume to the AFR cluster, and subequently removed
> one of the servers. Are there any additional procedures that have be
> followed when adding a node to a cluster?
>
> Gordan
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
More information about the Gluster-devel
mailing list