[Gluster-users] Replicating data files is causing issue with postgres

Thu Apr 9 18:36:12 UTC 2009

Actually we are still seeing errors when trying to restore our  
database to a gluster provided mount:

-bash-3.2$ pg_restore -U entitystore -d entitystore --no-owner -n  
public entitystore
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 1834; 0 147124 TABLE  
DATA entity_vzw-wthan-music-2 entitystore
pg_restore: [archiver (db)] COPY failed: ERROR:  unexpected data  
beyond EOF in block 70626 of relation "entity_vzw-wthan-music-2"
HINT:  This has been seen to occur with buggy kernels; consider  
updating your system.
CONTEXT:  COPY entity_vzw-wthan-music-2, line 668331: "vzw-wthan- 
music-2	2406931	\\340\\000\\000\\001\\0008\\317\\002ns2.http://schemas.medio.com/usearch/1 
..."
WARNING: errors ignored on restore: 1

config file for gluster is:

-bash-3.2$ cat /etc/glusterfs/replicatedb.vol
volume posix
  type storage/posix
  option directory /mnt/sdb1
end-volume

volume locks
   type features/locks
   subvolumes posix
end-volume

volume brick
  type performance/io-threads
  subvolumes locks
end-volume

volume server
  type protocol/server
  option transport-type tcp
  option auth.addr.brick.allow *
  subvolumes brick
end-volume

volume gfs01-hq.hq.msrch
  type protocol/client
  option transport-type tcp
  option remote-host gfs01-hq
  option remote-subvolume brick
end-volume

volume gfs02-hq.hq.msrch
  type protocol/client
  option transport-type tcp
  option remote-host gfs02-hq
  option remote-subvolume brick
end-volume

volume replicate
  type cluster/replicate
  subvolumes gfs01-hq.hq.msrch gfs02-hq.hq.msrch
end-volume

#volume writebehind
#  type performance/write-behind
#  option page-size 128KB
#  option cache-size 1MB
#  subvolumes replicate
#end-volume
#
#volume cache
#  type performance/io-cache
#  option cache-size 512MB
#  subvolumes writebehind
#end-volume

The main issue is we are able to perform a restore with no errors from  
one machine.
When we then stop the database on node2 and start it on node1 and  
attempt a restore we see these errors.
These errors are not at all present when the postgres data files are  
stored on local disk.

-Jeff

On Apr 1, 2009, at 3:07 PM, Jeff Lord wrote:

>
> On Apr 1, 2009, at 10:57 AM, Anand Avati wrote:
>
>>> Ok. Problem solved.
>>> We were mounting the file system with:
>>>
>>> mount -t glusterfs -o volume-name=cache /etc/glusterfs/ 
>>> replicatedb.vol
>>> /mnt/replicate
>>>
>>> So I dropped the db and the tablespace and remounted the gluster  
>>> share as:
>>>
>>> mount -t glusterfs -o volume-name=replicate /etc/glusterfs/ 
>>> replicatedb.vol
>>> /mnt/replicate
>>>
>>> After that our full database restore completed with no errors.
>>> This is a great thing!
>>> As you can see the volume-name=cache references write-behind,  
>>> which seemed
>>> to be causing the problems.
>>>
>>
>> volume-name=cache references io-cache and write-behind. Can you try
>> with volume-name=write-behind and see if things work? That way we can
>> corner the issue to be in io-cache specifically.
>>
>> Avati
>
>
> Yes I will try referencing volume-name=write-behind
> On a related note restoring the database from gfs02-hq still gives  
> errors, whereas restoring from gfs01-hq does not.
> Is there any reason this could be related to the favorite-child  
> setting in the config?
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users