[Gluster-devel] Mounting with file servers is failing very frequently (in every day)

Joseph Job joseph at spectrum.net.in
Wed Oct 31 14:44:07 UTC 2007


Hi Raghavendra,

I have commented the transport-timeout, and remounted it again. After 
working for some time the mounting failed again...see the logs in the 
client side...any help in this regard would be highly appreciated...



2007-10-31 10:26:20 W [client-protocol.c:210:call_bail] client_213: 
activating 
b 
ail-out. pending frames = 1. last sent = 2007-10-31 10:22:58. last 
received = 
20 
07-10-31 10:23:22 transport-timeout = 108
2007-10-31 10:26:20 C [client-protocol.c:218:call_bail] client_213: 
bailing 
tran 
sport
2007-10-31 10:26:20 D [tcp.c:131:cont_hand] tcp: forcing 
poll/read/write to 
brea 
k on blocked socket (if any)


Regarding the type of files, all files with average of 2 to 4 mb each.

But there are around  2 to 3 lakhs in every folder...around 20 folders...

The servers are connected together with network bonding.

2Gig connection.

Thanks,
Joseph




At 05:07 PM 10/30/2007, you wrote:

>Hi Joseph,
>transport-timeout in clients is very less in your configuration (4 
>seconds). Use a higher timeout or just comment out the option to use 
>the default one, which should be sufficient.
>
>regards,
>
>On Oct 30, 2007 3:16 PM, Joseph Job 
><<mailto:joseph at spectrum.net.in>joseph at spectrum.net.in> wrote:
>My Setup details..
>
>Operating System : Trustix Secure Linux release 3.0.5 (Mirch Masala)
>Kernel Version : 2.6.19.7-1
>
>Gluster version :
>glusterfs-server-1.3.6-1
>glusterfs-client-1.3.6-1
>glusterfs-common-1.3.6-1
>glusterfs-devel-1.3.6-1
>
>Working Mode.
>
>Two web servers are there and both web servers are using Gluster file
>server in TCP network for file access.
>
>see my server side configuration...
>
>
>
>Server 1
>
>## Define the stroage
>volume fs3-storage
>  type storage/posix                   # POSIX FS translator
>  option directory /storage            # Export this directory
>end-volume
>
>volume iothreads                      #iothreads can give performance a boost
>   type performance/io-threads
>   option thread-count 16
>   subvolumes fs3-storage
>end-volume
>
>## Add network serving capability to above brick.
>volume server
>  type protocol/server
>  option transport-type tcp/server     # For TCP/IP transport
>  option listen-port 6996              # Default is 6996
>  option client-volume-filename /var/log/glusterfs/client.vol
>  subvolumes iothreads
>  option auth.ip.iothreads.allow * # Allow access to "brick" volume
>end-volume
>
>Server 2
>
>## Define the stroage
>volume fs4-storage
>  type storage/posix                   # POSIX FS translator
>  option directory /storage            # Export this directory
>end-volume
>
>volume iothreads                      #iothreads can give performance a boost
>   type performance/io-threads
>   option thread-count 16
>   subvolumes fs4-storage
>end-volume
>
>## Add network serving capability to above brick.
>volume server
>  type protocol/server
>  option transport-type tcp/server     # For TCP/IP transport
>  option listen-port 6996              # Default is 6996
>  option client-volume-filename /var/log/glusterfs/client.vol
>  subvolumes iothreads
>  option auth.ip.iothreads.allow * # Allow access to "brick" volume
>end-volume
>
>
>Client side configuration...
>
>Client 1
>
>### Add client feature and attach to remote subvolume
>volume client_214
>  type protocol/client
>  option transport-type tcp/client     # for TCP/IP transport
>  option remote-host <http://10.10.0.214>10.10.0.214     # IP 
> address of the remote brick
>  option remote-port 6996              # default server port is 6996
>  option remote-subvolume iothreads        # name of the remote volume
>  option transport-timeout 4
>end-volume
>
>### Add client feature and attach to remote subvolume
>volume client_213
>  type protocol/client
>  option transport-type tcp/client     # for TCP/IP transport
>  option remote-host <http://10.10.0.213>10.10.0.213     # IP 
> address of the remote brick
>  option remote-port 6996              # default server port is 6996
>  option remote-subvolume iothreads        # name of the remote volume
>  option transport-timeout 4
>end-volume
>
>volume afrbricks
>  type cluster/afr
>  subvolumes  client_214  client_213
>  option replicate *:2
>  option self-heal on
>end-volume
>
>volume iothreads    #iothreads can give performance a boost
>   type performance/io-threads
>   option thread-count 8
>   subvolumes afrbricks
>end-volume
>##########################
>
>Client 2
>
>Client 1
>
>### Add client feature and attach to remote subvolume
>volume client_214
>  type protocol/client
>  option transport-type tcp/client     # for TCP/IP transport
>  option remote-host <http://10.10.0.214>10.10.0.214     # IP 
> address of the remote brick
>  option remote-port 6996              # default server port is 6996
>  option remote-subvolume iothreads        # name of the remote volume
>  option transport-timeout 4
>end-volume
>
>### Add client feature and attach to remote subvolume
>volume client_213
>  type protocol/client
>  option transport-type tcp/client     # for TCP/IP transport
>  option remote-host <http://10.10.0.213>10.10.0.213     # IP 
> address of the remote brick
>  option remote-port 6996              # default server port is 6996
>  option remote-subvolume iothreads        # name of the remote volume
>  option transport-timeout 4
>end-volume
>
>volume afrbricks
>  type cluster/afr
>  subvolumes  client_214  client_213
>  option replicate *:2
>  option self-heal on
>end-volume
>
>volume iothreads    #iothreads can give performance a boost
>   type performance/io-threads
>   option thread-count 8
>   subvolumes afrbricks
>end-volume
>##########################
>
>
>I am mounting the server to client with glusterfs -f
>/etc/glusterfs/glusterfs- client.vol /storage/
>
>I could able to mount, files are getting replicated to both file
>servers. But the problem, the mounting is braking very frequently...
>
>I am getting the error in glusterfs.log
>
>2007-10-30 00:46:10 C [ tcp.c:81:tcp_disconnect] client_213:
>connection disconnected
>2007-10-30 00:46:18 C [client-protocol.c:218:call_bail] client_214:
>bailing transport
>2007-10-30 00:46:18 C [client-protocol.c:218:call_bail] client_213:
>bailing transport
>2007-10-30 00:46:18 C [tcp.c:81:tcp_disconnect] client_214:
>connection disconnected
>2007-10-30 00:46:18 C [tcp.c:81:tcp_disconnect] client_213:
>connection disconnected
>
>But the physical connection is still there...I can ping from client
>to servers...
>The server is using gigabit networking bonding.
>
>I am using kernel with Fuse supported
>
>root at w3-cok ~# lsmod
>Module                  Size  Used by
>fuse                   39444  2
>ipv6                  221344  24
>tg3                   105860  0
>bonding                79224  0
>jfs                   163564  2
>usbhid                 35936  0
>ohci_hcd               18564  0
>usbcore               112772  3 usbhid,ohci_hcd
>parport_pc             21956  0
>parport                20032  1 parport_pc
>shpchp                 32416  0
>serverworks             8840  0 [permanent]
>cciss                  54020  8
>dm_mod                 49432  0
>sd_mod                 17024  0
>piix                    9604  0 [permanent]
>ide_disk               14336  0
>ide_generic             2048  0 [permanent]
>ide_core              106444  4 serverworks,piix,ide_disk,ide_generic
>
>
>Also see the glusterfsd.log in server side
>
>2007-10-30 00:03:43 E [server-protocol.c:197:generic_reply] server:
>transport_writev failed
>2007-10-30 00:03:43 E [tcp.c:118:tcp_except] server: shutdown () -
>error: Transport endpoint is not connected
>2007-10-30 00:03:43 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:03:43 E [protocol.c:253:gf_block_unserialize_transport]
>server: EOF from peer ( <http://10.10.0.203:1018>10.10.0.203:1018)
>2007-10-30 00:03:43 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:03:43 E [server-protocol.c:197:generic_reply] server:
>transport_writev failed
>2007-10-30 00:03:43 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:03:43 E [tcp.c:118:tcp_except] server: shutdown () -
>error: Transport endpoint is not connected
>2007-10-30 00:03:43 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:03:43 E [server-protocol.c:197:generic_reply] server:
>transport_writev failed
>2007-10-30 00:03:43 E [tcp.c:118:tcp_except] server: shutdown () -
>error: Transport endpoint is not connected
>2007-10-30 00:03:43 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:03:43 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:03:43 E [ server-protocol.c:197:generic_reply] server:
>transport_writev failed
>2007-10-30 00:03:43 E [tcp.c:118:tcp_except] server: shutdown () -
>error: Transport endpoint is not connected
>2007-10-30 00:03:43 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:03:43 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:03:43 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:03:43 E [server-protocol.c:197:generic_reply] server:
>transport_writev failed
>2007-10-30 00:03:43 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:03:43 E [tcp.c:118:tcp_except] server: shutdown () -
>error: Transport endpoint is not connected
>2007-10-30 00:48:52 E [server-protocol.c:197:generic_reply] server:
>transport_writev failed
>2007-10-30 00:48:52 E [tcp.c:118:tcp_except] server: shutdown () -
>error: Transport endpoint is not connected
>2007-10-30 00:48:52 C [ tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:48:52 E [server-protocol.c:197:generic_reply] server:
>transport_writev failed
>2007-10-30 00:48:52 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:48:52 E [tcp.c:118:tcp_except] server: shutdown () -
>error: Transport endpoint is not connected
>2007-10-30 00:48:52 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:48:52 E [ server-protocol.c:197:generic_reply] server:
>transport_writev failed
>2007-10-30 00:48:52 E [tcp.c:118:tcp_except] server: shutdown () -
>error: Transport endpoint is not connected
>2007-10-30 00:48:52 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:48:52 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>2007-10-30 00:48:52 C [tcp.c:81:tcp_disconnect] server: connection 
>disconnected
>
>
>
>
>JOSEPH JOB
>Spectrum Softtech Solutions(P)Ltd.
>MahaKavi G Road,
>Karikkamuri Cross Road
>Kochi-682011
>0484-4082000
><mailto:joseph at spectrum.net.in>joseph at spectrum.net.in
>Visit at <http://www.spectrum.net.in>www.spectrum.net.in
>
>
>_______________________________________________
>Gluster-devel mailing list
><mailto:Gluster-devel at nongnu.org>Gluster-devel at nongnu.org
>http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
>
>
>--
>Raghavendra G
>
>A centipede was happy quite, until a toad in fun,
>Said, "Prey, which leg comes after which?",
>This raised his doubts to such a pitch,
>He fell flat into the ditch,
>Not knowing how to run.
>-Anonymous

JOSEPH JOB
Spectrum Softtech Solutions(P)Ltd.
MahaKavi G Road,
Karikkamuri Cross Road
Kochi-682011
0484-4082000
joseph at spectrum.net.in
Visit at www.spectrum.net.in



More information about the Gluster-devel mailing list