[Gluster-devel] Cascading different translator doesn't work as expectation

yaomin @ gmail yangyaomin at gmail.com
Mon Jan 5 02:48:26 UTC 2009


Hey,

    I try to use the following cascading mode to enhance the throughput performance, but the result is bad.
    There are four storage nodes and each exports 2 directories.

         (on client)                          unify(alu) translator
                                                  /                              \
                                                /                                  \
                                              /                                      \
                                            /                                          \
                                           /                                             \
(translator on client)        stripe                                         stripe 
                                      /          \                                     /     \
                                     /            \                                   /        \
                                    /              \                                 /           \ 
(translator on client)   AFR        AFR                          AFR        AFR  
                                  /    \           /    \                        /     \          /      \
                                 /      \         /       \                    /        \        /        \
                            #1-1  #2-1  #3-1   #4-4         #1-2   #2-2   #3-2    #4-2
   When I use iozone to test with 10 concurrent processes, I only find the #3 and #4 storages working,  and the other 2 nodes doesn't work. As my expectation, the 4 storage nodes should simultaneously work at any time, but it is out of my mind. what's wrong with it?

  Another issue is that the memory is exhausted on storage nodes when writing and on client server when reading, and it is not what I want. Is there any method to limit the usage of memory?


Best Wishes.
Alfred

Following is the vol file on client.

### file: client-volume.spec.sample

##############################################
###  GlusterFS Client Volume Specification  ##
##############################################

#### CONFIG FILE RULES:
### "#" is comment character.
### - Config file is case sensitive
### - Options within a volume block can be in any order.
### - Spaces or tabs are used as delimitter within a line. 
### - Each option should end within a line.
### - Missing or commented fields will assume default values.
### - Blank/commented lines are allowed.
### - Sub-volumes should already be defined above before referring.

### Add client feature and attach to remote subvolume of server1

volume client-ns
  type protocol/client
  option transport-type tcp/client       # for TCP/IP transport
  option remote-host 192.168.13.2        # IP address of the remote brick
# option remote-port 6996                # default server port is 6996
# option transport-timeout 30            # seconds to wait for a response
                                         # from server for each request
  option remote-subvolume name_space          # name of the remote volume
end-volume

volume client11
  type protocol/client
  option transport-type tcp/client       # for TCP/IP transport
  option remote-host 192.168.13.2        # IP address of the remote brick
# option remote-port 6996                # default server port is 6996
# option transport-timeout 30            # seconds to wait for a response
                                         # from server for each request
  option remote-subvolume brick1          # name of the remote volume
end-volume

volume client12
  type protocol/client
  option transport-type tcp/client       # for TCP/IP transport
  option remote-host 192.168.13.2        # IP address of the remote brick
# option remote-port 6996                # default server port is 6996
# option transport-timeout 30            # seconds to wait for a response
                                         # from server for each request
  option remote-subvolume brick2          # name of the remote volume
end-volume


volume client21
  type protocol/client
  option transport-type tcp/client       # for TCP/IP transport
  option remote-host 192.168.13.4        # IP address of the remote brick
# option remote-port 6996                # default server port is 6996
# option transport-timeout 30            # seconds to wait for a response
                                         # from server for each request
  option remote-subvolume brick1          # name of the remote volume
end-volume

volume client22
  type protocol/client
  option transport-type tcp/client       # for TCP/IP transport
  option remote-host 192.168.13.4        # IP address of the remote brick
# option remote-port 6996                # default server port is 6996
# option transport-timeout 30            # seconds to wait for a response
                                         # from server for each request
  option remote-subvolume brick2          # name of the remote volume
end-volume

volume client31
  type protocol/client
  option transport-type tcp/client       # for TCP/IP transport
  option remote-host 192.168.13.5        # IP address of the remote brick
# option remote-port 6996                # default server port is 6996
# option transport-timeout 30            # seconds to wait for a response
                                         # from server for each request
  option remote-subvolume brick1          # name of the remote volume
end-volume

volume client32
  type protocol/client
  option transport-type tcp/client       # for TCP/IP transport
  option remote-host 192.168.13.5        # IP address of the remote brick
# option remote-port 6996                # default server port is 6996
# option transport-timeout 30            # seconds to wait for a response
                                         # from server for each request
  option remote-subvolume brick2          # name of the remote volume
end-volume

volume client41
  type protocol/client
  option transport-type tcp/client       # for TCP/IP transport
  option remote-host 192.168.13.7        # IP address of the remote brick
# option remote-port 6996                # default server port is 6996
# option transport-timeout 30            # seconds to wait for a response
                                         # from server for each request
  option remote-subvolume brick1          # name of the remote volume
end-volume

volume client42
  type protocol/client
  option transport-type tcp/client       # for TCP/IP transport
  option remote-host 192.168.13.7        # IP address of the remote brick
# option remote-port 6996                # default server port is 6996
# option transport-timeout 30            # seconds to wait for a response
                                         # from server for each request
  option remote-subvolume brick2          # name of the remote volume
end-volume

volume afr1
  type cluster/afr
  subvolumes client11 client21
  option debug off         # turns on detailed debug messages 
                              # in log by default is debugging off
  option self-heal on    # turn off self healing default is on
end-volume

volume afr2
  type cluster/afr
  subvolumes client31 client41
  option debug off         # turns on detailed debug messages 
                              # in log by default is debugging off
  option self-heal on    # turn off self healing default is on
end-volume

volume afr3
  type cluster/afr
  subvolumes client12 client22
  option debug off         # turns on detailed debug messages 
                              # in log by default is debugging off
  option self-heal on    # turn off self healing default is on
end-volume

volume afr4
  type cluster/afr
  subvolumes client32 client42
  option debug off         # turns on detailed debug messages 
                              # in log by default is debugging off
  option self-heal on    # turn off self healing default is on
end-volume

volume stripe1
   type cluster/stripe
   option block-size 1MB                 #default size is 128KB
   subvolumes afr1 afr2
end-volume

volume stripe2
   type cluster/stripe
   option block-size 1MB                 #default size is 128KB
   subvolumes afr3 afr4
end-volume



volume bricks
  type cluster/unify
  subvolumes stripe1 stripe2
  option namespace client-ns
  option scheduler alu   
#  option alu.limits.min-free-disk  5%    # Don't create files one a volume with less than 5% free diskspace
#  option alu.limits.max-open-files 10000  # Don't create files on a volume with more than 10000 files open

  # When deciding where to place a file, first look at the disk-usage, then at read-usage, write-usage, open files, and finally the disk-speed-usage.
  option alu.order disk-usage:read-usage:write-usage:open-files-usage:disk-speed-usage
  option alu.disk-usage.entry-threshold 2GB   # Kick in if the discrepancy in disk-usage between volumes is more than 2GB
  option alu.disk-usage.exit-threshold  60MB   # Don't stop writing to the least-used volume until the discrepancy is 1988MB 
  option alu.open-files-usage.entry-threshold 1024   # Kick in if the discrepancy in open files is 1024
  option alu.open-files-usage.exit-threshold 32   # Don't stop until 992 files have been written the least-used volume
  option alu.read-usage.entry-threshold 20%   # Kick in when the read-usage discrepancy is 20%
  option alu.read-usage.exit-threshold 4%   # Don't stop until the discrepancy has been reduced to 16% (20% - 4%)
  option alu.write-usage.entry-threshold 20%   # Kick in when the write-usage discrepancy is 20%
  option alu.write-usage.exit-threshold 4%   # Don't stop until the discrepancy has been reduced to 16%
#  option alu.disk-speed-usage.entry-threshold # NEVER SET IT. SPEED IS CONSTANT!!!
#  option alu.disk-speed-usage.exit-threshold  # NEVER SET IT. SPEED IS CONSTANT!!!
  option alu.stat-refresh.interval 10sec   # Refresh the statistics used for decision-making every 10 seconds
  option alu.stat-refresh.num-file-create 10   # Refresh the statistics used for decision-making after creating 10 files
end-volume


### Add io-threads feature
volume iot
  type performance/io-threads
  option thread-count 1  # deault is 1
  option cache-size 16MB #64MB

  subvolumes bricks #stripe #afr #bricks
end-volume

### Add readahead feature
volume readahead
  type performance/read-ahead
  option page-size 1MB      # unit in bytes
  option page-count 4       # cache per file  = (page-count x page-size)
  subvolumes iot
end-volume

### Add IO-Cache feature
volume iocache
  type performance/io-cache
  option page-size 1024KB
  option page-count 8
  subvolumes readahead
end-volume

### Add writeback feature
volume writeback
  type performance/write-behind
  option aggregate-size 1MB  #option flush-behind off
  option window-size 3MB     # default is 0bytes
#  option flush-behind on     # default is 'off'
  subvolumes iocache   
end-volume


### Add io-threads feature
volume iot_stripe1
  type performance/io-threads
  option thread-count 1  # deault is 1
  option cache-size 16MB #64MB
  subvolumes stripe1
end-volume

### Add readahead feature
volume readahead_stripe1
  type performance/read-ahead
 option page-size 1MB      # unit in bytes
  option page-count 4       # cache per file  = (page-count x page-size)
  subvolumes iot_stripe1
end-volume

### Add IO-Cache feature
volume iocache_stripe1
  type performance/io-cache
  option page-size 1024KB
  option page-count 8
  subvolumes readahead_stripe1
end-volume

### Add writeback feature
volume writeback_stripe1
  type performance/write-behind
  option aggregate-size 1MB  #option flush-behind off
#  option window-size 3MB     # default is 0bytes
#  option flush-behind on     # default is 'off'
  subvolumes iocache_stripe1
end-volume


### Add io-threads feature
volume iot_stripe2
  type performance/io-threads
  option thread-count 1  # deault is 1
  option cache-size 16MB #64MB
  subvolumes stripe2
end-volume

### Add readahead feature
volume readahead_stripe2
  type performance/read-ahead
 option page-size 1MB      # unit in bytes
  option page-count 4       # cache per file  = (page-count x page-size)
  subvolumes iot_stripe2
end-volume

### Add IO-Cache feature
volume iocache_stripe2
  type performance/io-cache
  option page-size 1024KB
  option page-count 8
  subvolumes readahead_stripe2
end-volume

### Add writeback feature
volume writeback_stripe2
  type performance/write-behind
  option aggregate-size 1MB  #option flush-behind off
#  option window-size 3MB     # default is 0bytes
#  option flush-behind on     # default is 'off'
  subvolumes iocache_stripe2
end-volume


### Add io-threads feature
volume iot_afr1
  type performance/io-threads
  option thread-count 1  # deault is 1
  option cache-size 16MB #64MB
  subvolumes afr1
end-volume

### Add readahead feature
volume readahead_afr1
  type performance/read-ahead
 option page-size 1MB      # unit in bytes
  option page-count 4       # cache per file  = (page-count x page-size)
  subvolumes iot_afr1
end-volume

### Add IO-Cache feature
volume iocache_afr1
  type performance/io-cache
  option page-size 1024KB
  option page-count 8
  subvolumes readahead_afr1
end-volume

### Add writeback feature
volume writeback_afr1
  type performance/write-behind
  option aggregate-size 1MB  #option flush-behind off
#  option window-size 3MB     # default is 0bytes
#  option flush-behind on     # default is 'off'
  subvolumes iocache_afr1
end-volume



### Add io-threads feature
volume iot_afr2
  type performance/io-threads
  option thread-count 1  # deault is 1
  option cache-size 16MB #64MB
  subvolumes afr2
end-volume

### Add readahead feature
volume readahead_afr2
  type performance/read-ahead
 option page-size 1MB      # unit in bytes
  option page-count 4       # cache per file  = (page-count x page-size)
  subvolumes iot_afr2
end-volume

### Add IO-Cache feature
volume iocache_afr2
  type performance/io-cache
  option page-size 1024KB
  option page-count 8
  subvolumes readahead_afr2
end-volume

### Add writeback feature
volume writeback_afr2
  type performance/write-behind
  option aggregate-size 1MB  #option flush-behind off
#  option window-size 3MB     # default is 0bytes
#  option flush-behind on     # default is 'off'
  subvolumes iocache_afr2
end-volume



### Add io-threads feature
volume iot_afr3
  type performance/io-threads
  option thread-count 1  # deault is 1
  option cache-size 16MB #64MB
  subvolumes afr3
end-volume

### Add readahead feature
volume readahead_afr3
  type performance/read-ahead
 option page-size 1MB      # unit in bytes
  option page-count 4       # cache per file  = (page-count x page-size)
  subvolumes iot_afr3
end-volume

### Add IO-Cache feature
volume iocache_afr3
  type performance/io-cache
  option page-size 1024KB
  option page-count 8
  subvolumes readahead_afr3
end-volume

### Add writeback feature
volume writeback_afr3
  type performance/write-behind
  option aggregate-size 1MB  #option flush-behind off
#  option window-size 3MB     # default is 0bytes
#  option flush-behind on     # default is 'off'
  subvolumes iocache_afr3
end-volume



### Add io-threads feature
volume iot_afr4
  type performance/io-threads
  option thread-count 1  # deault is 1
  option cache-size 16MB #64MB
  subvolumes afr4
end-volume

### Add readahead feature
volume readahead_afr4
  type performance/read-ahead
  option page-size 1MB      # unit in bytes
  option page-count 4       # cache per file  = (page-count x page-size)
  subvolumes iot_afr4
end-volume

### Add IO-Cache feature
volume iocache_afr4
  type performance/io-cache
  option page-size 1024KB
  option page-count 8
  subvolumes readahead_afr4
end-volume

### Add writeback feature
volume writeback_afr4
  type performance/write-behind
  option aggregate-size 1MB  #option flush-behind off
 # option window-size 3MB     # default is 0bytes
#  option flush-behind on     # default is 'off'
  subvolumes iocache_afr4
end-volume


 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20090105/2dcff7a6/attachment-0003.html>


More information about the Gluster-devel mailing list