Announcing GlusterFS release 3.10
srangana at redhat.com
Mon Feb 27 22:33:52 UTC 2017
The Gluster community is pleased to announce the release of Gluster 3.10
(packages available at ).
This is a major Gluster release that includes some substantial changes.
The features revolve around, better support in container environments,
scaling to larger number of bricks per node, and a few usability and
performance improvements, among other bug fixes. This releases marks the
completion of maintenance releases for Gluster 3.7 and 3.9. Moving
forward, Gluster versions 3.10 and 3.8 are actively maintained.
The most notable features and changes are documented here as well as in
our full release notes on Github. A full list of bugs that has been
addressed is included on that page as well.
Major changes and features:
Multiplexing reduces both port and memory usage. It does not improve
performance vs. non-multiplexing except when memory is the limiting
factor, though there are other related changes that improve performance
overall (e.g. compared to 3.9).
Multiplexing is off by default. It can be enabled with
# gluster volume set all cluster.brick-multiplex on
*Support to display op-version information from clients*
To get information on what op-version are supported by the clients,
users can invoke the gluster volume status command for clients. Along
with information on hostname, port, bytes read, bytes written and number
of clients connected per brick, we now also get the op-version on which
the respective clients operate. Following is the example usage:
# gluster volume status <VOLNAME|all> clients
*Support to get maximum op-version in a heterogeneous cluster*
A heterogeneous cluster operates on a common op-version that can be
supported across all the nodes in the trusted storage pool. Upon upgrade
of the nodes in the cluster, the cluster might support a higher
op-version. Users can retrieve the maximum op-version to which the
cluster could be bumped up to by invoking the gluster volume getcommand
on the newly introduced global option, cluster.max-op-version. The usage
is as follows:
# gluster volume get all cluster.max-op-version
*Support for rebalance time to completion estimation*
Users can now see approximately how much time the rebalance operation
will take to complete across all nodes.
The estimated time left for rebalance to complete is displayed as part
of the rebalance status. Use the command:
# gluster volume rebalance <VOLNAME> status
*Separation of tier as its own service*
This change is to move the management of the tier daemon into the
gluster service framework, thereby improving it stability and
manageability by the service framework.
This has no change to any of the tier commands or user facing interfaces
*Statedump support for gfapi based applications*
gfapi based applications now can dump state information for better
trouble shooting of issues. A statedump can be triggered in two ways:
a) by executing the following on one of the Gluster servers,
# gluster volume statedump <VOLNAME> client <HOST>:<PID>
<VOLNAME> should be replaced by the name of the volume
<HOST> should be replaced by the hostname of the system running the
<PID> should be replaced by the PID of the gfapi application
b) through calling glfs_sysrq(<FS>, GLFS_SYSRQ_STATEDUMP) within the
<FS> should be replaced by a pointer to a glfs_t structure
All statedumps (*.dump.* files) will be located at the usual location,
on most distributions this would be /var/run/gluster/.
*Disabled creation of trash directory by default*
From now onwards trash directory, namely .trashcan, will not be be
created by default upon creation of new volumes unless and until the
feature is turned ON and the restrictions on the same will be applicable
as long as features.trash is set for a particular volume.
*Implemented parallel readdirp with distribute xlator*
Currently the directory listing gets slower as the number of
bricks/nodes increases in a volume, though the file/directory numbers
remain unchanged. With this feature, the performance of directory
listing is made mostly independent of the number of nodes/bricks in the
volume. Thus scale doesn't exponentially reduce the directory listing
performance. (On a 2, 5, 10, 25 brick setup we saw ~5, 100, 400, 450%
To enable this feature:
# gluster volume set <VOLNAME> performance.readdir-ahead on
# gluster volume set <VOLNAME> performance.parallel-readdir on
To disable this feature:
# gluster volume set <VOLNAME> performance.parallel-readdir off
If there are more than 50 bricks in the volume it is good to increase
the cache size to be more than 10Mb (default value):
# gluster volume set <VOLNAME> performance.rda-cache-limit <CACHE SIZE>
*md-cache can optionally -ve cache security.ima xattr*
From kernel version 3.X or greater, creating of a file results in
removexattr call on security.ima xattr. This xattr is not set on the
file unless IMA feature is active. With this patch, removxattr call
returns ENODATA if it is not found in the cache.
The end benefit is faster create operations where IMA is not enabled.
To cache this xattr use,
# gluster volume set <VOLNAME> performance.cache-ima-xattrs on
The above option is on by default.
*Added support for CPU extensions in disperse computations*
To improve disperse computations, a new way of generating dynamic code
targeting specific CPU extensions like SSE and AVX on Intel processors
is implemented. The available extensions are detected on run time. This
can roughly double encoding and decoding speeds (or halve CPU usage).
This change is 100% compatible with the old method. No change is needed
if an existing volume is upgraded.
You can control which extensions to use or disable them with the
# gluster volume set <VOLNAME> disperse.cpu-extensions <type>
Valid values are:
none: Completely disable dynamic code generation
auto: Automatically detect available extensions and use the best one
x64: Use dynamic code generation using standard 64 bits instructions
sse: Use dynamic code generation using SSE extensions (128 bits)
avx: Use dynamic code generation using AVX extensions (256 bits)
The default value is 'auto'. If a value is specified that is not
detected on run-time, it will automatically fall back to the next
Bugs addressed since release-3.9 are listed in our full release notes .
 Release 3.10.0 packages:
 Full release notes for 3.10.0:
More information about the Announce