[Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

Thu Jun 12 13:22:31 UTC 2014

Hi Vijay,

Since glusterfs 3.5, posix_lookup() sends ESTALE instead of ENOENT [1] 
when when a parent gfid (entry) is not present on the brick . In a 
replicate set up, this causes a problem because AFR gives more priority 
to ESTALE than ENOENT, causing IO to fail [2]. The fix is in progress at 
[3] and is client-side specific , and would make it to 3.5.2

But we will still hit the problem when rolling upgrade is performed from 
3.4 to 3.5,  unless the clients are also upgraded to 3.5: To elaborate 
an example:

0) Create a 1x2 volume using 2 nodes and mount it from client. All 
machines are glusterfs 3.4
1) Perform for i in {1..30}; do mkdir $i; tar xf glusterfs-3.5git.tar.gz 
-C $i& done
2) While this is going on, kill one of the node in the replica pair and 
upgrade it to glusterfs 3.5 (simulating rolling upgrade)
3) After a while, kill all tar processes
4) Create a backup directory and move all 1..30 dirs inside 'backup'
5) Start the untar processes in 1) again
6) Bring up the upgraded node. Tar fails with estale errors.

Essentially the errors occur because [3] is a client side fix. But 
rolling upgrades are targeted at servers while the older clients still 
need to access them without issues.

A solution is to have a fix in the posix translator wherein the newer 
client passes it's version (3.5) to posix_lookup() which then sends 
ESTALE if version is 3.5 or newer but sends ENOENT instead if it is an 
older client. Does this seem okay?

[1] http://review.gluster.org/6318
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1106408
[3] http://review.gluster.org/#/c/8015/