[Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

Thu Jun 12 17:33:53 UTC 2014

On 06/12/2014 06:52 PM, Ravishankar N wrote:
> Hi Vijay,
>
> Since glusterfs 3.5, posix_lookup() sends ESTALE instead of ENOENT [1]
> when when a parent gfid (entry) is not present on the brick . In a
> replicate set up, this causes a problem because AFR gives more priority
> to ESTALE than ENOENT, causing IO to fail [2]. The fix is in progress at
> [3] and is client-side specific , and would make it to 3.5.2
>
> But we will still hit the problem when rolling upgrade is performed from
> 3.4 to 3.5,  unless the clients are also upgraded to 3.5: To elaborate
> an example:
>
> 0) Create a 1x2 volume using 2 nodes and mount it from client. All
> machines are glusterfs 3.4
> 1) Perform for i in {1..30}; do mkdir $i; tar xf glusterfs-3.5git.tar.gz
> -C $i& done
> 2) While this is going on, kill one of the node in the replica pair and
> upgrade it to glusterfs 3.5 (simulating rolling upgrade)
> 3) After a while, kill all tar processes
> 4) Create a backup directory and move all 1..30 dirs inside 'backup'
> 5) Start the untar processes in 1) again
> 6) Bring up the upgraded node. Tar fails with estale errors.
>
> Essentially the errors occur because [3] is a client side fix. But
> rolling upgrades are targeted at servers while the older clients still
> need to access them without issues.
>
> A solution is to have a fix in the posix translator wherein the newer
> client passes it's version (3.5) to posix_lookup() which then sends
> ESTALE if version is 3.5 or newer but sends ENOENT instead if it is an
> older client. Does this seem okay?
>

Cannot think of a better solution to this. Seamless rolling upgrades are 
necessary for us and the proposed fix does seem okay for that reason.

Thanks,
Vijay