[Gluster-devel] gluster doesn't like Oracle's FSINFO RPC call
Niels de Vos
ndevos at redhat.com
Fri Apr 12 22:50:33 UTC 2013
On Fri, Apr 12, 2013 at 03:58:04PM -0400, Michael Brown wrote:
> KERBOOM
>
> [michael at fleming1 ~]$ sudo mount -a -t nfs
> [sudo] password for michael:
> mount: fearless1:/gv0 failed, reason given by server: No such file or
> directory
> mount: fearless1:/gv0/fleming1/db0/ALTUS_config failed, reason given by
> server: unknown nfs status return value: 22
> mount: fearless1:/gv0/fleming1/db0/ALTUS_data failed, reason given by
> server: unknown nfs status return value: 22
> mount: fearless1:/gv0/fleming1/db0/ALTUS_flash failed, reason given by
> server: unknown nfs status return value: 22
> mount.nfs: mount point /db/flash_recovery_area/ALTUS/onlinelog does not
> exist
>
> nfs.log:
> [2013-04-12 15:55:16.507084] E [nfs3.c:305:__nfs3_get_volume_id]
> (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo+0x22c)
> [0x7f45bfbb852c]
> (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo_reply+0x29)
> [0x7f45bfbb2ce9]
> (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0x51)
> [0x7f45bfbb2481]))) 0-nfs-nfsv3: invalid argument: xl
> [2013-04-12 15:55:16.538560] E [nfs3.c:4706:nfs3_fsinfo] 0-nfs-nfsv3:
> Bad Handle
> [2013-04-12 15:55:16.538580] W [nfs3-helpers.c:3389:nfs3_log_common_res]
> 0-nfs-nfsv3: XID: 242c1550, FSINFO: NFS: 10001(Illegal NFS file handle),
> POSIX: 14(Bad address)
> [2013-04-12 15:55:16.538617] E [nfs3.c:305:__nfs3_get_volume_id]
> (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo+0x22c)
> [0x7f45bfbb852c]
> (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo_reply+0x29)
> [0x7f45bfbb2ce9]
> (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0x51)
> [0x7f45bfbb2481]))) 0-nfs-nfsv3: invalid argument: xl
>
> (I tried both with and without modifying your uint32_t size to a
> 'int32_t size' to correct the signedness of the argument)
>
> Get ahold of me in IRC and let's get this figured out. I've got a
> debugger attached.
23:51 < ndevos> Supermathie: ah, I've thought of the error in my
suggestion - that function is used to encode and decode
23:52 < ndevos> which means, that the size parameter must be set
correctly - the .data_len attribute contain the size when encoding,
and should be overwritten when decoding
23:53 < ndevos> KERBOOM happens when an idea is only half looked at :-/
Maybe something the attached patch works better? It should encode/decode
both the length and the fhandle value. Compile tested only.
Niels
>
> M.
>
> On 13-04-12 11:32 AM, Niels de Vos wrote:
> > On Fri, Apr 12, 2013 at 05:23:08PM +0200, Niels de Vos wrote:
> >> On Thu, Apr 11, 2013 at 12:37:30PM -0400, Michael Brown wrote:
> >>> That actually broke everything (including Linux trying to mount NFS).
> >>>
> >>> I've modified it slightly to be:
> >>>
> >>> bool_t
> >>> xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp)
> >>> {
> >>> if (!xdr_bytes (xdrs, (char **)&objp->data.data_val, (u_int *)
> >>> &objp->data.data_len, NFS3_FHSIZE))
> >>> if (!xdr_opaque (xdrs, &objp, (u_int *)
> >>> &objp->data.data_len))
> >>> return FALSE;
> >>> return TRUE;
> >>> }
> >>>
> >>> (i.e. only call the xdr_opaque function if the xdr_bytes decode fails)
> >> Nah, that won't work. The xdr_* functions are modifying the position of
> >> the cursor in the XDR-stream. Subsequent reads will continue where the
> >> previous one finished.
> >>
> >> What you probably need to do is something like this:
> >>
> >> xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp)
> >> {
> >> uint32_t size;
> >>
> >> if (!xdr_int (xdrs, &size))
> >> if (!xdr_opaque (xdrs, (u_int *)&objp->data.data_len, size))
> > ^ that should be objp->data.data_val of course :-/
> >
> >> return FALSE
> >> return TRUE;
> >> }
> >>
> >> That will read the size of the fhandle first, to determine how long the opaque
> >> fhandle is, and use that size to read it.
> >>
> >> Cheers,
> >> Niels
> >>
> >>> But I get no change in behaviour.
> >>>
> >>> Also get these warnings:
> >>>
> >>> xdr-nfs3.c: In function 'xdr_nfs_fh3':
> >>> xdr-nfs3.c:197: warning: passing argument 2 of 'xdr_opaque' from
> >>> incompatible pointer type
> >>> /usr/include/rpc/xdr.h:313: note: expected 'caddr_t' but argument is of
> >>> type 'struct nfs_fh3 **'
> >>> xdr-nfs3.c:197: warning: passing argument 3 of 'xdr_opaque' makes
> >>> integer from pointer without a cast
> >>> /usr/include/rpc/xdr.h:313: note: expected 'u_int' but argument is of
> >>> type 'u_int *'
> >>>
> >>> M.
> >>>
> >>> On 13-04-11 07:42 AM, Niels de Vos wrote:
> >>>> My guess is that this (untested) change would fix it, can you try that?
> >>>>
> >>>> --- a/rpc/xdr/src/xdr-nfs3.c
> >>>> +++ b/rpc/xdr/src/xdr-nfs3.c
> >>>> @@ -184,7 +184,7 @@ xdr_specdata3 (XDR *xdrs, specdata3 *objp)
> >>>> bool_t
> >>>> xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp)
> >>>> {
> >>>> - if (!xdr_bytes (xdrs, (char **)&objp->data.data_val, (u_int *) &objp->data.data_len, NFS3_FHSIZE))
> >>>> + if (!xdr_opaque (xdrs, &objp, (u_int *) &objp->data.data_len))
> >>>> return FALSE;
> >>>> return TRUE;
> >>>> }
> >>>>
> >>>>
> >>>> HTH,
> >>>> Niels
> >>>>
> >>>>> All I get out of gluster is:
> >>>>> [2013-04-08 12:54:32.206312] E [nfs3.c:4741:nfs3svc_fsinfo] 0-nfs-nfsv3:
> >>>>> Error decoding arguments
> >>>>>
> >>>>>
> >>>>> I've attached abridged packet captures and text explanations of the
> >>>>> packets (thanks to wireshark).
> >>>>>
> >>>>> Can someone please look at this and determine if it's gluster's parsing
> >>>>> of the RPC call to blame, or if it's Oracle?
> >>>>>
> >>>>> This is the same setup on which I reported the NFS race condition bug.
> >>>>> It does have that patch applied.
> >>>>> Details:
> >>>>> http://lists.gnu.org/archive/html/gluster-devel/2013-04/msg00014.html
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Michael
> >>>>>
> >>>>> --
> >>>>> Michael Brown | `One of the main causes of the fall of
> >>>>> Systems Consultant | the Roman Empire was that, lacking zero,
> >>>>> Net Direct Inc. | they had no way to indicate successful
> >>>>> ?: +1 519 883 1172 x5106 | termination of their C programs.' - Firth
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>>> _______________________________________________
> >>>>> Gluster-devel mailing list
> >>>>> Gluster-devel at nongnu.org
> >>>>> https://lists.nongnu.org/mailman/listinfo/gluster-devel
> >>>
> >>> --
> >>> Michael Brown | `One of the main causes of the fall of
> >>> Systems Consultant | the Roman Empire was that, lacking zero,
> >>> Net Direct Inc. | they had no way to indicate successful
> >>> ☎: +1 519 883 1172 x5106 | termination of their C programs.' - Firth
> >>>
> >> --
> >> Niels de Vos
> >> Sr. Software Maintenance Engineer
> >> Support Engineering Group
> >> Red Hat Global Support Services
> >>
> >> _______________________________________________
> >> Gluster-devel mailing list
> >> Gluster-devel at nongnu.org
> >> https://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
> --
> Michael Brown | `One of the main causes of the fall of
> Systems Consultant | the Roman Empire was that, lacking zero,
> Net Direct Inc. | they had no way to indicate successful
> ☎: +1 519 883 1172 x5106 | termination of their C programs.' - Firth
>
--
Niels de Vos
Sr. Software Maintenance Engineer
Support Engineering Group
Red Hat Global Support Services
-------------- next part --------------
>From 2f7f6b952ed89f5cf8181db351e1965d8400f493 Mon Sep 17 00:00:00 2001
From: Niels de Vos <ndevos at redhat.com>
Date: Sat, 13 Apr 2013 00:41:43 +0200
Subject: [PATCH] nfs: encode/decode fhandles as opaque and not as bytes
At least one client (Oracle DNFS) does not pass an XDR roundup'd byte
array a fhandle on FSINFO.
XDR (http://tools.ietf.org/html/rfc4506, the encoding used for the RPC
protocol) uses 'blocks' for alignment. A fhandle byte array that is
34-bytes long, needs to be (34 / 4 + 1)*4 = 36 bytes in size. The
'length' given in the structure tells the consumer to ignore the two
tailing bytes.
The NFSv3 specification (http://tools.ietf.org/html/rfc1813#page-21)
defines the nfs_fh3 as a opaque (not bytes) structure.
BUG: 950121
Change-Id: Id723a38ef0ec6e7f1d9f29683321ea32e00503c7
Reported-by: Michael Brown <michael at supermathie.net>
Signed-off-by: Niels de Vos <ndevos at redhat.com>
---
rpc/xdr/src/xdr-nfs3.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/rpc/xdr/src/xdr-nfs3.c b/rpc/xdr/src/xdr-nfs3.c
index a497e9f..39dbf5c 100644
--- a/rpc/xdr/src/xdr-nfs3.c
+++ b/rpc/xdr/src/xdr-nfs3.c
@@ -184,7 +184,9 @@ xdr_specdata3 (XDR *xdrs, specdata3 *objp)
bool_t
xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp)
{
- if (!xdr_bytes (xdrs, (char **)&objp->data.data_val, (u_int *) &objp->data.data_len, NFS3_FHSIZE))
+ if (!xdr_uint32 (xdrs, (u_int *) &objp->data.data_len))
+ return FALSE;
+ if (!xdr_opaque (xdrs, (char *) &objp->data.data_val, (u_int) objp->data.data_len))
return FALSE;
return TRUE;
}
--
1.7.1
More information about the Gluster-devel
mailing list