<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jun 9, 2018 at 9:38 AM, Dan Lavu <span dir="ltr"><<a href="mailto:dan@redhat.com" target="_blank">dan@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Krutika, <div><br></div><div>Is it also normal for the following messages as well? </div></div></blockquote><div><br></div><div>Yes, this should be fine. It only represents a transient state when multiple threads/clients are trying to create the same shard at the same time. These can be ignored.</div><div><br></div><div>-Krutika</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><br></div><div><div>[2018-06-07 06:36:22.008492] E [MSGID: 113020] [posix.c:1395:posix_mknod] 0-rhev_vms-posix: setting gfid on /gluster/brick/rhev_vms/.<wbr>shard/0ab3a16c-1d07-4153-8d01-<wbr>b9b0ffd9d19b.16158 failed</div><div>[2018-06-07 06:36:22.319735] E [MSGID: 113020] [posix.c:1395:posix_mknod] 0-rhev_vms-posix: setting gfid on /gluster/brick/rhev_vms/.<wbr>shard/0ab3a16c-1d07-4153-8d01-<wbr>b9b0ffd9d19b.16160 failed</div><div>[2018-06-07 06:36:24.711800] E [MSGID: 113002] [posix.c:267:posix_lookup] 0-rhev_vms-posix: buf->ia_gfid is null for /gluster/brick/rhev_vms/.<wbr>shard/0ab3a16c-1d07-4153-8d01-<wbr>b9b0ffd9d19b.16177 [No data available]</div><div>[2018-06-07 06:36:24.711839] E [MSGID: 115050] [server-rpc-fops.c:170:server_<wbr>lookup_cbk] 0-rhev_vms-server: 32334131: LOOKUP /.shard/0ab3a16c-1d07-4153-<wbr>8d01-b9b0ffd9d19b.16177 (be318638-e8a0-4c6d-977d-<wbr>7a937aa84806/0ab3a16c-1d07-<wbr>4153-8d01-b9b0ffd9d19b.16177) ==> (No data available) [No data available]</div></div><div><br></div><div>if so what does it mean? </div><span class="HOEnZb"><font color="#888888"><div><br></div><div>Dan</div></font></span></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Aug 16, 2016 at 1:21 AM, Krutika Dhananjay <span dir="ltr"><<a href="mailto:kdhananj@redhat.com" target="_blank">kdhananj@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Thanks, I just sent <a href="http://review.gluster.org/#/c/15161/1" target="_blank">http://review.gluster.org/#/c/<wbr>15161/1</a> to reduce the log-level to DEBUG. Let's see what the maintainers have to say. :)<span class="m_3041681418643982804HOEnZb"><font color="#888888"><br><br></font></span></div><span class="m_3041681418643982804HOEnZb"><font color="#888888">-Krutika<br></font></span></div><div class="m_3041681418643982804HOEnZb"><div class="m_3041681418643982804h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Aug 16, 2016 at 5:50 AM, David Gossage <span dir="ltr"><<a href="mailto:dgossage@carouselchecks.com" target="_blank">dgossage@carouselchecks.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><span><div><div data-smartmail="gmail_signature"><div dir="ltr"><div>On Mon, Aug 15, 2016 at 6:24 PM, Krutika Dhananjay <span dir="ltr"><<a href="mailto:kdhananj@redhat.com" target="_blank">kdhananj@redhat.com</a>></span> wrote:<br></div></div></div></div></span><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>No. The EEXIST errors are normal and can be ignored. This can happen when multiple threads try to create the same<br></div>shard in parallel. Nothing wrong with that.<br><br></div></div></blockquote><div><br></div></span><div>Other than they pop up as E errors making a user worry hehe</div><div><br></div><div>Is their a known bug filed against that or should I maybe create one to see if we can get that sent to an informational level maybe? </div><div><div class="m_3041681418643982804m_146657148836122038h5"><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div></div>-Krutika<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Aug 16, 2016 at 1:02 AM, David Gossage <span dir="ltr"><<a href="mailto:dgossage@carouselchecks.com" target="_blank">dgossage@carouselchecks.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><span><div><div data-smartmail="gmail_signature"><div dir="ltr"><div>On Sat, Aug 13, 2016 at 6:37 AM, David Gossage <span dir="ltr"><<a href="mailto:dgossage@carouselchecks.com" target="_blank">dgossage@carouselchecks.com</a>></span> wrote:<br></div></div></div></div></span><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra">Here is reply again just in case. I got quarantine message so not sure if first went through or wll anytime soon. Brick logs weren't large so Ill just include as text files this time</div></div></blockquote><div><br></div></span><div>Did maintenance over weekend updating ovirt from 3.6.6->3.6.7 and after restrating the complaining ovirt node I was able to migrate the 2 vm with issues. So not sure why the mount got stale, but I imagine that one node couldn't see the new image files after that had occurred?</div><div><br></div><div>Still getting a few sporadic errors, but seems much fewer than before and never get any corresponding notices in any other log files</div><div><br></div><div><div>[2016-08-15 13:40:31.510798] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/0e5a<wbr>d95d-722d-4374-88fb-66fca0b143<wbr>41.584 failed [File exists]</div><div>[2016-08-15 13:40:31.522067] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/0e5a<wbr>d95d-722d-4374-88fb-66fca0b143<wbr>41.584 failed [File exists]</div><div>[2016-08-15 17:47:06.375708] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/d5a3<wbr>28be-03d0-42f7-a443-248290849e<wbr>7d.722 failed [File exists]</div><div>[2016-08-15 17:47:26.435198] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/d5a3<wbr>28be-03d0-42f7-a443-248290849e<wbr>7d.723 failed [File exists]</div><div>[2016-08-15 17:47:06.405481] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/d5a3<wbr>28be-03d0-42f7-a443-248290849e<wbr>7d.722 failed [File exists]</div><div>[2016-08-15 17:47:26.464542] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/d5a3<wbr>28be-03d0-42f7-a443-248290849e<wbr>7d.723 failed [File exists]</div><div>[2016-08-15 18:46:47.187967] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/f9a7<wbr>f3c5-4c13-4020-b560-1f4f7b1e3c<wbr>42.739 failed [File exists]</div><div>[2016-08-15 18:47:41.414312] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/f9a7<wbr>f3c5-4c13-4020-b560-1f4f7b1e3c<wbr>42.779 failed [File exists]</div><div>[2016-08-15 18:47:41.450470] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/f9a7<wbr>f3c5-4c13-4020-b560-1f4f7b1e3c<wbr>42.779 failed [File exists]</div></div><div><div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><span style="font-size:12.8px">The attached file bricks.zip you sent to <</span><a href="mailto:kdhananj@redhat.com" style="font-size:12.8px" target="_blank">kdhananj@redhat.com</a><span style="font-size:12.8px">>;<</span><a href="mailto:Gluster-users@gluster.org" style="font-size:12.8px" target="_blank">Gluster<wbr>-users@gluster.org</a><span style="font-size:12.8px">> on </span><span style="font-size:12.8px"><span>8/13/2016 7:17:35 AM</span></span><span style="font-size:12.8px"> was quarantined. As a safety precaution, the University of South Carolina quarantines .zip and .docm files sent via email. If this is a legitimate attachment <</span><a href="mailto:kdhananj@redhat.com" style="font-size:12.8px" target="_blank">kdhananj@redhat.com</a><span style="font-size:12.8px">>;<</span><a href="mailto:Gluster-users@gluster.org" style="font-size:12.8px" target="_blank">Gluster<wbr>-users@gluster.org</a><span style="font-size:12.8px">> may contact the Service Desk at </span><a href="tel:803-777-1800" value="+18037771800" style="font-size:12.8px" target="_blank">803-777-1800</a><span style="font-size:12.8px"> (</span><a href="mailto:servicedesk@sc.edu" style="font-size:12.8px" target="_blank">servicedesk@s<wbr>c.edu</a><span style="font-size:12.8px">) and the attachment file will be released from quarantine and delivered.</span></div><div class="gmail_extra"><span style="font-size:12.8px"><br></span></div><div class="gmail_extra"><span style="font-size:12.8px"><br></span><div class="gmail_quote">On Sat, Aug 13, 2016 at 6:15 AM, David Gossage <span dir="ltr"><<a href="mailto:dgossage@carouselchecks.com" target="_blank">dgossage@carouselchecks.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div><div data-smartmail="gmail_signature"><div dir="ltr"><div>On Sat, Aug 13, 2016 at 12:26 AM, Krutika Dhananjay <span dir="ltr"><<a href="mailto:kdhananj@redhat.com" target="_blank">kdhananj@redhat.com</a>></span> wrote:<br></div></div></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><div><div><div>1. Could you share the output of `gluster volume heal <VOL> info`?<br></div></div></div></div></div></div></blockquote><div>Results were same moments after issue occurred as well</div><div><div>Brick ccgl1.gl.local:/gluster1/BRICK<wbr>1/1</div><div>Status: Connected</div><div>Number of entries: 0</div><div><br></div><div>Brick ccgl2.gl.local:/gluster1/BRICK<wbr>1/1</div><div>Status: Connected</div><div>Number of entries: 0</div><div><br></div><div>Brick ccgl4.gl.local:/gluster1/BRICK<wbr>1/1</div><div>Status: Connected</div><div>Number of entries: 0</div><div><br></div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><div><div><div></div>2. `gluster volume info`<br></div></div></div></div></div></blockquote><div><div>Volume Name: GLUSTER1</div><div>Type: Replicate</div><div>Volume ID: 167b8e57-28c3-447a-95cc-8410cb<wbr>df3f7f</div><div>Status: Started</div><div>Number of Bricks: 1 x 3 = 3</div><div>Transport-type: tcp</div><div>Bricks:</div><div>Brick1: ccgl1.gl.local:/gluster1/BRICK<wbr>1/1</div><div>Brick2: ccgl2.gl.local:/gluster1/BRICK<wbr>1/1</div><div>Brick3: ccgl4.gl.local:/gluster1/BRICK<wbr>1/1</div><div>Options Reconfigured:</div><div>cluster.locking-scheme: granular</div><div>nfs.enable-ino32: off</div><div>nfs.addr-namelookup: off</div><div>nfs.disable: on</div><div>performance.strict-write-order<wbr>ing: off</div><div>cluster.background-self-heal-c<wbr>ount: 16</div><div>cluster.self-heal-window-size: 1024</div><div>server.allow-insecure: on</div><div>cluster.server-quorum-type: server</div><div>cluster.quorum-type: auto</div><div>network.remote-dio: enable</div><div>cluster.eager-lock: enable</div><div>performance.stat-prefetch: on</div><div>performance.io-cache: off</div><div>performance.read-ahead: off</div><div>performance.quick-read: off</div><div>storage.owner-gid: 36</div><div>storage.owner-uid: 36</div><div>performance.readdir-ahead: on</div><div>features.shard: on</div><div>features.shard-block-size: 64MB</div><div>diagnostics.brick-log-level: WARNING</div></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><div><div></div>3. fuse mount logs of the affected volume(s)?<br></div></div></div></div></blockquote><div> [2016-08-12 21:34:19.518511] W [MSGID: 114031] [client-rpc-fops.c:3050:client<wbr>3_3_readv_cbk] 0-GLUSTER1-client-1: remote operation failed [No such file or directory]</div><div>[2016-08-12 21:34:19.519115] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-0: remote operation failed [No such file or directory]</div><div>[2016-08-12 21:34:19.519203] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-1: remote operation failed [No such file or directory]</div><div>[2016-08-12 21:34:19.519226] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-2: remote operation failed [No such file or directory]</div><div>[2016-08-12 21:34:19.520737] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_t<wbr>xn] 0-GLUSTER1-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid e18650c4-02c0-4a5a-bd4c-bbdf5f<wbr>bd9c88. (Possible split-brain)</div><div>[2016-08-12 21:34:19.521393] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-2: remote operation failed [No such file or directory]</div><div>[2016-08-12 21:34:19.522269] E [MSGID: 109040] [dht-helper.c:1190:dht_migrati<wbr>on_complete_check_task] 0-GLUSTER1-dht: (null): failed to lookup the file on GLUSTER1-dht [Stale file handle]</div><div>[2016-08-12 21:34:19.522341] W [fuse-bridge.c:2227:fuse_readv<wbr>_cbk] 0-glusterfs-fuse: 18479997: READ => -1 gfid=31d7c904-775e-4b9f-8ef7-8<wbr>88218679845 fd=0x7f00a80bde58 (Stale file handle)</div><div>[2016-08-12 21:34:19.521296] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-1: remote operation failed [No such file or directory]</div><div>[2016-08-12 21:34:19.521357] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-0: remote operation failed [No such file or directory]</div><div>[2016-08-12 22:15:08.337528] I [MSGID: 109066] [dht-rename.c:1568:dht_rename] 0-GLUSTER1-dht: renaming /7c73a8dd-a72e-4556-ac88-7f681<wbr>3131e64/images/ec4f5b10-02b1-4<wbr>35c-a7e1-97e399532597/0e6ed1c3<wbr>-ffe0-43b0-9863-439ccc3193c9.m<wbr>eta.new (hash=GLUSTER1-replicate-0/cac<wbr>he=GLUSTER1-replicate-0) => /7c73a8dd-a72e-4556-ac88-7f681<wbr>3131e64/images/ec4f5b10-02b1-4<wbr>35c-a7e1-97e399532597/0e6ed1c3<wbr>-ffe0-43b0-9863-439ccc3193c9.m<wbr>eta (hash=GLUSTER1-replicate-0/cac<wbr>he=GLUSTER1-replicate-0)</div><div>[2016-08-12 22:15:12.240026] I [MSGID: 109066] [dht-rename.c:1568:dht_rename] 0-GLUSTER1-dht: renaming /7c73a8dd-a72e-4556-ac88-7f681<wbr>3131e64/images/78636a1b-86dd-4<wbr>aaf-8b4f-4ab9c3509e88/4707d651<wbr>-06c6-446b-b9c8-408004a55ada.m<wbr>eta.new (hash=GLUSTER1-replicate-0/cac<wbr>he=GLUSTER1-replicate-0) => /7c73a8dd-a72e-4556-ac88-7f681<wbr>3131e64/images/78636a1b-86dd-4<wbr>aaf-8b4f-4ab9c3509e88/4707d651<wbr>-06c6-446b-b9c8-408004a55ada.m<wbr>eta (hash=GLUSTER1-replicate-0/cac<wbr>he=GLUSTER1-replicate-0)</div><div>[2016-08-12 22:15:11.105593] I [MSGID: 109066] [dht-rename.c:1568:dht_rename] 0-GLUSTER1-dht: renaming /7c73a8dd-a72e-4556-ac88-7f681<wbr>3131e64/images/ec4f5b10-02b1-4<wbr>35c-a7e1-97e399532597/0e6ed1c3<wbr>-ffe0-43b0-9863-439ccc3193c9.m<wbr>eta.new (hash=GLUSTER1-replicate-0/cac<wbr>he=GLUSTER1-replicate-0) => /7c73a8dd-a72e-4556-ac88-7f681<wbr>3131e64/images/ec4f5b10-02b1-4<wbr>35c-a7e1-97e399532597/0e6ed1c3<wbr>-ffe0-43b0-9863-439ccc3193c9.m<wbr>eta (hash=GLUSTER1-replicate-0/cac<wbr>he=GLUSTER1-replicate-0)</div><div>[2016-08-12 22:15:14.772713] I [MSGID: 109066] [dht-rename.c:1568:dht_rename] 0-GLUSTER1-dht: renaming /7c73a8dd-a72e-4556-ac88-7f681<wbr>3131e64/images/78636a1b-86dd-4<wbr>aaf-8b4f-4ab9c3509e88/4707d651<wbr>-06c6-446b-b9c8-408004a55ada.m<wbr>eta.new (hash=GLUSTER1-replicate-0/cac<wbr>he=GLUSTER1-replicate-0) => /7c73a8dd-a72e-4556-ac88-7f681<wbr>3131e64/images/78636a1b-86dd-4<wbr>aaf-8b4f-4ab9c3509e88/4707d651<wbr>-06c6-446b-b9c8-408004a55ada.m<wbr>eta (hash=GLUSTER1-replicate-0/cac<wbr>he=GLUSTER1-replicate-0)</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><div></div>4. glustershd logs<br></div></div></div></blockquote><div>Nothing recent same on all 3 storage nodes</div><div><div>[2016-08-07 08:48:03.593401] I [glusterfsd-mgmt.c:1600:mgmt_g<wbr>etspec_cbk] 0-glusterfs: No change in volfile, continuing</div><div>[2016-08-11 08:14:03.683287] I [MSGID: 100011] [glusterfsd.c:1323:reincarnate<wbr>] 0-glusterfsd: Fetching the volume file from server...</div><div>[2016-08-11 08:14:03.684492] I [glusterfsd-mgmt.c:1600:mgmt_g<wbr>etspec_cbk] 0-glusterfs: No change in volfile, continuing</div></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div></div>5. Brick logs<br></div></div></blockquote><div> Their have been some error in brick logs I hadn't noticed occurring. I've zip'd and attached all 3 nodes logs, but from this snippet on one node none of them seem to coincide with the time window when migration had issues. f9a7f3c5-4c13-4020-b560-1f4f7<wbr>b1e3c42 shard refers to an image for a different vm than one I had issues with as well. Maybe gluster is trying to do some sort of make shard test before writing out changes that would go to that image and that shard file? </div><div><br></div><div><div>[2016-08-12 18:48:22.463628] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/f9a7<wbr>f3c5-4c13-4020-b560-1f4f7b1e3c<wbr>42.697 failed [File exists]</div><div>[2016-08-12 18:48:24.553455] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/f9a7<wbr>f3c5-4c13-4020-b560-1f4f7b1e3c<wbr>42.698 failed [File exists]</div><div>[2016-08-12 18:49:16.065502] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/f9a7<wbr>f3c5-4c13-4020-b560-1f4f7b1e3c<wbr>42.738 failed [File exists]</div><div>The message "E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/f9a7<wbr>f3c5-4c13-4020-b560-1f4f7b1e3c<wbr>42.697 failed [File exists]" repeated 5 times between [2016-08-12 18:48:22.463628] and [2016-08-12 18:48:22.514777]</div><div>[2016-08-12 18:48:24.581216] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/f9a7<wbr>f3c5-4c13-4020-b560-1f4f7b1e3c<wbr>42.698 failed [File exists]</div><div>The message "E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/f9a7<wbr>f3c5-4c13-4020-b560-1f4f7b1e3c<wbr>42.738 failed [File exists]" repeated 5 times between [2016-08-12 18:49:16.065502] and [2016-08-12 18:49:16.107746]</div><div>[2016-08-12 19:23:40.964678] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/8379<wbr>4e5d-2225-4560-8df6-7c903c8a64<wbr>8a.1301 failed [File exists]</div><div>[2016-08-12 20:00:33.498751] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/0e5a<wbr>d95d-722d-4374-88fb-66fca0b143<wbr>41.580 failed [File exists]</div><div>[2016-08-12 20:00:33.530938] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/0e5a<wbr>d95d-722d-4374-88fb-66fca0b143<wbr>41.580 failed [File exists]</div><div>[2016-08-13 01:47:23.338036] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/1884<wbr>3fb4-e31c-4fc3-b519-cc6e5e9478<wbr>13.211 failed [File exists]</div><div>The message "E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/1884<wbr>3fb4-e31c-4fc3-b519-cc6e5e9478<wbr>13.211 failed [File exists]" repeated 16 times between [2016-08-13 01:47:23.338036] and [2016-08-13 01:47:23.380980]</div><div>[2016-08-13 01:48:02.224494] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/ffbb<wbr>cce0-3c4a-4fdf-b79f-a96ca32156<wbr>57.211 failed [File exists]</div><div>[2016-08-13 01:48:42.266148] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/1884<wbr>3fb4-e31c-4fc3-b519-cc6e5e9478<wbr>13.177 failed [File exists]</div><div>[2016-08-13 01:49:09.717434] E [MSGID: 113022] [posix.c:1245:posix_mknod] 0-GLUSTER1-posix: mknod on /gluster1/BRICK1/1/.shard/1884<wbr>3fb4-e31c-4fc3-b519-cc6e5e9478<wbr>13.178 failed [File exists]</div></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><br></div>-Krutika<br><div><div><div><div><div><div><br></div></div></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Aug 13, 2016 at 3:10 AM, David Gossage <span dir="ltr"><<a href="mailto:dgossage@carouselchecks.com" target="_blank">dgossage@carouselchecks.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><span><div><div data-smartmail="gmail_signature"><div dir="ltr"><div>On Fri, Aug 12, 2016 at 4:25 PM, Dan Lavu <span dir="ltr"><<a href="mailto:dan@redhat.com" target="_blank">dan@redhat.com</a>></span> wrote:<br></div></div></div></div></span><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div>David, <br><br></div>I'm seeing similar behavior in my lab, but it has been caused by healing files in the gluster cluster, though I attribute my problems to problems with the storage fabric. See if 'gluster volume heal $VOL info' indicates files that are being healed, and if those reduce in number, can the VM start? <br><br></div></div></blockquote><div><br></div></span><div>I haven't had any files in a state of being healed according to either of the 3 storage nodes. </div><div><br></div><div>I shut down one VM that has been around awhile a moment ago then told it to start on the one ovirt server that complained previously. It ran fine, and I was able to migrate it off and on the host no issues.</div><div><br></div><div>I told one of the new VM's to migrate to the one node and within seconds it paused from unknown storage errors no shards showing heals nothing with an error on storage node. Same stale file handle issues.</div><div><br></div><div>I'll probably put this node in maintenance later and reboot it. Other than that I may re-clone those 2 reccent VM's. maybe images just got corrupted though why it would only fail on one node of 3 if image was bad not sure.</div><div><div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr">Dan<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Aug 11, 2016 at 7:52 AM, David Gossage <span dir="ltr"><<a href="mailto:dgossage@carouselchecks.com" target="_blank">dgossage@carouselchecks.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr">Figure I would repost here as well. one client out of 3 complaining of stale file handles on a few new VM's I migrated over. No errors on storage nodes just client. Maybe just put that one in maintenance and restart gluster mount?<div><span><font color="#888888"><br clear="all"><div><div data-smartmail="gmail_signature"><div dir="ltr"><span><font color="#888888"><span style="color:rgb(0,0,0)"><b><i>David Gossage</i></b></span><font><i><span style="color:rgb(51,51,51)"><b><br>
</b></span></i></font></font></span><div><span><font color="#888888"><font><i><span style="color:rgb(51,51,51)"></span></i><font size="1"><b style="color:rgb(153,0,0)">Carousel Checks Inc.<span style="color:rgb(204,204,204)"> | System Administrator</span></b></font></font><font style="color:rgb(153,153,153)"><font size="1"><br>
</font></font><font><font size="1"><span style="color:rgb(51,51,51)"><b style="color:rgb(153,153,153)">Office</b><span style="color:rgb(153,153,153)"> <a value="+17086132426">708.613.2284<font color="#888888"><font size="1"><br></font></font></a></span></span></font></font></font></span></div></div></div></div></font></span><div><div>
<br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">David Gossage</b> <span dir="ltr"><<a href="mailto:dgossage@carouselchecks.com" target="_blank">dgossage@carouselchecks.com</a>></span><br>Date: Thu, Aug 11, 2016 at 12:17 AM<br>Subject: vm paused unknown storage error one node out of 3 only<br>To: users <<a href="mailto:users@ovirt.org" target="_blank">users@ovirt.org</a>><br><br><br><div dir="ltr"><div><div>Out of a 3 node cluster running oVirt <span style="color:rgb(0,0,0);font-family:"Arial Unicode MS",Arial,sans-serif;line-height:21.6667px;text-align:-webkit-center">3.6.6.2-1.el7.centos with a 3 replicate gluster 3.7.14 starting a VM i just copied in on one node of the 3 gets the following errors. The other 2 the vm starts fine. All ovirt and gluster are centos 7 based. VM on start of the one node it tries to default to on its own accord immediately puts into paused for unknown reason. Telling it to start on different node starts ok. node with issue already has 5 VMs running fine on it same gluster storage plus the hosted engine on different volume.</span></div><div><span style="color:rgb(0,0,0);font-family:"Arial Unicode MS",Arial,sans-serif;line-height:21.6667px;text-align:-webkit-center"><br></span></div><div>gluster nodes logs did not have any errors for volume</div><div>nodes own gluster logs had this in log</div><div><br></div><div>dfb8777a-7e8c-40ff-8faa-252bea<wbr>bba5f8 couldnt find in .glusterfs .shard or images/<span style="color:rgb(0,0,0);font-family:"Arial Unicode MS",Arial,sans-serif;line-height:21.6667px;text-align:-webkit-center"><br></span></div><div><br></div><div>7919f4a0-125c-4b11-b5c9-fb50cc<wbr>195c43 is the gfid of the bootable drive of the vm<br></div><div><br></div><div>[2016-08-11 04:31:39.982952] W [MSGID: 114031] [client-rpc-fops.c:3050:client<wbr>3_3_readv_cbk] 0-GLUSTER1-client-2: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:31:39.983683] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-2: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:31:39.984182] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-0: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:31:39.984221] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-1: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:31:39.985941] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_t<wbr>xn] 0-GLUSTER1-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid dfb8777a-7e8c-40ff-8faa-252bea<wbr>bba5f8. (Possible split-brain)</div><div>[2016-08-11 04:31:39.986633] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-2: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:31:39.987644] E [MSGID: 109040] [dht-helper.c:1190:dht_migrati<wbr>on_complete_check_task] 0-GLUSTER1-dht: (null): failed to lookup the file on GLUSTER1-dht [Stale file handle]</div><div>[2016-08-11 04:31:39.987751] W [fuse-bridge.c:2227:fuse_readv<wbr>_cbk] 0-glusterfs-fuse: 15152930: READ => -1 gfid=7919f4a0-125c-4b11-b5c9-f<wbr>b50cc195c43 fd=0x7f00a80bdb64 (Stale file handle)</div><div>[2016-08-11 04:31:39.986567] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-0: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:31:39.986567] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-1: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:35:21.210145] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_t<wbr>xn] 0-GLUSTER1-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid dfb8777a-7e8c-40ff-8faa-252bea<wbr>bba5f8. (Possible split-brain)</div><div>[2016-08-11 04:35:21.210873] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-1: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:35:21.210888] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-2: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:35:21.210947] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-0: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:35:21.213270] E [MSGID: 109040] [dht-helper.c:1190:dht_migrati<wbr>on_complete_check_task] 0-GLUSTER1-dht: (null): failed to lookup the file on GLUSTER1-dht [Stale file handle]</div><div>[2016-08-11 04:35:21.213345] W [fuse-bridge.c:2227:fuse_readv<wbr>_cbk] 0-glusterfs-fuse: 15156910: READ => -1 gfid=7919f4a0-125c-4b11-b5c9-f<wbr>b50cc195c43 fd=0x7f00a80bf6d0 (Stale file handle)</div><div>[2016-08-11 04:35:21.211516] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_t<wbr>xn] 0-GLUSTER1-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid dfb8777a-7e8c-40ff-8faa-252bea<wbr>bba5f8. (Possible split-brain)</div><div>[2016-08-11 04:35:21.212013] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-0: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:35:21.212081] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-1: remote operation failed [No such file or directory]</div><div>[2016-08-11 04:35:21.212121] W [MSGID: 114031] [client-rpc-fops.c:1572:client<wbr>3_3_fstat_cbk] 0-GLUSTER1-client-2: remote operation failed [No such file or directory]</div><div><br></div><div>I attached vdsm.log starting from when I spun up vm on offending node</div><div><br></div></div>
</div>
</div><br></div></div></div></div>
<br>______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman<wbr>/listinfo/gluster-users</a><br></blockquote></div><br></div>
</blockquote></div></div></div><br></div></div>
<br>______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman<wbr>/listinfo/gluster-users</a><br></blockquote></div><br></div>
</blockquote></div><br></div></div>
</blockquote></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div><br></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div></div>