<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif">Hi Jose,</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default"><font face="verdana, sans-serif">I tried your suggestion but there is one confusion regarding point #3. Since RancherOS has everything running as container, i am running </font><span style="font-family:&quot;trebuchet ms&quot;,sans-serif;font-size:12.8px">webcenter/rancher-glusterfs-se</span><wbr style="font-size:12.8px"><span style="font-size:12.8px"><font face="trebuchet ms, sans-serif">rver </font><font face="verdana, sans-serif">container on all three nodes. Now as far as removing the directories are concerned, i hope you meant removing them on the host and _not_ from within the container. After completing step 1 and 2, i checked the contents of all the directories that you specified in point #3. All were empty as you can see in the attached </font></span><font face="verdana, sans-serif"><span style="font-size:12.8px"><i>other_logs.txt</i>. So i got confused. I ran the deploy again but the issue persists. Two pods show Liveness error and the third one, Readiness error.</span></font></div><div class="gmail_default"><font face="verdana, sans-serif"><span style="font-size:12.8px"><br></span></font></div><div class="gmail_default"><font face="verdana, sans-serif"><span style="font-size:12.8px">I then tried removing those directories (Step #3) from within the container but getting following error:</span></font></div><div class="gmail_default"><font face="verdana, sans-serif"><span style="font-size:12.8px"><br></span></font></div><div class="gmail_default">root@c0f8ab4d92a2:/app# rm -rf /var/lib/heketi /etc/glusterfs /var/lib/glusterd /var/log/glusterfs</div><div class="gmail_default">rm: cannot remove &#39;/var/lib/glusterd&#39;: Device or resource busy<font face="verdana, sans-serif"><span style="font-size:12.8px"> </span></font></div><div class="gmail_default"><span style="font-size:12.8px"><font face="verdana, sans-serif"><br></font></span></div><div class="gmail_default"><span style="font-size:12.8px"><font face="verdana, sans-serif"><br></font></span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Sep 1, 2017 at 8:21 PM, Jose A. Rivera <span dir="ltr">&lt;<a href="mailto:jarrpa@redhat.com" target="_blank">jarrpa@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">1. Add a line to the ssh-exec portion of heketi.json of the sort:<br>

<br>

&quot;sudo&quot;: true,<br>

<br>

2. Run<br>

<br>

gk-deploy -g --abort<br>

<br>

3. On the nodes that were/will be running GlusterFS pods, run:<br>

<br>

rm -rf /var/lib/heketi /etc/glusterfs /var/lib/glusterd /var/log/glusterfs<br>

<br>

Then try the deploy again.<br>

<div class="HOEnZb"><div class="h5"><br>

On Fri, Sep 1, 2017 at 6:05 AM, Gaurav Chhabra &lt;<a href="mailto:varuag.chhabra@gmail.com">varuag.chhabra@gmail.com</a>&gt; wrote:<br>

&gt; Hi Jose,<br>

&gt;<br>

&gt;<br>

&gt; Thanks for the reply. It seems the three gluster pods might have been a<br>

&gt; copy-paste from another set of cluster where i was trying to setup the same<br>

&gt; thing using CentOS. Sorry for that. By the way, i did check for the kernel<br>

&gt; modules and it seems it&#39;s already there. Also, i am attaching fresh set of<br>

&gt; files because i created a new cluster and thought of giving it a try again.<br>

&gt; Issue still persists. :(<br>

&gt;<br>

&gt; In heketi.json, there is a slight change w.r.t the user which connects to<br>

&gt; glusterfs node using SSH. I am not sure how Heketi was using root user to<br>

&gt; login because i wasn&#39;t able to use root and do manual SSH. With rancher<br>

&gt; user, i can login successfully so i think this should be fine.<br>

&gt;<br>

&gt; /etc/heketi/heketi.json:<br>

&gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;     &quot;executor&quot;: &quot;ssh&quot;,<br>

&gt;<br>

&gt;     &quot;_sshexec_comment&quot;: &quot;SSH username and private key file information&quot;,<br>

&gt;     &quot;sshexec&quot;: {<br>

&gt;       &quot;keyfile&quot;: &quot;/var/lib/heketi/.ssh/id_rsa&quot;,<br>

&gt;       &quot;user&quot;: &quot;rancher&quot;,<br>

&gt;       &quot;port&quot;: &quot;22&quot;,<br>

&gt;       &quot;fstab&quot;: &quot;/etc/fstab&quot;<br>

&gt;     },<br>

&gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;<br>

&gt; Before running gk-deploy:<br>

&gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt; [root@workstation deploy]# kubectl get<br>

&gt; nodes,pods,daemonset,<wbr>deployments,services<br>

&gt; NAME                                     STATUS    AGE       VERSION<br>

&gt; no/node-a.c.kubernetes-174104.<wbr>internal   Ready     3h        v1.7.2-rancher1<br>

&gt; no/node-b.c.kubernetes-174104.<wbr>internal   Ready     3h        v1.7.2-rancher1<br>

&gt; no/node-c.c.kubernetes-174104.<wbr>internal   Ready     3h        v1.7.2-rancher1<br>

&gt;<br>

&gt; NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE<br>

&gt; svc/kubernetes   10.43.0.1    &lt;none&gt;        443/TCP   3h<br>

&gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;<br>

&gt; After running gk-deploy:<br>

&gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt; [root@workstation messagegc]# kubectl get<br>

&gt; nodes,pods,daemonset,<wbr>deployments,services<br>

&gt; NAME                                     STATUS    AGE       VERSION<br>

&gt; no/node-a.c.kubernetes-174104.<wbr>internal   Ready     3h        v1.7.2-rancher1<br>

&gt; no/node-b.c.kubernetes-174104.<wbr>internal   Ready     3h        v1.7.2-rancher1<br>

&gt; no/node-c.c.kubernetes-174104.<wbr>internal   Ready     3h        v1.7.2-rancher1<br>

&gt;<br>

&gt; NAME                 READY     STATUS    RESTARTS   AGE<br>

&gt; po/glusterfs-0j9l5   0/1       Running   0          2m<br>

&gt; po/glusterfs-gqz4c   0/1       Running   0          2m<br>

&gt; po/glusterfs-gxvcb   0/1       Running   0          2m<br>

&gt;<br>

&gt; NAME           DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE<br>

&gt; NODE-SELECTOR           AGE<br>

&gt; ds/glusterfs   3         3         0         3            0<br>

&gt; storagenode=glusterfs   2m<br>

&gt;<br>

&gt; NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE<br>

&gt; svc/kubernetes   10.43.0.1    &lt;none&gt;        443/TCP   3h<br>

&gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;<br>

&gt; Kernel module check on all three nodes:<br>

&gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt; [root@node-a ~]# find /lib*/modules/$(uname -r) -name *.ko | grep<br>

&gt; &#39;thin-pool\|snapshot\|mirror&#39; | xargs ls -ltr<br>

&gt; -rw-r--r--    1 root     root         92310 Jun 26 04:13<br>

&gt; /lib64/modules/4.9.34-rancher/<wbr>kernel/drivers/md/dm-thin-<wbr>pool.ko<br>

&gt; -rw-r--r--    1 root     root         56982 Jun 26 04:13<br>

&gt; /lib64/modules/4.9.34-rancher/<wbr>kernel/drivers/md/dm-snapshot.<wbr>ko<br>

&gt; -rw-r--r--    1 root     root         27070 Jun 26 04:13<br>

&gt; /lib64/modules/4.9.34-rancher/<wbr>kernel/drivers/md/dm-mirror.ko<br>

&gt; -rw-r--r--    1 root     root         92310 Jun 26 04:13<br>

&gt; /lib/modules/4.9.34-rancher/<wbr>kernel/drivers/md/dm-thin-<wbr>pool.ko<br>

&gt; -rw-r--r--    1 root     root         56982 Jun 26 04:13<br>

&gt; /lib/modules/4.9.34-rancher/<wbr>kernel/drivers/md/dm-snapshot.<wbr>ko<br>

&gt; -rw-r--r--    1 root     root         27070 Jun 26 04:13<br>

&gt; /lib/modules/4.9.34-rancher/<wbr>kernel/drivers/md/dm-mirror.ko<br>

&gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;<br>

&gt; Error snapshot attached.<br>

&gt;<br>

&gt; In my first mail, i checked that Readiness Probe failure check has this code<br>

&gt; in kube-templates/glusterfs-<wbr>daemonset.yaml file:<br>

&gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;         readinessProbe:<br>

&gt;           timeoutSeconds: 3<br>

&gt;           initialDelaySeconds: 40<br>

&gt;           exec:<br>

&gt;             command:<br>

&gt;             - &quot;/bin/bash&quot;<br>

&gt;             - &quot;-c&quot;<br>

&gt;             - systemctl status glusterd.service<br>

&gt;           periodSeconds: 25<br>

&gt;           successThreshold: 1<br>

&gt;           failureThreshold: 15<br>

&gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;<br>

&gt; I tried logging into glustefs container on one of the node and ran the above<br>

&gt; command:<br>

&gt;<br>

&gt; [root@node-a ~]# docker exec -it c0f8ab4d92a23b6df2 /bin/bash<br>

&gt; root@c0f8ab4d92a2:/app# systemctl status glusterd.service<br>

&gt; WARNING: terminal is not fully functional<br>

&gt; Failed to connect to bus: No such file or directory<br>

&gt;<br>

&gt;<br>

&gt; Any check that i can do manually on nodes to debug further? Any suggestions?<br>

&gt;<br>

&gt;<br>

&gt; On Thu, Aug 31, 2017 at 6:53 PM, Jose A. Rivera &lt;<a href="mailto:jarrpa@redhat.com">jarrpa@redhat.com</a>&gt; wrote:<br>

&gt;&gt;<br>

&gt;&gt; Hey Gaurav,<br>

&gt;&gt;<br>

&gt;&gt; The kernel modules must be loaded on all nodes that will run heketi<br>

&gt;&gt; pods. Additionally, you must have at least three nodes specified in<br>

&gt;&gt; your topology file. I&#39;m not sure how you&#39;re getting three gluster pods<br>

&gt;&gt; when you only have two nodes defined... :)<br>

&gt;&gt;<br>

&gt;&gt; --Jose<br>

&gt;&gt;<br>

&gt;&gt; On Wed, Aug 30, 2017 at 5:27 AM, Gaurav Chhabra<br>

&gt;&gt; &lt;<a href="mailto:varuag.chhabra@gmail.com">varuag.chhabra@gmail.com</a>&gt; wrote:<br>

&gt;&gt; &gt; Hi,<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; I have the following setup in place:<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; 1 node    : RancherOS having Rancher application for Kubernetes setup<br>

&gt;&gt; &gt; 2 nodes  : RancherOS having Rancher agent<br>

&gt;&gt; &gt; 1 node   : CentOS 7 workstation having kubectl installed and folder<br>

&gt;&gt; &gt; cloned/downloaded from <a href="https://github.com/gluster/gluster-kubernetes" rel="noreferrer" target="_blank">https://github.com/gluster/<wbr>gluster-kubernetes</a><br>

&gt;&gt; &gt; using<br>

&gt;&gt; &gt; which i run Heketi setup (gk-deploy -g)<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; I also have rancher-glusterfs-server container running with the<br>

&gt;&gt; &gt; following<br>

&gt;&gt; &gt; configuration:<br>

&gt;&gt; &gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;&gt; &gt; [root@node-1 rancher]# cat gluster-server.sh<br>

&gt;&gt; &gt; #!/bin/bash<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; sudo docker run --name=gluster-server -d \<br>

&gt;&gt; &gt;         --env &#39;SERVICE_NAME=gluster&#39; \<br>

&gt;&gt; &gt;         --restart always \<br>

&gt;&gt; &gt;         --env &#39;GLUSTER_DATA=/srv/docker/<wbr>gitlab&#39; \<br>

&gt;&gt; &gt;         --publish 2222:22 \<br>

&gt;&gt; &gt;         webcenter/rancher-glusterfs-<wbr>server<br>

&gt;&gt; &gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; In /etc/heketi/heketi.json, following is the only modified portion:<br>

&gt;&gt; &gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;&gt; &gt;     &quot;executor&quot;: &quot;ssh&quot;,<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;     &quot;_sshexec_comment&quot;: &quot;SSH username and private key file information&quot;,<br>

&gt;&gt; &gt;     &quot;sshexec&quot;: {<br>

&gt;&gt; &gt;       &quot;keyfile&quot;: &quot;/var/lib/heketi/.ssh/id_rsa&quot;,<br>

&gt;&gt; &gt;       &quot;user&quot;: &quot;root&quot;,<br>

&gt;&gt; &gt;       &quot;port&quot;: &quot;22&quot;,<br>

&gt;&gt; &gt;       &quot;fstab&quot;: &quot;/etc/fstab&quot;<br>

&gt;&gt; &gt;     },<br>

&gt;&gt; &gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Status before running gk-deploy:<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; [root@workstation deploy]# kubectl get nodes,pods,services,<wbr>deployments<br>

&gt;&gt; &gt; NAME                                     STATUS    AGE       VERSION<br>

&gt;&gt; &gt; no/node-1.c.kubernetes-174104.<wbr>internal   Ready     2d<br>

&gt;&gt; &gt; v1.7.2-rancher1<br>

&gt;&gt; &gt; no/node-2.c.kubernetes-174104.<wbr>internal   Ready     2d<br>

&gt;&gt; &gt; v1.7.2-rancher1<br>

&gt;&gt; &gt; no/node-3.c.kubernetes-174104.<wbr>internal   Ready     2d<br>

&gt;&gt; &gt; v1.7.2-rancher1<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE<br>

&gt;&gt; &gt; svc/kubernetes   10.43.0.1    &lt;none&gt;        443/TCP   2d<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Now when i run &#39;gk-deploy -g&#39;, in the Rancher console, i see the<br>

&gt;&gt; &gt; following<br>

&gt;&gt; &gt; error:<br>

&gt;&gt; &gt; Readiness probe failed: Failed to get D-Bus connection: Operation not<br>

&gt;&gt; &gt; permitted<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; From the attached gk-deploy_log i see that it failed at:<br>

&gt;&gt; &gt; Waiting for GlusterFS pods to start ... pods not found.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; In the kube-templates/glusterfs-<wbr>daemonset.yaml file, i see this for<br>

&gt;&gt; &gt; Readiness probe section:<br>

&gt;&gt; &gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;&gt; &gt;         readinessProbe:<br>

&gt;&gt; &gt;           timeoutSeconds: 3<br>

&gt;&gt; &gt;           initialDelaySeconds: 40<br>

&gt;&gt; &gt;           exec:<br>

&gt;&gt; &gt;             command:<br>

&gt;&gt; &gt;             - &quot;/bin/bash&quot;<br>

&gt;&gt; &gt;             - &quot;-c&quot;<br>

&gt;&gt; &gt;             - systemctl status glusterd.service<br>

&gt;&gt; &gt;           periodSeconds: 25<br>

&gt;&gt; &gt;           successThreshold: 1<br>

&gt;&gt; &gt;           failureThreshold: 15<br>

&gt;&gt; &gt; ------------------------------<wbr>------------------------------<wbr>------<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Status after running gk-deploy:<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; [root@workstation deploy]# kubectl get nodes,pods,deployments,<wbr>services<br>

&gt;&gt; &gt; NAME                                     STATUS    AGE       VERSION<br>

&gt;&gt; &gt; no/node-1.c.kubernetes-174104.<wbr>internal   Ready     2d<br>

&gt;&gt; &gt; v1.7.2-rancher1<br>

&gt;&gt; &gt; no/node-2.c.kubernetes-174104.<wbr>internal   Ready     2d<br>

&gt;&gt; &gt; v1.7.2-rancher1<br>

&gt;&gt; &gt; no/node-3.c.kubernetes-174104.<wbr>internal   Ready     2d<br>

&gt;&gt; &gt; v1.7.2-rancher1<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; NAME                 READY     STATUS    RESTARTS   AGE<br>

&gt;&gt; &gt; po/glusterfs-0s440   0/1       Running   0          1m<br>

&gt;&gt; &gt; po/glusterfs-j7dgr   0/1       Running   0          1m<br>

&gt;&gt; &gt; po/glusterfs-p6jl3   0/1       Running   0          1m<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE<br>

&gt;&gt; &gt; svc/kubernetes   10.43.0.1    &lt;none&gt;        443/TCP   2d<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Also, from prerequisite perspective, i was also seeing this mentioned:<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; The following kernel modules must be loaded:<br>

&gt;&gt; &gt;  * dm_snapshot<br>

&gt;&gt; &gt;  * dm_mirror<br>

&gt;&gt; &gt;  * dm_thin_pool<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Where exactly is this to be checked? On all Gluster server nodes? How<br>

&gt;&gt; &gt; can i<br>

&gt;&gt; &gt; check whether it&#39;s there?<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; I have attached topology.json and gk-deploy log for reference.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Does this issue has anything to do with the host OS (RancherOS) that i<br>

&gt;&gt; &gt; am<br>

&gt;&gt; &gt; using for Gluster nodes? Any idea how i can fix this? Any help will<br>

&gt;&gt; &gt; really<br>

&gt;&gt; &gt; be appreciated.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Thanks.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; ______________________________<wbr>_________________<br>

&gt;&gt; &gt; heketi-devel mailing list<br>

&gt;&gt; &gt; <a href="mailto:heketi-devel@gluster.org">heketi-devel@gluster.org</a><br>

&gt;&gt; &gt; <a href="http://lists.gluster.org/mailman/listinfo/heketi-devel" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/heketi-devel</a><br>

&gt;&gt; &gt;<br>

&gt;<br>

&gt;<br>

</div></div></blockquote></div><br></div>