[heketi-devel] Unable to use Heketi setup to install Gluster for Kubernetes

Fri Sep 1 14:51:18 UTC 2017

1. Add a line to the ssh-exec portion of heketi.json of the sort:

"sudo": true,

2. Run

gk-deploy -g --abort

3. On the nodes that were/will be running GlusterFS pods, run:

rm -rf /var/lib/heketi /etc/glusterfs /var/lib/glusterd /var/log/glusterfs

Then try the deploy again.

On Fri, Sep 1, 2017 at 6:05 AM, Gaurav Chhabra <varuag.chhabra at gmail.com> wrote:
> Hi Jose,
>
>
> Thanks for the reply. It seems the three gluster pods might have been a
> copy-paste from another set of cluster where i was trying to setup the same
> thing using CentOS. Sorry for that. By the way, i did check for the kernel
> modules and it seems it's already there. Also, i am attaching fresh set of
> files because i created a new cluster and thought of giving it a try again.
> Issue still persists. :(
>
> In heketi.json, there is a slight change w.r.t the user which connects to
> glusterfs node using SSH. I am not sure how Heketi was using root user to
> login because i wasn't able to use root and do manual SSH. With rancher
> user, i can login successfully so i think this should be fine.
>
> /etc/heketi/heketi.json:
> ------------------------------------------------------------------
>     "executor": "ssh",
>
>     "_sshexec_comment": "SSH username and private key file information",
>     "sshexec": {
>       "keyfile": "/var/lib/heketi/.ssh/id_rsa",
>       "user": "rancher",
>       "port": "22",
>       "fstab": "/etc/fstab"
>     },
> ------------------------------------------------------------------
>
> Before running gk-deploy:
> ------------------------------------------------------------------
> [root at workstation deploy]# kubectl get
> nodes,pods,daemonset,deployments,services
> NAME                                     STATUS    AGE       VERSION
> no/node-a.c.kubernetes-174104.internal   Ready     3h        v1.7.2-rancher1
> no/node-b.c.kubernetes-174104.internal   Ready     3h        v1.7.2-rancher1
> no/node-c.c.kubernetes-174104.internal   Ready     3h        v1.7.2-rancher1
>
> NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
> svc/kubernetes   10.43.0.1    <none>        443/TCP   3h
> ------------------------------------------------------------------
>
> After running gk-deploy:
> ------------------------------------------------------------------
> [root at workstation messagegc]# kubectl get
> nodes,pods,daemonset,deployments,services
> NAME                                     STATUS    AGE       VERSION
> no/node-a.c.kubernetes-174104.internal   Ready     3h        v1.7.2-rancher1
> no/node-b.c.kubernetes-174104.internal   Ready     3h        v1.7.2-rancher1
> no/node-c.c.kubernetes-174104.internal   Ready     3h        v1.7.2-rancher1
>
> NAME                 READY     STATUS    RESTARTS   AGE
> po/glusterfs-0j9l5   0/1       Running   0          2m
> po/glusterfs-gqz4c   0/1       Running   0          2m
> po/glusterfs-gxvcb   0/1       Running   0          2m
>
> NAME           DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE
> NODE-SELECTOR           AGE
> ds/glusterfs   3         3         0         3            0
> storagenode=glusterfs   2m
>
> NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
> svc/kubernetes   10.43.0.1    <none>        443/TCP   3h
> ------------------------------------------------------------------
>
> Kernel module check on all three nodes:
> ------------------------------------------------------------------
> [root at node-a ~]# find /lib*/modules/$(uname -r) -name *.ko | grep
> 'thin-pool\|snapshot\|mirror' | xargs ls -ltr
> -rw-r--r--    1 root     root         92310 Jun 26 04:13
> /lib64/modules/4.9.34-rancher/kernel/drivers/md/dm-thin-pool.ko
> -rw-r--r--    1 root     root         56982 Jun 26 04:13
> /lib64/modules/4.9.34-rancher/kernel/drivers/md/dm-snapshot.ko
> -rw-r--r--    1 root     root         27070 Jun 26 04:13
> /lib64/modules/4.9.34-rancher/kernel/drivers/md/dm-mirror.ko
> -rw-r--r--    1 root     root         92310 Jun 26 04:13
> /lib/modules/4.9.34-rancher/kernel/drivers/md/dm-thin-pool.ko
> -rw-r--r--    1 root     root         56982 Jun 26 04:13
> /lib/modules/4.9.34-rancher/kernel/drivers/md/dm-snapshot.ko
> -rw-r--r--    1 root     root         27070 Jun 26 04:13
> /lib/modules/4.9.34-rancher/kernel/drivers/md/dm-mirror.ko
> ------------------------------------------------------------------
>
> Error snapshot attached.
>
> In my first mail, i checked that Readiness Probe failure check has this code
> in kube-templates/glusterfs-daemonset.yaml file:
> ------------------------------------------------------------------
>         readinessProbe:
>           timeoutSeconds: 3
>           initialDelaySeconds: 40
>           exec:
>             command:
>             - "/bin/bash"
>             - "-c"
>             - systemctl status glusterd.service
>           periodSeconds: 25
>           successThreshold: 1
>           failureThreshold: 15
> ------------------------------------------------------------------
>
> I tried logging into glustefs container on one of the node and ran the above
> command:
>
> [root at node-a ~]# docker exec -it c0f8ab4d92a23b6df2 /bin/bash
> root at c0f8ab4d92a2:/app# systemctl status glusterd.service
> WARNING: terminal is not fully functional
> Failed to connect to bus: No such file or directory
>
>
> Any check that i can do manually on nodes to debug further? Any suggestions?
>
>
> On Thu, Aug 31, 2017 at 6:53 PM, Jose A. Rivera <jarrpa at redhat.com> wrote:
>>
>> Hey Gaurav,
>>
>> The kernel modules must be loaded on all nodes that will run heketi
>> pods. Additionally, you must have at least three nodes specified in
>> your topology file. I'm not sure how you're getting three gluster pods
>> when you only have two nodes defined... :)
>>
>> --Jose
>>
>> On Wed, Aug 30, 2017 at 5:27 AM, Gaurav Chhabra
>> <varuag.chhabra at gmail.com> wrote:
>> > Hi,
>> >
>> >
>> > I have the following setup in place:
>> >
>> > 1 node    : RancherOS having Rancher application for Kubernetes setup
>> > 2 nodes  : RancherOS having Rancher agent
>> > 1 node   : CentOS 7 workstation having kubectl installed and folder
>> > cloned/downloaded from https://github.com/gluster/gluster-kubernetes
>> > using
>> > which i run Heketi setup (gk-deploy -g)
>> >
>> > I also have rancher-glusterfs-server container running with the
>> > following
>> > configuration:
>> > ------------------------------------------------------------------
>> > [root at node-1 rancher]# cat gluster-server.sh
>> > #!/bin/bash
>> >
>> > sudo docker run --name=gluster-server -d \
>> >         --env 'SERVICE_NAME=gluster' \
>> >         --restart always \
>> >         --env 'GLUSTER_DATA=/srv/docker/gitlab' \
>> >         --publish 2222:22 \
>> >         webcenter/rancher-glusterfs-server
>> > ------------------------------------------------------------------
>> >
>> > In /etc/heketi/heketi.json, following is the only modified portion:
>> > ------------------------------------------------------------------
>> >     "executor": "ssh",
>> >
>> >     "_sshexec_comment": "SSH username and private key file information",
>> >     "sshexec": {
>> >       "keyfile": "/var/lib/heketi/.ssh/id_rsa",
>> >       "user": "root",
>> >       "port": "22",
>> >       "fstab": "/etc/fstab"
>> >     },
>> > ------------------------------------------------------------------
>> >
>> > Status before running gk-deploy:
>> >
>> > [root at workstation deploy]# kubectl get nodes,pods,services,deployments
>> > NAME                                     STATUS    AGE       VERSION
>> > no/node-1.c.kubernetes-174104.internal   Ready     2d
>> > v1.7.2-rancher1
>> > no/node-2.c.kubernetes-174104.internal   Ready     2d
>> > v1.7.2-rancher1
>> > no/node-3.c.kubernetes-174104.internal   Ready     2d
>> > v1.7.2-rancher1
>> >
>> > NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
>> > svc/kubernetes   10.43.0.1    <none>        443/TCP   2d
>> >
>> >
>> > Now when i run 'gk-deploy -g', in the Rancher console, i see the
>> > following
>> > error:
>> > Readiness probe failed: Failed to get D-Bus connection: Operation not
>> > permitted
>> >
>> > From the attached gk-deploy_log i see that it failed at:
>> > Waiting for GlusterFS pods to start ... pods not found.
>> >
>> > In the kube-templates/glusterfs-daemonset.yaml file, i see this for
>> > Readiness probe section:
>> > ------------------------------------------------------------------
>> >         readinessProbe:
>> >           timeoutSeconds: 3
>> >           initialDelaySeconds: 40
>> >           exec:
>> >             command:
>> >             - "/bin/bash"
>> >             - "-c"
>> >             - systemctl status glusterd.service
>> >           periodSeconds: 25
>> >           successThreshold: 1
>> >           failureThreshold: 15
>> > ------------------------------------------------------------------
>> >
>> >
>> > Status after running gk-deploy:
>> >
>> > [root at workstation deploy]# kubectl get nodes,pods,deployments,services
>> > NAME                                     STATUS    AGE       VERSION
>> > no/node-1.c.kubernetes-174104.internal   Ready     2d
>> > v1.7.2-rancher1
>> > no/node-2.c.kubernetes-174104.internal   Ready     2d
>> > v1.7.2-rancher1
>> > no/node-3.c.kubernetes-174104.internal   Ready     2d
>> > v1.7.2-rancher1
>> >
>> > NAME                 READY     STATUS    RESTARTS   AGE
>> > po/glusterfs-0s440   0/1       Running   0          1m
>> > po/glusterfs-j7dgr   0/1       Running   0          1m
>> > po/glusterfs-p6jl3   0/1       Running   0          1m
>> >
>> > NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
>> > svc/kubernetes   10.43.0.1    <none>        443/TCP   2d
>> >
>> >
>> > Also, from prerequisite perspective, i was also seeing this mentioned:
>> >
>> > The following kernel modules must be loaded:
>> >  * dm_snapshot
>> >  * dm_mirror
>> >  * dm_thin_pool
>> >
>> > Where exactly is this to be checked? On all Gluster server nodes? How
>> > can i
>> > check whether it's there?
>> >
>> > I have attached topology.json and gk-deploy log for reference.
>> >
>> > Does this issue has anything to do with the host OS (RancherOS) that i
>> > am
>> > using for Gluster nodes? Any idea how i can fix this? Any help will
>> > really
>> > be appreciated.
>> >
>> >
>> > Thanks.
>> >
>> >
>> >
>> > _______________________________________________
>> > heketi-devel mailing list
>> > heketi-devel at gluster.org
>> > http://lists.gluster.org/mailman/listinfo/heketi-devel
>> >
>
>