[heketi-devel] Unable to use Heketi setup to install Gluster for Kubernetes
Jose A. Rivera
jarrpa at redhat.com
Fri Sep 1 14:51:18 UTC 2017
1. Add a line to the ssh-exec portion of heketi.json of the sort:
"sudo": true,
2. Run
gk-deploy -g --abort
3. On the nodes that were/will be running GlusterFS pods, run:
rm -rf /var/lib/heketi /etc/glusterfs /var/lib/glusterd /var/log/glusterfs
Then try the deploy again.
On Fri, Sep 1, 2017 at 6:05 AM, Gaurav Chhabra <varuag.chhabra at gmail.com> wrote:
> Hi Jose,
>
>
> Thanks for the reply. It seems the three gluster pods might have been a
> copy-paste from another set of cluster where i was trying to setup the same
> thing using CentOS. Sorry for that. By the way, i did check for the kernel
> modules and it seems it's already there. Also, i am attaching fresh set of
> files because i created a new cluster and thought of giving it a try again.
> Issue still persists. :(
>
> In heketi.json, there is a slight change w.r.t the user which connects to
> glusterfs node using SSH. I am not sure how Heketi was using root user to
> login because i wasn't able to use root and do manual SSH. With rancher
> user, i can login successfully so i think this should be fine.
>
> /etc/heketi/heketi.json:
> ------------------------------------------------------------------
> "executor": "ssh",
>
> "_sshexec_comment": "SSH username and private key file information",
> "sshexec": {
> "keyfile": "/var/lib/heketi/.ssh/id_rsa",
> "user": "rancher",
> "port": "22",
> "fstab": "/etc/fstab"
> },
> ------------------------------------------------------------------
>
> Before running gk-deploy:
> ------------------------------------------------------------------
> [root at workstation deploy]# kubectl get
> nodes,pods,daemonset,deployments,services
> NAME STATUS AGE VERSION
> no/node-a.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
> no/node-b.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
> no/node-c.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
>
> NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
> svc/kubernetes 10.43.0.1 <none> 443/TCP 3h
> ------------------------------------------------------------------
>
> After running gk-deploy:
> ------------------------------------------------------------------
> [root at workstation messagegc]# kubectl get
> nodes,pods,daemonset,deployments,services
> NAME STATUS AGE VERSION
> no/node-a.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
> no/node-b.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
> no/node-c.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
>
> NAME READY STATUS RESTARTS AGE
> po/glusterfs-0j9l5 0/1 Running 0 2m
> po/glusterfs-gqz4c 0/1 Running 0 2m
> po/glusterfs-gxvcb 0/1 Running 0 2m
>
> NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE
> NODE-SELECTOR AGE
> ds/glusterfs 3 3 0 3 0
> storagenode=glusterfs 2m
>
> NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
> svc/kubernetes 10.43.0.1 <none> 443/TCP 3h
> ------------------------------------------------------------------
>
> Kernel module check on all three nodes:
> ------------------------------------------------------------------
> [root at node-a ~]# find /lib*/modules/$(uname -r) -name *.ko | grep
> 'thin-pool\|snapshot\|mirror' | xargs ls -ltr
> -rw-r--r-- 1 root root 92310 Jun 26 04:13
> /lib64/modules/4.9.34-rancher/kernel/drivers/md/dm-thin-pool.ko
> -rw-r--r-- 1 root root 56982 Jun 26 04:13
> /lib64/modules/4.9.34-rancher/kernel/drivers/md/dm-snapshot.ko
> -rw-r--r-- 1 root root 27070 Jun 26 04:13
> /lib64/modules/4.9.34-rancher/kernel/drivers/md/dm-mirror.ko
> -rw-r--r-- 1 root root 92310 Jun 26 04:13
> /lib/modules/4.9.34-rancher/kernel/drivers/md/dm-thin-pool.ko
> -rw-r--r-- 1 root root 56982 Jun 26 04:13
> /lib/modules/4.9.34-rancher/kernel/drivers/md/dm-snapshot.ko
> -rw-r--r-- 1 root root 27070 Jun 26 04:13
> /lib/modules/4.9.34-rancher/kernel/drivers/md/dm-mirror.ko
> ------------------------------------------------------------------
>
> Error snapshot attached.
>
> In my first mail, i checked that Readiness Probe failure check has this code
> in kube-templates/glusterfs-daemonset.yaml file:
> ------------------------------------------------------------------
> readinessProbe:
> timeoutSeconds: 3
> initialDelaySeconds: 40
> exec:
> command:
> - "/bin/bash"
> - "-c"
> - systemctl status glusterd.service
> periodSeconds: 25
> successThreshold: 1
> failureThreshold: 15
> ------------------------------------------------------------------
>
> I tried logging into glustefs container on one of the node and ran the above
> command:
>
> [root at node-a ~]# docker exec -it c0f8ab4d92a23b6df2 /bin/bash
> root at c0f8ab4d92a2:/app# systemctl status glusterd.service
> WARNING: terminal is not fully functional
> Failed to connect to bus: No such file or directory
>
>
> Any check that i can do manually on nodes to debug further? Any suggestions?
>
>
> On Thu, Aug 31, 2017 at 6:53 PM, Jose A. Rivera <jarrpa at redhat.com> wrote:
>>
>> Hey Gaurav,
>>
>> The kernel modules must be loaded on all nodes that will run heketi
>> pods. Additionally, you must have at least three nodes specified in
>> your topology file. I'm not sure how you're getting three gluster pods
>> when you only have two nodes defined... :)
>>
>> --Jose
>>
>> On Wed, Aug 30, 2017 at 5:27 AM, Gaurav Chhabra
>> <varuag.chhabra at gmail.com> wrote:
>> > Hi,
>> >
>> >
>> > I have the following setup in place:
>> >
>> > 1 node : RancherOS having Rancher application for Kubernetes setup
>> > 2 nodes : RancherOS having Rancher agent
>> > 1 node : CentOS 7 workstation having kubectl installed and folder
>> > cloned/downloaded from https://github.com/gluster/gluster-kubernetes
>> > using
>> > which i run Heketi setup (gk-deploy -g)
>> >
>> > I also have rancher-glusterfs-server container running with the
>> > following
>> > configuration:
>> > ------------------------------------------------------------------
>> > [root at node-1 rancher]# cat gluster-server.sh
>> > #!/bin/bash
>> >
>> > sudo docker run --name=gluster-server -d \
>> > --env 'SERVICE_NAME=gluster' \
>> > --restart always \
>> > --env 'GLUSTER_DATA=/srv/docker/gitlab' \
>> > --publish 2222:22 \
>> > webcenter/rancher-glusterfs-server
>> > ------------------------------------------------------------------
>> >
>> > In /etc/heketi/heketi.json, following is the only modified portion:
>> > ------------------------------------------------------------------
>> > "executor": "ssh",
>> >
>> > "_sshexec_comment": "SSH username and private key file information",
>> > "sshexec": {
>> > "keyfile": "/var/lib/heketi/.ssh/id_rsa",
>> > "user": "root",
>> > "port": "22",
>> > "fstab": "/etc/fstab"
>> > },
>> > ------------------------------------------------------------------
>> >
>> > Status before running gk-deploy:
>> >
>> > [root at workstation deploy]# kubectl get nodes,pods,services,deployments
>> > NAME STATUS AGE VERSION
>> > no/node-1.c.kubernetes-174104.internal Ready 2d
>> > v1.7.2-rancher1
>> > no/node-2.c.kubernetes-174104.internal Ready 2d
>> > v1.7.2-rancher1
>> > no/node-3.c.kubernetes-174104.internal Ready 2d
>> > v1.7.2-rancher1
>> >
>> > NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
>> > svc/kubernetes 10.43.0.1 <none> 443/TCP 2d
>> >
>> >
>> > Now when i run 'gk-deploy -g', in the Rancher console, i see the
>> > following
>> > error:
>> > Readiness probe failed: Failed to get D-Bus connection: Operation not
>> > permitted
>> >
>> > From the attached gk-deploy_log i see that it failed at:
>> > Waiting for GlusterFS pods to start ... pods not found.
>> >
>> > In the kube-templates/glusterfs-daemonset.yaml file, i see this for
>> > Readiness probe section:
>> > ------------------------------------------------------------------
>> > readinessProbe:
>> > timeoutSeconds: 3
>> > initialDelaySeconds: 40
>> > exec:
>> > command:
>> > - "/bin/bash"
>> > - "-c"
>> > - systemctl status glusterd.service
>> > periodSeconds: 25
>> > successThreshold: 1
>> > failureThreshold: 15
>> > ------------------------------------------------------------------
>> >
>> >
>> > Status after running gk-deploy:
>> >
>> > [root at workstation deploy]# kubectl get nodes,pods,deployments,services
>> > NAME STATUS AGE VERSION
>> > no/node-1.c.kubernetes-174104.internal Ready 2d
>> > v1.7.2-rancher1
>> > no/node-2.c.kubernetes-174104.internal Ready 2d
>> > v1.7.2-rancher1
>> > no/node-3.c.kubernetes-174104.internal Ready 2d
>> > v1.7.2-rancher1
>> >
>> > NAME READY STATUS RESTARTS AGE
>> > po/glusterfs-0s440 0/1 Running 0 1m
>> > po/glusterfs-j7dgr 0/1 Running 0 1m
>> > po/glusterfs-p6jl3 0/1 Running 0 1m
>> >
>> > NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
>> > svc/kubernetes 10.43.0.1 <none> 443/TCP 2d
>> >
>> >
>> > Also, from prerequisite perspective, i was also seeing this mentioned:
>> >
>> > The following kernel modules must be loaded:
>> > * dm_snapshot
>> > * dm_mirror
>> > * dm_thin_pool
>> >
>> > Where exactly is this to be checked? On all Gluster server nodes? How
>> > can i
>> > check whether it's there?
>> >
>> > I have attached topology.json and gk-deploy log for reference.
>> >
>> > Does this issue has anything to do with the host OS (RancherOS) that i
>> > am
>> > using for Gluster nodes? Any idea how i can fix this? Any help will
>> > really
>> > be appreciated.
>> >
>> >
>> > Thanks.
>> >
>> >
>> >
>> > _______________________________________________
>> > heketi-devel mailing list
>> > heketi-devel at gluster.org
>> > http://lists.gluster.org/mailman/listinfo/heketi-devel
>> >
>
>
More information about the heketi-devel
mailing list