[heketi-devel] Unable to use Heketi setup to install Gluster for Kubernetes
Gaurav Chhabra
varuag.chhabra at gmail.com
Fri Sep 1 11:05:47 UTC 2017
Hi Jose,
Thanks for the reply. It seems the three gluster pods might have been a
copy-paste from another set of cluster where i was trying to setup the same
thing using CentOS. Sorry for that. By the way, i did check for the kernel
modules and it seems it's already there. Also, i am attaching *fresh set of
files* because i created a new cluster and thought of giving it a try
again. Issue still persists. :(
In *heketi.json*, there is a slight change w.r.t the user which connects to
glusterfs node using SSH. I am not sure how Heketi was using root user to
login because i wasn't able to use root and do manual SSH. With *rancher* user,
i can login successfully so i think this should be fine.
/etc/heketi/heketi.json:
------------------------------------------------------------------
"executor": "ssh",
"_sshexec_comment": "SSH username and private key file information",
"sshexec": {
"keyfile": "/var/lib/heketi/.ssh/id_rsa",
"user": "*rancher*",
"port": "22",
"fstab": "/etc/fstab"
},
------------------------------------------------------------------
Before running gk-deploy:
------------------------------------------------------------------
[root at workstation deploy]# kubectl get
nodes,pods,daemonset,deployments,services
NAME STATUS AGE VERSION
no/node-a.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
no/node-b.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
no/node-c.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/kubernetes 10.43.0.1 <none> 443/TCP 3h
------------------------------------------------------------------
After running gk-deploy:
------------------------------------------------------------------
[root at workstation messagegc]# kubectl get
nodes,pods,daemonset,deployments,services
NAME STATUS AGE VERSION
no/node-a.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
no/node-b.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
no/node-c.c.kubernetes-174104.internal Ready 3h v1.7.2-rancher1
NAME READY STATUS RESTARTS AGE
po/glusterfs-0j9l5 0/1 Running 0 2m
po/glusterfs-gqz4c 0/1 Running 0 2m
po/glusterfs-gxvcb 0/1 Running 0 2m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE
NODE-SELECTOR AGE
ds/glusterfs 3 3 0 3 0
storagenode=glusterfs 2m
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/kubernetes 10.43.0.1 <none> 443/TCP 3h
------------------------------------------------------------------
Kernel module check on all three nodes:
------------------------------------------------------------------
[root at node-a ~]# find /lib*/modules/$(uname -r) -name *.ko | grep
'thin-pool\|snapshot\|mirror' | xargs ls -ltr
-rw-r--r-- 1 root root 92310 Jun 26 04:13
/lib64/modules/4.9.34-rancher/kernel/drivers/md/dm-thin-pool.ko
-rw-r--r-- 1 root root 56982 Jun 26 04:13
/lib64/modules/4.9.34-rancher/kernel/drivers/md/dm-snapshot.ko
-rw-r--r-- 1 root root 27070 Jun 26 04:13
/lib64/modules/4.9.34-rancher/kernel/drivers/md/dm-mirror.ko
-rw-r--r-- 1 root root 92310 Jun 26 04:13
/lib/modules/4.9.34-rancher/kernel/drivers/md/dm-thin-pool.ko
-rw-r--r-- 1 root root 56982 Jun 26 04:13
/lib/modules/4.9.34-rancher/kernel/drivers/md/dm-snapshot.ko
-rw-r--r-- 1 root root 27070 Jun 26 04:13
/lib/modules/4.9.34-rancher/kernel/drivers/md/dm-mirror.ko
------------------------------------------------------------------
Error snapshot attached.
In my first mail, i checked that Readiness Probe failure check has this
code in kube-templates/glusterfs-daemonset.yaml file:
------------------------------------------------------------------
readinessProbe:
timeoutSeconds: 3
initialDelaySeconds: 40
exec:
command:
- "/bin/bash"
- "-c"
- systemctl status glusterd.service
periodSeconds: 25
successThreshold: 1
failureThreshold: 15
------------------------------------------------------------------
I tried logging into glustefs container on one of the node and ran the
above command:
[root at node-a ~]# docker exec -it c0f8ab4d92a23b6df2 /bin/bash
root at c0f8ab4d92a2:/app# systemctl status glusterd.service
WARNING: terminal is not fully functional
Failed to connect to bus: No such file or directory
Any check that i can do manually on nodes to debug further? Any suggestions?
On Thu, Aug 31, 2017 at 6:53 PM, Jose A. Rivera <jarrpa at redhat.com> wrote:
> Hey Gaurav,
>
> The kernel modules must be loaded on all nodes that will run heketi
> pods. Additionally, you must have at least three nodes specified in
> your topology file. I'm not sure how you're getting three gluster pods
> when you only have two nodes defined... :)
>
> --Jose
>
> On Wed, Aug 30, 2017 at 5:27 AM, Gaurav Chhabra
> <varuag.chhabra at gmail.com> wrote:
> > Hi,
> >
> >
> > I have the following setup in place:
> >
> > 1 node : RancherOS having Rancher application for Kubernetes setup
> > 2 nodes : RancherOS having Rancher agent
> > 1 node : CentOS 7 workstation having kubectl installed and folder
> > cloned/downloaded from https://github.com/gluster/gluster-kubernetes
> using
> > which i run Heketi setup (gk-deploy -g)
> >
> > I also have rancher-glusterfs-server container running with the following
> > configuration:
> > ------------------------------------------------------------------
> > [root at node-1 rancher]# cat gluster-server.sh
> > #!/bin/bash
> >
> > sudo docker run --name=gluster-server -d \
> > --env 'SERVICE_NAME=gluster' \
> > --restart always \
> > --env 'GLUSTER_DATA=/srv/docker/gitlab' \
> > --publish 2222:22 \
> > webcenter/rancher-glusterfs-server
> > ------------------------------------------------------------------
> >
> > In /etc/heketi/heketi.json, following is the only modified portion:
> > ------------------------------------------------------------------
> > "executor": "ssh",
> >
> > "_sshexec_comment": "SSH username and private key file information",
> > "sshexec": {
> > "keyfile": "/var/lib/heketi/.ssh/id_rsa",
> > "user": "root",
> > "port": "22",
> > "fstab": "/etc/fstab"
> > },
> > ------------------------------------------------------------------
> >
> > Status before running gk-deploy:
> >
> > [root at workstation deploy]# kubectl get nodes,pods,services,deployments
> > NAME STATUS AGE VERSION
> > no/node-1.c.kubernetes-174104.internal Ready 2d
> v1.7.2-rancher1
> > no/node-2.c.kubernetes-174104.internal Ready 2d
> v1.7.2-rancher1
> > no/node-3.c.kubernetes-174104.internal Ready 2d
> v1.7.2-rancher1
> >
> > NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
> > svc/kubernetes 10.43.0.1 <none> 443/TCP 2d
> >
> >
> > Now when i run 'gk-deploy -g', in the Rancher console, i see the
> following
> > error:
> > Readiness probe failed: Failed to get D-Bus connection: Operation not
> > permitted
> >
> > From the attached gk-deploy_log i see that it failed at:
> > Waiting for GlusterFS pods to start ... pods not found.
> >
> > In the kube-templates/glusterfs-daemonset.yaml file, i see this for
> > Readiness probe section:
> > ------------------------------------------------------------------
> > readinessProbe:
> > timeoutSeconds: 3
> > initialDelaySeconds: 40
> > exec:
> > command:
> > - "/bin/bash"
> > - "-c"
> > - systemctl status glusterd.service
> > periodSeconds: 25
> > successThreshold: 1
> > failureThreshold: 15
> > ------------------------------------------------------------------
> >
> >
> > Status after running gk-deploy:
> >
> > [root at workstation deploy]# kubectl get nodes,pods,deployments,services
> > NAME STATUS AGE VERSION
> > no/node-1.c.kubernetes-174104.internal Ready 2d
> v1.7.2-rancher1
> > no/node-2.c.kubernetes-174104.internal Ready 2d
> v1.7.2-rancher1
> > no/node-3.c.kubernetes-174104.internal Ready 2d
> v1.7.2-rancher1
> >
> > NAME READY STATUS RESTARTS AGE
> > po/glusterfs-0s440 0/1 Running 0 1m
> > po/glusterfs-j7dgr 0/1 Running 0 1m
> > po/glusterfs-p6jl3 0/1 Running 0 1m
> >
> > NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
> > svc/kubernetes 10.43.0.1 <none> 443/TCP 2d
> >
> >
> > Also, from prerequisite perspective, i was also seeing this mentioned:
> >
> > The following kernel modules must be loaded:
> > * dm_snapshot
> > * dm_mirror
> > * dm_thin_pool
> >
> > Where exactly is this to be checked? On all Gluster server nodes? How
> can i
> > check whether it's there?
> >
> > I have attached topology.json and gk-deploy log for reference.
> >
> > Does this issue has anything to do with the host OS (RancherOS) that i am
> > using for Gluster nodes? Any idea how i can fix this? Any help will
> really
> > be appreciated.
> >
> >
> > Thanks.
> >
> >
> >
> > _______________________________________________
> > heketi-devel mailing list
> > heketi-devel at gluster.org
> > http://lists.gluster.org/mailman/listinfo/heketi-devel
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/heketi-devel/attachments/20170901/c20d1cdc/attachment-0001.html>
-------------- next part --------------
[root at workstation deploy]# ./gk-deploy -g
Welcome to the deployment tool for GlusterFS on Kubernetes and OpenShift.
Before getting started, this script has some requirements of the execution
environment and of the container platform that you should verify.
The client machine that will run this script must have:
* Administrative access to an existing Kubernetes or OpenShift cluster
* Access to a python interpreter 'python'
Each of the nodes that will host GlusterFS must also have appropriate firewall
rules for the required GlusterFS ports:
* 2222 - sshd (if running GlusterFS in a pod)
* 24007 - GlusterFS Management
* 24008 - GlusterFS RDMA
* 49152 to 49251 - Each brick for every volume on the host requires its own
port. For every new brick, one new port will be used starting at 49152. We
recommend a default range of 49152-49251 on each host, though you can adjust
this to fit your needs.
The following kernel modules must be loaded:
* dm_snapshot
* dm_mirror
* dm_thin_pool
For systems with SELinux, the following settings need to be considered:
* virt_sandbox_use_fusefs should be enabled on each node to allow writing to
remote GlusterFS volumes
In addition, for an OpenShift deployment you must:
* Have 'cluster_admin' role on the administrative account doing the deployment
* Add the 'default' and 'router' Service Accounts to the 'privileged' SCC
* Have a router deployed that is configured to allow apps to access services
running in the cluster
Do you wish to proceed with deployment?
[Y]es, [N]o? [Default: Y]: Y
Using Kubernetes CLI.
Using namespace "default".
Checking for pre-existing resources...
GlusterFS pods ... not found.
deploy-heketi pod ... not found.
heketi pod ... not found.
Creating initial resources ... serviceaccount "heketi-service-account" created
clusterrolebinding "heketi-sa-view" created
clusterrolebinding "heketi-sa-view" labeled
OK
node "node-a.c.kubernetes-174104.internal" labeled
node "node-b.c.kubernetes-174104.internal" labeled
node "node-c.c.kubernetes-174104.internal" labeled
daemonset "glusterfs" created
Waiting for GlusterFS pods to start ... pods not found.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: topology.json
Type: application/json
Size: 1153 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/heketi-devel/attachments/20170901/c20d1cdc/attachment-0001.json>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Readiness Probe Failed Error.png
Type: image/png
Size: 64512 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/heketi-devel/attachments/20170901/c20d1cdc/attachment-0001.png>
More information about the heketi-devel
mailing list