毛宏斌 2019-08-06
安装和配置 Kubernetes 集群的过程是比较繁琐的,这里阐述在 Mac 上利用 virtualbox 配置 CentOS 7 上的 Kubernetes 集群的过程。
我们需要搭建的 Kubernetes 集群目标和规格如下:
4 个节点的规划如下
主机名 | IP 地址 | Host-Only IP 地址 | 用途 |
---|---|---|---|
k8s-node1 | 192.168.56.11 | 192.168.7.11 | master |
k8s-node2 | 192.168.56.12 | 192.168.7.12 | worker |
k8s-node3 | 192.168.56.13 | 192.168.7.13 | worker |
k8s-node4 | 192.168.56.14 | 192.168.7.14 | worker |
请按照如下要求准备环境
本文使用 VirtualBox 6 配置虚拟机,请自行安装。
打开 VirtualBox, 按下快捷键 Command + ,, 或者点击菜单 VirtualBox -> 偏好设置,打开偏好设置窗口, 然后进入网络标签,点击 NAT 网络列表右侧的 添加新NAT网络 按钮,则添加了一个 NAT 网络 NatNetwork, 如下图
选中网络 NatNetwork,点击右侧的编辑NAT网络按钮,修改字段“网络 CIDR”的值为 192.168.56.0/24,然后点击 OK 按钮。如下图
现在 NAT 网络就设置好了。
这里设置最小地址为 192.168.7.11, 单纯是为了和 NAT 服务器的地址的最后一位对应上,没有其他的意义。
现在已经设置好了 Host-Only 网络
请在 http://mirrors.163.com/centos/7.6.1810/isos/x86_64/CentOS-7-x86_64-Minimal-1810.iso 下载 CentOS 7.6 镜像
此时虚拟机已经创建完毕,宿主如果想和虚拟机通信,需要通过 Host-Only 网络的 IP 地址。
可以通过一下命令查看 Host-Only 网络的 IP 地址
ip addr
结果如下:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 08:00:27:14:21:b0 brd ff:ff:ff:ff:ff:ff inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s3 valid_lft forever preferred_lft forever inet6 fe80::7734:1bd6:9da6:5d1f/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 08:00:27:1b:66:a7 brd ff:ff:ff:ff:ff:ff inet 192.168.7.11/24 brd 192.168.7.255 scope global noprefixroute dynamic enp0s8 valid_lft 1153sec preferred_lft 1153sec inet6 fe80::5f85:8418:37a4:f428/64 scope link noprefixroute valid_lft forever preferred_lft forever
则接口 enp0s8 为 Host-Only 的接口,ip 地址为 192.168.7.11 。
由于以后安装的需要,这里要做一些基础的配置。
更新系统
yum update -y
设置静态 IP
vi /etc/sysconfig/network-scripts/ifcfg-enp0s3
修改的内容如下
TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=static DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=enp0s3 DEVICE=enp0s3 ONBOOT=yes IPADDR=192.168.56.11 GATEWAY=192.168.56.1 DNS1=192.168.56.1
注意 BOOTPROTO=static 这一行是设置 IP 为静态 IP。
停止并禁用防火墙
systemctl stop firewalld systemctl disable firewalld
关闭 SELinux
setenforce 0 sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
设置主机名
通过以下命令将本机的主机名修改为 k8s-node1
echo k8s-node1 > /etc/hostname
修改文件 /etc/hosts,将主机名 k8s-node1 添加到 hosts ,以便在本机能够解析。效果如下:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 k8s-node1 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
关闭 swap, 并取消自动挂载 /swap
swapoff -a && sysctl -w vm.swappiness=0 sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab
- 操作之前建议备份节点。
- 此时也可以不进行节点复制,等 docker 和 kubelet, kubeadm, kubectl 的安装完成后在进行节点复制更方便。
在 VirtualBox 中复制 k8s-node1 节点为其他节点,其他节点的名称分别为 k8s-node2, k8s-node3, k8s-node4。然后分别修改各个节点的如下项:
至此基础环境已经安装完毕,下一步进入到 docker 和 k8s 的安装。
此步骤要在所有的 4 个节点执行。
安装依赖项
yum install -y yum-utils device-mapper-persistent-data lvm2 deltarpm
添加阿里云的 yum 仓库
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo yum makecache fast
安装 docker
yum install -y docker-ce docker-ce-cli containerd.io
完成后查看 docker 版本
docker --version
输出结果为
Docker version 18.09.7, build 2d0083d
现在 docker 已经成功安装了。
修改docker 的 cgroup 驱动为 systemd ,与k8s一致
cat > /etc/docker/daemon.json <<EOF { "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "max-size": "100m" }, "storage-driver": "overlay2", "storage-opts": [ "overlay2.override_kernel_check=true" ] } EOF
重启 docker ,并设置为随机自启动,请输入:
systemctl restart docker systemctl enable docker
此步骤要在所有的 4 个节点执行。
添加 kubernetes YUM 仓库,其中源修改为阿里云
cat > /etc/yum.repos.d/kubernetes.repo <<EOF [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
安装 kubelet, kubeadm, kubectl, ipvsadm
yum install -y kubelet kubeadm kubectl ipvsadm
设置路由
安装路由工具包,并加载 br_netfilter 模块
yum install -y bridge-utils.x86_64 modprobe br_netfilte
设置路由
cat > /etc/sysctl.d/k8s.conf <<EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 EOF
重新加载所有配置
sysctl --system
启动并设置随机自动启动
systemctl start kubelet systemctl enable kubelet
执行如下命令来初始化 master 节点。
kubeadm init \ --apiserver-advertise-address=192.168.56.11 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.15.0 \ --service-cidr=10.1.0.0/16 \ --pod-network-cidr=10.2.0.0/16 \ --service-dns-domain=cluster.local \ --ignore-preflight-errors=Swap \ --ignore-preflight-errors=NumCPU
先看一下几个重点的参数
整个过程可能会持续 5 分钟左右,整个输出的结果如下:
[init] Using Kubernetes version: v1.15.0 [preflight] Running pre-flight checks [WARNING NumCPU]: the number of available CPUs 1 is less than the required 2 [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Activating the kubelet service [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [k8s-node1 localhost] and IPs [192.168.56.11 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [k8s-node1 localhost] and IPs [192.168.56.11 127.0.0.1 ::1] [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "ca" certificate and key [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [k8s-node1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.1.0.1 192.168.56.11] [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. [apiclient] All control plane components are healthy after 41.503341 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.15" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node k8s-node1 as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node k8s-node1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: 5wf7mp.v61tv0s23ewbun1l [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.56.11:6443 --token 5wf7mp.v61tv0s23ewbun1l \ --discovery-token-ca-cert-hash sha256:ca524d88dbcc9a79c70c4cf21fba7252c0f12e5ab0fe9674e7f6998ab9fb5901
上面输出的最后部分提示我们连个信息: - 需要执行几个命令来在用户目录下建立配置文件 - 告诉我们其他节点加入集群的命令
按照上面的执行结果中的要求,执行以下命令。
mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config
在配置文件中,记录了 API Server 的访问地址,所以后面直接执行 kubectl 命令就可以正常连接到 API Server 中。
使用以下命令查看组件的状态
kubectl get cs
输出结果如下
NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-0 Healthy {"health":"true"}
这里能够正常返回结果,说明 API server 已经正常运行
获取 Node 信息
kubectl get node
输出如下
NAME STATUS ROLES AGE VERSION k8s-node1 NotReady master 6m48s v1.15.0
可以看出 k8s-node1 还是 NotReady 的状态,这是因为还未安装网络插件。现在进入网络插件的安装。
插件的部署通过 kubectl 命令应用 yaml 配置文件。分别运行以下两个命令。
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/canal/rbac.yaml
输出
clusterrole.rbac.authorization.k8s.io/calico created clusterrole.rbac.authorization.k8s.io/flannel created clusterrolebinding.rbac.authorization.k8s.io/canal-flannel created clusterrolebinding.rbac.authorization.k8s.io/canal-calico created
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/canal/canal.yaml
输出
configmap/canal-config created daemonset.extensions/canal created serviceaccount/canal created customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
运行以下查看启动的 Pod
kubectl get pods --all-namespaces
输出为
NAMESPACE NAME READY STATUS RESTARTS AGE kube-system canal-rj2fm 0/3 ContainerCreating 0 44s kube-system coredns-bccdc95cf-rgtbx 0/1 Pending 0 11m kube-system coredns-bccdc95cf-x6j8l 0/1 Pending 0 11m kube-system etcd-k8s-node1 1/1 Running 0 11m kube-system kube-apiserver-k8s-node1 1/1 Running 0 10m kube-system kube-controller-manager-k8s-node1 1/1 Running 0 10m kube-system kube-proxy-zcssq 1/1 Running 0 11m kube-system kube-scheduler-k8s-node1 1/1 Running 0 10m
可以看出 canal 正在创建容器, 而 coredns 处于 pending 状态。 由于需要下载 canal 镜像,所以需要一些时间,等镜像下载完成后,则 coredns 的状态变温 Running 。
需要注意的是,如果出现 ErrImagePull 等错误,则可能是由于 canal 镜像由于在 google 服务器访问不到的缘故,此时需要开启 VPN 才能正常下载。
等镜像下载完成后,再次运行 kubectl get pods --all-namespaces , 则状态都正常了,如下所示:
NAMESPACE NAME READY STATUS RESTARTS AGE kube-system canal-rj2fm 3/3 Running 0 35m kube-system coredns-bccdc95cf-rgtbx 1/1 Running 0 46m kube-system coredns-bccdc95cf-x6j8l 1/1 Running 0 46m kube-system etcd-k8s-node1 1/1 Running 1 46m kube-system kube-apiserver-k8s-node1 1/1 Running 1 45m kube-system kube-controller-manager-k8s-node1 1/1 Running 1 45m kube-system kube-proxy-zcssq 1/1 Running 1 46m kube-system kube-scheduler-k8s-node1 1/1 Running 1 45m
此时再运行 kubectl get node 查看 master 节点的状态,则状态已经 Ready, 如下
NAME STATUS ROLES AGE VERSION k8s-node1 Ready master 48m v1.15.0
首先在 master 节点上执行以下命令来获取在集群中添加节点的命令
kubeadm token create --print-join-command
输出为
kubeadm join 192.168.56.11:6443 --token eb0k80.qhqbjon1mh55w803 --discovery-token-ca-cert-hash sha256:ca524d88dbcc9a79c70c4cf21fba7252c0f12e5ab0fe9674e7f6998ab9fb5901
然后在每个 worker 节点上执行上面的命令,这个时候 kubernetes 会使用 DaemonSet 在所有节点上都部署 canal 和 kube-proxy。
需要注意的是,如果出现 ErrImagePull 等错误,则可能是由于镜像由于在 google 服务器访问不到的缘故,此时需要开启 VPN 才能正常下载。
等待全部部署完毕,在 master 节点运行以下命令查看信息。
查看所有 daemonset
kubectl get daemonset --all-namespaces
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE kube-system canal 4 4 4 4 4 beta.kubernetes.io/os=linux 16h kube-system kube-proxy 4 4 4 4 4 beta.kubernetes.io/os=linux 17h
可以看到 READY 和 AVAILABLE 都是 4, 也就是 4 个节点都已经可用了。
查看所有 Pod
kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE kube-system canal-6w2zb 3/3 Running 12 16h kube-system canal-jgw4m 3/3 Running 47 16h kube-system canal-klmfs 3/3 Running 33 16h kube-system canal-rj2fm 3/3 Running 12 17h kube-system coredns-bccdc95cf-rgtbx 1/1 Running 3 17h kube-system coredns-bccdc95cf-x6j8l 1/1 Running 3 17h kube-system etcd-k8s-node1 1/1 Running 4 17h kube-system kube-apiserver-k8s-node1 1/1 Running 6 17h kube-system kube-controller-manager-k8s-node1 1/1 Running 4 17h kube-system kube-proxy-7bk98 1/1 Running 0 16h kube-system kube-proxy-cd8xj 1/1 Running 0 16h kube-system kube-proxy-xfzfp 1/1 Running 0 16h kube-system kube-proxy-zcssq 1/1 Running 4 17h kube-system kube-scheduler-k8s-node1 1/1 Running 4 17h
查看所有节点
kubectl get node
NAME STATUS ROLES AGE VERSION k8s-node1 Ready master 17h v1.15.0 k8s-node2 Ready <none> 16h v1.15.0 k8s-node3 Ready <none> 16h v1.15.0 k8s-node4 Ready <none> 16h v1.15.0
现在可以看到所有的节点已经运行 Ready 。
通过上面的步骤,k8s 集群(1个 master 节点和 3 个 worker 节点)环境已经搭建完毕,并且所有的节点都得正常工作,现在我们要通过添加 Nginx 应用来测试集群。
创建单 Pod 的 Nginx 应用
kubectl create deployment nginx --image=nginx:alpine
deployment.apps/nginx created
查看 pod 详情
kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-8f6959bd-6pth6 1/1 Running 0 73s 10.2.1.2 k8s-node2 <none> <none>
Pod 的 IP 地址是从 Master 节点初始化的参数 --pod-network-cidr=10.2.0.0/16 的地址段中分配的。
访问 nginx
通过上面获取的 Pod 的 ip 10.2.1.2 地址访问 nginx
curl -I http://10.2.1.2
HTTP/1.1 200 OK Server: nginx/1.17.1 Date: Thu, 18 Jul 2019 07:53:22 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Tue, 25 Jun 2019 14:15:08 GMT Connection: keep-alive ETag: "5d122c6c-264" Accept-Ranges: bytes
扩容为 2 个 节点
kubectl scale deployment nginx --replicas=2
deployment.extensions/nginx scaled
查看 pod
kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-8f6959bd-6pth6 1/1 Running 0 6m44s 10.2.1.2 k8s-node2 <none> <none> nginx-8f6959bd-l56n9 1/1 Running 0 28s 10.2.3.2 k8s-node4 <none> <none>
可以看到 Pod 已经有了两个副本,每个副本都有各自的 IP, 通过 IP 访问新增加的副本,照样是可以提供服务的。
curl -I http://10.2.3.2
HTTP/1.1 200 OK Server: nginx/1.17.1 Date: Thu, 18 Jul 2019 07:58:27 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Tue, 25 Jun 2019 14:15:08 GMT Connection: keep-alive ETag: "5d122c6c-264" Accept-Ranges: bytes
**暴露为服务 **
多个副本需要暴露为一个服务来统一对外提供服务,服务会创建一个Cluster IP,从 Master 节点初始化参数 --service-cidr=10.1.0.0/16 地址段中进行分配。服务会自动在在多个副本之间进行负载均衡。
运行以下命令为 nginx 应用暴露服务,并开启 NodePort 在所有节点上进行端口映射,进行外部访问。
kubectl expose deployment nginx --port=80 --type=NodePort
service/nginx exposed
运行以下命令看一下服务列表
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.1.0.1 <none> 443/TCP 19h nginx NodePort 10.1.59.105 <none> 80:32502/TCP 80s
可以看到,nginx 服务的 vip 为 10.1.59.105, Node 节点上端口 32502 映射到 nginx 的 80 端口。
运行以下命令,通过 vip 访问服务
curl -I http://10.1.59.105
HTTP/1.1 200 OK Server: nginx/1.17.1 Date: Thu, 18 Jul 2019 08:10:45 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Tue, 25 Jun 2019 14:15:08 GMT Connection: keep-alive ETag: "5d122c6c-264" Accept-Ranges: bytes
在主机上运行以下命令通过节点的 IP 访问服务
curl -I http://192.168.7.11:32502
HTTP/1.1 200 OK Server: nginx/1.17.1 Date: Thu, 18 Jul 2019 08:14:31 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Tue, 25 Jun 2019 14:15:08 GMT Connection: keep-alive ETag: "5d122c6c-264" Accept-Ranges: bytes
这里由于宿主机不能直接访问 VirtualBox 的 NAT 网络,采用的 Host-Only 网络的 IP 进行访问。