部门负责维护milvus,决定自研部署基于k8s的版本,开始离线部署k8s 1.28.0。
为配合公司大模型应用,年初计划由我们部门承担milvus向量数据库的维保工作,这个工作落在了我头上,由于milvus是开源向量数据库(当然,商业授权版太贵了),于是便决定自行研究部署基于k8s的milvus版本。
这篇便是第一步,离线部署K8S 1.28.0,网络插件使用calico,参考主流方案改造而来,如果有错误,欢迎批评指正。
1.操作系统
############## Kylin Linux Version #################
Release:
Kylin Linux Advanced Server release V10 (Sword)
Kernel:
4.19.90-25.46.v2101.ky10.aarch64
Build:
Kylin Linux Advanced Server
release V10 (SP2) /(Sword)-aarch64-Build09/20210524
#################################################
基于4台8C16G虚拟机做部署验证,IP是:10.194.174.91-94。生产环境为保证高可用,建议有独立的apiserver VIP。
2.节点初始化
节点主机名配置(每台节点上执行对应的主机名配置)
# 在对应节点上设置hostname
hostnamectl set-hostname kylin-milvus-1
hostnamectl set-hostname kylin-milvus-2
hostnamectl set-hostname kylin-milvus-3
hostnamectl set-hostname kylin-milvus-4
并在每个节点上配置其他节点解析(在所有节点上执行)
# 添加节点名解析到本地/etc/hosts
echo "10.194.174.91 kylin-milvus-1" >> /etc/hosts
echo "10.194.174.92 kylin-milvus-2" >> /etc/hosts
echo "10.194.174.93 kylin-milvus-3" >> /etc/hosts
echo "10.194.174.94 kylin-milvus-4" >> /etc/hostsxxx
节点内核初始化配置
# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
# 关闭swap,防止swap影响性能
swapoff -a
sed -i '/swap/s/^/#/' /etc/fstab
开启内核功能,允许iptables检查桥接流量
cat <<EOF| tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
cat <<EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
# 开启内核转发功能
sed -i 's/net.ipv4.ip_forward=0/net.ipv4.ip_forward=1/' /etc/sysctl.conf
应用内核配置
sysctl --system
modprobe br_netfilter
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
sysctl -p
3.节点磁盘初始化
创建containerd磁盘分区,测试机器上,使用/dev/vda用作containerd的数据盘(使用lvm管理卷),请以实际为准,命令p后三次回车即将该磁盘全部用以containerd:
fdisk /dev/vda <<EOF
n
p
t
8e
w
EOF
将containerd分区持久化挂载
pvcreate /dev/vda1
pvscan
vgcreate vg_containerd /dev/vda1
vgdisplay
lvcreate -l 100%Free -n lv_containerd vg_containerd
lvdisplay
mkfs.xfs /dev/vg_containerd/lv_containerd
mkdir /var/lib/containerd/
mount /dev/vg_containerd/lv_containerd /var/lib/containerd/
echo "/dev/vg_containerd/lv_containerd /var/lib/containerd/ xfs defaults 0 0" >> /etc/fstab
4.离线包准备
containerd:
https://k8s.huweihuang.com/project/runtime/containerd/install-containerd#id-2.-li-xian-er-jin-zhi-an-zhuang-containerd
k8s rpms:
https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-aarch64/Packages/?userCode=okjhlpr5
https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-aarch64/Packages/549fe173ace5fde6ca198ff32e7f647e6fadab7fda6bcdeaf38feedd15a4a126-kubeadm-1.28.0-0.aarch64.rpm
https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-aarch64/Packages/6e4932ec181931cc5b1962859624ab612f14b202e4cbab47c45e92400fb48f20-kubectl-1.28.0-0.aarch64.rpm
https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-aarch64/Packages/bb5555c7c997ab284aa02b1012546c52210199f7d16543cfc41de2cf87bbd3c7-kubelet-1.28.0-0.aarch64.rpm
https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-aarch64/Packages/46006252a3921115803b91decee20067c7671514b77014681ffbd50f5bcf92b7-cri-tools-1.24.0-0.aarch64.rpm
https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-aarch64/Packages/d0d8d44785388fdca100350365a2a06d46185ca4304216a94b7646d39c4f1ff9-kubernetes-cni-1.2.0-0.aarch64.rpm
https://update.cs2c.com.cn/NS/V10/V10SP2/os/adv/lic/base/aarch64/Packages/conntrack-tools-1.4.6-2.ky10.aarch64.rpm
https://update.cs2c.com.cn/NS/V10/V10SP2/os/adv/lic/base/aarch64/Packages/conntrack-tools-help-1.4.6-2.ky10.noarch.rpm
https://update.cs2c.com.cn/NS/V10/V10SP2/os/adv/lic/base/aarch64/Packages/libnetfilter_queue-1.0.5-1.ky10.aarch64.rpm
https://update.cs2c.com.cn/NS/V10/V10SP2/os/adv/lic/base/aarch64/Packages/libnetfilter_cttimeout-1.0.0-13.ky10.aarch64.rpm
https://update.cs2c.com.cn/NS/V10/V10SP2/os/adv/lic/base/aarch64/Packages/libnetfilter_cthelper-1.0.0-15.ky10.aarch64.rpm
https://update.cs2c.com.cn/NS/V10/V10SP2/os/adv/lic/base/aarch64/Packages/socat-1.7.3.2-8.ky10.aarch64.rpm
https://update.cs2c.com.cn/NS/V10/V10SP2/os/adv/lic/base/aarch64/Packages/socat-help-1.7.3.2-8.ky10.noarch.rpm
k8s certs update:
https://github.com/yuyicai/update-kube-cert/blob/master/update-kubeadm-cert.sh
helm binary package:
https://get.helm.sh/helm-v3.14.3-linux-arm64.tar.gz
5.安装依赖
依赖包中包含k8s、containerd的基础依赖,需准备好
wget kylin_arm_k8s1.28.tar.gz
mkdir /root/k8s
tar zxf kylin_arm_k8s1.28.tar.gz -C /root/k8s
安装其中的依赖,包括k8s和containerd的依赖
cd /root/k8s/rpms/basic-tools
yum install -y ./*rpm
cd /root/k8s/rpms/container-deps
yum install -y ./*rpm
6.安装Containerd
1.28版本k8s已不再支持docker,因此是基于的是官方文档安装containerd及其命令行工具,其中的`nerdctl -n k8s`可以代替原来的docker命令
cd /root/k8s/rpms/containerd
echo "--------------install containerd--------------"
tar Cxzvf /usr/local containerd-1.6.36-linux-arm64.tar.gz
echo "--------------install containerd service--------------"
cp containerd.service /lib/systemd/system/
# 配置好的containerd文件,有默认的配置文件,以实际配置为准
cp -r containerd /etc/
echo "--------------install runc--------------"
chmod +x runc.arm64
mv runc.arm64 /usr/local/bin/runc
echo "--------------install cni plugins--------------"
rm -fr /opt/cni/bin
mkdir -p /opt/cni/bin
tar Cxzvf /opt/cni/bin cni-plugins-linux-arm64-v1.1.1.tgz
echo "--------------install nerdctl--------------"
tar Cxzvf /usr/local/bin nerdctl-0.21.0-linux-arm64.tar.gz
echo "--------------install crictl--------------"
tar Cxzvf /usr/local/bin crictl-v1.24.2-linux-arm64.tar.gz
7.安装kubelet/kubeadm/kubectl
cd /root/k8s/rpms/k8s
yum install -y ./*rpm
设置开机启动
systemctl daemon-reload
systemctl daemon-reexec
systemctl restart containerd
systemctl enable containerd
systemctl enable kubelet
echo 'alias crictl="crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock"' >> /etc/profile
配置镜像仓库并登录
mkdir -p /etc/containerd/certs.d/repo.xxbank.cn
cat <<EOF| tee /etc/containerd/certs.d/repo.xxbank.cn/hosts.toml
server = 'http://repo.xxbank.cn'
[host.'http://repo.xxbank.cn']
capabilities=['pull', 'resolve', 'push']
# skip_verify = true
EOF
# 登录镜像仓库验证
nerdctl login repo.xxbank.cn -u xxxx
8.初始化k8s集群
K8s集群证书默认有效期一年,测试环境可以初始化集群前先生成长期证书(生产环境不建议),该脚本会在/etc/kubernetes/pki/目录生成证书,后续k8s集群初始化时会使用到这些证书
cd /root/k8s/rpms/k8s/
./update-kubeadm-cert.sh --action gen-ca
[root@kylin-milvus-1 k8s]# ./update-kubeadm-cert.sh --action gen-ca
证书生成后也有提示,后续应该如何使用
[INFO] DONE!!! generated CA for new cluster.
# create new cluster after generating CA, you can use the following command:
kubeadm init [options]
# after running kubeadm init, update certificates for 100 yeas
bash update-kubeadm-cert.sh --cri containerd --days 36500
生成k8s集群初始化配置文件
kubeadm config print init-defaults > kubeadm-init.yaml
并修改其中的节点名、容器IP段、SVC网段、API地址端口等,修改完成后初始化k8s集群
kubeadm init --config kubeadm-init.yaml --upload-certs --v=5
初始化成功后,会有如下打印提示如何将节点加入集群
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join 10.194.174.91:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:583f191220c0e437ba46c843510e43e3e14b3e3e4c0e5426f2e2574466e54b29 \
--control-plane --certificate-key f29e5b21a1ac2f4a907212f63eca8e26ca39a9bf0e7272c018b539247f9ed5d3
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.194.174.91:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:583f191220c0e437ba46c843510e43e3e14b3e3e4c0e5426f2e2574466e54b29
测试环境,为使用长期证书,需要更新当前master节点的其它组件证书,单位:天
./update-kubeadm-cert.sh --cri containerd --days 36500
证书更新后,可以通过kubeadm检查证书有效期
[root@kylin-milvus-1 ~]# kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf May 03, 2125 07:06 UTC 99y ca no
apiserver May 03, 2125 07:06 UTC 99y ca no
apiserver-etcd-client May 03, 2125 07:06 UTC 99y etcd-ca no
apiserver-kubelet-client May 03, 2125 07:06 UTC 99y ca no
controller-manager.conf May 03, 2125 07:06 UTC 99y ca no
etcd-healthcheck-client May 03, 2125 07:06 UTC 99y etcd-ca no
etcd-peer May 03, 2125 07:06 UTC 99y etcd-ca no
etcd-server May 03, 2125 07:06 UTC 99y etcd-ca no
front-proxy-client May 03, 2125 07:06 UTC 99y front-proxy-ca no
scheduler.conf May 03, 2125 07:06 UTC 99y ca no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca May 03, 2125 07:05 UTC 99y no
etcd-ca May 03, 2125 07:05 UTC 99y no
front-proxy-ca May 03, 2125 07:05 UTC 99y no
9.其他节点加入k8s集群
为保证高可用,需要有三个master节点,其他节点以控制面节点角色加入集群
kubeadm join 10.194.174.91:6443 \
--token yk8mhc.sy8dvsegsh4wxlu7 \
--discovery-token-ca-cert-hash sha256:834aef146fc7d4421933fc00369fe982b4374831949c8febd6b0e808d442438e \
--control-plane \
--certificate-key b98d35b2e06335c6c1b71705d5a3671cd1fedcd32e0eee1937575749b0aa5ed6 \
--node-name kylin-milvus-2
并且,在其他master节点加入集群后,需要更新其节点上的证书有效期
./update-kubeadm-cert.sh --cri containerd --days 36500
对于worker节点加入集群后,不需要此脚本更新证书有效期
kubeadm join 10.194.174.91:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:583f191220c0e437ba46c843510e43e3e14b3e3e4c0e5426f2e2574466e54b29 \
--node-name kylin-milvus-4
节点加入集群完毕后,kubectl get node查看可以看到节点状态为NotReady,这是因为尚未安装网络插件,本方案中使用的是calico。
10.安装calico
本方案中使用calico-operator部署calico。
修改calico配置文件custom-resources.yaml中ipPools.cidr需与k8s集群初始化时的Pod cidr保持一致,多网卡节点,需要通过nodeAdressAutoDetectionV4.interface指明网卡名,否则可能会导致calico使用其他网卡导致节点不通,镜像仓库地址也可以在文件中指明。初始配置文件可以在calico源码包中找到。
cd /root/k8s/calico-yaml
kubectl create -f tigera-operator.yaml
kubectl create -f custom-resources.yaml
部署calico完成后,可以看到节点状态均正常:
[root@kylin-milvus-1 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
kylin-milvus-1 Ready control-plane 55m v1.28.0
kylin-milvus-2 Ready control-plane 52m v1.28.0
kylin-milvus-3 Ready control-plane 50m v1.28.0
kylin-milvus-4 Ready <none> 49m v1.28.0
查看所有Pod
[root@kylin-milvus-1 ~]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-apiserver calico-apiserver-7dd4779598-6kzp6 1/1 Running 0 43m
calico-apiserver calico-apiserver-7dd4779598-8gd6w 1/1 Running 0 43m
calico-system calico-kube-controllers-fdb7c55d7-274wc 1/1 Running 0 44m
calico-system calico-node-94cvs 1/1 Running 0 44m
calico-system calico-node-9n7lj 1/1 Running 0 44m
calico-system calico-node-t729v 1/1 Running 0 44m
calico-system calico-node-vcds6 1/1 Running 0 44m
calico-system calico-typha-7b5f5699cf-7562p 1/1 Running 0 44m
calico-system calico-typha-7b5f5699cf-cktzl 1/1 Running 0 44m
calico-system csi-node-driver-d25ts 2/2 Running 0 44m
calico-system csi-node-driver-hnlhf 2/2 Running 0 44m
calico-system csi-node-driver-n7tg6 2/2 Running 0 44m
calico-system csi-node-driver-s6gsx 2/2 Running 0 44m
kube-system coredns-6869d96769-czc77 1/1 Running 0 55m
kube-system coredns-6869d96769-wmxjp 1/1 Running 0 55m
kube-system etcd-kylin-milvus-1 1/1 Running 4 (55m ago) 55m
kube-system etcd-kylin-milvus-2 1/1 Running 2 52m
kube-system etcd-kylin-milvus-3 1/1 Running 2 (50m ago) 50m
kube-system kube-apiserver-kylin-milvus-1 1/1 Running 3 (55m ago) 55m
kube-system kube-apiserver-kylin-milvus-2 1/1 Running 2 52m
kube-system kube-apiserver-kylin-milvus-3 1/1 Running 2 (50m ago) 50m
kube-system kube-controller-manager-kylin-milvus-1 1/1 Running 6 (52m ago) 55m
kube-system kube-controller-manager-kylin-milvus-2 1/1 Running 2 52m
kube-system kube-controller-manager-kylin-milvus-3 1/1 Running 1 (50m ago) 50m
kube-system kube-proxy-cq75q 1/1 Running 0 55m
kube-system kube-proxy-lr9jz 1/1 Running 0 52m
kube-system kube-proxy-s6pnk 1/1 Running 0 50m
kube-system kube-proxy-vmksw 1/1 Running 0 49m
kube-system kube-scheduler-kylin-milvus-1 1/1 Running 7 (52m ago) 55m
kube-system kube-scheduler-kylin-milvus-2 1/1 Running 2 (51m ago) 52m
kube-system kube-scheduler-kylin-milvus-3 1/1 Running 1 (50m ago) 50m
tigera-operator tigera-operator-79dbfdbd7f-jrjhf 1/1 Running 0 45m
calico正常部署完成后,也可以通过calicoctl(下载二进制程序即可运行)查看到其他节点状态是否正常,为UP表示到其他节点通畅,为start则不正常,需检查网络或calico配置
[root@kylin-milvus-1 k8s]# calicoctl node status
Calico process is running.
IPv4 BGP status
+---------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+---------------+-------------------+-------+----------+-------------+
| 10.194.174.92 | node-to-node mesh | up | 07:18:07 | Established |
| 10.194.174.93 | node-to-node mesh | up | 07:18:09 | Established |
| 10.194.174.94 | node-to-node mesh | up | 07:18:09 | Established |
+---------------+-------------------+-------+----------+-------------+
11.结语
至此,k8s基础环境已部署完毕,本文涉及的基础镜像、配置文件、基础依赖包等,如果有需要可以留言或私信交流(不会收取任何费用,点赞、推荐支持等就行~)。
原文来源:https://mp.weixin.qq.com/s/qkZw4KNpYwUZhR4pP-_W1g?click_id=16
来源:本文内容搜集或转自各大网络平台,并已注明来源、出处,如果转载侵犯您的版权或非授权发布,请联系小编,我们会及时审核处理。
声明:江苏教育黄页对文中观点保持中立,对所包含内容的准确性、可靠性或者完整性不提供任何明示或暗示的保证,不对文章观点负责,仅作分享之用,文章版权及插图属于原作者。
Copyright©2013-2025 JSedu114 All Rights Reserved. 江苏教育信息综合发布查询平台保留所有权利
苏公网安备32010402000125
苏ICP备14051488号-3技术支持:南京博盛蓝睿网络科技有限公司
南京思必达教育科技有限公司版权所有 百度统计