准备工作
kvm 虚拟化
物理机ubuntu 系统
apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils -y
虚拟机 ubuntu 系统
apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils virt-manager -y
虚拟机 rockylinux 系统
yum install -y qemu-kvm libvirt virt-install bridge-utils
嵌套虚拟化
如果测试机器本身是虚拟机,则还有一个工作,开启嵌套虚拟化。
修改 grub 设置
安装了前面的几个kvm 相关包之后,还需要修改 grub 设置。
vi /etc/default/grub 加上最后1部分
GRUB_CMDLINE_LINUX=”crashkernel=auto rd.lvm.lv=rl/root intel_iommu=on”
重新生成 GRUB 配置文件
# 对于 BIOS 系统:
grub2-mkconfig -o /boot/grub2/grub.cfg
# 对于 UEFI 系统:
grub2-mkconfig -o /boot/efi/EFI/rocky/grub.cfg
# 判断是bios还是uefi
ls /sys/firmware/efi # 如果存在该目录,则为 UEFI
检查 qemu 状态
安装后,使用如下 virt-host-validate qemu 命令检查虚拟化是否正常
virt-host-validate qemu
QEMU: Checking for hardware virtualization : PASS
QEMU: Checking if device /dev/kvm exists : PASS
QEMU: Checking if device /dev/kvm is accessible : PASS
QEMU: Checking if device /dev/vhost-net exists : PASS
QEMU: Checking if device /dev/net/tun exists : PASS
QEMU: Checking for cgroup ‘cpu’ controller support : PASS
QEMU: Checking for cgroup ‘cpuacct’ controller support : PASS
QEMU: Checking for cgroup ‘cpuset’ controller support : PASS
QEMU: Checking for cgroup ‘memory’ controller support : PASS
QEMU: Checking for cgroup ‘devices’ controller support : PASS
QEMU: Checking for cgroup ‘blkio’ controller support : PASS
QEMU: Checking for device assignment IOMMU support : PASS
QEMU: Checking if IOMMU is enabled by kernel : PASS
QEMU: Checking for secure guest support : WARN (Unknown if this platform has Secure Guest support)
正式部署 kubevirt
参考 https://kubevirt.io/quickstart_cloud/,部署过程基本参考前面的官网链接,如下
yaml部署kubevirt
KubeVirt can be installed using the KubeVirt operator, which manages the lifecycle of all the KubeVirt core components.
查看kubevirt 当前最新版本
export VERSION=$(curl -s https://storage.googleapis.com/kubevirt-prow/release/kubevirt/kubevirt/stable.txt)
echo $VERSION
v1.6.0
# 部署 kubevirt-operator
kubectl create -f “https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/kubevirt-operator.yaml”
# 输出如下
namespace/kubevirt created
customresourcedefinition.apiextensions.k8s.io/kubevirts.kubevirt.io created
priorityclass.scheduling.k8s.io/kubevirt-cluster-critical created
clusterrole.rbac.authorization.k8s.io/kubevirt.io:operator created
serviceaccount/kubevirt-operator created
role.rbac.authorization.k8s.io/kubevirt-operator created
rolebinding.rbac.authorization.k8s.io/kubevirt-operator-rolebinding created
clusterrole.rbac.authorization.k8s.io/kubevirt-operator created
clusterrolebinding.rbac.authorization.k8s.io/kubevirt-operator created
deployment.apps/virt-operator created
# 部署自定义资源
# Again use kubectl to deploy the KubeVirt custom resource definitions:
kubectl create
-f “https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/kubevirt-cr.yaml”
#检查组件 Verify components
#By default KubeVirt will deploy 7 pods, 3 services, 1 daemonset, 3 deployment apps, 3 replica sets.
kubectl get kubevirt.kubevirt.io/kubevirt -n kubevirt -o=jsonpath=”{.status.phase}”
Deploying( 再等等)
Check the components:
kubectl get all -n kubevirt
Warning: kubevirt.io/v1 VirtualMachineInstancePresets is now deprecated and will be removed in v2.
NAME READY STATUS RESTARTS AGE
pod/virt-api-67778d48b6-7kjhm 0/1 ContainerCreating 0 19s
pod/virt-api-67778d48b6-z8lrq 0/1 ContainerCreating 0 20s
pod/virt-operator-b87fbb945-n7287 1/1 Running 0 3m35s
pod/virt-operator-b87fbb945-xl7pg 1/1 Running 0 3m34s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubevirt-operator-webhook ClusterIP 10.233.48.98
service/kubevirt-prometheus-metrics ClusterIP None
service/virt-api ClusterIP 10.233.49.2
service/virt-exportproxy ClusterIP 10.233.29.96
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/virt-api 0/2 2 0 21s
deployment.apps/virt-operator 2/2 2 2 3m36s
NAME DESIRED CURRENT READY AGE
replicaset.apps/virt-api-67778d48b6 2 2 0 21s
replicaset.apps/virt-operator-b87fbb945 2 2 2 3m36s
NAME AGE PHASE
kubevirt.kubevirt.io/kubevirt 92s Deploying
9分钟后,部署完成,主要是拉镜像通过了
kubectl get kubevirt.kubevirt.io/kubevirt -n kubevirt -o=jsonpath=”{.status.phase}”
Deployed
这样检查也过了
kubectl get all -n kubevirt
Warning: kubevirt.io/v1 VirtualMachineInstancePresets is now deprecated and will be removed in v2.
NAME READY STATUS RESTARTS AGE
pod/virt-api-67778d48b6-7kjhm 1/1 Running 0 5m59s
pod/virt-api-67778d48b6-z8lrq 1/1 Running 0 6m
pod/virt-controller-8c6b9f8f4-4xhgd 1/1 Running 0 4m56s
pod/virt-controller-8c6b9f8f4-jcqzd 1/1 Running 0 4m56s
pod/virt-handler-642df 1/1 Running 0 4m55s
pod/virt-handler-crjvl 1/1 Running 0 4m55s
pod/virt-handler-tbmv7 1/1 Running 0 4m55s
pod/virt-operator-b87fbb945-n7287 1/1 Running 0 9m15s
pod/virt-operator-b87fbb945-xl7pg 1/1 Running 0 9m14s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubevirt-operator-webhook ClusterIP 10.233.48.98
service/kubevirt-prometheus-metrics ClusterIP None
service/virt-api ClusterIP 10.233.49.2
service/virt-exportproxy ClusterIP 10.233.29.96
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/virt-handler 3 3 3 3 3 kubernetes.io/os=linux 4m56s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/virt-api 2/2 2 2 6m
deployment.apps/virt-controller 2/2 2 2 4m56s
deployment.apps/virt-operator 2/2 2 2 9m15s
NAME DESIRED CURRENT READY AGE
replicaset.apps/virt-api-67778d48b6 2 2 2 6m
replicaset.apps/virt-controller-8c6b9f8f4 2 2 2 4m56s
replicaset.apps/virt-operator-b87fbb945 2 2 2 9m15s
NAME AGE PHASE
kubevirt.kubevirt.io/kubevirt 7m11s Deployed
部署 Virtctl 工具
VERSION=$(kubectl get kubevirt.kubevirt.io/kubevirt -n kubevirt -o=jsonpath=”{.status.observedKubeVirtVersion}”)
ARCH=$(uname -s | tr A-Z a-z)-$(uname -m | sed ‘s/x86_64/amd64/’) || windows-amd64.exe
echo ${ARCH}
curl -L -o virtctl https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/virtctl-${VERSION}-${ARCH}
chmod +x virtctl
sudo install virtctl /usr/local/bin
部署 kubevirt cdi
https://kubevirt.io/labs/kubernetes/lab2.html
export VERSION=$(basename $(curl -s -w %{redirect_url} https://github.com/kubevirt/containerized-data-importer/releases/latest))
kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-operator.yaml
kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-cr.yaml
检查cdi 输出
kubectl -n cdi get all
Warning: kubevirt.io/v1 VirtualMachineInstancePresets is now deprecated and will be removed in v2.
NAME READY STATUS RESTARTS AGE
pod/cdi-operator-ccb895984-w4b6n 1/1 Running 0 3m8s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/cdi-operator 1/1 1 1 3m8s
NAME DESIRED CURRENT READY AGE
replicaset.apps/cdi-operator-ccb895984 1 1 1 3m8s
[root@gm-10-29-221-9 ~]# kubectl get cdi cdi -n cdi
NAME AGE PHASE
cdi 7m15s Deployed
[root@gm-10-29-221-9 ~]# kubectl get pods -n cdi
NAME READY STATUS RESTARTS AGE
cdi-apiserver-6c76687b66-6l7d2 1/1 Running 1 (5m48s ago) 7m11s
cdi-deployment-5f6ff949d7-mlrth 1/1 Running 0 7m9s
cdi-operator-ccb895984-w4b6n 1/1 Running 0 10m
cdi-uploadproxy-b499c7956-lsh5r 1/1 Running 0 7m7s
检查 集群 确认有 sc
kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
hwameistor-storage-lvm-hdd (default) lvm.hwameistor.io Retain WaitForFirstConsumer true 100d
local-path rancher.io/local-path Delete WaitForFirstConsumer false 100d
测试fedora 镜像, 补上 sc 等字段
cat <
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
name: “fedora”
spec:
storage:
resources:
requests:
storage: 5Gi
storageClassName: hwameistor-storage-lvm-hdd
accessModes:
– ReadWriteOnce
volumeMode: Filesystem
source:
http:
url: “https://download.fedoraproject.org/pub/fedora/linux/releases/40/Cloud/x86_64/images/Fedora-Cloud-Base-AmazonEC2.x86_64-40-1.14.raw.xz”
EOF
kubectl create -f dv_fedora.yml
启动一个vm1 虚拟机
wget https://kubevirt.io/labs/manifests/vm1_pvc.yml
cat vm1_pvc.yml
修改前
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
creationTimestamp: 2018-07-04T15:03:08Z
generation: 1
labels:
kubevirt.io/os: linux
name: vm1
spec:
runStrategy: Always
template:
metadata:
creationTimestamp: null
labels:
kubevirt.io/domain: vm1
spec:
domain:
cpu:
cores: 2
devices:
disks:
– disk:
bus: virtio
name: disk0
– cdrom:
bus: sata
readonly: true
name: cloudinitdisk
machine:
type: q35
resources:
requests:
memory: 1024M
volumes:
– name: disk0
persistentVolumeClaim:
claimName: fedora
– cloudInitNoCloud:
userData: |
#cloud-config
hostname: vm1
ssh_pwauth: True
disable_root: false
ssh_authorized_keys:
– ssh-rsa YOUR_SSH_PUB_KEY_HERE
name: cloudinitdisk
# Generate a password-less SSH key using the default location.
ssh-keygen
PUBKEY=cat ~/.ssh/id_rsa.pub
sed -i “s%ssh-rsa.*%$PUBKEY%” vm1_pvc.yml
kubectl create -f vm1_pvc.yml
fedora 起来了
virtctl console vm1 进入 这个虚拟机实例
用户是 fedora ,没有密码,ssh [email protected] 前面用了ssk-key
暴露一个nodeport
virtctl expose vmi vm1 –name=vm1-ssh –port=20222 –target-port=22 –type=NodePort
顺利完成第一轮测试,后面自制rocky linux 8.10镜像 以及win10-ltsc-2021-6216版本
定制镜像
这次自己打包了2个镜像,rockylinux 和 windows ltsc 2021.
rockylinux 镜像比较简单,rockylinux.org 官方就有 cloud 镜像,下载,重新打包即可。
wget -c https://dl.rockylinux.org/pub/rocky/8/images/x86_64/Rocky-8-GenericCloud-Base.latest.x86_64.qcow2
vi Dockerfile
FROM scratch
ADD –chown=107:107 Rocky-8-GenericCloud-Base.latest.x86_64.qcow2 /disk/
docker build -t 10.29.221.9/public/rockylinux:v810 .
docker push 10.29.221.9/public/rockylinux:v810
rocky810的镜像构建
首先rockylinux官网下载 Rocky-8-GenericCloud-Base.latest.x86_64.qcow2 云镜像,然后本地构建一下
wget -c https://dl.rockylinux.org/pub/rocky/8/images/aarch64/Rocky-8-GenericCloud-Base.latest.aarch64.qcow2
vi Dockerfile
FROM scratch
ADD –chown=107:107 Rocky-8-GenericCloud-Base.latest.x86_64.qcow2 /disk/
nerdctl build –platform=amd64 -t 10.29.221.9/public/rockylinux:v810 .
处理 rockylinux8 的 datavolume
kubectl get datavolumes.cdi.kubevirt.io rocky810-rootdisk -oyaml
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
labels:
kubevirt.io/created-by: 2782666c-6914-4395-8362-0067661a02b0
name: rocky810-rootdisk
namespace: default
spec:
pvc:
accessModes:
– ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: hwameistor-storage-lvm-hdd
source:
registry:
url: docker://10.29.221.9/public/rockylinux:v810
rocky810 vm 配置
kubectl get vm rocky810 -oyaml
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
annotations:
kubevirt.io/latest-observed-api-version: v1
kubevirt.io/storage-observed-api-version: v1
virtnest.io/alias-name: “”
virtnest.io/image-secret: “”
virtnest.io/image-source: docker
virtnest.io/os-image: 10.29.221.9/public/rockylinux:v810
labels:
virtnest.io/os-family: rocky
virtnest.io/os-version: “810”
name: rocky810
namespace: default
spec:
dataVolumeTemplates:
– metadata:
creationTimestamp: null
name: rocky810-rootdisk
namespace: default
spec:
pvc:
accessModes:
– ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: hwameistor-storage-lvm-hdd
source:
registry:
url: docker://10.29.221.9/public/rockylinux:v810
runStrategy: Always
template:
metadata:
creationTimestamp: null
spec:
architecture: amd64
domain:
cpu:
cores: 1
sockets: 1
threads: 1
devices:
disks:
– bootOrder: 1
disk:
bus: virtio
name: rootdisk
– disk:
bus: virtio
name: cloudinitdisk
interfaces:
– masquerade: {}
name: default
machine:
type: q35
memory:
guest: 2Gi
resources: {}
networks:
– name: default
pod: {}
volumes:
– dataVolume:
name: rocky810-rootdisk
name: rootdisk
– cloudInitNoCloud:
userDataBase64: I2Nsb3VkLWNvbmZpZwpzc2hfcHdhdXRoOiB0cnVlCmRpc2FibGVfcm9vdDogZmFsc2UKY2hwYXNzd2Q6IHsibGlzdCI6ICJyb290OkRhb2Nsb3VkLjIwMjMiLCBleHBpcmU6IEZhbHNlfQoKd3JpdGVfZmlsZXM6CiAgLSBwYXRoOiAvZXRjL3N5c3RlbWQvc3lzdGVtL2RoY2xpZW50LW9uLWJvb3Quc2VydmljZQogICAgcGVybWlzc2lvbnM6ICIwNjQ0IgogICAgY29udGVudDogfAogICAgICBbVW5pdF0KICAgICAgRGVzY3JpcHRpb249UnVuIGRoY2xpZW50IG9uIGJvb3QKICAgICAgQWZ0ZXI9bmV0d29yay50YXJnZXQKCiAgICAgIFtTZXJ2aWNlXQogICAgICBUeXBlPW9uZXNob3QKICAgICAgRXhlY1N0YXJ0PS9zYmluL2RoY2xpZW50CiAgICAgIFJlbWFpbkFmdGVyRXhpdD10cnVlCgogICAgICBbSW5zdGFsbF0KICAgICAgV2FudGVkQnk9bXVsdGktdXNlci50YXJnZXQKcnVuY21kOgogIC0gc3lzdGVtY3RsIGRhZW1vbi1yZWxvYWQKICAtIHN5c3RlbWN0bCBlbmFibGUgZGhjbGllbnQtb24tYm9vdC5zZXJ2aWNlCiAgLSBzeXN0ZW1jdGwgc3RhcnQgZGhjbGllbnQtb24tYm9vdC5zZXJ2aWNlCiAgLSBzZWQgLWkgIi8jXD9QZXJtaXRSb290TG9naW4vcy9eLiokL1Blcm1pdFJvb3RMb2dpbiB5ZXMvZyIgL2V0Yy9zc2gvc3NoZF9jb25maWc=
name: cloudinitdisk
kubectl apply -f rocky810.yaml
检查一下vm 状态
kubectl get vm -A
NAMESPACE NAME AGE STATUS READY
default rocky810 50d Running True
default win10 46d Running True
windows 10 ltsc 2021版镜像
然后是这次折腾的重点windows 10 ltsc 2021版镜像
当前大家默认使用的是cloudbase 的win10 专业版镜像,使用上感觉有点卡,于是尝试以前使用的windows 10 ltsc 2021。
先检查 之前安装的 kubevirt cdi 服务
kubectl -n cdi get svc -l cdi.kubevirt.io=cdi-uploadproxy
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cdi-uploadproxy ClusterIP 10.233.7.11
然后用virtctl 上传原始ISO镜像 ,实际名称为
19044.6216.250813-0800.vb_refresh_enterprise_ltsc_2021_x64freo_zh-cn_1b2d3b40.iso,这里改成 win10_ltsc.iso
virtctl image-upload –image-path=win10_ltsc.iso –storage-class hwameistor-storage-lvm-hdd pvc iso-win10 –size=7Gi –insecure –uploadproxy-url=https://10.233.7.11 –force-binds
输出日志为
PVC default/iso-win10 not found
PersistentVolumeClaim default/iso-win10 created
Waiting for PVC iso-win10 upload pod to be ready…
Pod now ready
Uploading data to https://10.233.7.11
4.64 GiB / 4.64 GiB [————————————————————————————-] 100.00% 42.76 MiB p/s 1m51s
Uploading data completed successfully, waiting for processing to complete, you can hit ctrl-c without interrupting the progress
Processing completed successfully
Uploading win10_ltsc.iso completed successfully
然后用这个 yaml 部署 vm
cat win10.vm.yaml
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: win10
spec:
runStrategy: Always
template:
metadata:
labels:
kubevirt.io/domain: win10
spec:
domain:
cpu:
cores: 4
devices:
disks:
– bootOrder: 1
cdrom:
bus: sata
name: cdromiso
– disk:
bus: virtio
name: harddrive
– cdrom:
bus: sata
name: virtiocontainerdisk
interfaces:
– masquerade: {}
model: e1000
name: default
machine:
type: q35
resources:
requests:
memory: 16G
networks:
– name: default
pod: {}
volumes:
– name: cdromiso
persistentVolumeClaim:
claimName: iso-win10
– name: harddrive
hostDisk:
capacity: 50Gi
path: /data/disk.img
type: DiskOrCreate
– containerDisk:
image: dce-boot.io/docker.io/kubevirt/virtio-container-disk
name: virtiocontainerdisk
kubectl apply -f win10.vm.yaml
然后代理出来,用vnc 工具开始安装
virtctl vnc –proxy-only win10
好几年没搞过 ltsc 安装了,看到多了一个 iot版本 ltsc
默认看不到可用盘
直接确定,选择驱动位置。
选择 第二个, 和很久以前相比多了直通这一组驱动
这次截图只给了25G磁盘了,后来另一套环境扩到50G了
kubectl get pvc disk-windows -oyaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{“apiVersion”:”v1″,”kind”:”PersistentVolumeClaim”,”metadata”:{“annotations”:{},”name”:”disk-windows”,”namespace”:”default”},”spec”:{“accessModes”:[“ReadWriteOnce”],”resources”:{“requests”:{“storage”:”25Gi”}},”storageClassName”:”local-path”}}
pv.kubernetes.io/bind-completed: “yes”
pv.kubernetes.io/bound-by-controller: “yes”
volume.beta.kubernetes.io/storage-provisioner: rancher.io/local-path
volume.kubernetes.io/selected-node: gm-10-29-221-9
volume.kubernetes.io/storage-provisioner: rancher.io/local-path
creationTimestamp: “2025-08-16T04:04:44Z”
finalizers:
– kubernetes.io/pvc-protection
name: disk-windows
namespace: default
resourceVersion: “32202616”
uid: d06116f4-606e-4567-884f-cb11bb44e8c0
spec:
accessModes:
– ReadWriteOnce
resources:
requests:
storage: 25Gi
storageClassName: local-path
volumeMode: Filesystem
volumeName: pvc-d06116f4-606e-4567-884f-cb11bb44e8c0
status:
accessModes:
– ReadWriteOnce
capacity:
storage: 25Gi
phase: Bound
然后是正常安装
遇到了一次登录交互报错
强制重启后,到了熟悉的界面, 海内存知己
隐私网络等设置
第一次画面,地球仪图标打了叉,没有网络。
设备管理这里没有网卡设备,需要安装驱动。
pci 这里打了问号
安装驱动
一路确认即可
看到卡设备了,其他几个问号的pci 依样画葫芦即可。
开启远程桌面
关机,转换当前系统 disk.img 为 标准镜像
cd /opt/local-path-provisioner/pvc-d06116f4-606e-4567-884f-cb11bb44e8c0_default_disk-windows/
qemu-img convert -O qcow2 disk.img win10.ltsc.qcow2
得到1个12G的qcow2 格式镜像
# dockerfile构建
FROM scratch
COPY –chown=107:107 –from=build win10.ltsc.qcow2 /disk/win10.ltsc.qcow2
构建并推送
nerdctl build –platform=amd64 -t dce-boot.io/public/win10.ltsc:v1 –insecure-registry .
遗留问题
dce-boot.io/public/win10.ltsc:v1 这个镜像 没有经过类似 cloudbase 的优化,印象中不能直接给其他环境使用,直接启动,下次有空再补文档吧。
2025-10-13 更新
今天找环境使用了这个镜像,能直接在虚拟机模块使用,不用用户再次初始化,但是有2个小问题,它会继续更新补丁,这时电脑会很卡,另外据说2025-10-14 后,win10 不再更新。 然后磁盘占用比较厉害,30G的盘干掉了20G,需要精简。以后要么直接用精简版开始构建镜像吧。
更新后,系统信息这样的界面,我没看过,截图存档。
参考文档
https://kubevirt.io/quickstart_cloud/
https://icloudnative.io/posts/use-kubevirt-to-manage-windows-on-kubernetes/
https://www.ctyun.cn/document/10027726/10747218
https://www.geminiopencloud.com/zh-tw/blog/kubevirt-2/
FROM: https://blog.wanjie.info/2025/09/kubevirt-uses-custom-virtual-machines/
]]>在开始使用yq之前,我们首先需要安装它。当你在谷歌上搜索yq时,你会发现两个项目/存储库。
首先,在https://github.com/kislyuk/yq是jq(JSON处理器)的包装器。如果你已经熟悉jq,你可能想抓住这个,使用你已经知道的语法。
不过,在本文中,我们将使用https://github.com/mikefarah/yq 这个版本,这个版本没有100%匹配jq语法,但它的优点是它没有依赖性(不依赖于jq),有关差异的更多上下文,请参阅下面的GitHub问题。
https://github.com/mikefarah/yq/issues/193
mikefarah版本的yq的具体用法参见文档:https://mikefarah.gitbook.io/yq/
由于公司产品需要部署在无网环境,所以需要制作适配各个组件的离线安装源,包括rpm/apt之类的系统相关软件包,以及各个二进制/压缩包之类的离线文件,原来是做法是统一将所有的离线源统一整合到一个yml文件当中去,现在的做法是将一个名为packages.yaml的文件放到各个组件的工程里,通过git下载然后进行聚合生成最后的全量列表。
聚合之前,通过生成一个config.yaml来进行选择哪些组件的聚合,下面是全量的配置:
---
downMode: git
project:
- name: offline-packages
url: https://git.xxx.com/PaaS/offline-packages.git
branch: autoupdate-git
item:
- name: yks
url: https://git.xxx.com/PaaS/paas-installer.git
branch: develop
item:
- name: middleware
url: https://git.xxx.com/PaaS/middleware.git
branch: develop
item:
- redis
- nginx
- elasticsearch
- kafka
- kibana
- license-server
- minio通过流水线传参,生成以分号为分隔符的参数,来自定义生成config_custom.yaml:
# 生成聚合用的config.yml
if [ "$middleware_names" != "all" ]; then
middleware_pattern="middleware|$(echo ${middleware_names} |sed 's/;/\|/g')"
# 只保留选择的大类,如'yks, middleware, offline-packages'
yq 'del(.project[] | select(.name | test("'$middleware_pattern'")|not))' config.yml > ./config_custom_tmp.yml
# 再次筛选, 只保留middleware大类下的小类,如'redis, nginx'
yq 'del(.project[].item.[] | select(. |test("'$middleware_pattern'")|not) )' config_custom_tmp.yml > ./config_custom.yml
config_file=config_custom.yml
rm -rf config_custom_tmp.yml
else
config_file=config.yml
fi解析:
由于yq的contains不支持多个匹配,所以在这里用到了test来测试是否包含以‘|’分隔的多个字符
del的用法:
用del删除之前,先要能查询打印出需要删除的部分,才能进一步使用del删除,如:
如果选择了只生成middleware和paas-installer, 则先查出不包含middleware和paas-installer的部分:
[root@jenkins1-iuap-hb2-ali offline-packages]# cat config.yml |yq '.project[]| select(.name | test("middleware|yks")|not)'
name: offline-packages
url: https://git.xxx.com/paas/offline-packages.git
branch: develop
item:然后调用del删除:
[root@jenkins1 offline-packages]# cat config.yml |yq 'del(.project[]| select(.name | test("middleware|yks")|not))'
---
downMode: git
project:
- name: paas
url: https://git.xxx.com/PaaS/paas-installer.git
branch: develop
item:
- name: middleware
url: https://git.xxx.com/PaaS/middleware.git
branch: develop
item:
- redis
- nginx
- elasticsearch
- kafka
- kibana
- license-server
- minio
如果只选择了paas,nginx和redis,则需再次筛选, 只保留middleware大类下的小类,如’redis, nginx’:
[root@jenkins1 offline-packages]# yq 'del(.project[].item.[] | select(. |test("redis|nginx")|not) )' config_custom_tmp.yml
---
downMode: git
project:
- name: yks
url: https://git.xxx.com/PaaS/paas-installer.git
branch: develop
item: []
- name: middleware
url: https://git.xxx.com/PaaS/middleware.git
branch: develop
item:
- redis
- nginx
最后生成的格式如下:
os: kylin
version: v10-sp3
package:
- name:
- telnet
- curl
- wget
- name:
- docker-ce-20.10.17
- docker-ce-cli-20.10.17
- containerd.io-1.6.28
arch: amd64
repo:
- https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
- http://mirrors.aliyun.com/repo/Centos-7.repo
replace:
- item: repo
from: $releasever
to: "7"
- name:
- docker-ce-20.10.17
- docker-ce-cli-20.10.17
- containerd.io-1.6.28
arch: arm64
repo:
- https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
- http://mirrors.aliyun.com/repo/Centos-altarch-7.repo
replace:
- item: repo
from: $releasever
to: "7"
file:
- name: cni-plugins
arch: amd64
src: http://bucket.oss-cn-beijing.aliyuncs.com/download/nexus2/raw-apis/containernetworking/plugins/releases/download/v1.1.1/cni-plugins-linux-amd64-v1.1.1.tgz
dest: raw-apis/containernetworking/plugins/releases/download/v1.1.1/cni-plugins-linux-amd64-v1.1.1.tgz需要再次通过yq来进行提取过滤:
yq -o=j -I=0 '.package[] | select(. | has("repo")|not) | select(.arch == "amd64" or .arch == "x86_64" or has("arch")|not)' packages.yaml其中-o=j表示输出是json格式,-I=0代表sets indent level for output为0,默认是2,输出结果是:
{
"name": [
"telnet",
"curl",
"wget",
...
]
}
yq -o=j -I=0 '.package[] | select(. | contains({"repo":""})) | select(.arch == "amd64" or .arch == "x86_64" or has("arch")|not) ' packages.yaml因为rhel 8的docker repo不同,聚合生成了好几个带repo的docker安装列表:
os: rhel
version: "8.8"
package:
- name:
- telnet
- name:
- docker-ce-20.10.17
- docker-ce-cli-20.10.17
- containerd.io-1.6.28
arch: amd64
repo:
- https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
- http://mirrors.aliyun.com/repo/Centos-7.repo
replace:
- item: repo
from: $releasever
to: "7"
- name:
- docker-ce-20.10.17
- docker-ce-cli-20.10.17
- containerd.io-1.6.28
arch: arm64
repo:
- https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
- http://mirrors.aliyun.com/repo/Centos-altarch-7.repo
replace:
- item: repo
from: $releasever
to: "7"
- name:
- ntp
- ntpdate
- libselinux-python
repo:
- https://mirrors.aliyun.com/repo/Centos-7.repo
replace:
- item: repo
from: $releasever
to: "7"
- name:
- docker-ce-25.0.3
- docker-ce-cli-25.0.3
- containerd.io-1.6.28
arch: amd64
repo:
- https://mirrors.aliyun.com/docker-ce/linux/rhel/docker-ce.repo
- http://mirrors.aliyun.com/repo/Centos-8.repo
replace:
- item: repo
from: $releasever
to: "8"需要只提取rhel安装用的:
yq 'del( .package[] | select(. | contains({"repo":""})) | select(.name[] =="*docker*") | select(.repo[] == "*/linux/centos/*") )' all_packages.yaml
ARG FROM=""
FROM $FROM as packages
ARG FROM=""
WORKDIR /packages
# download packages
RUN set -x \
&& ARCH=$(uname -m); echo "ARCH=$ARCH" > .info \
&& ARCH_ALIAS=$(echo $ARCH | sed 's/x86_64/amd64/;s/aarch64/arm64/'); echo "ARCH_ALIAS=$ARCH_ALIAS" >> .info \
&& BASE_URL=http://bucket.oss.aliyuncs.com/download \
&& image=$(echo $FROM|awk -F'/' '{print $NF}') \
&& image_name=$(echo $image|awk -F':' '{print $1}') \
&& image_tag=$(echo $image|awk -F':' '{print $2}') \
&& os_name=${image_name} \
&& os_version=$(echo $image_tag|sed 's/-arm64//;s/-amd64//') \
&& echo "os_name=$os_name" >> .info \
&& echo "image_name=$image_name" >> .info \
&& echo "image_tag=$image_tag" >> .info \
&& echo "os_version=$os_version" >> .info
# && curl -o /usr/bin/yq $BASE_URL/binary/yq_linux_$ARCH && chmod +x /usr/bin/yq
COPY all_packages.yaml .
RUN set -x; echo "download package with no custom repo firstly" \
&& source ./.info \
&& declare -A packages \
&& packages=$(yq -o=j -I=0 '.package[] | select(. | has("repo")|not) | select(.arch == "'$ARCH_ALIAS'" or .arch == "'$ARCH'" or has("arch")|not)' all_packages.yaml) \
&& for pkg in ${packages}; do \
if (which rpm &> /dev/null); then \
echo "$pkg"|yq '.name[]' | sort -u | xargs repotrack -d 9 --destdir "$os_name/$os_version/os/$ARCH/Packages"; \
elif (which apt &> /dev/null); then \
echo $(echo "$pkg"|yq '.name[]') $(dpkg --get-selections | grep -v deinstall | cut -f1 | cut -d ':' -f1) | sed -e 's/ntp\|ntpdate/ /g' | \
sort -u | xargs apt-get install --reinstall --print-uris | awk -F "'" '{print $2}' | grep -v '^$' > packages_1.urls; \
apt-get install --reinstall --print-uris ntp ntpdate | awk -F "'" '{print $2}' | grep -v '^$' >> packages_1.urls; \
wget -q -x -P $os_name/$os_version -i packages_1.urls; \
fi; \
done
RUN set -x; echo "download package with custom repo sencondly" \
&& source ./.info \
&& declare -A packagesWithRepo \
&& packagesWithRepo=$(yq -o=j -I=0 '.package[] | select(. | contains({"repo":""})) | select(.arch == "'$ARCH_ALIAS'" or .arch == "'$ARCH'" or has("arch")|not) ' all_packages.yaml) \
&& for pkg in ${packagesWithRepo}; do \
repo=$(echo "$pkg"|yq '.repo[]'); \
replace=$(echo "$pkg"|yq -o=j -I=0 '.replace[]'); \
echo "$repo"|while read ro; do \
if (which rpm &> /dev/null); then \
repo_name=${ro##*/}; \
curl -k -o /etc/yum.repos.d/${repo_name} ${ro}; \
for r in ${replace}; do \
from=$(echo "$r"|yq '.from'); \
to=$(echo "$r"|yq '.to'); \
sed -i "s@$from@$to@g" /etc/yum.repos.d/${repo_name}; \
done; \
elif (which apt &> /dev/null); then \
eval echo "$ro" > /etc/apt/sources.list.d/custom.list; \
fi; \
done; \
if (which rpm &> /dev/null); then \
yum makecache; \
echo "$pkg"|yq '.name[]' | sort -u | xargs repotrack -d 9 --destdir "$os_name/$os_version/os/$ARCH/Packages"; \
elif (which apt &> /dev/null); then \
echo "$pkg"|yq '.name[]' | xargs apt-get --allow-unauthenticated install --reinstall --print-uris | awk -F "'" '{print $2}' | grep -v '^$' | sort -u > packages_2.urls; \
wget -q -x -P $os_name/$os_version -i packages_2.urls; \
fi; \
done
# create repo
RUN set -x \
&& source ./.info \
&& if (which rpm &> /dev/null); then \
createrepo -d "$os_name/$os_version/os/$ARCH"; \
if [ "$image_name" == "rhel" -o "$image_name" == "nfs" ]; then \
# 修复:No available modular metadata for modular package
repo2module -s stable $os_name/$os_version/os/$ARCH modules.yaml \
&& modifyrepo_c --mdtype=modules modules.yaml $os_name/$os_version/os/$ARCH/repodata/ \
&& rm -rf modules.yaml; \
fi; \
printf '\
['$os_name''$os_version'] \n\
name='$os_name''$os_version' \n\
baseurl=http://{{ nexus_access_address|default(127.0.0.1) }}/nexus/content/repositories/'$os_name'/'$os_version'/os/$basearch/ \n\
enabled=1 \n\
gpgcheck=0 \n\
sslverify=0 \n\
' > ${os_name}.repo.j2; \
elif (which apt &> /dev/null); then \
dpkg-scanpackages $os_name/$os_version | gzip -9c > $os_name/$os_version/Packages.gz; \
printf '\
deb [trusted=yes] http://{{ nexus_access_address|default(127.0.0.1) }}/nexus/content/repositories/'$os_name'/'$ARCH_ALIAS' '$os_version'/ \n\
' > ${os_name}.list.j2; \
fi \
\
# clean finaly
&& rm -rf packages_1.urls packages_2.urls
# download files
RUN set -x;source ./.info \
&& item=$(yq -o=j -I=0 '.file[] | select(.arch == "'$ARCH_ALIAS'" or .arch == "'$ARCH'" or has("arch")|not)' all_packages.yaml) \
&& if [ "X$item" != "X" ]; then \
for i in $item; do \
src=$(echo "$i"|yq '.src'); \
dest=$(echo "$i"|yq '.dest'); \
dest_path=${dest%/*}; \
# dest_file=${dest##*/}; \
mkdir -p files/$dest_path; \
wget -c -q -O files/$dest "$src"; \
done; \
fi
FROM scratch
COPY --from=packages /packages ./packages方案一
通过ip命令提取默认路由,来找到默认通信的网卡
获取IP命令:
if [[ "$useIPV6" == "true" ]]; then
ip_r=$(ip -6 r g 2000::1)
LOCAL_IP=$(grep -oP "src \S+ " < <(echo $ip6_r) | sed 's/src //' | awk '{gsub(/^\s+|\s+$/, "");print}')
if [ "X$LOCAL_IP" == "X" ]; then
LOCAL_IP=$(ip add | grep -w inet6 | grep -v ::1 | awk NR==1'{print $2}' | cut -d "/" -f 1)
fi
else
ip_r=$(ip r g 1.0.0.0)
LOCAL_IP=$(grep -oP "src \S+ " < <(echo $ip_r) | sed 's/src //' | awk '{gsub(/^\s+|\s+$/, "");print}')
if [ "X$LOCAL_IP" == "X" ] || (! echo $LOCAL_IP | grep -Eq '^([0-9]{1,3}\.){3}[0-9]{1,3}$'); then
LOCAL_IP=$(ip add | grep -w inet | grep -v 127.0.0.1 | awk NR==1'{print $2}' | cut -d "/" -f 1)
fi
fi
fi
获取网卡名称
function get_cfgname(){
ipa_info=$(ip a)
line=$(echo "${ipa_info}" | sed -n -e "/\<${LOCAL_IP}\>/=")
detect_cfgname=$(echo "${ipa_info}" | sed -n "1,${line}p" | grep '^[0-9]' | sed -n '$p' | awk -F ':| ' '{print $3}')
echo $detect_cfgname
}
function interface_check() {
cfgname=$(get_cfgname)
# 判断网卡名称,此变量由环境变量传过来,如没有传时,则根据ip来反推网卡名称
if [ "X$ifCfgName" == "X" ]; then
ifCfgName=$cfgname
if [ "X$ifCfgName" != "X" ]; then
echo_color info "经自动探测,服务器网卡名称为:${ifCfgName}"
else
echo_color error "服务器网卡名称:${ifCfgName} 不存在, 请再次确认!"
exit 1
fi
elif [ "$ifCfgName" != "$cfgname" ]; then
detect_ipaddr=$(ip addr | awk '/inet/ && ! /\/32/ {ip[$NF] = $2; sub(/\/.*$/,"",ip[$NF])} END {for(i in ip){if(i ~ "'$ifCfgName'") print ip[i]}}')
if [ "$LOCAL_IP" != "$detect_ipaddr" ]; then
echo_color warning "填的网卡名称是$ifCfgName,跟探测的网卡名称$cfgname 不一致,请再次确认!"
exit 1
fi
else
echo_color info "填写的服务器网卡名称为:${ifCfgName}"
fi
write_restore_end
}
方案二
通过ansible_default_ipv4变量
获取IP:
if [[ "$useIPV6" == "true" ]]; then
filter=ansible_default_ipv6
else
filter=ansible_default_ipv4
fi
# define parameters of address and interface
eval $(ansible localhost -m setup -a 'gather_subset=!all,network filter='$filter''|egrep "\"interface\"|\"address\""|awk -F': ' '{gsub(/"|,| /,"",$0);gsub(/:/,"=",$0);print $0}')
LOCAL_IP=$address获取网卡名称:
# 判断网卡名称,此变量由环境变量传过来,如没有传时,则提取ansible_default_ipv4变量
if [ "X$ifCfgName" == "X" ]; then
ifCfgName=$interface
echo_color info "经自动探测,服务器网卡名称为:${ifCfgName}"
elif [ "$ifCfgName" != "$interface" ]; then
echo_color warning "填的网卡名称是$ifCfgName,跟探测的网卡名称$interface 不一致,请再次确认!"
exit 1
else
echo_color info "填写的服务器网卡名称为:${ifCfgName}"
fi]]>
safe-rm 是一款用来替代不安全 rm 的开源软件,可以在 /etc/safe-rm.conf 文件中配置保护名单,定义哪些文件不能被 rm 删除,可用于防止执行 rm -rf 命令导致文件被误删的发生。
0.x版本的是通过shell脚本来实现的,而1.x版本则通过rust来实现的,需要现编译。
# 下载文件(-c 的作用是可以断点续传)
# wget -c https://launchpadlibrarian.net/188958703/safe-rm-0.12.tar.gz
# 解压文件
# tar -xvf safe-rm-0.12.tar.gz
# 拷贝可执行文件
# cd safe-rm
# cp safe-rm /bin/
# 建立软链接
# mv /bin/rm /bin/rm.bak
# ln -s /bin/safe-rm /bin/rm
# 默认的safe-rm配置文件有以下两个,需要自行创建 全局配置:/etc/safe-rm.conf 用户配置:~/.safe-rm # 创建全局配置文件 # touch /etc/safe-rm.conf # 添加保护名单 # vim /etc/safe-rm.conf / /* /etc /etc/* /usr /usr/* /usr/local /usr/local/* /usr/local/bin /usr/local/bin/* /bin/* /boot/* /dev/* /etc/* /home/* /lib/* /lib64/* /media/* /mnt/* /opt/* /proc/* /root/* /run/* /sbin/* /srv/* /sys/* /var/*
# 创建测试文件 # touch /home/test.txt # 追加需要保护的文件路径到配置文件中 # vim /etc/safe-rm.conf /home/test.txt # 测试删除受保护的文件路径,如果输出skipping日志代表safe-rm生效 # rm /home/test.txt # rm -rf /home/test.txt safe-rm: skipping /home/test.txt # 注意: # 配置文件里面的/etc只能保证执行"rm -rf /etc"命令的时候不能删除,但是如果执行"rm -rf /etc/app",还是可以删除app文件的 # 如果想保证某个目录下面的所有文件都不被删除,则配置文件里可以写成/etc/*,但使用通配符的方式无法避免/etc目录下链接文件被删除 # 例如/lib或/lib64这种目录,下面会有很多库文件对应的链接文件,使用safe-rm并不能保护链接文件被删除 # 建议将/etc/safe-rm.conf加入到保护名单中,防止/etc/safe-rm.conf被删后配置失效 # 使用系统默认的删除命令,此时safe-rm的保护作用将失效 # /usr/bin/rm -rf /etc/app
1.x版本解决了软链接文件的问题,保护目录下的软链接文件也不会被删除。
官网地址是https://launchpad.net/safe-rm/trunk/1.1.0
#下载safe-rm源码 wget -c https://launchpad.net/safe-rm/trunk/1.1.0/+download/safe-rm-1.1.0.tar.gz #解压 tar -zxvf safe-rm-1.1.0.tar.gz #安装依赖 yum install cargo cd safe-rm-1.1.0/ #编译,编译完的safe-rm文件大小是11M make #移动功能文件到系统目录 mv target/release/safe-rm /usr/local/bin/ #替换rm命令 echo 'alias rm="safe-rm -i"' >> /etc/bashrc #刷新修改的文件 source /etc/bashrc #创建链接,将safe-rm替换rm ln -s /usr/local/bin/safe-rm /usr/local/bin/rm echo 'PATH=/usr/local/bin:$PATH' >> /etc/profile source /etc/profile
]]>
While the Certbot team tries to keep the Certbot packages offered by various operating systems working in the most basic sense, due to distribution policies and/or the limited resources of distribution maintainers, Certbot OS packages often have problems that other distribution mechanisms do not. The packages are often old resulting in a lack of bug fixes and features and a worse TLS configuration than is generated by newer versions of Certbot. They also may not configure certificate renewal for you or have all of Certbot’s plugins available. For reasons like these, we recommend most users follow the instructions at https://certbot.eff.org/instructions and OS packages are only documented here as an alternative.
翻译:
尽管Certbot团队试图保持各操作系统的Certbot包在最基本的功能上运作,但由于各发行版的政策和有限的维护人员,Certbot OS包通常存在其他发行版本没有的问题。这些包通常很旧,导致缺少bug修复和一些特性,并且TLS的配置比新版本Certbot生成的较差。他们也可能没有为你配置证书更新,或者没有Certbot的全部可用插件。出于这些原因,我们建议大多数用户按照https://certbot.eff.org/instructions而操作系统包在这里只是作为一个替代方案。
官方给出的文档https://snapcraft.io/docs/installing-snapd,这里翻译一下CentOS版本
Snap在CentOS 7.6版本以上均可用
sudo yum install epel-release
添加EPEL repository后,进行Snapd的安装
sudo yum install snapd
安装后,需要启用管理snap通信套接字的systemd unit
sudo systemctl enable --now snapd.socket
为了启用classic snap的支持,需要创建如下软连接
sudo ln -s /var/lib/snapd/snap /snap
重新登录或者重启你的系统,以确保snap的路径正确更新,至此snap安装完成。
安装Certbot
升级snap
执行如下命令以保证Snap为最新的版本
sudo snap install core sudo snap refresh core
卸载Certbot和其他Certbot OS包
如果使用操作系统包管理器(如apt、dnf或yum)安装了任何Certbot包,则应在安装Certbot snap之前将其删除,以确保在运行命令Certbot时使用snap,而不是从操作系统包管理器安装。执行此操作的确切命令取决于你的操作系统,常见的示例有sudo apt get remove certbot、sudo dnf remove certbot或sudo yum remove certbot。
如果以前通过Certbot auto脚本使用Certbot,还应该按照此处的说明删除其安装。
安装Certbot
依次执行下列命令
# 删除以前的安装 yum -y remove certbot rm -rf /opt/scripts/certbot snap install --classic certbot
当执行snap install –classic certbot命令时,可能会有如下报错
[root@localhost ~]# snap install --classic certbot
error: cannot install "certbot": classic confinement requires snaps under /snap
or symlink from /snap to /var/lib/snapd/snap解决:
创建一个软链:
ln -s /var/lib/snapd/snap /snap
执行如下命令以确保Certbot命令行可用
ln -s /snap/bin/certbot /usr/bin/certbot
下面是生成两个域名的SSL证书
certbot certonly --standalone --email [email protected] -d open010.com -d www.open010.com
当然, 我们用Let’s Encrypt生成 泛域名ssl证书最方便了, 这样所有子域名都可以直接使用一个证书
官网文档:https://certbot.eff.org/docs/using.html
直接使用官网首页的安装方法是无法使用最新的Let’s Encrypt的v2 API,这里加参数–server https://acme-v02.api.letsencrypt.org/directory ,还需要增加DNS验证,需要添加参数–preferred-challenges dns,而我用的DNS服务器商没有API来自动验证,故需要添加参数–manual,还需要注意的是:域名不能只添加*.xxx.com, 还需要增加xxx.com, 不然只能子域名用,xxx.com域名就用不了了, 所以最后的命令为:
certbot certonly --preferred-challenges dns --manual -d *.open010.com -d open010.com --server https://acme-v02.api.letsencrypt.org/directory
签发证书时提示添加TXT记录,需要打开域名注册网站,按要求添加一下TXT记录,等记录生效后, 按回车,这样等待签发完成即可
[root@summer sean]# certbot certonly --preferred-challenges dns --manual -d *.open010.com --server https://acme-v02.api.letsencrypt.org/directory Saving debug log to /var/log/letsencrypt/letsencrypt.log Requesting a certificate for *.open010.com Successfully received certificate. Certificate is saved at: /etc/letsencrypt/live/open010.com-0002/fullchain.pem Key is saved at: /etc/letsencrypt/live/open010.com-0002/privkey.pem This certificate expires on 2024-05-04. These files will be updated when the certificate renews. NEXT STEPS: - This certificate will not be renewed automatically. Autorenewal of --manual certificates requires the use of an authentication hook script (--manual-auth-hook) but one was not provided. To renew this certificate, repeat this same certbot command before the certificate's expiry date. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - If you like Certbot, please consider supporting our work by: * Donating to ISRG / Let's Encrypt: https://letsencrypt.org/donate * Donating to EFF: https://eff.org/donate-le - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
生成证书时, 会在/etc/letsencrypt/renewal目录下生成配置文件open010.com.conf,生成前可以将旧的配置文件先删除
申请 成功后,证书会保存在/etc/letsencrypt/live/open010.com下面
配置nginx证书,配置中用到一个 /data/secret/dhparam.pem 文件,该文件是一个 PEM 格式的密钥文件,用于 TLS 会话中。用来加强 ssl 的安全性。生成该文件方法,
mkdir /etc/ssl/private/ cd /etc/ssl/private/ openssl dhparam 2048 -out dhparam.pem
配置到nginx:
ssl_dhparam /etc/ssl/private/dhparam.pem;
自动续期证书
Certbot包带有cron作业或systemd计时器,它将在证书过期之前自动续订证书。除非更改配置,否则不需要再次运行Certbot。通过运行以下命令,可以测试证书的自动续订
sudo certbot renew --dry-run
Let’s Encrypt证书有效期为90天,设置一个定时任务来运行certbot命令来自动续期
0 2 1 * * certbot renew --quiet && nginx -s reload
这样,系统就会每月1日凌晨两点续期证书了
]]>linux服务器挂载了一个含有海量文件的 nfs 目录,当使用 df命令时卡住了?
先确认下是不是这个 nfs 目录的原因,使用strace df -h跟踪一下是哪个系统调用有问题。遇到卡住的地方就会停住
[root@node1 ~]# strace df -h
execve("/usr/bin/df", ["df", "-h"], 0x7ffefb6cbde8 /* 23 vars */) = 0
brk(NULL) = 0x230b000
...
...
statfs("/data/nfs", #卡住的地方,确实是挂载的nfs目录nfsstat -m命令定位一下挂载的目录
结果发现, 服务器是挂载了一台不存在的nfs server导致的
处理:
#!/bin/bash
DB_USER='ro_all_db'
DATE=`date -d"today" +%Y%m%d`
TIME=`date "+%Y-%m-%d %H:%M:%S"`
MYSQL_SOURCE='172.20.x.x'
# DB_PASSWORD='xxxx'
echo '--------------开始备份:开始时间为 '$TIME
BEGIN=`date "+%Y-%m-%d %H:%M:%S"`
BEGIN_T=`date -d "$BEGIN" +%s`
echo '===>备份'$MYSQL_SOURCE'的mysql实例,开始时间为 '$BEGIN
BACKUP_DIR=/data/backup/$DATE/$MYSQL_SOURCE;
mkdir -p $BACKUP_DIR;
for i in `mysql --ssl-mode=DISABLED -h$MYSQL_SOURCE -u$DB_USER -p'mypasswd' -B iuap_apcom_supportcenter -e "show tables;" | grep 'multilang_'`; do
echo "-->开始备份表$i"
mysqldump --ssl-mode=DISABLED --skip-opt -h$MYSQL_SOURCE -u$DB_USER -p'mypasswd' iuap_apcom_supportcenter $i > $BACKUP_DIR/$i.sql
echo "-->表${i}已经备份完成"
done
cd $BACKUP_DIR
mkdir $BACKUP_DIR/replace_sql
for i in `ls *.sql`; do
cat $i | sed -n '/^INSERT INTO/p'|sed 's/^INSERT INTO/REPLACE INTO/g'> $BACKUP_DIR/replace_sql/$i
done
cd $BACKUP_DIR/replace_sql
let num=1
echo "--------------开始修改:开始时间为 `date "+%Y-%m-%d %H:%M:%S"`"
for i in `ls *.sql`; do
echo "$num-->开始修改表:$i,开始时间为 `date "+%Y-%m-%d %H:%M:%S"`"
mysql -h172.20.1.1 -upremises -pmypasswd iuap_apcom_supportcenter < $i
echo "$num-->完成修改表:$i,完成时间为 `date "+%Y-%m-%d %H:%M:%S"`"
let num+=1
done
TIME=`date "+%Y-%m-%d %H:%M:%S"`
echo "--------------结束修改:时间为`date "+%Y-%m-%d %H:%M:%S"`"]]>
将 kubernetes 官方源码 fork 到自己的 repo 中
$ git clone https://github.com/k8sli/kubernetes.git $ cd kubernetes $ git remote add upstream https://github.com/kubernetes/kubernetes.git $ git fetch --all $ git checkout upstream/release-1.32 $ git checkout -B kubeadm-1.32
注:如果github上没有相关branch,也需要基于upstream来首先creat一个分支(比如release-1.32)
.github/workflows/kubeadm.yaml---
name: Build kubeadm binary
on:
push:
tag:
- 'v*'
# 手动触发事件
workflow_dispatch:
jobs:
build:
runs-on: ubuntu-20.04
# 这里我们选择以 tag 的方式惩触发 job 的运行
if: startsWith(github.ref, 'refs/tags/')
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Build kubeadm binary
shell: bash
run: |
# 运行 build/run.sh 构建脚本来编译相应平台上的二进制文件
build/run.sh make kubeadm KUBE_BUILD_PLATFORMS=linux/amd64
build/run.sh make kubeadm KUBE_BUILD_PLATFORMS=linux/arm64
# 构建好的二进制文件存放在 _output/dockerized/bin/ 中
# 我们根据二进制目标文件的系统名称+CPU体系架构名称进行命名
- name: Prepare for upload
shell: bash
run: |
mv _output/dockerized/bin/linux/amd64/kubeadm kubeadm-linux-amd64
mv _output/dockerized/bin/linux/arm64/kubeadm kubeadm-linux-arm64
sha256sum kubeadm-linux-{amd64,arm64} > sha256sum.txt
# 使用 softprops/action-gh-release 来将构建产物上传到 GitHub release 当中
- name: Release and upload packages
uses: softprops/action-gh-release@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
files: |
sha256sum.txt
kubeadm-linux-amd64
kubeadm-linux-arm64
build/run.sh: Run a command in a build docker container. Common invocations:
build/run.sh make: Build just linux binaries in the container. Pass options and packages as necessary.build/run.sh make cross: Build all binaries for all platforms. To build only a specific platform, addKUBE_BUILD_PLATFORMS=<os>/<arch>build/run.sh make kubectl KUBE_BUILD_PLATFORMS=darwin/amd64: Build the specific binary for the specific platform (kubectlanddarwin/amd64respectively in this example)build/run.sh make test: Run all unit testsbuild/run.sh make test-integration: Run integration testbuild/run.sh make test-cmd: Run CLI tests
cmd/kubeadm/app/constants/constants.go找到 CertificateValidity 变量将它在 365 天后面加两个 0,就将证书续命为 100 年了。
// CertificateValidity defines the validity for all the signed certificates generated by kubeadm - CertificateValidity = time.Hour * 24 * 365 + CertificateValidity = time.Hour * 24 * 36500 // CACertAndKeyBaseName defines certificate authority base name CACertAndKeyBaseName = "ca"
找到NotAfter,默认生成ca证书是10年,改为100年
func NewSelfSignedCACert(cfg Config, key crypto.Signer) (*x509.Certificate, error) {
now := time.Now()
tmpl := x509.Certificate{
SerialNumber: new(big.Int).SetInt64(0),
Subject: pkix.Name{
CommonName: cfg.CommonName,
Organization: cfg.Organization,
},
DNSNames: []string{cfg.CommonName},
NotBefore: now.UTC(),
// NotAfter: now.Add(duration365d * 10).UTC(),
// extend ca cert to 100 years
NotAfter: now.Add(duration365d * 100).UTC(),
KeyUsage: x509.KeyUsageKeyEncipherment | x509.KeyUsageDigitalSignature | x509.KeyUsageCertSign,
BasicConstraintsValid: true,
IsCA: true,
}
const ( // MaximumAllowedMinorVersionUpgradeSkew describes how many minor versions kubeadm can upgrade the control plane version in one go // 改成30 // MaximumAllowedMinorVersionUpgradeSkew = 1 MaximumAllowedMinorVersionUpgradeSkew = 15 // MaximumAllowedMinorVersionDowngradeSkew describes how many minor versions kubeadm can upgrade the control plane version in one go // 改成30 // MaximumAllowedMinorVersionDowngradeSkew = 1 MaximumAllowedMinorVersionDowngradeSkew = 15 // MaximumAllowedMinorVersionKubeletSkew describes how many minor versions the control plane version and the kubelet can skew in a kubeadm cluster // 改成30 // MaximumAllowedMinorVersionKubeletSkew = 3 MaximumAllowedMinorVersionKubeletSkew = 15 )
提交:
在分支上完成修改之后,我们将这个修改 cherry-pick 到其他的 tag 上面去,下面以 v1.21.4 为例子:在 v1.21.4 tag 的基础之上将上述的修改 cherry-pick 过来,重新打上新的 tag。
$ COMMIT_ID=$(git rev-parse HEAD)
$ git checkout v1.21.4 Note: checking out 'v1.21.4'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example: HEAD is now at 3cce4a82b44 Release commit for Kubernetes v1.21.4
$ git cherry-pick $COMMIT_ID [detached HEAD baadbe03458] Update kubeadm CertificateValidity time to ten years Date: Tue Aug 24 16:32:49 2021 +0800 2 files changed, 38 insertions(+), 1 deletion(-) create mode 100644 .github/workflows/kubeadm.yaml
v1.21.4-patch-1.0$ git tag v1.21.4-patch-1.0 -f Updated tag 'v1.21.4-patch-1.0' (was 70bcbd6de6c)
$ git push origin --tags -f Enumerating objects: 17, done. Counting objects: 100% (17/17), done. Delta compression using up to 4 threads Compressing objects: 100% (9/9), done. Writing objects: 100% (10/10), 1.13 KiB | 192.00 KiB/s, done. Total 10 (delta 7), reused 0 (delta 0) remote: Resolving deltas: 100% (7/7), completed with 7 local objects. To github.com:k8sli/kubernetes.git + c2a633e07ec...baadbe03458 v1.21.4-patch-1.0 -> v1.21.4-patch-1.0 (forced update)
上面只展示了以一个 tag 为单位进行构建的流程,想要构建其他版本的 kubeadm ,可以按照同样的流程和方法来完成。其实写一个 shell 脚本来处理也是十分简单,如下:
#!/bin/bash
set -o errexit
set -o nounset
# 定义 commit ID
: ${COMMIT:="48e4b4c7c62a84ab4ec363588721011b73ee77e6"}
# 定义需要重新编译的版本号
: ${TAGS:="v1.22.1 v1.22.0 v1.21.4 v1.21.3 v1.20.10 v1.19.14 v1.18.10"}
for tag in ${TAGS}; do
git reset --hard ${tag}
git cherry-pick ${COMMIT}
git tag ${tag}-patch-1.0
git push origin ${tag}-patch-1.0
done使用 GitHub Actions 的好处就是能够为我们解决代码管理和产物管理,构建好的二进制文件存放在 GitHub release 当中,下载和使用起来十分方便,不用在自己搞一台单独的机器或者存储服务器,节省很多人力维护成本。
更新:
如果下次又要修改的时候
先check到上次提交的tag: git checkout v1.32.2-patch-1.0
然后基于此tag创建一个新分支:git checkout -B kubeadm-1.32
修改完后,git add/commit, 然后
COMMIT_ID=$(git rev-parse HEAD)
git checkout v1.32.2-patch-1.0
git cherry-pick $COMMIT_ID
git tag v1.32.2-patch-1.0 -f
git push origin –tags -f
这样就可以不改变远程的tag名称,进行覆盖推送了,当然如果重新命名一个新tag推送也可以
]]>remote: Support for password authentication was remove on August 123, 2021. Please use a personal access token instead.
原因是github不再使用密码方式验证身份,现在使用个人token。
本文记录,
github的官方有给出如何生成个人token的文档。参考github官网生成token文档
之前,github使用用户名和密码作为身份验证,现在使用用户名和token作为验证。
比如,github官网给出的示例。克隆一个仓库,提示输入用户名和密码,此处就可以使用上面生成的token作为密码使用。
$ git clone https://github.com/username/repo.git Username: your_username Password: your_token
但是有一个问题,我们总不能记住那么长的一串token吧
为了解决这个问题,github提供了gh工具,通过gh登录验证身份后,之后再不需要验证身份。
此处只演示ubuntu安装gh工具。
$ sudo apt update $ sudo apt install snapd $ sudo apt install gh
然后使用gh进行认证
$ gh auth login # 输入你的用户名和token
如下图所示:使用键盘上下键选择对应项,回车键确认。
依次选择Github.com, HTTPS(如果使用的https协议), 选择使用网页浏览器认证或者粘贴token认证,二者选择一个即可。如果是ssh远程登录,命令行中无法打开远程的浏览器,那么只能选择token验证了。选择使用网页认证:先复制命令行中生成的一次性验证码,比如我这里本次是5C38-D954。然后回车,自动打开网页浏览器,输入一次性验证码,授权即可完成认证。如果上面选择使用token认证,那么输入你的token即可。

如果换了一台机器,那么重新生成一个新的token,然后gh auth login即可。
]]>使用ansible mitogen 0.3.4插件进行kubespray安装时,报错:
经过debug分析,kubespray-default默认定义了如下变量模板:
ansible_ssh_common_args: |
"{% if 'bastion' in groups['all'] %} -o ProxyCommand='ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -W %h:%p
-p {{ hostvars['bastion']['ansible_port'] | default(22) }} {{ hostvars['bastion']['ansible_user'] }}@{{ hostvars['bastion']['ansible_host'] }} {% if ansible_ssh_private_key_file is defined %}
-i {{ ansible_ssh_private_key_file }}{% endif %} ' {% endif %}"
mitogen.parent: command line for Connection(None): ssh -o "LogLevel ERROR" -l root -p 22 -o "Compression yes" -o "ServerAliveInterval 30" -o "ServerAliveCountMax 10"
-o "StrictHostKeyChecking no" -o "UserKnownHostsFile /dev/null" -o "GlobalKnownHostsFile /dev/null" -C -o ControlMaster=auto
-o ControlPersist=60s {% if bastion in groups[all] %} -o "ProxyCommand=ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -W %h:%p
-p {{ hostvars[bastion][ansible_port] | default(22) }} {{ hostvars[bastion][ansible_user] }}@{{ hostvars[bastion][ansible_host] }} {% if ansible_ssh_private_key_file is defined %}
-i {{ ansible_ssh_private_key_file }}{% endif %} " {% endif %} 172.20.59.209 /usr/bin/python -c
476 def ssh_args(self):
478 print('------------>')
479 print('ssh_args: ',C.config.get_config_value("ssh_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})))
480 print('ssh_common_args: ',C.config.get_config_value("ssh_common_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})))
481 print('ssh_extra_args: ', C.config.get_config_value("ssh_extra_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})))
482 print('----------->')
483 return [
484 mitogen.core.to_text(term)
485 for s in (
486 C.config.get_config_value("ssh_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})),
487 C.config.get_config_value("ssh_common_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})),
488 C.config.get_config_value("ssh_extra_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {}))
489 #C.config.get_config_value("ssh_args", plugin_type="connection", plugin_name="ssh", variables=local_vars),
490 #C.config.get_config_value("ssh_common_args", plugin_type="connection", plugin_name="ssh", variables=local_vars),
491 #C.config.get_config_value("ssh_extra_args", plugin_type="connection", plugin_name="ssh", variables=local_vars)
492 )
493 for term in ansible.utils.shlex.shlex_split(s or '')
494 ]
下面是输出:
------------>
ssh_args: -o ControlMaster=auto -o ControlPersist=1d -o UserKnownHostsFile=/dev/null -o ConnectTimeout=30 -o Compression=yes -o TCPKeepAlive=yes -o ServerAliveInterval=60 -o StrictHostKeyChecking=no -o AddKeysToAgent=yes -o ControlPath=~/.ssh/%r@%h-%p
ssh_common_args: {% if 'bastion' in groups['all'] %} -o ProxyCommand='ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -W %h:%p -p {{ hostvars['bastion']['ansible_port'] | default(22) }} {{ hostvars['bastion']['ansible_user'] }}@{{ hostvars['bastion']['ansible_host'] }} {% if ansible_ssh_private_key_file is defined %}-i {{ ansible_ssh_private_key_file }}{% endif %} ' {% endif %}
ssh_extra_args:
----------->
可以看到ssh_common_args变量没有渲染
将_task_vars.get(“vars”, {}) 改为 _task_vars.get(“hostvars”, {}), 从hostvars取值
