阿里云 csi

  • 阿里云为 k8s 提供多种存储类型作为 k8s 集群的共享存储:
    • oss:
      • 仅支持静态存储卷,不支持动态存储卷
      • OSSFS网络性能欠佳,可以支撑一些小文件的读场景
    • nas:
      • 共享存储,可提供高性能、高吞吐存储服务。
      • 允许跨可用区访问,不允许跨 vpc 访问
    • 阿里云盘:
      • 非共享存储,每个云盘只能在单个节点挂载。
      • 性能最高,延迟最低
      • 不可跨可用区访问

配置阿里云 NAS

开通 NAS服务

  • 阿里云控制台开通 NAS 服务
  • 添加挂载点
  • 按需购买资源包

配置 NAS 网络访问

方式对比

  • 如果 k8s 集群部署于阿里云中,可直接通过内网访问
  • 如果部署于本地配置中心,可通过 VPN 或 NAT 方式访问
    • VPN:
      • 优点:VPN 提供安全的访问(通过IPsec实现加密通信)
      • 缺点:
        • 通过 VPN 访问文件系统时的 I/O 性能将受限于从 IDC 到 VPC 或者 VPC 之间的公网带宽和时延
        • 费用较高
    • NAT:
      • 优点:配置简单、费用低
      • 缺点:
        • 安全性方面,由于 EIP 和 VPC 网络互通,因此任何人都可通过 EIP 挂载到 EIP 对应的挂载点。
        • 每个 EIP + PORT 只能映射到一个挂载点,同时访问多个挂载点时需要创建多个 EIP

配置 NAT

  • 测试环境,以 NAT 为例
  • 获取新创建的挂载点的阿里云内网访问地址
  • 根据内网访问域名,拿到内网访问的 ip
    • 由于阿里云不提供挂载点的 ip,需在 nas 同区域内创建一台 ecs,通过 ping 命令获取其 ip
  • 创建公网 NAT 网关
  • 绑定公网弹性 ip

    • 选择立即绑定

    • 如果已有未绑定 EIP,可直接绑定,反之可选择新建 EIP

  • 配置 DNAT,将外网 IP 映射至内网 NAS 访问 IP

    • 创建 DNAT

    • 创建 DNAT 条目

  • 测试连接

1
$ sudo mount -t nfs  118.190.XXX.XX:/ /mnt/

安装阿里 csi

安装阿里云 nas csi

  • 下载 yaml
1
$ git clone https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver.git
  • 部署 RBAC
1
$ k apply -f deploy/rbac.yaml
  • 部署 nas-plugin

    • 如果直接使用官网提供的 yaml,在自建 k8s 集群中运行回出现报错,需更改部分内容
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    
    # vim deploy/nas/nas-plugin.yaml
    .......
              image: registry.cn-hangzhou.aliyuncs.com/acs/csi-plugin:v1.18.8.47-906bd535-aliyun
              imagePullPolicy: "Always"
              args:
                - "--endpoint=$(CSI_ENDPOINT)"
                - "--v=2"
                - "--driver=nas"
                - "--nodeid=$(KUBE_NODE_NAME)" #增加这一行
              env:
                - name: KUBE_NODE_NAME
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: spec.nodeName
                - name: CSI_ENDPOINT
    .......
    
    • 创建 nas-plugin
    1
    2
    3
    
    $ k apply -f deploy/nas/nas-plugin.yaml
    Warning: spec.template.spec.nodeSelector[beta.kubernetes.io/os]: deprecated since v1.14; use "kubernetes.io/os" instead
    daemonset.apps/csi-plugin created
    
  • 部署 nas-provisioner

    • 更改 yaml
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    
    # vi deploy/nas/nas-provisioner.yaml ......
              image: registry.cn-hangzhou.aliyuncs.com/acs/csi-plugin:v1.18.8.47-906bd535-aliyun
              imagePullPolicy: "Always"
              args:
                - "--endpoint=$(CSI_ENDPOINT)"
                - "--v=2"
                - "--driver=nas"
                - "--nodeid=$(KUBE_NODE_NAME)" # 添加这一行
              env:
                # 这个文件里没有下面这个KUBE_NODE_NAME的变量  也需要加上
                - name: KUBE_NODE_NAME
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: spec.nodeName
                #到这里停止
    ......
    
    • 创建 nas-provisioner
    1
    
    $ k apply -f deploy/nas/nas-provisioner.yaml
    
    • 查看创建状态
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    
    $ k get pods -n kube-system
    NAME                                READY   STATUS    RESTARTS        AGE
    coredns-7f6cbbb7b8-72jr5            1/1     Running   26 (28h ago)    50d
    coredns-7f6cbbb7b8-cz5cg            1/1     Running   25 (28h ago)    50d
    csi-plugin-qk8f4                    2/2     Running   0               17m
    csi-provisioner-59f6f488f9-mql7r    2/2     Running   4 (3m31s ago)   15m
    etcd-xiang-nuc                      1/1     Running   24 (28h ago)    50d
    kube-apiserver-xiang-nuc            1/1     Running   25 (28h ago)    50d
    kube-controller-manager-xiang-nuc   1/1     Running   34 (28h ago)    50d
    kube-proxy-nclpw                    1/1     Running   25 (28h ago)    50d
    kube-scheduler-xiang-nuc            1/1     Running   40 (28h ago)    50d
    weave-net-k2t65                     2/2     Running   50 (28h ago)    50d
    
  • 报错说明

    • 不调整阿里云 nas plugin provisioner yaml,会因为缺少参数,导致 Pod 创建失败
    • 查看 pod 启动状态:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    $ k describe pods csi-provisioner-5fdbc6d848-6znwp -n kube-system
    ......
    Events:
    Type     Reason       Age                   From               Message
    ----     ------       ----                  ----               -------
    ......
    Normal   Pulled       21m                   kubelet            Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/acs/csi-plugin:v1.18.8.47-906bd535-aliyun" in 504.655489ms
    Warning  Unhealthy    17m (x14 over 26m)    kubelet            Liveness probe failed: Get "http://192.168.1.200:11270/healthz": dial tcp 192.168.1.200:11270: connect: connection refused
    Warning  BackOff      2m40s (x30 over 22m)  kubelet            Back-off restarting failed container
    
    • 查看 Pod 详细日志
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    
    $ k logs csi-provisioner-5fdbc6d848-6znwp -n kube-system
    error: a container name must be specified for pod csi-provisioner-5fdbc6d848-6znwp, choose one of: [external-nas-provisioner csi-provisioner]
    $ k logs csi-provisioner-5fdbc6d848-6znwp -c csi-provisioner -n kube-system
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
      0     0    0     0    0     0      0      0 --:--:--  0:02:09 --:--:--     0curl: (7) Failed connect to 100.100.100.200:80; Connection timed out
    Running nas plugin....
    time="2021-12-25T23:03:54+08:00" level=info msg="Multi CSI Driver Name: nas, nodeID: , endPoints: unix://var/lib/kubelet/csi-provisioner/driverplugin.csi.alibabacloud.com-replace/csi.sock"
    time="2021-12-25T23:03:54+08:00" level=info msg="CSI Driver Branch: 'master', Version: 'v1.18.8.47-906bd535-aliyun', Build time: '2021-05-13-20:56:55'\n"
    time="2021-12-25T23:03:54+08:00" level=info msg="Create Stroage Path: /var/lib/kubelet/csi-plugins/nasplugin.csi.alibabacloud.com/controller"
    time="2021-12-25T23:03:54+08:00" level=info msg="Create Stroage Path: /var/lib/kubelet/csi-plugins/nasplugin.csi.alibabacloud.com/node"
    time="2021-12-25T23:03:54+08:00" level=info msg="CSI is running status."
    time="2021-12-25T23:03:54+08:00" level=info msg="Metric listening on address: /healthz"
    time="2021-12-25T23:03:54+08:00" level=info msg="Driver: nasplugin.csi.alibabacloud.com version: 1.0.0"
    time="2021-12-25T23:03:54+08:00" level=info msg="Metric listening on address: /metrics"
    time="2021-12-25T23:04:24+08:00" level=info msg="Use node id : "
    E1225 23:04:24.621878      12 driver.go:46] NodeID missing
    I1225 23:04:24.622025      12 driver.go:93] Enabling volume access mode: MULTI_NODE_MULTI_WRITER
    panic: runtime error: invalid memory address or nil pointer dereference
    [signal SIGSEGV: segmentation violation code=0x1 addr=0x50 pc=0x17cac64]
      
    goroutine 28 [running]:
    github.com/kubernetes-csi/drivers/pkg/csi-common.(*CSIDriver).AddVolumeCapabilityAccessModes(0x0, 0xc0002f1eec, 0x1, 0x1, 0x0, 0x0, 0x0)
      /home/regressionTest/go/pkg/mod/github.com/kubernetes-csi/drivers@v1.0.2/pkg/csi-common/driver.go:96 +0x1c4
    github.com/kubernetes-sigs/alibaba-cloud-csi-driver/pkg/nas.NewDriver(0x0, 0x0, 0xc0003404b0, 0x4e, 0x0)
      /home/regressionTest/go/src/github.com/kubernetes-sigs/alibaba-cloud-csi-driver/pkg/nas/nas.go:86 +0x1b5
    main.main.func1(0xc00059d1c0, 0xc0003404b0, 0x4e)
      /home/regressionTest/go/src/github.com/kubernetes-sigs/alibaba-cloud-csi-driver/main.go:180 +0x79
    created by main.main
      /home/regressionTest/go/src/github.com/kubernetes-sigs/alibaba-cloud-csi-driver/main.go:178 +0xfe8
    
    • 报错是由于 csi-plugin 需要将 nodeid 注入到每个 node 的 annotation 中区分 csi-node,对于非阿里云机器,需要在在 csi-plugin 手动传入 hostname 做为 nodeid

配置 sc,并进行测试

配置 sc

  • 创建 sc 并设置未默认 sc
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# alicloud-nas-subpath.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: alicloud-nas-subpath
mountOptions:
- nolock,tcp,noresvport
- vers=4
parameters:
  volumeAs: subpath
  server: "47.104.XXX.X:/test/" # 更改为 NAS 映射出来的外网 IP
provisioner: nasplugin.csi.alibabacloud.com
reclaimPolicy: Retain
  • 创建 sc
1
$ k apply -f alicloud-nas-subpath.yaml
  • 设为默认 sc
1
$ kubectl patch storageclass alicloud-nas-subpath -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
  • 查看状态
1
2
3
$ k get sc
NAME                   PROVISIONER                      RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
alicloud-nas-subpath   nasplugin.csi.alibabacloud.com   Retain          Immediate           false                  89s

测试

  • 创建 pvc
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# ali-pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata: 
  name: nas-csi-pvc
spec:
  accessModes:
  - ReadWriteMany 
  storageClassName: alicloud-nas-subpath
  resources: 
    requests:
      storage: 20Gi
1
2
3
4
$ k apply -f ali-pvc.yaml
$ k get pvc
NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS           AGE
nas-csi-pvc   Bound    nas-2ac6f08a-7463-4843-b728-6c71d4baa629   20Gi       RWX            alicloud-nas-subpath   3m17s
  • 创建应用 nginx-1 和 nginx-2 共享 NAS 存储卷的同一个子目录
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# nginx-1.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-nas-1
  labels:
    app: nginx-1
spec:
  selector:
    matchLabels:
      app: nginx-1
  template:
    metadata:
      labels:
        app: nginx-1
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80
        volumeMounts:
          - name: nas-pvc
            mountPath: "/data"
      volumes:
        - name: nas-pvc
          persistentVolumeClaim:
            claimName: nas-csi-pvc
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# nginx-2.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-nas-2
  labels:
    app: nginx-2
spec:
  selector:
    matchLabels:
      app: nginx-2
  template:
    metadata:
      labels:
        app: nginx-2
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80
        volumeMounts:
          - name: nas-pvc
            mountPath: "/data"
      volumes:
        - name: nas-pvc
          persistentVolumeClaim:
            claimName: nas-csi-pvc
1
2
$ k apply -f nginx-1.yaml
$ k apply -f nginx-2.yaml

参考资料

阿里云 NAS

  • 通过 VPN 访问阿里云 NAS:https://help.aliyun.com/document_detail/54998.html
  • 通过 NAT 访问阿里云 NAS:https://help.aliyun.com/document_detail/57628.html

阿里云 CSI