说明

  • 官方推荐通过 daemonset 配置 node local dns 缓存 coredns 解析,降低查询 coredns 的频次,提升服务稳定性
    • 地址:https://kubernetes.io/zh/docs/tasks/administer-cluster/nodelocaldns/
  • 架构:
  • 配置说明
    • coredns:
      • 部署对象:Deployment
      • 配置文件对象:ConfigMap
    • node local dns:
      • 部署对象:Daemonset
      • 配置文件对象:ConfigMap

问题

  • node local dns CrashLoopBackOff
  • 查看异常 pod 事件:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$ ks describe po nodelocaldns-6k72k
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  5m43s                  default-scheduler  Successfully assigned kube-system/nodelocaldns-6k72k to node03
  Normal   Pulled     4m47s (x4 over 5m42s)  kubelet            Container image "registry.cn-hangzhou.aliyuncs.com/kevin-k8s/k8s-dns-node-cache:1.21.1" already present on machine
  Normal   Created    4m47s (x4 over 5m42s)  kubelet            Created container node-cache
  Normal   Started    4m47s (x4 over 5m42s)  kubelet            Started container node-cache
  Warning  Unhealthy  4m43s                  kubelet            Liveness probe failed: Get "http://169.254.25.10:9254/health": dial tcp 169.254.25.10:9254: connect: connection refused
  Warning  Unhealthy  4m43s                  kubelet            Readiness probe failed: Get "http://169.254.25.10:9254/health": dial tcp 169.254.25.10:9254: connect: connection refused
  Warning  BackOff    37s (x31 over 5m34s)   kubelet            Back-off restarting failed container
  • 查看异常 pod 日志
1
2
3
4
5
6
7
$ ks logs -f nodelocaldns-6k72k
...
[INFO] plugin/reload: Running configuration MD5 = 2514b5c46ed7aac066e61a06ae9f744e
CoreDNS-1.7.0
linux/amd64, go1.16.8,
[FATAL] plugin/loop: Loop (169.254.25.10:40910 -> 169.254.25.10:53) detected for zone "in-addr.arpa.", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 8242809723071053454.7869479726261281517.in-addr.arpa."
yanx@master01:~$ ks logs -f nodelocaldns-6k72k
  • 查看 coredns 日志
1
2
3
4
$ $ ks logs -f coredns-bcd64976d-5rqsx
...
[ERROR] plugin/errors: 2 3167181245102042778.6454700926367002340.ip6.arpa. HINFO: read udp 10.100.40.29:45000->114.114.114.114:53: i/o timeout
[ERROR] plugin/errors: 2 5644203932264654170.2772417498849148561.in-addr.arpa. HINFO: read udp 10.100.40.29:44809->114.114.114.114:53: i/o timeout

问题说明及处理

问题说明

  • localdns 启动时会请求 coredns 解析域名,判断与 coredns 通信是否正常

  • 配置文件(ConfigMap)中存在 ipv6 相关配置,会基于 ipv6 地址的判活

  • master/node 各节点主机 DNS 为 114.114.114.114 ,114.114.114.114 对 ipv6 地址的解析不稳定

    • 114.114.114.114 解析 timed out
    1
    2
    3
    4
    5
    6
    
    $ dig 4071248015005853885.5531442340350897257.ip6.arpa. @114.114.114.114
      
    ; <<>> DiG 9.16.1-Ubuntu <<>> 4071248015005853885.5531442340350897257.ip6.arpa. @114.114.114.114
    ;; global options: +cmd
    ;; connection timed out; no servers could be reached
      
    
    • 1.2.4.8 可稳定解析
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    
    $ dig 4071248015005853885.5531442340350897257.ip6.arpa. @1.2.4.8
      
    ; <<>> DiG 9.16.1-Ubuntu <<>> 4071248015005853885.5531442340350897257.ip6.arpa. @1.2.4.8
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 47146
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
      
    ;; OPT PSEUDOSECTION:
    ; EDNS: version: 0, flags:; udp: 4096
    ;; QUESTION SECTION:
    ;4071248015005853885.5531442340350897257.ip6.arpa. IN A
      
    ;; AUTHORITY SECTION:
    ip6.arpa.               3600    IN      SOA     b.ip6-servers.arpa. nstld.iana.org. 2021111926 1800 900 604800 3600
      
    ;; Query time: 376 msec
    ;; SERVER: 1.2.4.8#53(1.2.4.8)
    ;; WHEN: Wed Mar 16 15:25:15 CST 2022
    ;; MSG SIZE  rcvd: 141
    

处理

  • 禁用主机 ipv6
1
2
3
4
5
6
7
8
$ sudo vim /etc/sysctl.conf
######## 新增 ##########
net.ipv6.conf.all.disable_ipv6=0
net.ipv6.conf.default.disable_ipv6=0
net.ipv6.conf.lo.disable_ipv6=0

# apply
$ sudo sysctl -p
  • 更改每台主机 DNS 解析
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$ sudo vim /etc/netplan/00-installer-config.yaml
	  ...
      nameservers:
        addresses:
        - 1.2.4.8 		# 中国互联网络中心
        - 119.29.29.29	# 腾讯云(DNSPOD)
        - 223.6.6.6		# 阿里云
          #- 114.114.114.114
	  ...

$ sudo netplan apply
  • 重建所有 coredns
  • 重建 node local dns