k8s之calico网络

环境介绍

在一个物理server上安装三个VM,VM操作系统如下:
root@master:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 19.10
Release: 19.10
Codename: eoan

一个VM作为master,另外两个VM作为worker:

root@master:~# kubectl get nodes -o wide
NAME     STATUS   ROLES    AGE    VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION     CONTAINER-RUNTIME
master   Ready    master   112d   v1.17.3   192.168.122.20   <none>        Ubuntu 19.10   5.3.0-55-generic   docker://19.3.2
node1    Ready    <none>   112d   v1.17.3   192.168.122.21   <none>        Ubuntu 19.10   5.3.0-55-generic   docker://19.3.2
node2    Ready    <none>   112d   v1.17.3   192.168.122.22   <none>        Ubuntu 19.10   5.3.0-55-generic   docker://19.3.2



calico安装
wget https://docs.projectcalico.org/manifests/calico.yaml
kubectl apply -f calico.yaml

root@master:~/calico# kubectl get pod -n kube-system -o wide
NAME                                       READY   STATUS    RESTARTS   AGE    IP               NODE     NOMINATED NODE   READINESS GATES
calico-kube-controllers-5b644bc49c-94g6h   1/1     Running   0          82s    10.24.104.2      node2    <none>           <none>
calico-node-75kns                          1/1     Running   0          82s    192.168.122.20   master   <none>           <none>
calico-node-fh969                          1/1     Running   0          82s    192.168.122.22   node2    <none>           <none>
calico-node-lbbd9                          1/1     Running   0          82s    192.168.122.21   node1    <none>           <none>
coredns-9d85f5447-5s8k9                    0/1     Running   3          112d   10.24.219.65     master   <none>           <none>
coredns-9d85f5447-zbc8m                    1/1     Running   2          112d   10.24.219.66     master   <none>           <none>
etcd-master                                1/1     Running   2          112d   192.168.122.20   master   <none>           <none>
kube-apiserver-master                      1/1     Running   2          112d   192.168.122.20   master   <none>           <none>
kube-controller-manager-master             1/1     Running   2          112d   192.168.122.20   master   <none>           <none>
kube-proxy-l4wn7                           1/1     Running   2          112d   192.168.122.22   node2    <none>           <none>
kube-proxy-prhcm                           1/1     Running   2          112d   192.168.122.21   node1    <none>           <none>
kube-proxy-psxqt                           1/1     Running   2          112d   192.168.122.20   master   <none>           <none>
kube-scheduler-master                      1/1     Running   2          112d   192.168.122.20   master   <none>           <none>

calico客户端命令工具-calicoctl,可用来查看,修改calico配置

wget https://github.com/projectcalico/calicoctl/releases/download/v3.5.4/calicoctl -O /usr/bin/calicoctl
chmod +x /usr/bin/calicoctl


网络模式

calico支持三种网络模式,可通过修过calico.yaml进行配置:

  • overlay之ipip
  • overlay之vxlan
  • underlay之BGP

下面分别进行配置验证,并分析数据流向

overlay -- ipip

configure

安装完calico,默认就是ipip模式。
node之间是full mesh连接。

root@master:~/calico# calicoctl node status
Calico process is running.

IPv4 BGP status
+----------------+-------------------+-------+----------+-------------+
|  PEER ADDRESS  |     PEER TYPE     | STATE |  SINCE   |    INFO     |
+----------------+-------------------+-------+----------+-------------+
| 192.168.122.21 | node-to-node mesh | up    | 17:37:27 | Established |
| 192.168.122.22 | node-to-node mesh | up    | 17:37:28 | Established |
+----------------+-------------------+-------+----------+-------------+

IPv6 BGP status
No IPv6 peers found.

进入calico pod,查看运行的进程。

  • felix为pod配置直接路由,管理接口
  • bird感知pod直接路由,并通过bgp发布给其他node
  • confd动态更新bird的配置文件
root@master:~/calico# kubectl exec -it calico-node-lbbd9 -n kube-system bash
[root@node1 /]# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 17:37 ?        00:00:00 /usr/local/bin/runsvdir -P /etc/service/enabled
root        44     1  0 17:37 ?        00:00:00 runsv felix
root        45     1  0 17:37 ?        00:00:00 runsv bird6
root        46     1  0 17:37 ?        00:00:00 runsv bird
root        47     1  0 17:37 ?        00:00:00 runsv confd
root        51    47  0 17:37 ?        00:00:00 calico-node -confd
root       148    45  0 17:37 ?        00:00:00 bird6 -R -s /var/run/calico/bird6.ctl -d -c /etc/calico/confd/config/bird6.cfg
root       149    46  0 17:37 ?        00:00:00 bird -R -s /var/run/calico/bird.ctl -d -c /etc/calico/confd/config/bird.cfg
root       163    44  2 17:37 ?        00:00:06 calico-node -felix
root       866     0  0 17:40 pts/0    00:00:00 bash
root      1263   866  0 17:42 pts/0    00:00:00 ps -ef

而且在node上会多出一个网络接口tunl0,用于封装/解封装ipip报文

11: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
    inet 10.24.166.129/32 brd 10.24.166.129 scope global tunl0
       valid_lft forever preferred_lft forever

verify

通过下面yaml文件部署两个pod,验证网络连通性。
nginx.yaml
1nginx.yaml -- 复制nginx.yaml,修改name

root@master:~# cat nginx.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      name: nginx
  template:
    metadata:
      labels:
        name: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        imagePullPolicy: Always

---
kind: Service
apiVersion: v1
metadata:
  name: nginx
spec:
  type: ClusterIP
  ports:
  - name: nginx
    port: 3306
    targetPort: 80
    protocol: TCP
  selector:
    name: nginx
root@master:~# kubectl apply -f nginx.yaml
deployment.apps/nginx unchanged
service/nginx unchanged
root@master:~# kubectl apply -f 1nginx.yaml
deployment.apps/nginx1 unchanged
service/nginx1 unchanged
root@master:~# kubectl get pod -o wide
NAME                     READY   STATUS    RESTARTS   AGE   IP              NODE    NOMINATED NODE   READINESS GATES
nginx-677dc4d96-vrbp5    1/1     Running   0          18s   10.24.104.3     node2   <none>           <none>
nginx1-677dc4d96-8bjvv   1/1     Running   0          21s   10.24.166.130   node1   <none>           <none>

可看到两个pod分别部署在不同的worker上。
进入一个pod,可以ping通另一个pod

root@master:~# kubectl exec -it nginx1-677dc4d96-8bjvv bash
root@nginx1-677dc4d96-8bjvv:/# ping 10.24.104.3 -c1
PING 10.24.104.3 (10.24.104.3): 48 data bytes
56 bytes from 10.24.104.3: icmp_seq=0 ttl=62 time=2.369 ms
--- 10.24.104.3 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 2.369/2.369/2.369/0.000 ms

traffic flow

image.png

以10.24.166.130 ping 10.24.104.3 为例:

  1. 查找pod内路由表可知,需要发送给默认路由 169.254.1.1。
    发送arp请求169.254.1.1的mac。arp请求报文会到达caliadb5d6cab6f。此设备设置了arp proxy,所以会将它的mac回复给pod。(可在caliadb5d6cab6f抓到arp请求和回复报文)
root@node1:~# cat /proc/sys/net/ipv4/conf/caliadb5d6cab6f/proxy_arp
1
  1. 学习到mac地址后,发送icmp请求报文
  2. 在eth0设备的驱动发送函数veth_xmit函数中,将skb->dev指向eth0的peer设备caliadb5d6cab6f,接着调用netif_rx进入协议栈查找路由。
    可在caliadb5d6cab6f抓到报文。
18:17:50.003013 0a:65:aa:2b:ef:d1 > ee:ee:ee:ee:ee:ee, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 64, id 47525, offset 0, flags [DF], proto ICMP (1), length 76)
    10.24.166.130 > 10.24.104.3: ICMP echo request, id 7168, seq 0, length 56
  1. icmp请求到达caliadb5d6cab6f, 查找host路由表得知,下一跳为192.168.122.22(node2的ip),并且需要通过tunl0进行隧道封装。
root@node1:~# ip r
default via 192.168.122.1 dev ens3 proto static
10.24.104.0/26 via 192.168.122.22 dev tunl0 proto bird onlink
blackhole 10.24.166.128/26 proto bird
10.24.166.130 dev caliadb5d6cab6f scope link
10.24.219.64/26 via 192.168.122.20 dev tunl0 proto bird onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.21

所以报文达到tunl0设备时,报文格式如下,源目的ip不变,因为ipip模式,所以mac已经没了。

root@node1:~# tcpdump -vne -i tunl0
tcpdump: listening on tunl0, link-type RAW (Raw IP), capture size 262144 bytes
18:20:45.293856 ip: (tos 0x0, ttl 63, id 52265, offset 0, flags [DF], proto ICMP (1), length 76)
    10.24.166.130 > 10.24.104.3: ICMP echo request, id 7424, seq 0, length 56
18:20:45.294975 ip: (tos 0x0, ttl 63, id 57896, offset 0, flags [none], proto ICMP (1), length 76)
    10.24.104.3 > 10.24.166.130: ICMP echo reply, id 7424, seq 0, length 56

封装完ipip,根据外层ip再次查找host路由表,从ens3网卡发送出去

192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.21
  1. 封装后从ens3网卡发出
    最终封装的icmp request报文,可在ens3抓到 ipip 报文
  root@node1:~# tcpdump -vne -i ens3 host 192.168.122.22
tcpdump: listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
18:31:17.809729 52:54:00:74:ac:0d > 52:54:00:f3:3a:90, ethertype IPv4 (0x0800), length 110: (tos 0x0, ttl 63, id 2590, offset 0, flags [DF], proto IPIP (4), length 96)
    192.168.122.21 > 192.168.122.22: (tos 0x0, ttl 63, id 61416, offset 0, flags [DF], proto ICMP (1), length 76)
    10.24.166.130 > 10.24.104.3: ICMP echo request, id 7680, seq 0, length 56
  1. 封装数据包到达node2后,因为目的ip为local,所以接收此数据包,并向上层协议传递。
    解封装后,将报文发送给tunl0网卡,可在此抓到icmp请求报文
root@node2:~# tcpdump -vne -i tunl0
tcpdump: listening on tunl0, link-type RAW (Raw IP), capture size 262144 bytes
18:38:56.717329 ip: (tos 0x0, ttl 63, id 19824, offset 0, flags [DF], proto ICMP (1), length 76)
    10.24.166.130 > 10.24.104.3: ICMP echo request, id 7936, seq 0, length 56
  1. 再次查找host路由表,得知目的ip 10.24.104.3发给
    calie935ef337bb
10.24.104.3 dev calie935ef337bb scope link
  1. 通过veth,发送到pod
  2. icmp reply数据包处理过程类似

overlay -- vxlan

configure

参考:https://docs.projectcalico.org/getting-started/kubernetes/installation/config-options : Switching from IP-in-IP to VXLAN

修过 calico.yaml:

  1. Replace environment variable name CALICO_IPV4POOL_IPIP withCALICO_IPV4POOL_VXLAN. Leave the value of the new variable as “Always”.
  2. Optionally, (to save some resources if you’re running a VXLAN-only cluster) completely disable Calico’s BGP-based networking:
    Replace calico_backend: "bird" with calico_backend: "vxlan". This disables BIRD.
    Comment out the line - -bird-ready and - -bird-live from the calico/node readiness/liveness check (otherwise disabling BIRD will cause the readiness/liveness check to fail on every node):
          livenessProbe:
            exec:
              command:
              - /bin/calico-node
              - -felix-live
             # - -bird-live
          readinessProbe:
            exec:
              command:
              - /bin/calico-node
              # - -bird-ready
              - -felix-ready

重新apply calico.yaml

kubectl apply -f ./calico.yaml



查看calico node上运行的进程,已经没了bird等和BGP相关的进程。

root@master:~/calico# kubectl exec -it calico-node-9lh84 -n kube-system bash
[root@node1 /]# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 10:37 ?        00:00:00 /usr/local/bin/runsvdir -P /etc/service/enabled
root        37     1  0 10:38 ?        00:00:00 runsv felix
root        38    37  1 10:38 ?        00:02:08 calico-node -felix
root      2128     0  1 12:45 pts/0    00:00:00 bash
root      2148  2128  0 12:45 pts/0    00:00:00 ps -ef

calicoctl查看node状态,也已经没有BGP相关内容

root@master:~# calicoctl node status
Calico process is running.

None of the BGP backend processes (BIRD or GoBGP) are running.

而且每个节点上多了一个网络接口:

7: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue state UNKNOWN group default
    link/ether 66:f9:37:c3:7e:94 brd ff:ff:ff:ff:ff:ff
    inet 10.24.166.128/32 brd 10.24.166.128 scope global vxlan.calico
       valid_lft forever preferred_lft forever
    inet6 fe80::64f9:37ff:fec3:7e94/64 scope link
       valid_lft forever preferred_lft forever


verify

和ipip模式verify一样,创建两个pod。
进入一个pod,可以ping通另一个pod

root@master:~# kubectl get pod -o wide
NAME                     READY   STATUS    RESTARTS   AGE   IP              NODE    NOMINATED NODE   READINESS GATES
nginx-677dc4d96-xysui 1/1     Running   0          18s   10.24.104.2     node2   <none>           <none>
nginx-677dc4d96-wkkcn   1/1     Running   0          21s   10.24.166.130   node1   <none>           <none>
root@master:~# kubectl exec -it nginx-677dc4d96-wkkcn bash
root@nginx-677dc4d96-wkkcn:/# ping 10.24.104.2 -c1
PING 10.24.104.2 (10.24.104.2): 48 data bytes
56 bytes from 10.24.104.2: icmp_seq=0 ttl=62 time=2.519 ms
--- 10.24.104.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 2.519/2.519/2.519/0.000 ms

traffic flow

image.png

以10.24.166.130 ping 10.24.104.2 为例:

  1. 查找pod内路由表可知,需要发送给默认路由 169.254.1.1。
    pod内邻居表项有169.254.1.1对应的mac地址(可能是calico静态配置的)。
root@nginx-677dc4d96-wkkcn:/# ip neigh
169.254.1.1 dev eth0 lladdr ee:ee:ee:ee:ee:ee STALE
192.168.122.21 dev eth0 lladdr ee:ee:ee:ee:ee:ee STALE

所以pod发出icmp request报文,可在eth0抓到。

  1. 在eth0设备的驱动发送函数veth_xmit函数中,将skb->dev指向eth0的peer设备caliea5b03f12b8,接着调用netif_rx进入协议栈查找路由。
    可在caliea5b03f12b8抓到报文。
16:39:36.406630 4e:78:56:5f:78:5d > ee:ee:ee:ee:ee:ee, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 64, id 59734, offset 0, flags [DF], proto ICMP (1), length 76)
    10.24.166.130 > 10.24.104.2: ICMP echo request, id 21248, seq 0, length 56
  1. icmp请求到达caliea5b03f12b8, 查找host路由表得知,下一跳为10.24.104.0(node2的vxlan设备ip),并且需要通过vxlan.calico进行隧道封装。
root@node1:~# ip r
default via 192.168.122.1 dev ens3 proto static
10.24.104.0/26 via 10.24.104.0 dev vxlan.calico onlink
10.24.166.130 dev caliea5b03f12b8 scope link
10.24.219.64/26 via 10.24.219.64 dev vxlan.calico onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.21

从neigh信息可知,10.24.104.0 对应的mac地址为66:2d:bf:44:a6:8b

root@node1:~# ip neigh
192.168.122.22 dev ens3 lladdr 52:54:00:f3:3a:90 STALE
10.24.219.64 dev vxlan.calico lladdr 66:4f:26:ae:af:db PERMANENT
10.24.166.130 dev caliea5b03f12b8 lladdr 4e:78:56:5f:78:5d STALE
192.168.122.1 dev ens3 lladdr 52:54:00:32:63:2e REACHABLE
10.24.104.0 dev vxlan.calico lladdr 66:2d:bf:44:a6:8b PERMANENT
192.168.122.20 dev ens3 lladdr 52:54:00:d9:d7:07 REACHABLE

所以报文达到vxlan.calico设备时,报文格式如下,源目的ip不变,但是目的mac已经变为10.24.104.0对应的mac,源mac变为vxlan.calico设备的mac

13:44:39.560217 66:f9:37:c3:7e:94 > 66:2d:bf:44:a6:8b, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 63, id 48899, offset 0, flags [DF], proto ICMP (1), length 76)
    10.24.166.130 > 10.24.104.2: ICMP echo request, id 16128, seq 0, length 56

在 vxlan_xmit 中调用 vxlan_find_mac 根据目的mac查找fdb信息。
从fdb信息可知,mac 66:2d:bf:44:a6:8b 对应ip 192.168.122.22。
此ip即为vxlan外层目的ip。

root@node1:~# bridge fdb show dev vxlan.calico
66:2d:bf:44:a6:8b dst 192.168.122.22 self permanent
66:4f:26:ae:af:db dst 192.168.122.20 self permanent

封装完vxlan,根据外层ip再次查找host路由表,从ens3网卡发送出去

192.168.122.22 dev ens3 lladdr 52:54:00:f3:3a:90 STALE
  1. 封装后从ens3网卡发出
    最终封装的icmp request报文,可在ens3抓到
    192.168.122.21.44936 > 192.168.122.22.4789: VXLAN, flags [I] (0x08), vni 4096
66:f9:37:c3:7e:94 > 66:2d:bf:44:a6:8b, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 63, id 1065, offset 0, flags [DF], proto ICMP (1), length 76)
    10.24.166.130 > 10.24.104.2: ICMP echo request, id 15616, seq 0, length 56
  1. 封装数据包到达node2后,因为目的ip为local,所以接收此数据包,并向上层协议传递。
    node2上正在监听4789端口号(创建vxlan.calico时,添加的socket vxlan_sock_add),如果有报文来了调用vxlan_rcv处理vxlan报文,
root@node2:~# netstat -nap | grep 4789
udp        0      0 0.0.0.0:4789            0.0.0.0:*                           -

解封装后,将报文发送给vxlan.calico网卡,可在此抓到报文

13:44:25.320094 66:f9:37:c3:7e:94 > 66:2d:bf:44:a6:8b, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 63, id 47307, offset 0, flags [DF], proto ICMP (1), length 76)
    10.24.166.130 > 10.24.104.2: ICMP echo request, id 15872, seq 0, length 56
  1. 再次查找host路由表,得知目的ip 10.24.104.2发给
    cali82cc91000b8
10.24.104.2 dev cali82cc91000b8 scope link
  1. 通过veth,发送到pod
  2. icmp reply数据包处理过程类似

underlay -- BGP

configure

修改 calico.yaml,将 CALICO_IPV4POOL_IPIP 的value改完 Never

            # Enable IPIP
            - name: CALICO_IPV4POOL_IPIP
              value: "Never"

重新apply calico.yaml

kubectl apply -f calico.yaml

查看 calico node status和calico node上的进程,看和ipip模式没有区别。区别在于worker上的路由表,跨节点通信不再通过tunl0。

root@master:~/calico# calicoctl node status
Calico process is running.

IPv4 BGP status
+----------------+-------------------+-------+----------+-------------+
|  PEER ADDRESS  |     PEER TYPE     | STATE |  SINCE   |    INFO     |
+----------------+-------------------+-------+----------+-------------+
| 192.168.122.21 | node-to-node mesh | up    | 19:56:08 | Established |
| 192.168.122.22 | node-to-node mesh | up    | 19:56:09 | Established |
+----------------+-------------------+-------+----------+-------------+

IPv6 BGP status
No IPv6 peers found.
root@master:~/calico# kubectl exec -it -n kube-system calico-node-czhnn bash
[root@node1 /]# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 19:56 ?        00:00:00 /usr/local/bin/runsvdir -P /etc/service/enabled
root        42     1  0 19:56 ?        00:00:00 runsv felix
root        43     1  0 19:56 ?        00:00:00 runsv bird6
root        44     1  0 19:56 ?        00:00:00 runsv bird
root        45     1  0 19:56 ?        00:00:00 runsv confd
root        47    42  2 19:56 ?        00:00:02 calico-node -felix
root        48    45  0 19:56 ?        00:00:00 calico-node -confd
root       144    44  0 19:56 ?        00:00:00 bird -R -s /var/run/calico/bird.ctl -d -c /etc/calico/confd/config/bird.cfg
root       145    43  0 19:56 ?        00:00:00 bird6 -R -s /var/run/calico/bird6.ctl -d -c /etc/calico/confd/config/bird6.cfg
root       493     0  1 19:57 pts/0    00:00:00 bash
root       518   493  0 19:57 pts/0    00:00:00 ps -ef

或者通过如下方式动态更新,从IPIP到纯BGP模式

root@master:~# calicoctl get ipPool --export -o yaml > pool.yaml
修改ipipMode为Never
root@master:~# cat pool.yaml
apiVersion: projectcalico.org/v3
items:
- apiVersion: projectcalico.org/v3
  kind: IPPool
  metadata:
    creationTimestamp: 2020-05-30T18:27:41Z
    name: default-ipv4-ippool
    resourceVersion: "4950731"
    uid: 79dac11f-309c-423a-ad5c-8235aafd08ea
  spec:
    cidr: 10.24.0.0/16
    ipipMode: Never
    natOutgoing: true
kind: IPPoolList
metadata:
  resourceVersion: "4950758"
使配置生效
root@master:~# calicoctl replace -f pool.yaml
Successfully replaced 1 'IPPool' resource(s)

verify

和ipip模式verify一样,创建两个pod。
进入一个pod,可以ping通另一个pod

root@master:~# kubectl get pod -o wide
NAME                     READY   STATUS    RESTARTS   AGE   IP              NODE    NOMINATED NODE   READINESS GATES
nginx-677dc4d96-c6mxz    1/1     Running   0          14s   10.24.104.1     node2   <none>           <none>
nginx1-677dc4d96-bjnw9   1/1     Running   0          17s   10.24.166.128   node1   <none>           <none>
root@master:~# kubectl exec -it nginx1-677dc4d96-bjnw9 bash
root@nginx1-677dc4d96-bjnw9:/# ping 10.24.104.1 -c1
PING 10.24.104.1 (10.24.104.1): 48 data bytes
56 bytes from 10.24.104.1: icmp_seq=0 ttl=63 time=4.949 ms
--- 10.24.104.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 4.949/4.949/4.949/0.000 ms

traffic flow

image.png

以10.24.166.128 ping 10.24.104.1为例

  1. 查找pod内路由表可知,需要发送给默认路由 169.254.1.1。
    发送arp请求169.254.1.1的mac。arp请求报文会到底caliadb5d6cab6f。此设备设置了arp proxy,所以会将它的mac回复给pod。(可在caliadb5d6cab6f抓到arp请求和回复报文)
  2. 学习到mac地址后,发送icmp请求报文
  3. 在eth0设备的驱动发送函数veth_xmit函数中,将skb->dev指向eth0的peer设备cali5a1d2678510,接着调用netif_rx进入协议栈查找路由。
    可在cali5a1d2678510抓到报文。
20:11:15.035450 7a:17:c4:cf:73:81 > ee:ee:ee:ee:ee:ee, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 64, id 57736, offset 0, flags [DF], proto ICMP (1), length 76)
    10.24.166.128 > 10.24.104.1: ICMP echo request, id 6400, seq 0, length 56
  1. icmp请求到达cali5a1d2678510, 查找host路由表得知,下一跳为192.168.122.22(node2的ip),出接口为ens3,不用再经过任何封装。
root@node1:~# ip r
default via 192.168.122.1 dev ens3 proto static
10.24.104.0/26 via 192.168.122.22 dev ens3 proto bird
10.24.166.128 dev cali5a1d2678510 scope link
blackhole 10.24.166.128/26 proto bird
10.24.219.65 via 192.168.122.20 dev ens3 proto bird
10.24.219.66 via 192.168.122.20 dev ens3 proto bird
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.21
  1. icmp请求报文从ens3网卡发出,源目的ip就是pod的ip
    20:13:48.448931 52:54:00:74:ac:0d > 52:54:00:f3:3a:90, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 63, id 2546, offset 0, flags [DF], proto ICMP (1), length 76)
    10.24.166.128 > 10.24.104.1: ICMP echo request, id 6912, seq 0, length 56
  2. 请求报文达到node2后,查找路由表得知目的ip 10.24.104.1发给
    cali06f028cd84e
root@node2:~# ip r
default via 192.168.122.1 dev ens3 proto static
10.24.104.0 dev cali1cd7c4c9ed9 scope link
blackhole 10.24.104.0/26 proto bird
10.24.104.1 dev cali06f028cd84e scope link
10.24.166.128/26 via 192.168.122.21 dev ens3 proto bird
10.24.219.65 via 192.168.122.20 dev ens3 proto bird
10.24.219.66 via 192.168.122.20 dev ens3 proto bird
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.22
  1. 通过veth,发送到pod
  2. icmp reply数据包处理过程类似

Q&A

引用自 https://docs.projectcalico.org/reference/faq

  1. Why does my container have a route to 169.254.1.1?
    In a Calico network, each host acts as a gateway router for the workloads that it hosts. In container deployments, Calico uses 169.254.1.1 as the address for the Calico router. By using a link-local address, Calico saves precious IP addresses and avoids burdening the user with configuring a suitable address.
    While the routing table may look a little odd to someone who is used to configuring LAN networking, using explicit routes rather than subnet-local gateways is fairly common in WAN networking.

  2. Why can’t I see the 169.254.1.1 address mentioned above on my host?
    Calico tries hard to avoid interfering with any other configuration on the host. Rather than adding the gateway address to the host side of each workload interface, Calico sets the proxy_arp flag on the interface. This makes the host behave like a gateway, responding to ARPs for 169.254.1.1 without having to actually allocate the IP address to the interface.

  3. Why do all cali* interfaces have the MAC address ee:ee:ee:ee:ee:ee?
    In some setups the kernel is unable to generate a persistent MAC address and so Calico assigns a MAC address itself. Since Calico uses point-to-point routed interfaces, traffic does not reach the data link layer so the MAC Address is never used and can therefore be the same for all the cali* interfaces.

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 160,026评论 4 364
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 67,655评论 1 296
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 109,726评论 0 244
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 44,204评论 0 213
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,558评论 3 287
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,731评论 1 222
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,944评论 2 314
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,698评论 0 203
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,438评论 1 246
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,633评论 2 247
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 32,125评论 1 260
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,444评论 3 255
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 33,137评论 3 238
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,103评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,888评论 0 197
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,772评论 2 276
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,669评论 2 271