在本文中我们将尝试理解Linux Network Namespace及相关Linux内核网络设备的概念,以及Docker容器网络模型的部分实现
我们在使用Docker容器时,了解过Docker是用linux network namespace实现的容器网络隔离的。使用docker时,在物理主机或虚拟机上会有一个docker0的linux bridge,brctl show时能看到 docker0上绑定veth网络设备:
[root@localhost ~]# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
5: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 02:42:fa:3f:86:7e brd ff:ff:ff:ff:ff:ff
138: veth5843bf2@if137: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
link/ether f2:31:9d:e6:b7:16 brd ff:ff:ff:ff:ff:ff link-netnsid 2
[root@localhost ~]# brctl show
bridge name bridge id STP enabled interfaces
docker0 8000.0242fa3f867e no veth5843bf2
linux network namespace是实现网络虚拟化的重要功能,它能创建多个相互隔离的网络空间,每个网络名字空间都有独立的网络配置,比如:网络设备、路由表等。
针对网络配置相关的操作需要借助于ip
命名,而这个命令来自于iproute2
包,系统一般默认安装
ip
命令管理的功能很多, 和 network namespace 有关的操作都是在子命令 ip netns
下进行的,可以通过 ip netns help
查看所有操作的帮助信息。
[root@localhost ~]# ip netns help
Usage: ip netns list
ip netns add NAME
ip netns set NAME NETNSID
ip [-all] netns delete [NAME]
ip netns identify [PID]
ip netns pids NAME
ip [-all] netns exec [NAME] cmd ...
ip netns monitor
ip netns list-id
[root@localhost ~]# ip netns add net01
[root@localhost ~]# ip netns add net02
[root@localhost ~]#
[root@localhost ~]#
[root@localhost ~]# ip netns list
net02
net01
[root@localhost ~]# ip netns exec net01 ip addr
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
[root@localhost ~]# ip netns exec net02 ip addr
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
[root@localhost ~]# ip netns exec net01 ip link set lo up
[root@localhost ~]# ip netns exec net02 ip link set lo up
[root@localhost ~]# ip netns exec net01 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
[root@localhost ~]#
[root@localhost ~]# ip netns exec net02 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
linux 提供了 veth pair
。可以把 veth pair
当做是双向的 pipe(管道),从一个方向发送的网络数据,可以直接被另外一端接收到;或者也可以想象成两个 namespace 直接通过一个特殊的虚拟网卡连接起来,可以直接通信。
创建veth pair
[root@localhost ~]# ip link add veth01 type veth peer name veth02
[root@localhost ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
5: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:fa:3f:86:7e brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::42:faff:fe3f:867e/64 scope link
valid_lft forever preferred_lft forever
143: veth02@veth01: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether ee:71:28:fa:dc:ad brd ff:ff:ff:ff:ff:ff
144: veth01@veth02: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether fa:18:55:74:50:32 brd ff:ff:ff:ff:ff:ff
veth pair
分别放到已经两个 namespace
[root@localhost ~]# ip link set veth01 netns net01
[root@localhost ~]# ip link set veth02 netns net02
[root@localhost ~]#
[root@localhost ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
link/ether 00:50:56:82:08:e1 brd ff:ff:ff:ff:ff:ff
5: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:fa:3f:86:7e brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::42:faff:fe3f:867e/64 scope link
valid_lft forever preferred_lft forever
给这对 veth pair
配置上 ip 地址,并启用它们
[root@localhost ~]# ip netns exec net01 ip link set veth01 up
[root@localhost ~]# ip netns exec net01 ip addr add 10.0.0.1/24 dev veth01
[root@localhost ~]#
[root@localhost ~]# ip netns exec net02 ip link set veth02 up
[root@localhost ~]# ip netns exec net02 ip addr add 10.0.0.2/24 dev veth02
[root@localhost ~]# ip netns exec net01 ping -c 3 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.142 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.098 ms
64 bytes from 10.0.0.2: icmp_seq=3 ttl=64 time=0.087 ms
--- 10.0.0.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.087/0.109/0.142/0.023 ms
[root@localhost ~]# ip netns exec net02 ping -c 3 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.078 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.091 ms
64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.095 ms
--- 10.0.0.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.078/0.088/0.095/0.007 ms
多个网络设备通信,我们首先想到的交换机和路由器。Linux Bridge 同时可以提供二层虚拟交换机和三层路由网络功能,我们还是用 ip
命令来完成所有的操作。
root@ubuntu:~# ip netns add ns1
root@ubuntu:~# ip netns add ns2
root@ubuntu:~# ip link add veth1 type veth peer name eth1
root@ubuntu:~# ip link set eth1 netns ns1
root@ubuntu:~# ip link add veth2 type veth peer name eth1
root@ubuntu:~# ip link set eth1 netns ns2
root@ubuntu:~# ip netns exec ns1 ip addr add 172.18.1.10/24 dev eth1
root@ubuntu:~# ip netns exec ns2 ip addr add 172.18.1.20/24 dev eth1
root@ubuntu:~# ip netns exec ns1 ip link set dev eth1 up
root@ubuntu:~# ip netns exec ns2 ip link set dev eth1 up
root@ubuntu:~# ip link add br0 type bridge
root@ubuntu:~# ip link set dev veth1 master br0
root@ubuntu:~# ip link set dev veth2 master br0
root@ubuntu:~# ip link set dev br0 up
root@ubuntu:~# ip link set dev veth1 up
root@ubuntu:~# ip link set dev veth2 up
root@ubuntu:~# ip netns exec ns1 ping -c 3 172.18.1.20
PING 172.18.1.20 (172.18.1.20) 56(84) bytes of data.
64 bytes from 172.18.1.20: icmp_seq=1 ttl=64 time=0.038 ms
64 bytes from 172.18.1.20: icmp_seq=2 ttl=64 time=0.107 ms
64 bytes from 172.18.1.20: icmp_seq=3 ttl=64 time=0.067 ms
--- 172.18.1.20 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.038/0.070/0.107/0.029 ms
root@ubuntu:~# ip netns exec ns2 ping -c 3 172.18.1.10
PING 172.18.1.10 (172.18.1.10) 56(84) bytes of data.
64 bytes from 172.18.1.10: icmp_seq=1 ttl=64 time=0.107 ms
64 bytes from 172.18.1.10: icmp_seq=2 ttl=64 time=0.030 ms
64 bytes from 172.18.1.10: icmp_seq=3 ttl=64 time=0.030 ms
--- 172.18.1.10 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.030/0.055/0.107/0.037 ms
在主机上能ping通容器的话需要给br0
加ip,容器中ping主机,需要在容器里加默认路由指向下一跳指向br0
地址
root@ubuntu:~# ip addr add 172.18.1.1/24 dev br0
root@ubuntu:~#
root@ubuntu:~#
root@ubuntu:~# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.2.1 0.0.0.0 UG 0 0 0 ens33
172.18.1.0 0.0.0.0 255.255.255.0 U 0 0 0 br0
root@ubuntu:~# ping -c 3 172.18.1.10
PING 172.18.1.10 (172.18.1.10) 56(84) bytes of data.
64 bytes from 172.18.1.10: icmp_seq=1 ttl=64 time=0.017 ms
64 bytes from 172.18.1.10: icmp_seq=2 ttl=64 time=0.041 ms
64 bytes from 172.18.1.10: icmp_seq=3 ttl=64 time=0.025 ms
--- 172.18.1.10 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.017/0.027/0.041/0.011 ms
root@ubuntu:~# ping -c 3 172.18.1.20
PING 172.18.1.20 (172.18.1.20) 56(84) bytes of data.
64 bytes from 172.18.1.20: icmp_seq=1 ttl=64 time=0.058 ms
64 bytes from 172.18.1.20: icmp_seq=2 ttl=64 time=0.028 ms
64 bytes from 172.18.1.20: icmp_seq=3 ttl=64 time=0.062 ms
--- 172.18.1.20 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.028/0.049/0.062/0.016 ms
root@ubuntu:~# ip netns exec ns1 ip route add default via 172.18.1.1 dev eth1
root@ubuntu:~#
root@ubuntu:~# ip netns exec ns2 ip route add default via 172.18.1.1 dev eth1
root@ubuntu:~#
# 192.168.2.118 是主机网卡地址
root@ubuntu:~# ip netns exec ns1 ping -c 3 192.168.2.118
PING 192.168.2.118 (192.168.2.118) 56(84) bytes of data.
64 bytes from 192.168.2.118: icmp_seq=1 ttl=64 time=0.018 ms
64 bytes from 192.168.2.118: icmp_seq=2 ttl=64 time=0.058 ms
64 bytes from 192.168.2.118: icmp_seq=3 ttl=64 time=0.055 ms
--- 192.168.2.118 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.018/0.043/0.058/0.019 ms
root@ubuntu:~#
root@ubuntu:~# ip netns exec ns2 ping -c 3 192.168.2.118
PING 192.168.2.118 (192.168.2.118) 56(84) bytes of data.
64 bytes from 192.168.2.118: icmp_seq=1 ttl=64 time=0.020 ms
64 bytes from 192.168.2.118: icmp_seq=2 ttl=64 time=0.027 ms
64 bytes from 192.168.2.118: icmp_seq=3 ttl=64 time=0.026 ms
--- 192.168.2.118 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.020/0.024/0.027/0.005 ms
ens33 root@ubuntu:~# ip netns exec ns1 ping -c 3 192.168.2.118
PING 192.168.2.118 (192.168.2.118) 56(84) bytes of data.
64 bytes from 192.168.2.118: icmp_seq=1 ttl=64 time=0.018 ms
64 bytes from 192.168.2.118: icmp_seq=2 ttl=64 time=0.058 ms
64 bytes from 192.168.2.118: icmp_seq=3 ttl=64 time=0.055 ms
--- 192.168.2.118 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.018/0.043/0.058/0.019 ms
root@ubuntu:~#
root@ubuntu:~# ip netns exec ns2 ping -c 3 192.168.2.118
PING 192.168.2.118 (192.168.2.118) 56(84) bytes of data.
64 bytes from 192.168.2.118: icmp_seq=1 ttl=64 time=0.020 ms
64 bytes from 192.168.2.118: icmp_seq=2 ttl=64 time=0.027 ms
64 bytes from 192.168.2.118: icmp_seq=3 ttl=64 time=0.026 ms
--- 192.168.2.118 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
ens33 192.168.2.118/24
br0 172.18.1.1/24
|
| +-------------+
|-veth1 <--|--> eth1 ns1 | 172.18.1.10/24
| |-------------+
|-veth2 <--|--> eth1 ns2 | 172.18.1.20/24
| +-------------+
以上操作就是docker bridge模式的网络模型
当然要实现每个 namespace 对外网的访问还需要额外的配置(设置默认网关,开启 ip_forward,为网络添加 NAT 规则等)。