ovs报Too many open files

现象

现象是ovs占用cpu比较高,查看ovs的stack发现一直陷入内核代码,但是一直被schedule出去;

  • 查看进程file descriptor限制
    [root@node-1 zjp]# ulimit -u
    102400
  • 查看ovs-vswitchd进程打开的文件数
    [root@node-1 zjp]# lsof -p $(pidof ovs-vswitchd) |grep -c GENERIC
    102409

    可以看到文件句柄确实达到了进程限制

初步分析

  • 之前遇到过类似的问题,主要就是ovs内部导致每创建一个port,就会为每个线程创建一个netlink sock

    sock_num = ports * n-handler-threads;
    Vn-handler-threads= online_cpu* 3/4 -1;

    根据目前提供的信息,计算下来是没有超过ovs-swicthd进程的max open files 102400
    https://bugzilla.redhat.com/show_bug.cgi?id=1526306,我们的版本已经优化了该问题;

  • 使用strace -p 发现该进程一直在access,但是因为进程已经到了limit了,导致无法正常运行,但是发现
    每打印如下一行,ovs的fd就会“残留”下来

    socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC) = 1415
  • 查看ovs的日志

    [root@node-1 ~]# tailf /var/log/openvswitch/ovs-vswitchd.log
    2021-04-15T12:36:29.723Z|00595|bridge|INFO|bridge br-int: added interface ha-d489f511-9f on port 17744
    2021-04-15T12:36:30.062Z|00596|bridge|INFO|bridge br-int: added interface ha-24939610-b4 on port 17745
    2021-04-15T12:36:32.790Z|00597|bridge|INFO|bridge br-int: deleted interface ha-d489f511-9f on port 17744
    2021-04-15T12:36:33.229Z|00598|bridge|INFO|bridge br-int: added interface ha-d489f511-9f on port 17746
    2021-04-15T12:36:41.807Z|00599|bridge|INFO|bridge br-int: deleted interface ha-24939610-b4 on port 17745
    2021-04-15T12:36:41.809Z|00600|bridge|INFO|bridge br-int: deleted interface ha-d489f511-9f on port 17746
    2021-04-15T12:36:41.821Z|00601|bridge|INFO|bridge br-int: added interface ha-d489f511-9f on port 17746
    2021-04-15T12:36:42.175Z|00602|bridge|INFO|bridge br-int: added interface ha-24939610-b4 on port 17747
    2021-04-15T12:36:44.846Z|00603|bridge|INFO|bridge br-int: deleted interface ha-d489f511-9f on port 17746
    2021-04-15T12:36:45.324Z|00604|bridge|INFO|bridge br-int: added interface ha-d489f511-9f on port 17748
    2021-04-15T12:36:54.105Z|00605|bridge|INFO|bridge br-int: deleted interface ha-24939610-b4 on port 17747
    2021-04-15T12:36:54.107Z|00606|bridge|INFO|bridge br-int: deleted interface ha-d489f511-9f on port 17748
    2021-04-15T12:36:54.119Z|00607|bridge|INFO|bridge br-int: added interface ha-d489f511-9f on port 17748
    2021-04-15T12:36:54.454Z|00608|bridge|INFO|bridge br-int: added interface ha-24939610-b4 on port 17749
    2021-04-15T12:36:56.506Z|00609|bridge|INFO|bridge br-int: deleted interface ha-d489f511-9f on port 17748
    2021-04-15T12:36:57.013Z|00610|bridge|INFO|bridge br-int: added interface ha-d489f511-9f on port 17750

    可以发现ovs在不停的add/delete ha-d489f511-9f,更明确的信息就是没次打印到deleted interface ha-d489f511-9f就会残留一个fd集合上面的strace信息应该就是残留了port的netlink的socket句柄(fd);

  • 查看neutron-l3-agent的日志

2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: c863cb21-1b5a-427a-aecc-bc6b1bb22571
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 533, in _process_router_update
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 468, in _process_router_if_compatible
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 473, in _process_added_router
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 376, in _router_added
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent self.force_reraise()
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb)
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 365, in _router_added
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 118, in initialize
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 360, in spawn_state_change_monitor
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/external_process.py", line 94, in enable
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 913, in execute
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 148, in execute
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent ProcessExecutionError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Guru meditation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility.
SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
2021-04-16 16:36:50.925 24849 ERROR neutron.agent.l3.agent Traceback (most recent call last):

查看neutron.agent.l3.agent的日志发现这个pod在add一个router会出现错误,进入delete的流程,但是出现错误了,试着将neutron.agent.l3.agent服务sleep掉不再不停的deleteadd tap设备,现在ovs-vswitchd的fd设备已经不再增加,可以确认就是neutron.agent.l3.agent服务引起的问题,根据这个现象,尝试在neutron每次删除失败后执行如下命令:

ovs-vsctl del-port br-int ha-d489f511-9f

可以正确释放新增的fd

小结:
查看了neutron.agent.l3.agent 日志发现应该和https://bugzilla.redhat.com/show_bug.cgi?id=1508091 这个bug一样的,在 while cleaning up a router namespace 出错了导致ovs也出现问题了。
根据现在现象分析,经过修改neutron的代码,发现确实可以修改该问题,但是为什么会出现这个现象,ovs在该问题又是如何运行的,需要深入调查一下。

深入调查

1、查看了一个neutron的代码发现:

except Exception:
with excutils.save_and_reraise_exception():
del self.router_info[router_id]
LOG.exception(_LE('Error while initializing router %s'),
router_id)
self.namespaces_manager.ensure_router_cleanup(router_id)
try:
ri.delete()
except Exception:
LOG.exception(_LE('Error while deleting router %s'),
router_id)

会执行ensure_router_cleanup的动作,里面会清楚网络的namespace

def delete(self):
ns_ip = ip_lib.IPWrapper(namespace=self.name)
for d in ns_ip.get_devices(exclude_loopback=True,
exclude_gre_devices=True):
if d.name.startswith(INTERNAL_DEV_PREFIX):
# device is on default bridge
self.driver.unplug(d.name, namespace=self.name,
prefix=INTERNAL_DEV_PREFIX) // (1
elif d.name.startswith(ROUTER_2_FIP_DEV_PREFIX):
ns_ip.del_veth(d.name) (2
elif d.name.startswith(EXTERNAL_DEV_PREFIX): (3
self.driver.unplug(
d.name,
bridge=self.agent_conf.external_network_bridge,
namespace=self.name,
prefix=EXTERNAL_DEV_PREFIX)
super(RouterNamespace, self).delete() (4

在执行删除RouterNamespace之前会del相关的netdev设备,如(1)、(3),(2)删除veth设备,但是前缀都是

NS_PREFIX = 'qrouter-'
INTERNAL_DEV_PREFIX = 'qr-'
EXTERNAL_DEV_PREFIX = 'qg-'
# TODO(Carl) It is odd that this file needs this. It is a dvr detail.
ROUTER_2_FIP_DEV_PREFIX = 'rfp-'

没有如ha-d489f511-9f 以ha开头的internal设备。

2、使用脚本尝试复现:

ip netns add ovs-test
ovs-vsctl --timeout=10 --oneline --format=json -- add-port br-int ha-d489f511-9f -- set Interface ha-d489f511-9f type=internal external_ids:iface-id=d489f511-9f9f-49e9-b772-f43bf3bfa04b external_ids:iface-status=active external_ids:attached-mac=fa:16:3e:4a:52:36
ip link set netns ovs-test ha-d489f511-9f
ip netns del ovs-test
sleep 1
ovs-vsctl --timeout=10 --oneline --format=json -- --if-exists del-port ha-d489f511-9f

发现可以复现环境的问题,关键就是在del-port之前把namespace删除,就是ovs的fd不会随着删除的动作相应的close socket。

3、使用systemtap(见附件)抓了一下代码发现正常和非正常的代码:
正常:

-----the delete strace-------
0xffff845a4b6c : dpif_netlink_port_del+0x0/0x134 [/usr/lib64/libopenvswitch-2.12.so.0.0.0]
0xffff844d05cc : dpif_port_del+0x80/0xe4 [/usr/lib64/libopenvswitch-2.12.so.0.0.0]
0xffff8474fee4 : port_del+0x9c/0xb4 [/usr/lib64/libofproto-2.12.so.0.0.0]
0xffff8473fd4c : ofproto_port_del+0x68/0xd0 [/usr/lib64/libofproto-2.12.so.0.0.0]
0xaaaac2e5e53c : bridge_delete_or_reconfigure_ports+0x120/0x370 [/usr/sbin/ovs-vswitchd]
0xaaaac2e5f704 : bridge_reconfigure+0x40c/0x2f0c [/usr/sbin/ovs-vswitchd]
0xaaaac2e629cc : bridge_run+0x220/0x19dc [/usr/sbin/ovs-vswitchd]
0xaaaac2e5a3a0 : main+0x3d4/0x534 [/usr/sbin/ovs-vswitchd]
0xffff83da1714 : __libc_start_main+0xf0/0x1cc [/usr/lib64/libc-2.17.so]
0xaaaac2e5a554 : _start+0x38/0x3c [/usr/sbin/ovs-vswitchd]
-----the del_cached_port strace-------
the name = ha-d489f511-9f
0xffff8457e1e0 : shash_find_and_delete+0x0/0x38 [/usr/lib64/libopenvswitch-2.12.so.0.0.0]
0xffff8473ae6c : ofport_destroy__+0xb0/0xdc [/usr/lib64/libofproto-2.12.so.0.0.0]
0xffff8473debc : ofport_remove+0x50/0x78 [/usr/lib64/libofproto-2.12.so.0.0.0]
0xffff8473f308 : update_port+0x88/0x2e0 [/usr/lib64/libofproto-2.12.so.0.0.0]
0xffff8473fd88 : ofproto_port_del+0xa4/0xd0 [/usr/lib64/libofproto-2.12.so.0.0.0]
0xaaaac2e5e53c : bridge_delete_or_reconfigure_ports+0x120/0x370 [/usr/sbin/ovs-vswitchd]
0xaaaac2e5f704 : bridge_reconfigure+0x40c/0x2f0c [/usr/sbin/ovs-vswitchd]
0xaaaac2e629cc : bridge_run+0x220/0x19dc [/usr/sbin/ovs-vswitchd]
0xaaaac2e5a3a0 : main+0x3d4/0x534 [/usr/sbin/ovs-vswitchd]
0xffff83da1714 : __libc_start_main+0xf0/0x1cc [/usr/lib64/libc-2.17.so]
0xaaaac2e5a554 : _start+0x38/0x3c [/usr/sbin/ovs-vswitchd]

非正常:

-----the del_cached_port strace-------
the name = ha-d489f511-9f
0xffff8457e1e0 : shash_find_and_delete+0x0/0x38 [/usr/lib64/libopenvswitch-2.12.so.0.0.0]
0xffff8473ae6c : ofport_destroy__+0xb0/0xdc [/usr/lib64/libofproto-2.12.so.0.0.0]
0xffff8473debc : ofport_remove+0x50/0x78 [/usr/lib64/libofproto-2.12.so.0.0.0]
0xffff8473f308 : update_port+0x88/0x2e0 [/usr/lib64/libofproto-2.12.so.0.0.0]
0xffff8473fa9c : ofproto_run+0x53c/0x680 [/usr/lib64/libofproto-2.12.so.0.0.0]
0xaaaac2e5d450 : bridge_run__+0x17c/0x1d8 [/usr/sbin/ovs-vswitchd]
0xaaaac2e60824 : bridge_reconfigure+0x152c/0x2f0c [/usr/sbin/ovs-vswitchd]
0xaaaac2e629cc : bridge_run+0x220/0x19dc [/usr/sbin/ovs-vswitchd]
0xaaaac2e5a3a0 : main+0x3d4/0x534 [/usr/sbin/ovs-vswitchd]
0xffff83da1714 : __libc_start_main+0xf0/0x1cc [/usr/lib64/libc-2.17.so]
0xaaaac2e5a554 : _start+0x38/0x3c [/usr/sbin/ovs-vswitchd]

可以看到正常的流程应该是在函数之后走两个分支

bridge_delete_or_reconfigure_ports
-->ofproto_port_del
-->ofport_remove (1)
-->port_del (2)

而非正常情况下是完全没有走到ofproto_port_del 函数,只是在bridge_run检查到ofport异常的时候执行了ofport_remove,看一下bridge_delete_or_reconfigure_ports 函数:

20 ┊ OFPROTO_PORT_FOR_EACH (&ofproto_port, &dump, br->ofproto) {
19 ┊ ┊ ┊ ofp_port_t requested_ofp_port;
18 ┊ ┊ ┊ struct iface *iface;
17
16 ┊ ┊ ┊ sset_add(&ofproto_ports, ofproto_port.name);
15
14 ┊ ┊ ┊ iface = iface_lookup(br, ofproto_port.name);
13 ┊ ┊ ┊ if (!iface) {
12 ┊ ┊ ┊ ┊ ┊ /* No such iface is configured, so we should delete this
11 ┊ ┊ ┊ ┊ ┊ ┊* ofproto_port.
10 ┊ ┊ ┊ ┊ ┊ ┊*
9 ┊ ┊ ┊ ┊ ┊ ┊* As a corner case exception, keep the port if it's a bond fake
8 ┊ ┊ ┊ ┊ ┊ ┊* interface. */
7 ┊ ┊ ┊ ┊ ┊ if (bridge_has_bond_fake_iface(br, ofproto_port.name)
6 ┊ ┊ ┊ ┊ ┊ ┊ ┊ && !strcmp(ofproto_port.type, "internal")) {
5 ┊ ┊ ┊ ┊ ┊ ┊ ┊ continue;
4 ┊ ┊ ┊ ┊ ┊ }
3 ┊ ┊ ┊ ┊ ┊ goto delete;
2 ┊ ┊ ┊ }
1

通过抓包也可以验证,残留fd的情况iface_lookup(br, ofproto_port.name); 函数的ofproto_port.name 都没有
ha-d489f511-9f 这个设备,所以也就走不到 goto delete的
ofproto_port_del

16delete:
15 ┊ ┊ ┊ iface_destroy(iface);
14 ┊ ┊ ┊ del = add_ofp_port(ofproto_port.ofp_port, del, &n, &allocated);
13 ┊ }
12for (i = 0; i < n; i++) {
11 ┊ ┊ ┊ ofproto_port_del(br->ofproto, del[i]);
10 ┊ }
9free(del);
8

现在问题就是在ovs delete port之前把网络的namespace删除,
dpif_netlink_port_del 函数里面会有close socket的流程,因为异常情况走不到这个分支,所以就“残留”在这个进程里面。
ovs del的时候会经过的代码段,也是在这里出现问题的

20 ┊ OFPROTO_PORT_FOR_EACH (&ofproto_port, &dump, br->ofproto) { <=======(1)
19 ┊ ┊ ┊ ofp_port_t requested_ofp_port;
18 ┊ ┊ ┊ struct iface *iface;
17
16 ┊ ┊ ┊ sset_add(&ofproto_ports, ofproto_port.name);
┊ ┊ ┊ iface = iface_lookup(br, ofproto_port.name); <=======(2)
13 ┊ ┊ ┊ if (!iface) {
12 ┊ ┊ ┊ ┊ ┊ /* No such iface is configured, so we should delete this
11 ┊ ┊ ┊ ┊ ┊ ┊* ofproto_port.
10 ┊ ┊ ┊ ┊ ┊ ┊*
9 ┊ ┊ ┊ ┊ ┊ ┊* As a corner case exception, keep the port if it's a bond fake
8 ┊ ┊ ┊ ┊ ┊ ┊* interface. */
7 ┊ ┊ ┊ ┊ ┊ if (bridge_has_bond_fake_iface(br, ofproto_port.name)
6 ┊ ┊ ┊ ┊ ┊ ┊ ┊ && !strcmp(ofproto_port.type, "internal")) {
5 ┊ ┊ ┊ ┊ ┊ ┊ ┊ continue;
4 ┊ ┊ ┊ ┊ ┊ }
3 ┊ ┊ ┊ ┊ ┊ goto delete;
2 ┊ ┊ ┊ }

为了验证ovs在del port的时候确实是因为lookup没有相关的name问题,经过gdb调试,在经过(2)此代码时手动改变ofproto_port.name的值为ha-d489f511-9f确实可以正确的回收这个netlink的fd;而(2)的来源是(1) dump所有的dpif的设备的结果,所名dpif已经被删除了,打开dpif debug日志也证实这一点

2021-04-26T08:13:16.076Z|00938|dpif|DBG|system@ovs-system: failed to query port ha-d489f511-9f: No such device
2021-04-26T08:13:16.078Z|00939|dpif|DBG|system@ovs-system: failed to query port ha-d489f511-9f: No such device

具体代码:
dpif_port_query_by_name

dpif_port_query_by_name(const struct dpif *dpif, const char *devname,
struct dpif_port *port)
{
....
VLOG_RL(&error_rl, error == ENODEV ? VLL_DBG : VLL_WARN,
"%s: failed to query port %s: %s",
dpif_name(dpif), devname, ovs_strerror(error));
}
return error;
}

到这里基本已经可以知道是namespace被删除的时候虚拟设备被删除才导致ovs无法找到该设备;

  • 看一下kernel关于namespace被删除的相关代码
    ops_exit_list

    146 static void ops_exit_list(const struct pernet_operations *ops,
    147struct list_head *net_exit_list)
    148 {
    149 struct net *net;
    150 if (ops->exit) { <--------------(1)
    151 list_for_each_entry(net, net_exit_list, exit_list)
    152 ops->exit(net);
    153 }
    154 if (ops->exit_batch) <--------------(2)
    155 ops->exit_batch(net_exit_list);
    156 }

    (1)出的具体的函数default_device_exit

    9380 static void __net_exit default_device_exit(struct net *net)
    9381 {
    9382 struct net_device *dev, *aux;
    ......
    9396 /* Leave virtual devices for the generic cleanup */
    9397 if (dev->rtnl_link_ops)
    9398 continue;
    .....
    9411 rtnl_unlock();
    9412 }

    在这个函数里面检测到是virtual devices会放在generic 去cleanup,也就是会执行到(2)
    default_device_exit_batch

    9442 static void __net_exit default_device_exit_batch(struct list_head *net_list)
    9443 {
    .....
    9464 rtnl_lock_unregistering(net_list);
    9465 list_for_each_entry(net, net_list, exit_list) {
    9466 for_each_netdev_reverse(net, dev) {
    9467 if (dev->rtnl_link_ops && dev->rtnl_link_ops->dellink)
    9468 dev->rtnl_link_ops->dellink(dev, &dev_kill_list);
    9469 else
    9470 unregister_netdevice_queue(dev, &dev_kill_list); <=======(a)
    9471 }
    9472 }
    9473 unregister_netdevice_many(&dev_kill_list);
    9474 rtnl_unlock();
    9475 }

    可以看到最终会走到(a)处去执行netdev设备的清理删除工作

贴一下intelnal的设备的代码:

141 static void do_setup(struct net_device *netdev)
142 {
143 ether_setup(netdev);
144
145 netdev->max_mtu = ETH_MAX_MTU;
146
147 netdev->netdev_ops = &internal_dev_netdev_ops;
....
155 netdev->rtnl_link_ops = &internal_dev_link_ops;
......
167 }

有设置netdev->rtnl_link_ops 结构体,该结构体的内容

137 static struct rtnl_link_ops internal_dev_link_ops __read_mostly = {
138 .kind = "openvswitch",
139 };
140

可以看到没有设置dev->rtnl_link_ops->dellink,所以就会上面分析的结果,走到default_device_exit_batch执行真正的删除。
小结:主要是namespace在删除的时候会去清楚internal类型的netdevice,ovs没有找到是符合逻辑的,主要还是neutron这块代码有问题,在走到异常路径的时候只把其他类型的删除了,没有把ha-xxxx类型的设备删除导致的ovs fd残留。

附件:

#!/usr/bin/env stap
probe process("/usr/lib64/libopenvswitch-2.12.so.0").function("shash_find_and_delete"){
if(user_string($name)=="ha-d489f511-9f"){
printf("\n-----the del_cached_port strace-------\n");
printf("the name = %s\n", user_string($name));
print_ubacktrace();
}
}
probe process("/usr/lib64/libopenvswitch-2.12.so.0").function("dpif_netlink_port_del"){
printf("\n-----the delete strace-------\n");
print_ubacktrace();
}
暂无评论

发送评论 编辑评论


				
|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠( ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌皿ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ °Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
颜文字
Emoji
小恐龙
花!
上一篇
下一篇