-
Notifications
You must be signed in to change notification settings - Fork 179
Description
Important notices
Before you add a new report, we ask you kindly to acknowledge the following:
- I have read the contributing guide lines at https://github.com/opnsense/src/blob/master/CONTRIBUTING.md
- I am convinced that my issue is new after having checked both open and closed issues at https://github.com/opnsense/src/issues?q=is%3Aissue
Describe the bug
Boxes sharing addresses using CARP cause routing loops/packet amplification as soon as NetFlow is activated on CARP-enabled interfaces and both boxes receive the same packet.
A CARP backup box routes packets destined to adresses of the current CARP master, either its physical and virtual address - and afterwards vice versa.
To Reproduce
Steps to reproduce the behavior:
- Setup two boxes
- Delete NAT rules and create a floating any/any/allow rule, allow local nets, etc. for debugging
- Setup the following network layout
TEST-PC1
|
--+-----------------+----------------+--
| |
em0 em0
+-------------+ +-------------+
| OPNsense 01 |em2 <-pfsync-> em2| OPNsense 02 |
+-------------+ +-------------+
em1 em1
| |
--+-----------------+----------------+--
|
TEST-PC2
LAN (em0)
CARP VIP 192.168.50.10/24
OPNsense 01 192.168.50.11/24
OPNsense 02 192.168.50.12/24
TEST-PC1 192.168.50.50/24
WAN (em1)
CARP VIP 192.168.222.10/24
OPNsense 01 192.168.222.11/24
OPNsense 02 192.168.222.12/24
TEST-PC2 192.168.222.50/24
pfsync (em2)
OPNsense 01 192.168.99.1/24
OPNsense 02 192.168.99.2/24
- Set OPNsense CARP VIPs as gateway on Test-PCs
- Verify that packets route correctly and do failover tests (should succeed)
- Enable NetFlow for em0 and em1 with local collection
- Reboot
To test, you can either ping Test-PC2 from Test-PC1, ping a physical box' address or ping the CARP VIP.
You will see that both boxes will process the packet, the non-destined box will route it with TTL -= 1 which will end in a loop until TTL times out.
Important:
If you are using a switched network, you will see the bahaviour only when the switch is flooding packets (e.g. at DLF, after STP-TC, etc.). This will limit the effect to a very short time. To have a persistent lab environment, use hubs or a non-switched virtual network.
To stop or mitigate the problem, you can stop the service samplicate, manually shut down interfaces netflow_emX using ngctl or disable CARP.
Expected behavior
CARP Failover Cluster with NetFlow functionality without re-routed/looped packets.
Describe alternatives you considered
Disable NetFlow or use NetFlow without CARP.
Screenshots
/
Relevant log files
/
Additional context
I tried to reproduce a real issue observed with hardware boxes. This finding does not only affect my tests, it is a result of a real network debugging.
Environment
Software version used:
OPNsense 25.7.11_2-amd64