-
Notifications
You must be signed in to change notification settings - Fork 12
improve ARP cache logic #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
66556b3 to
afc434f
Compare
|
Here is how the now fast recovery from a bogus gratuitous ARP request looks like. The ping that is running in parallel only lost 9 packets: |
e59f514 to
1c8f6cb
Compare
|
Thanks for your contribution. Why is it a draft PR (is it ready for review or not)? Could you please keep unrelated changes out of the PR (here, for sure the "let trace = ..." should be removed to keep the diff small). |
|
Is this "improved cach logic" taken from some book/paper/implementation? |
As outlined on https://mirageos.org/security, please send security issues to the mail address provided there. Thanks a lot. From what I understand, please tell me if this is correct or wrong: your worry is that gratuitous ARP packets are accepted and overwrite existing entries? So, if a (local, layer 2) attacker sends an gratuitous ARP packet, a unikernel will use this advertised MAC address to communicate with the IP address provided.
So, if I understand you correctly, there's now a not-so-obvious state machine in the code, and there's some "Probing" intervals going on (with unicast packets). Now, if someone sends a gratuitous ARP, the unikernel will probe 5 times (unicast), and the remove the entry? Doesn't this lead to another denial of service attack (amplification - as an attacker I only have to send a single ARP packet, and suddenly the layer 2 network is filled with ARP requests (with an amplification factor of 5))? -- Please note that I'm sure there are better strategies for how to handle gratuitous ARP, and this ARP implementation is not perfect. But I'm curious about how other implementations do it, and what exactly are the concerns and improvements in this PR. |
That was on purpose. I wanted to see first if the CI is green and also experiment a bit more with the behavior. For me "Draft" means: reviews are welcome, but possibly larger parts of the code will still change or the whole PR might get redacted. But of course that can be different from project to project
In a separate commit, or completely out of the PR? I mean, in this specific case it's not important anyway, can remove it completely. But in general I always think it's a good opportunity to clean up if you touch it anyway. (But in separate commits then, true.)
As I wrote above it is inspired by NUD in IPv6 (see RFC4861 Sec 7.3) mainly by observing the behaviour of the Linux kernel, which apparently uses the NUD state machine also for IPv4. However NUD assumes access to "confirmations" from upper-level protocols, which we don't have here, so I removed the DELAY state, for example.
Yeah, maybe "DoS" is not the right term, I don't really see it as a security issue, more as an interoperability issue. But right, I should probably be more aware of this.
My issue is less of a worry, but an observation while I was investigating network issues with the unikernel in a busy network. But yes, your description is correct. If I send a single gratuitous ARP packet with a "wrong" mac address to the unikernel for my local IP address, I can't reach it anymore for 20 minutes. But I can recover from this state by sending another gratuitous ARP packet with the correct MAC address. The issue is that gratuitous ARP packets for known IPs are trusted without confirmation. I compared that with how the Linux kernel behaved, which recovered within 10s by itself, and that's how I learned about the NUD, which it is using.
Almost. Unsolicited updates (gratuitous ARP) are accepted as before, but now it changes the entry immediately to STALE state. (That is a MUST in the NUD spec). This means that next time this entry is queried it will go directly into PROBE (I skip the DELAY state, because we don't have the option go get "confirmation" by upper-layers.) In PROBE state it tries for a few times and rate limited to confirm that MAC address with unicast requests. Either they are answered, which refreshes the cache, or not, then the entry is removed after 5 times.
I can't imagine how that would work. You only can cause that for already existing ARP cache entries and only once per entry, and it is rate limited. Also it is what the Linux kernel is doing and what is described in RFC4861, so I don't think that can be a real issue.
We experience real-world connectivity issues in certain busy networks (which might contain many ill-behaving nodes), and while investigating, this was the first issue that I could find. (There might be also other issues, I'm still waiting for confirmation, because unfortunately I don't have direct access to that network.) So the goal is to make the ARP cache similarly resilient in such environments as the Linux kernel. In any case I think it makes not so much sense, that ARP cache entries are only probed once every 20 minutes and stay there forever, even if they are never queried again. |
1c8f6cb to
8eec8d7
Compare
|
So, I'm not very convinced by this PR. I understand the desire, I'm not sure it brings us much. For reference, I studied various ARP implementations:
Putting trust or not into layer 2 packets is hard -- esp. with the different scenarios you may have: some computer with random mac addresses restarting, some local attacker trying to poison your ARP cache. From your implementation, I still don't understand why there's need for separate Stale and Probing states? And what the It'd as well be helpful if someone else shares their opinion on this PR. |
Ok, so maybe we go step by step and we first check on which points we are on one page before we go into implementation specifics: Do you also see a problem in the current behaviour, that a gratuitous ARP request for an already present IP in the cache makes the original node unreachable for 20 minutes, if the node doesn't constantly send gratuitous ARP requests itself? Even if this is not considered as a security issue, it is at least not a very robust behaviour.
Thanks for the research, that's interesting. Linux seems to use NUD for IPv4 since ages (2.2/2.4) and Windows since Vista. I wonder why *BSD didn't follow that.
Right, but isn't that a reason to not blindly trust a gratuitous MAC address update for flat 20 minutes? Even the very old original RFC 1122 (1989!) says regarding "ARP Cache Validation":
And this is from 35 years ago when the low bandwidth in large collision domains (10BASE2/10BASE5) made too many ARP packets relatively expensive. Today these numbers should be more than safe. And proxy ARP is omnipresent with VM infrastructures etc.
Let's go into detail here after we agreed on a problem statement, but you can basically use the definitions from RFC 4861: The comments of the values should explain that, but here rephrased:
I also definitely would appreciate that. Since this is a quite delicate low-level interop component, I hesitate to use my implementation as a downstream fork for now, even if the community is not interested in this improvement. It would be great to have some input. |
|
@haesbaert maybe you are a good candidate to have a look at this, if we can nerd-snipe you? I remember you helped us out already before with TCP/IP stuff. 😄 |
|
@ansiwen we (@hannesm, @reynir, me) discussed this PR at the Mirage meeting. There is not a clear consensus for now, but we all agree that the PR would benefit from:
Could you kindly provide that? |
I'm not sure I follow this, you say it keeps forever, but it expires after 20minutes? I haven't read any arp RFC, so I don't have a lot to add, but here goes some speculation.
While this seem to make sense, I wonder, why not just rely on the refresh interval of 1minute? Anything that is older than 1minute needs to get probed, if it gets an answer it continues to exist, if not, it's zapped. If, during this 1 minute of staleness, you get a packet saying the IP changed, you mark the existing stale entry as "contested" or whatever, contested means, once it expires, you do a broadcast request asking for whoever the new owner is. With this you don't need two new states, only one. I might be misunderstanding something so take it with a grain of salt. Concerning why OpenBSD and FreeBSD don't do what linux does, I'd wager it's likely because linux tries to solve everything in the most complex way since it doesn't know how it is supposed to be used, while Unixes compromise so they can have something simpler. Also other things come to mind, I assume the number of entries is bound as well right? Otherwise one can just populate it with crap until OOM, you know, ocaml style. |
|
Thanks a lot for having a look, @haesbaert!
After 20 minutes it probes an entry exactly one time (but with a normal broadcast, so not exactly a probe), and removes it only, if it is not reachable anymore. It does that not matter if the entry has been in use or not. So, if you contacted another node one time, it will stay in the ARP cache as long as that node is reachable.
If you add an internal
Well, in this case I would argue, Linux (and Windows) have something simpler, because they have only one implementation, since they use NUD for both IPv4 and IPv6, while the BSDs have two separate implementations. (They still need NUD for IPv6).
Yes, that was indeed the case until recently, but since #35 this is mitigated. |
Thanks for discussing this PR! Sure, I can do that! |
2a1dd61 to
96ea56d
Compare
6b10cec to
6cff3cf
Compare
|
I also reimplemented the racy |
Yes, a separate PR with the changes for that would be great. Bonus points if the commit targets the main branch, and does not depend on your state machine work. |
|
Thanks for your state transition diagram and your table. For the events "reply received", I understand it. Please correct me if I'm wrong in respect to timings: You propose to have (
What is the reasoning for reducing the retries? And please also help me to understand - in Probing you send out ARP requests to the unicast address, how would there a reply with a different mac is received? Is this a gratuitous ARP? I think I understand now much better what you try to achieve here. It is still the case that we can't do much about ARP spoofing (even with this PR, an attacker can still continue to send lots of ARP replies from e.g. the IP address of the router -- and the ARP implementation will take them as valid) --> but that's how it is, if you're in an adversarial context, you're better off to configure the mac address of the router statically (something that I suspect is not (yet?) exposed in MirageOS -- could be a command-line argument). |
| let handle_reply t source mac = | ||
| let extcache = | ||
| let cache = M.add source (Dynamic (mac, t.epoch + t.timeout)) t.cache in | ||
| let update_cache () = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this renamed, and moved to be a function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because extcache is not used in all branches, so I think it is better to do it on demand. For me at least update_cache is easier to understand.
e1dce01 to
41fd60f
Compare
This change stabilizes the `entries_expire` mirage test by removing timing dependencies, fixes it for CI environments with slow ticks (they seem to have up to ~20ms ticks) and re-enables it for all platforms again.
41fd60f to
52c6278
Compare
The old ARP cache implementation is very simple, and keeps entries of reachable but unused neighbors forever in the cache. Also it never probes if entries are still valid before the timeout is reached, which is by default a very long 20 minutes. Combined with the fact that gratuitous ARP requests for existing entries are accepted, this makes a simple DoS attack possible, that only recovers after these 20 minutes. This is an improved implementation of the ARP cache logic, which handles dynamic network changes a lot better. It is inspired by the NUD of IPv6, but a lot simpler. It only introduces two more cache entry states: Stale and Probing. After the `refresh` interval (default: 1 minute) a Dynamic entry goes into Stale state. If an entry changes its MAC address because of a gratuitous ARP request, it goes into Stale immediately. If a Stale entry is queried, it goes into Probing state, where it sends ARP requests similar to the Pending state, but as unicasts to the last known MAC address. If the probing is not successful, the entry is removed and further queries cause a normal broadcast request, otherwise it is replaced with a fresh Dynamic entry. If a Stale entry reaches `timeout` age (default 20 minutes), because it never has been queried, it is removed without further probing. Compared to the old implentation the default values create on the happy path only one more unicast ARP request per minute and ARP cache entry.
52c6278 to
21c6f07
Compare
That seems correct.
It was 6 requests before (1 + 5 retries), which felt a bit excessive to me. So I just went with what IPv6 NUD is using, which explicitly uses 3 (1 + 2 retries). See
Yeah, that is not likely, but the code would still handle it, no matter if it is a real reply packet or a gratuitous ARP announcement (which is technically a request packet) which arrives coincidentally at that moment. In both cases the situation would be unclear and Stale state is therefore appropriate.
Right, it's not perfect. But I would argue it is still a lot more resilient than before, because in such situations it would constantly try to verify the entries and therefore also recover a lot quicker. So it makes an attack more expensive and less disruptive. |
BTW: if that would be new code I would have called |
The old ARP cache implementation is very simple, and keeps entries of reachable but unused neighbors forever in the cache. Also it never probes if entries are still valid before the timeout is reached, which is by default a very long 20 minutes. Combined with the fact that gratuitous ARP requests for existing entries are accepted, this makes a simple DoS attack possible, that only recovers after these 20 minutes.
This is an improved implementation of the ARP cache logic, which handles dynamic network changes a lot better. It is inspired by the NUD of IPv6, but a lot simpler. It only introduces two more cache entry states: Stale and Probing. After the
refreshinterval (default: 1 minute) a Dynamic entry goes into Stale state. If an entry changes its MAC address because of a gratuitous ARP request, it goes into Stale immendiately. If a Stale entry is queried, it goes into Probing state, where it sends ARP requests similar to the Pending state, but as unicasts to the last known MAC address. If the probing is not successful, the entry is removed and further queries cause a normal broadcast request, otherwise it is replaced with a fresh Dynamic entry. If a Stale entry reachestimeoutage (default 20 minutes), because it never has been queried, it is removed without further probing.Compared to the old implentation the default values create on the happy path only one more unicast ARP request per minute and ARP cache entry.
Depends on #36