-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
TCP keepalive support - Traefik unusable in Azure #1046
Description
Possibly related to #727, but am not sure if that issue is only about HTTP keepalives (which is different than TCP keepalive).
I haven't seen TCP keepalive documentation (or other issues) in Traefik?
Why? 1
The other useful goal of keepalive is to prevent inactivity from disconnecting the channel. It's a very common issue, when you are behind a NAT proxy or a firewall, to be disconnected without a reason.
I am experiencing this exact issue with Azure. Even though I use a public IP which is directly attached to my VM, for absolutely no reason they have a forced NAT gateway in front of it (it's a public IP with 1:1 relationship with the VM!!!) which I cannot do anything about. This issue only manifests when I'm using Cloudflare in front of Traefik - and to optimize latency they keep the connection open. Azure's "SNAT" implementation1 kills the connection @ 4 minute idle mark. I verified this with netcat & tcpdump.
Microsoft's cloud is actually pretty disgusting compared to any other sane alternatives like AWS or DigitalOcean in many respects (I'll probably write a blog post about it), but I have to use them for business reasons.
Would it be possible to add TCP keepalive support to Traefik? Any pointers on which file the patch should go into? It's probably at the point when the server accepts a new connection on the socket, but I didn't find it (perhaps it's somewhere at mailgun/manners).
For reference: Nginx keepalive switches: http://nginx.org/en/docs/http/ngx_http_core_module.html (search for so_keepalive)
Linux kernel also has keepalive settings, but the keepalive feature is opt-in per-socket, so the app has to do the equivalent of setsockopt(so, SOL_SOCKET, SO_KEEPALIVE, 1) before the kernel assists in sending keepalive probes. More of this in link 2.