Bug fixes for some resource leaks and access to freed resources, etc#122
Bug fixes for some resource leaks and access to freed resources, etc#122ston3lu wants to merge 4 commits intovitalif:masterfrom
Conversation
|
Hi. Hope this is helpful. |
|
Hi, thank you very much for your fixes, I'm only doubtful about the "handle missing peer clients" change. Did you actually see crashes at that .at()? Problem is that just checking by peer_fd isn't really correct because if a peer_fd is freed and closed then the FD number may be reused. Checking the peer by osd_client_t* pointer isn't correct either because it can also be reused - allocators like to reuse chunks of the same size. So msgr_receive.cpp tries to guarantee that it never calls exec_op() on destroyed clients. Now it even clears postponed operations from the "immediate callback queue" when the client is destroyed... And if you've seen crashes where the client reference was absent by peer_fd, then it probably means this protection didn't work and that it could also refer to an invalid reused peer_fd as well. So we'd better find and fix the real bug instead of replacing .at() with .find() != .end(). From looking at the code it seems that this protection may only break when a recovery operation is postponed by recovery_target_sleep_us... |
|
"Fix peer connection handling by checking for existing entries" is also similar - did you really observe crashes in on_connect_peer? It shouldn't be possible, it's invoked either from connect_timeout or handle_connect_epoll and it shouldn't be possible to run both for the same FD or to run one of them multiple times. |
Okay... this one occurred in the check_peer_config() callback. wanted_peers.erase() is only called in on_connect_peer(). It means that on_connect_peer() was already called for that OSD with a successful connection! I suspect it's also related to "handle missing peer_fd". I.e. maybe it was a recreated client or something like that... |

I accept Vitastor CLA agreement: https://git.yourcmc.ru/vitalif/vitastor/src/branch/master/CLA-en.md