-
Notifications
You must be signed in to change notification settings - Fork 0
7983: rt: (io-uring) ensure fds are closed when invoking cancel op #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
96188ed
970f3c7
166374f
1d04d3f
a3f25f3
08aa7bc
555c9ec
2b92733
99a243d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,12 +2,14 @@ use io_uring::{squeue::Entry, IoUring, Probe}; | |
| use mio::unix::SourceFd; | ||
| use slab::Slab; | ||
|
|
||
| use crate::runtime::driver::op::CancelData; | ||
| use crate::runtime::driver::op::CqeResult; | ||
| use crate::runtime::driver::op::{Cancellable, Lifecycle}; | ||
| use crate::{io::Interest, loom::sync::Mutex}; | ||
|
|
||
| use super::{Handle, TOKEN_WAKEUP}; | ||
|
|
||
| use std::os::fd::{AsRawFd, RawFd}; | ||
| use std::os::fd::{AsRawFd, FromRawFd, OwnedFd, RawFd}; | ||
| use std::{io, mem, task::Waker}; | ||
|
|
||
| const DEFAULT_RING_SIZE: u32 = 256; | ||
|
|
@@ -77,9 +79,14 @@ impl UringContext { | |
| waker.wake_by_ref(); | ||
| *ops.get_mut(idx).unwrap() = Lifecycle::Completed(cqe); | ||
| } | ||
| Some(Lifecycle::Cancelled(_)) => { | ||
| Some(Lifecycle::Cancelled(CancelData::Open(_))) => { | ||
| if let Ok(fd) = CqeResult::from(cqe).result { | ||
| // SAFETY: the successful CQE result provides | ||
| // a non-negative integer, and the event is | ||
| // related to an open operation. | ||
| unsafe { OwnedFd::from_raw_fd(fd as i32) }; | ||
| } | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cancelled Read/Write operations cause runtime panicHigh Severity The previous code matched all Additional Locations (1)
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. value:useful; category:bug; feedback: The Bugbot AI reviewer is correct! Before the proposed changes any cancelation will lead to removing the index from the list of operations. With the proposed changes only canceled Open operations are handled. Prevents leaking index operations for all but the Open operations.
Comment on lines
+82
to
+88
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This still misses the post-completion cancel path. This only handles the case where the slot is already 🤖 Prompt for AI Agents |
||
| // Op future was cancelled, so we discard the result. | ||
| // We just remove the entry from the slab. | ||
| ops.remove(idx); | ||
| } | ||
|
Comment on lines
+82
to
91
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This change correctly handles closing file descriptors for cancelled To fix this, you should handle all Additionally, using Some(Lifecycle::Cancelled(cancel_data)) => {
if matches!(cancel_data, CancelData::Open(_)) {
if let Ok(fd) = CqeResult::from(cqe).result {
// SAFETY: the successful CQE result provides
// a non-negative integer, and the event is
// related to an open operation.
let _ = unsafe { OwnedFd::from_raw_fd(fd as i32) };
}
}
// Op future was cancelled, so we discard the result.
ops.remove(idx);
}
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. value:useful; category:bug; feedback: The Gemini AI reviewer is correct! Before the proposed changes any cancelation will lead to removing the index from the list of operations. With the proposed changes only canceled Open operations are handled. Prevents leaking index operations for all but the Open operations.
Comment on lines
+82
to
91
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't regress cancelled read/write ops into a panic.
Suggested fix- Some(Lifecycle::Cancelled(CancelData::Open(_))) => {
- if let Ok(fd) = CqeResult::from(cqe).result {
- // SAFETY: the successful CQE result provides
- // a non-negative integer, and the event is
- // related to an open operation.
- unsafe { OwnedFd::from_raw_fd(fd as i32) };
- }
+ Some(Lifecycle::Cancelled(cancel_data)) => {
+ let is_open = matches!(cancel_data, CancelData::Open(_));
+ if is_open {
+ if let Ok(fd) = CqeResult::from(cqe).result {
+ // SAFETY: the successful CQE result provides
+ // a non-negative integer, and the event is
+ // related to an open operation.
+ drop(unsafe { OwnedFd::from_raw_fd(fd as i32) });
+ }
+ }
// Op future was cancelled, so we discard the result.
ops.remove(idx);
}🤖 Prompt for AI Agents
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. value:useful; category:bug; feedback: The CodeRabbit AI reviewer is correct! Before the proposed changes any cancelation will lead to removing the index from the list of operations. With the proposed changes only canceled Open operations are handled. Prevents leaking index operations for all but the Open operations. |
||
| Some(other) => { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,6 +9,7 @@ | |
| ))] | ||
|
|
||
| use futures::future::FutureExt; | ||
| use std::fs; | ||
| use std::sync::mpsc; | ||
| use std::task::Poll; | ||
| use std::time::Duration; | ||
|
|
@@ -145,6 +146,53 @@ async fn cancel_op_future() { | |
| assert!(res.is_cancelled()); | ||
| } | ||
|
|
||
| // see: https://github.com/tokio-rs/tokio/issues/7979 | ||
| #[tokio::test] | ||
| async fn file_descriptors_are_closed_when_cancelling_open_op() { | ||
| let (_tmp_file, path): (Vec<NamedTempFile>, Vec<PathBuf>) = create_tmp_files(1); | ||
|
|
||
| let fd_count_before_access = fs::read_dir("/proc/self/fd").unwrap().count(); | ||
|
|
||
| for _ in 0..128 { | ||
| let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel(); | ||
|
|
||
| let path = path.clone(); | ||
| let handle = tokio::spawn(async move { | ||
| poll_fn(|cx| { | ||
| let opt = { | ||
| let mut opt = tokio::fs::OpenOptions::new(); | ||
| opt.read(true); | ||
| opt | ||
| }; | ||
|
|
||
| let fut = opt.open(&path[0]); | ||
|
|
||
| // If io_uring is enabled (and not falling back to the thread pool), | ||
| // the first poll should return Pending. | ||
| let _pending = Box::pin(fut).poll_unpin(cx); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This stores the first poll result in Severity: medium 🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. value:useful; category:bug; feedback: The Augment AI reviewer is correct! The result of poll_unpin() is just ignored assuming that it is Pending but it is not asserted. Prevents silently passing the test even when the result is not Pending |
||
|
|
||
| tx.send(()).unwrap(); | ||
|
|
||
| Poll::<()>::Pending | ||
| }) | ||
| .await; | ||
| }); | ||
|
|
||
| // Wait for the first poll | ||
| rx.recv().await.unwrap(); | ||
|
|
||
| handle.abort(); | ||
|
|
||
| let res = handle.await.unwrap_err(); | ||
| assert!(res.is_cancelled()); | ||
| } | ||
|
|
||
| let fd_count_after_cancel = fs::read_dir("/proc/self/fd").unwrap().count(); | ||
| let leaked = fd_count_after_cancel.saturating_sub(fd_count_before_access); | ||
|
|
||
| assert!(leaked <= 64); | ||
|
Comment on lines
+154
to
+193
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This regression can pass without ever exercising the bug.
🤖 Prompt for AI Agents
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. value:useful; category:bug; feedback: The CodeRabbit AI reviewer is correct! The result of poll_unpin() is just ignored assuming that it is Pending but it is not asserted. Prevents silently passing the test even when the result is not Pending |
||
| } | ||
|
|
||
| fn create_tmp_files(num_files: usize) -> (Vec<NamedTempFile>, Vec<PathBuf>) { | ||
| let mut files = Vec::with_capacity(num_files); | ||
| for _ in 0..num_files { | ||
|
|
||


There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CancelDataalso hasRead/Writevariants; with this match only handlingCancelled(Open(_)), any cancelled read/write that later completes will now hit theSome(other)arm and panic the runtime. That seems like a functional regression from the previousCancelled(_)handling (remove-and-discard).Severity: high
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value:useful; category:bug; feedback: The Augment AI reviewer is correct! Before the proposed changes any cancelation will lead to removing the index from the list of operations. With the proposed changes only canceled Open operations are handled. Prevents leaking index operations for all but the Open operations.