-
Notifications
You must be signed in to change notification settings - Fork 27
Open
Description
Currently, the create_infer_request method of CompiledModel is defined with &mut self as the receiver:
pub fn create_infer_request(&mut self) -> Result<InferRequest> {
// implementation
}
However, through code analysis and verification with the OpenVINO team, this &mut self constraint is false and redundant:
The method does not modify any internal state of CompiledModel (it only uses self.ptr as a read-only handle to create an InferRequest).
The underlying C API ov_compiled_model_create_infer_request only requires the ov_compiled_model_t* pointer as a read-only parameter, with no mutable operations on the model itself.
The OpenVINO team has confirmed that the underlying C++ methods corresponding to &self (including model handle usage) are thread-safe .
Using &mut self causes unnecessary obstacles for multi-threaded scenarios (e.g., gRPC services sharing CompiledModel across threads), forcing developers to use redundant synchronization primitives (like Mutex) which introduce performance overhead in high-concurrency scenarios.
Proposed Enhancement
Modify the receiver of CompiledModel::create_infer_request from &mut self to &self, while keeping the rest of the implementation logic unchanged. This change will:
Eliminate false mutable constraints and align the Rust API with the actual thread-safe characteristics of the underlying implementation.
Enable lock-free multi-threaded sharing of CompiledModel (after supplementing Sync implementation, which has also been confirmed by the OpenVINO team).
Improve developer experience without introducing any breaking changes (except for removing unnecessary mut bindings).
Additional Context
This change is fully compliant with the thread safety guarantees of OpenVINO's underlying C++ API.
It will better support high-concurrency scenarios such as multi-threaded inference services.Metadata
Metadata
Assignees
Labels
No labels