fix(security): replace pickle with JSON serialization in ModelRunner shared-memory IPC#860
fix(security): replace pickle with JSON serialization in ModelRunner shared-memory IPC#860
Conversation
…model_runner Co-authored-by: ChuxiJ <30956809+ChuxiJ@users.noreply.github.com>
ChuxiJ
left a comment
There was a problem hiding this comment.
Code Review
Overall: Solid security improvement. The PR successfully eliminates pickle.loads() from the IPC deserialization path.
Key Items
-
Medium - Buffer size risk: JSON is typically larger than pickle for the same data. The shared memory buffer is fixed at 1 MB (
2**20). For large batches with many sequences containing longtoken_idslists, JSON-encoded payloads could exceed this. Consider adding a size check inwrite_shmto raise a clear error if encoded data exceeds the buffer, rather than silently corrupting memory. -
Info - Remaining attack surfaces (pre-existing, out of scope): The hardcoded shared memory name
"nanovllm"andgetattr(self, method_name)dispatch remain local attack surfaces. Any process on the same machine can write to the shared memory and invoke arbitrary methods onModelRunner. Not a regression from this PR, but worth tracking separately. -
Good practices observed:
_encexplicitly raisesTypeErroron unsupported types (fail-fast)Sequence.__getstate__/__setstate__round-trip is correctly implemented- Thorough test coverage with 8 tests covering round-trips, type preservation, and negative cases
json.loads()correctly prevents arbitrary code execution
Style
- Code follows existing file conventions (module-level underscore-prefixed helpers)
- Docstrings use Google-style format consistently
- The test module-level loading pattern is unconventional but justified by torch import avoidance
No blocking issues.
pickle.loads()was used to deserialize data from a POSIX shared memory segment in the tensor-parallelModelRunnerIPC channel. Any process with access to the shared memory segment could inject a malicious payload and achieve arbitrary code execution.Changes
model_runner.pyimport pickle; addimport json_encode_for_shm(items) -> bytes: serializes[method_name, *args]to UTF-8 JSON.Sequenceobjects are encoded via their existing__getstate__()tuple (ints + lists of ints — all JSON-native). Passing an unsupported type raisesTypeErrorimmediately rather than silently pickling it._decode_from_shm(data) -> list: deserializes JSON and reconstructsSequencevia__setstate__()— nopickle.loadspath.write_shm/read_shmupdated to call the new helpers.model_runner_shm_test.py(new)Eight unit tests covering: prefill/decode-phase
Sequenceround-trips, multi-sequence batches, primitive type preservation, JSON output validation (asserts the first byte is not the pickle magic0x80), unsupported-type rejection, and absence ofpicklein the module namespace.Original prompt
📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.