-
Notifications
You must be signed in to change notification settings - Fork 108
Ported object-file IO onto a safe codec to mitigate gadget chain attack #638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
There are individual operator tests for sources and sinks in each platform. See eg: Which setup would you suggest for testing besides these tests? |
|
|
||
| protected final Class<T> tClass; | ||
|
|
||
| private ObjectFileSerializationMode serializationMode = ObjectFileSerializationMode.LEGACY_JAVA_SERIALIZATION; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If deprecated, shouldn't this utilize the JSON variant by default?
|
|
||
| private final Class<T> tClass; | ||
|
|
||
| private ObjectFileSerializationMode serializationMode = ObjectFileSerializationMode.LEGACY_JAVA_SERIALIZATION; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If deprecated, shouldn't this utilize the JSON variant by default?
|
|
||
| private transient DataOutputViewStreamWrapper outView; | ||
|
|
||
| private ObjectFileSerializationMode serializationMode = ObjectFileSerializationMode.LEGACY_JAVA_SERIALIZATION; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If deprecated, shouldn't this utilize the JSON variant by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should not switch off functions directly, instead give users a certain amount of time to use the new feature for backwards compatibility. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it depends on the consequences for the user. If they have to implement further things to make sure their data types implement Serializable, I agree. If this is a change without any overhead for the user, fixing a potential security threat, then it should be the new default.
| //TODO: remove the set parallelism 1 | ||
| DataSetChannel.Instance input = (DataSetChannel.Instance) inputs[0]; | ||
| ObjectFileSerializationMode serializationMode = this.getSerializationMode(); | ||
| if (serializationMode == ObjectFileSerializationMode.LEGACY_JAVA_SERIALIZATION) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is repeated in every implementation extending ObjectFileSink, so it should just be moved there and called here for checking the serializationMode
|
|
||
| private static final Logger LOGGER = LogManager.getLogger(FlinkObjectFileSink.class); | ||
|
|
||
| private static final AtomicBoolean LEGACY_WARNING_EMITTED = new AtomicBoolean(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field could also be moved to the ObjectFileSink to make it less redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but (I know ...) from past experiences it's better to tell every time a soon deprecated function is called. We can refactor it, main goal is to close the already known security bug.
|
created #640 - my fork gots messy with merges. |
Port object-file IO onto a safe codec
ObjectFileSerializationMode+ helper to switch between legacy Java serialization and a JSON-based format, with unit tests covering both pathsObjectInputStreamMotivation: the legacy path deserializes untrusted SequenceFile payloads via
ObjectInputStream.readObject, letting an attacker ship a gadget chain that runs arbitrary bytecode inside the Wayang JVM. That RCE can be used to execute system commands, exfiltrate data, or tamper with jobs, so we move to JSON by default and require explicit opt-in for the old codec.========================== IMPORTANT =========================
Before we merge test the patch, ideally in a setup with source and sink platforms. Compilation and tests when through in my setup.