Implement the serialization of tagged values#1643
Implement the serialization of tagged values#1643pyfisch wants to merge 1 commit intoserde-rs:masterfrom pyfisch:tagged-value-serialization
Conversation
Add a new method called serialize_tagged_value to the Serializer trait. The method has a default implementation serializing just the value for all formats without tags.
|
@dtolnay Did you have a look at this? Do you see any problems with this approach? |
|
This would be highly beneficial for us (Actyx), how can we help? |
|
I've implemented the deserialization part based on #408: https://github.com/vmx/serde/tree/tagged-value-deserialization I dare to cross-post my comment from pyfisch/cbor#157 (comment) in hope that this issue might have more people looking at it: I'm not happy with the result. One missing thing is to deserialize the value (without the tag) by default. I just couldn't get it working. I always got trait bounds errors. I hope someone can help with that. I also don't really like the |
|
@dtolnay Is there any chance that tagged values become part of the Serde Data Model given the design constraints outlined at pyfisch/cbor#157 (comment) (unrelated to the implementation proposed in later comments)? I'm asking as only if there is a chance, it makes sense for @pyfisch and me to spend more time on this. I really need tagged values and else I would need to look into other ways solving it and I fear that I would eventually end up with a less good re-implementation of Serde. |
dtolnay
left a comment
There was a problem hiding this comment.
Thanks for the PR! I think unfortunately introducing support for tagged values wouldn't be the right course for Serde. It isn't a goal for the data model to be a union of all the features of data formats that people care about. Rather it's a fairly selective subset that translates well across a range of formats (and maybe would be smaller if it were being redesigned today). Formats that equate naturally with the data model are a good fit for Serde. Formats where a subset can be made to fit the data model (like serde_cbor today) are still often useful for working within that subset. Beyond that, I would recommend building dedicated libraries to get full coverage of individual formats -- such as I've encouraged in the past for formats that care about element types of empty lists and maps, formats that care about preserving comments or whitespace, formats that distinguish between fields and attributes, etc.
|
Thanks for your comment. I am disappointed by your response.
|
|
From my perspective the trouble is that many other data formats also need a different tiny bit of help, and even if each one is only a 5% extension, we really need to draw a line to keep the overall complexity of the core under control. There should be ways for a library with its own traits to integrate with the Serde ecosystem, such as providing blanket impls for its traits for any type that implements Serde's traits. |
|
I agree that projects shouldn't just blindly implement whatever is requested from users. It's important to keep scope so you don't end up in some hard to maintain state where lots of additions are only used by a small subset of people. Though I think things are different with Serde and this request to add tags:
|
|
Is there possibly a way to keep the serde data model as it is, but just add a mechanism for different data formats that have a bit that is outside the serde data model to work? Just a lightweight optional communication channel between values and serializers/deserializers? This could be beneficial not just for tagged values, but also for other things that are currently done in a slightly hackish way, like e.g. RawValue support in serde_json, where a "magic value" is used as a communication channel between a special value and a certain serializer. The tag hack that @vmx came up with in serde_cbor ( pyfisch/cbor#151 ) and the RawValue hack ( https://github.com/serde-rs/json/blob/master/src/raw.rs#L220) kinda have the same basic approach. I have not yet spent time thinking about this, but would you be willing to add something to make the life of formats that go beyond the serde data model easier if it does not include extending the serde data model? |
Yes this could be used for the TOML date type or to differentiate between undefined and null in data formats which usually map to unit in the serde data model.
Tags are kind of a lightweight communication channel. I made sure that they are entirely optional and specific to a given data format so messages are not confused between different formats. I have not checked if the proposed tags are sufficient for deserializing a |
If many of these ugly hacks like RawValue etc. can be made nicer with an additional communication channel, that probably increases the chances of getting the additional communication channel merged. Would having an additional &str parameter for newtype structs be sufficient (as described in #1556 ) |
|
I don't think an additional &'static str parameter would be sufficient as for tags I need to pass two pieces of information, namely the format CBOR and the tag e.g. 42.
But I have not thought about this much, so it may also be possible.
Am 15. November 2019 10:04:09 MEZ schrieb "Rüdiger Klaehn" <notifications@github.com>:
…> Tags are kind of a lightweight communication channel. I made sure
that they are entirely optional and specific to a given data format so
messages are not confused between different formats. I have not checked
if the proposed tags are sufficient for deserializing a RawValue but
serialization would work and I am confident that a design for
deserialization would be found too that enables both tags and raw
values.
If many of these ugly hacks like RawValue etc. can be made nicer with
an additional communication channel, that probably increases the
chances of getting the additional communication channel merged.
Would having an additional &str parameter for newtype structs be
sufficient (as described in
#1556 )
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#1643 (comment)
|
I am just trying to think of something really minimal that has a chance to get accepted. You could of course encode CBOR:42 in a string, but that would be super ugly. Maybe make the additional newtype parameter a |
Add a new method called serialize_tagged_value to the Serializer trait.
The method has a default implementation serializing just the value for all
formats without tags.
Tagged values are the most requested feature of the
serde_cborcrate and they are an important part of working with CBOR documents. But tagged values do not fit neatly into serde's data model therefore this small change to serde is required to enable serialization of tagged values in CBOR and possibly other data formats.Simple CBOR serialization with a string tagged as an URL (tag 32):
To support tagged values in a serializer for a data format it is necessary to implement
serialize_tagged_valueand write a custom serializer as I have done for CBOR: https://github.com/pyfisch/cbor/blob/6ad5ea3ade62ceadc20c5bcc832ac85f6c1d3901/src/ser.rs#L511-L695This PR deliberately only includes serialization and not deserialization as both parts can be discussed and used independently from each other.
What do you think about this extension to serde?