-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON serialization emitDefaults option #592
base: master
Are you sure you want to change the base?
JSON serialization emitDefaults option #592
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reporting #585 and the PR, @jmartin127.
I will make another pass tomorrow but added a few inline comments for now.
Would it be possible to test enum
fields too? I think those are not handled currently.
@osa1 Thank you for reviewing this PR. Your feedback was very helpful. I believe I have addressed all of the comments. I also added additional tests that make it very clear what the expected behavior is. Let me know if you have any additional questions or concerns when going through. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments @jmartin127. I added more inline comments. I think this is getting better, but we need to improve the mock class a little bit to properly implement the serialization code. See my inline comments. Please let me know if you disagree, it's possible that I misunderstanding something.
Thank you @osa1 for your analysis of this. I agree this would be more readable. Was trying not to change existing functionality too much, but it definitely is more readable, and will ensure all use cases are covered as you mentioned. I augmented the tests to help ensure uses cases are covered as well. I have a question for you about this part:
You are correct. This is referring to the fact that the JSON object field should be omitted, which I agree is a bug in the current version. What are your thoughts about how we address this? My only concern if we fix this in the current version is it may break clients who are expecting it to behave as it does now, which is not necessarily according to the spec. If we feel good about the overall approach you have outlined here, let me know the answer to the above and I can take a pass at implementing it in this way. Thanks for your assistance on this! |
We discussed this with @sigurdm. Since this is a bug we should fix it. The question is whether we bump major version number or minor, but we don't have to decide this right now, before merging this PR. So we should fix it and merge this PR. |
Excellent. Glad there is a consensus on fixing the existing bug. I'll work on implementing the suggested changes and get it ready for review (hopefully today). |
@osa1 This is now ready for your review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jmartin127 for your persistence on this. protobuf tests pass, but there are some failures in protoc_plugin. To run the tests locally you can run dart pub get; make run-tests
in protoc_plugin
directory. (I think you may also need dart pub get
in protobuf directory before running this)
Could you take a look at those failures? I think for some of them we just need to update the expected outputs. I didn't check all though.
Currently in protobuf.dart there's no easy way to exhaustively check all possible field types, we have to implement a switch
on an int
value (as we do in _mergeFromCodedBufferReader
, CodedBufferWriter._writeValueAs
, and probably other places) or do as in this PR and implement a long if-then-else chain, and manually check (and add tests) that all possible field types are covered. This is error prone and verbose.
In a branch of mine I have this code in PbFieldType
:
enum BaseFieldType {
bool,
bytes,
string,
double,
float,
enum_,
int32,
int64,
sint32,
sint64,
uint32,
uint64,
fixed32,
fixed64,
sfixed32,
sfixed64,
message,
map,
}
...
class PbFieldType {
static BaseFieldType toBaseType(int fieldType) {
final baseType = _baseType(fieldType);
assert(fieldType & (fieldType - 1) == 0,
'base type has more than one bit set: $baseType');
switch (_baseType(fieldType)) {
case PbFieldType._BOOL_BIT:
return BaseFieldType.bool;
case PbFieldType._BYTES_BIT:
return BaseFieldType.bytes;
case PbFieldType._STRING_BIT:
return BaseFieldType.string;
case PbFieldType._DOUBLE_BIT:
return BaseFieldType.double;
case PbFieldType._FLOAT_BIT:
return BaseFieldType.float;
case PbFieldType._ENUM_BIT:
return BaseFieldType.enum_;
case PbFieldType._GROUP_BIT:
throw ArgumentError('TODO: How to handle groups?');
case PbFieldType._INT32_BIT:
return BaseFieldType.int32;
case PbFieldType._INT64_BIT:
return BaseFieldType.int64;
case PbFieldType._SINT32_BIT:
return BaseFieldType.sint32;
case PbFieldType._SINT64_BIT:
return BaseFieldType.sint64;
case PbFieldType._UINT32_BIT:
return BaseFieldType.uint32;
case PbFieldType._UINT64_BIT:
return BaseFieldType.uint64;
case PbFieldType._FIXED32_BIT:
return BaseFieldType.fixed32;
case PbFieldType._FIXED64_BIT:
return BaseFieldType.fixed64;
case PbFieldType._SFIXED32_BIT:
return BaseFieldType.sfixed32;
case PbFieldType._SFIXED64_BIT:
return BaseFieldType.sfixed64;
case PbFieldType._MESSAGE_BIT:
return BaseFieldType.message;
case PbFieldType._MAP_BIT:
return BaseFieldType.map;
default:
throw ArgumentError('TODO: write an error message');
}
}
}
With this type and static method, I can do
switch (PbFieldType.toBaseType(fieldType)) {
...
}
and the compiler makes sure I cover all cases.
I wonder if we should incorporate this code in this PR and simplify the large if-then-else chain in this PR. We could add it in another PR and update the code in binary encoders and decoders, and then use it in this PR. Any thoughts on this @sigurdm?
Secondly, I think there should be simple ways to
- Check if a field has the default value
- Generate the default value for a field
Given (2), (1) can be implemented with value == defaultFieldValue
. (==
could be implemented as identity equality or structural, does not matter, as for complex fields we represent the default or unset field values as null
)
As far as I can see we don't have (2) right now. We have methods like FieldInfo.readonlyDefault
, but it returns an empty message rather than null for message fields, which is not right according to spec (see #309). I suspect it also returns empty map instead of null
for map fields which is also not right. The requirement here is that this method should returning the value that represents a field that is not set, or set to the default value. For messages and maps it's null
.
Given that this is an existing issue and it may be a while until we fix them, I'm OK with merging this once my inline comments are addressed and the tests are fixed, and tests for map fields are added.
@sigurdm could you also take a look at this PR?
if (context.emitDefaults || value != 0.0) { | ||
skipField = false; | ||
jsonValue = valueToProto3Json(value, fieldInfo.type); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jmartin127 , this is not much easier to follow now.
I still have one suggestion though. In the outer if
, in the first few cases you check the field type, which makes sense. But after the bool
case you start checking the value type rather than field type. Ideally we should be able to check just the field type, and the value type should be as expected for the field type (i.e. value is int
if field type is int32
). I have a suggestion in my top-level comment on how to make this easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this part was a bit wonky. The reason I switched from value types to field types was because there are just so many numeric types to handle, so it became more readable to check the value types. I think if we follow you top-level comment suggestion, then this is no longer an issue. I'll wait for your and @sigurdm to reply on the best path forward there (if we go with the switch, from your other PR).
.toList(); | ||
} | ||
} else if (fieldInfo.isMapField) { | ||
if (context.emitDefaults || (value as Map).isNotEmpty) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: shouldn't default value for maps be null
? I think maps are considered the same as messages in the spec, at least when talking about default values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question. I know that the emitDefaults
option within the Go library serializes maps that are not set to an empty map value: {}
. The spec is not very clear on this point.
Let me know if we are thinking this should be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be related to #309. I say not a blocker for this PR.
List<int> get byts => $_get(8, []); | ||
set byts(x) => setField(10, x); | ||
|
||
Map<String, String> get mp => $_getMap(9); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like you don't try to set the map field in the tests. Is there a reason for this? To make sure we handle maps correctly we should set the map field too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The map value is getting set, just in a different way... For consistency, I followed the existing example of how the int32s are set using the addAll
function.
See here for an example of setting/testing the map field.
https://github.com/google/protobuf.dart/pull/592/files#diff-f9dfd1b28cb8ed5241edf3d57212698aeb5c6b3faa64d982213236cba5bd89f8R44
With this in mind, it is currently being tested, but if it would be better to set the map differently, just let me know.
The unit tests within |
@osa1 I like this idea. I was also not liking the if/else, but wasn't sure how to incorporate a switch with the existing code base. That would be great to use the code you mentioned above in this PR. |
@osa1 I believe I have addressed all of your comments. The major outstanding issue is the if/else vs. switch. I personally like the switch much better, but we'd just need your other code merged in first I believe. Lemme know. Thanks for all of the great feedback and insightful review. |
Re: updated tests, I found it a bit strange that we test proto2 files with proto3 serialization. Is this expected @sigurdm? JSON serialization implemented in |
I've always assumed that proto2-fields serialize more or less like their proto3 counterparts. Especially since you can have a proto2 message type as a field of a proto3 message. It would be problematic to deny serializing it. (That is, I look at the proto3-json serialization format as independent of the proto3-protobuf language definition)... I don't know of any cases where this causes trouble. But I might be overlooking something |
Hm, I didn't know this was possible. Is the other way also possible? I.e. proto2 message having proto3 message field? Since proto3 can have proto2 field, and we can't say something like "this field is encoded as proto2", serialization of proto2 messages should follow the proto3 spec. So changes in the tests should be OK. |
I submitted a PR for this: #605. Currently protoc_plugin fails when building test protos so it needs a bit more debugging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay @jmartin127.
It will take time to improve the field type API so I think we should merge this before #605. We can refactor the code in #605.
I found an issue with this PR. Described it in my inline comment above and provided a fix.
When merging master just ignore the code in HEAD and overwrite with your code.
continue; // It's missing, repeated, or an empty byte array. | ||
} | ||
dynamic jsonValue; | ||
if (fieldInfo.isMapField) { | ||
jsonValue = (value as PbMap).map((key, entryValue) { | ||
var mapEntryInfo = fieldInfo as MapFieldInfo; | ||
return MapEntry(convertToMapKey(key, mapEntryInfo.keyFieldType!), | ||
valueToProto3Json(entryValue, mapEntryInfo.valueFieldType)); | ||
}); | ||
} else if (fieldInfo.isRepeated) { | ||
jsonValue = (value as PbListBase) | ||
.map((element) => valueToProto3Json(element, fieldInfo.type)) | ||
.toList(); | ||
// if the value for this field is null, this function will return the | ||
// default value, which simplifies the need to handle null for most cases | ||
// below | ||
var value = fs._getField(fieldInfo.tagNumber); | ||
|
||
bool skipField = true; | ||
Object? jsonValue; | ||
if (fieldInfo.isRepeated) { | ||
if (context.emitDefaults || (value as List).isNotEmpty) { | ||
skipField = false; | ||
jsonValue = (value as PbListBase) | ||
.map((element) => valueToProto3Json(element, fieldInfo.type)) | ||
.toList(); | ||
} | ||
} else if (fieldInfo.isMapField) { | ||
if (context.emitDefaults || (value as Map).isNotEmpty) { | ||
skipField = false; | ||
jsonValue = (value as PbMap).map((key, entryValue) { | ||
var mapEntryInfo = fieldInfo as MapFieldInfo; | ||
return MapEntry(convertToMapKey(key, mapEntryInfo.keyFieldType!), | ||
valueToProto3Json(entryValue, mapEntryInfo.valueFieldType)); | ||
}); | ||
} | ||
} else if (_isBytes(fieldInfo.type)) { | ||
if (context.emitDefaults || (value as List).isNotEmpty) { | ||
skipField = false; | ||
if ((value as List).isEmpty) { | ||
jsonValue = null; | ||
} else { | ||
jsonValue = valueToProto3Json(value, fieldInfo.type); | ||
} | ||
} | ||
} else if (fieldInfo.isEnum) { | ||
// For enums, the default value is the first value listed in the enum's type definition | ||
final defaultEnum = fieldInfo.enumValues!.first; | ||
if (context.emitDefaults || | ||
(value as ProtobufEnum).name != defaultEnum.name) { | ||
skipField = false; | ||
jsonValue = valueToProto3Json(value, fieldInfo.type); | ||
} | ||
} else if (fieldInfo.isGroupOrMessage) { | ||
final originalValue = fs._values[fieldInfo.index!]; | ||
if (context.emitDefaults || originalValue != null) { | ||
skipField = false; | ||
if (originalValue == null) { | ||
jsonValue = null; | ||
} else { | ||
jsonValue = valueToProto3Json(value, fieldInfo.type); | ||
} | ||
} | ||
} else if (PbFieldType._baseType(fieldInfo.type) == | ||
PbFieldType._STRING_BIT) { | ||
if (context.emitDefaults || (value as String).isNotEmpty) { | ||
skipField = false; | ||
jsonValue = valueToProto3Json(value, fieldInfo.type); | ||
} | ||
} else if (PbFieldType._baseType(fieldInfo.type) == PbFieldType._BOOL_BIT) { | ||
if (context.emitDefaults || value != false) { | ||
skipField = false; | ||
jsonValue = valueToProto3Json(value, fieldInfo.type); | ||
} | ||
} else if (value is Int64) { | ||
if (context.emitDefaults || !value.isZero) { | ||
skipField = false; | ||
jsonValue = valueToProto3Json(value, fieldInfo.type); | ||
} | ||
} else if (value is int) { | ||
if (context.emitDefaults || value != 0) { | ||
skipField = false; | ||
jsonValue = valueToProto3Json(value, fieldInfo.type); | ||
} | ||
} else if (value is double) { | ||
if (context.emitDefaults || value != 0.0) { | ||
skipField = false; | ||
jsonValue = valueToProto3Json(value, fieldInfo.type); | ||
} | ||
} else { | ||
jsonValue = valueToProto3Json(value, fieldInfo.type); | ||
throw JsonSerializationException( | ||
'Unexpected field type for proto3 JSON serialization ${fieldInfo.type}'); | ||
} | ||
|
||
if (!skipField) { | ||
result[fieldInfo.name] = jsonValue; | ||
} | ||
result[fieldInfo.name] = jsonValue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discovered an interesting case regarding JSON serialization and proto2. proto2 spec doesn't specify JSON encoding, only proto3 does. However toProto3Json
can be called on both proto2 and proto3 messages. So if I have a proto2 message like
syntax = "proto2";
message Test {
optional int32 a = 1 [default = 123];
}
I can convert this to proto3 JSON using toProto3Json
, but in this PR we assume proto3 defaults -- i.e. for numeric fields the default to be 0. This doesn't hold when the message syntax is proto2. As a result output of this code is somewhat surprising:
Test test1 = Test.create()..a=0;
print(test1.toProto3Json()); // omits `a`
Test test2 = Test.create()..a=123;
print(test2.toProto3Json()); // generates `a`
It's possible to fix this issue and simplify the code at the same time. Here's my suggestion:
for (var fieldInfo in fs._infosSortedByTag) { | |
var value = fs._values[fieldInfo.index!]; | |
if (value == null || (value is List && value.isEmpty)) { | |
continue; // It's missing, repeated, or an empty byte array. | |
} | |
dynamic jsonValue; | |
if (fieldInfo.isMapField) { | |
jsonValue = (value as PbMap).map((key, entryValue) { | |
var mapEntryInfo = fieldInfo as MapFieldInfo; | |
return MapEntry(convertToMapKey(key, mapEntryInfo.keyFieldType!), | |
valueToProto3Json(entryValue, mapEntryInfo.valueFieldType)); | |
}); | |
} else if (fieldInfo.isRepeated) { | |
jsonValue = (value as PbListBase) | |
.map((element) => valueToProto3Json(element, fieldInfo.type)) | |
.toList(); | |
// if the value for this field is null, this function will return the | |
// default value, which simplifies the need to handle null for most cases | |
// below | |
var value = fs._getField(fieldInfo.tagNumber); | |
bool skipField = true; | |
Object? jsonValue; | |
if (fieldInfo.isRepeated) { | |
if (context.emitDefaults || (value as List).isNotEmpty) { | |
skipField = false; | |
jsonValue = (value as PbListBase) | |
.map((element) => valueToProto3Json(element, fieldInfo.type)) | |
.toList(); | |
} | |
} else if (fieldInfo.isMapField) { | |
if (context.emitDefaults || (value as Map).isNotEmpty) { | |
skipField = false; | |
jsonValue = (value as PbMap).map((key, entryValue) { | |
var mapEntryInfo = fieldInfo as MapFieldInfo; | |
return MapEntry(convertToMapKey(key, mapEntryInfo.keyFieldType!), | |
valueToProto3Json(entryValue, mapEntryInfo.valueFieldType)); | |
}); | |
} | |
} else if (_isBytes(fieldInfo.type)) { | |
if (context.emitDefaults || (value as List).isNotEmpty) { | |
skipField = false; | |
if ((value as List).isEmpty) { | |
jsonValue = null; | |
} else { | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
} | |
} | |
} else if (fieldInfo.isEnum) { | |
// For enums, the default value is the first value listed in the enum's type definition | |
final defaultEnum = fieldInfo.enumValues!.first; | |
if (context.emitDefaults || | |
(value as ProtobufEnum).name != defaultEnum.name) { | |
skipField = false; | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
} | |
} else if (fieldInfo.isGroupOrMessage) { | |
final originalValue = fs._values[fieldInfo.index!]; | |
if (context.emitDefaults || originalValue != null) { | |
skipField = false; | |
if (originalValue == null) { | |
jsonValue = null; | |
} else { | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
} | |
} | |
} else if (PbFieldType._baseType(fieldInfo.type) == | |
PbFieldType._STRING_BIT) { | |
if (context.emitDefaults || (value as String).isNotEmpty) { | |
skipField = false; | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
} | |
} else if (PbFieldType._baseType(fieldInfo.type) == PbFieldType._BOOL_BIT) { | |
if (context.emitDefaults || value != false) { | |
skipField = false; | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
} | |
} else if (value is Int64) { | |
if (context.emitDefaults || !value.isZero) { | |
skipField = false; | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
} | |
} else if (value is int) { | |
if (context.emitDefaults || value != 0) { | |
skipField = false; | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
} | |
} else if (value is double) { | |
if (context.emitDefaults || value != 0.0) { | |
skipField = false; | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
} | |
} else { | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
throw JsonSerializationException( | |
'Unexpected field type for proto3 JSON serialization ${fieldInfo.type}'); | |
} | |
if (!skipField) { | |
result[fieldInfo.name] = jsonValue; | |
} | |
result[fieldInfo.name] = jsonValue; | |
} | |
for (var fieldInfo in fs._infosSortedByTag) { | |
// Value of the field, or the default for the field if the field is not set | |
final dynamic value = fs._getField(fieldInfo.tagNumber); | |
// Whether the field should be skipped. In proto3 JSON encoding default | |
// values are omitted by default. | |
// (https://developers.google.com/protocol-buffers/docs/proto3#json) | |
bool skipField = true; | |
// Serialized JSON value for the field. Only set and used when `skipField` | |
// is false. | |
Object? jsonValue; | |
if (fieldInfo.isRepeated) { | |
if (context.emitDefaults || value != fieldInfo.readonlyDefault) { | |
skipField = false; | |
jsonValue = (value as PbListBase) | |
.map((element) => valueToProto3Json(element, fieldInfo.type)) | |
.toList(); | |
} | |
} else if (fieldInfo.isMapField) { | |
if (context.emitDefaults || value != fieldInfo.readonlyDefault) { | |
skipField = false; | |
jsonValue = (value as PbMap).map((key, entryValue) { | |
var mapEntryInfo = fieldInfo as MapFieldInfo; | |
return MapEntry(convertToMapKey(key, mapEntryInfo.keyFieldType), | |
valueToProto3Json(entryValue, mapEntryInfo.valueFieldType)); | |
}); | |
} | |
} else if (_isBytes(fieldInfo.type)) { | |
if (context.emitDefaults || !_areListsEqual(value, fieldInfo.readonlyDefault)) { | |
skipField = false; | |
if ((value as List).isEmpty) { | |
jsonValue = null; | |
} else { | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
} | |
} | |
} else if (fieldInfo.isGroupOrMessage) { | |
// `_FieldInfo._getField` and `_FieldInfo.readonlyDefault` return empty | |
// message instead of `null` when a field with message type is not set, | |
// see #309. | |
final originalValue = fs._values[fieldInfo.index!]; | |
if (context.emitDefaults || originalValue != null) { | |
skipField = false; | |
if (originalValue == null) { | |
jsonValue = null; | |
} else { | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
} | |
} | |
} else { | |
// enum, string, bytes, numerics, map | |
if (context.emitDefaults || value != fieldInfo.readonlyDefault) { | |
skipField = false; | |
jsonValue = valueToProto3Json(value, fieldInfo.type); | |
} | |
} | |
if (!skipField) { | |
result[fieldInfo.name] = jsonValue; | |
} | |
} |
I've also added some comments.
You'll notice that two tests will break with this change, but if you look at the test proto you'll see that we actually set the default value as 42
in those tests, so actually we fix the test. We should update the expect
s.
#659 shows an actual problem (found in downstream code, caused breakage) with |
@osa1 I want to get your thoughts on this PR. Is it valuable, should I bring it up to date with master and get this across the finish line? For my purposes, it may not be needed anymore, but it may be valuable to others in the future? |
@jmartin127 this fixes a bug in the library so we definitely want to merge it, whether it's a blocker for someone today or not. You did most of the work already (thanks!), we just need to simplify it a little bit and remove the
I don't want you to feel obliged of course (this is voluntary work). I can finish this in a separate PR (created from your branch) and merge it, as I've been doing recently with some of the other, older PRs. |
Similar to the existing JsonParsingContext for customizing deserialization of JSON via mergeFromProto3Json, this PR adds a new JsonSerializationContext for customizing the serialization of JSON via toProto3Json.
This addresses #585. Tests have been added for common uses cases to ensure that the proper default values are serialized for each use case.