Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect and report all duplicated keys #170

Closed
wants to merge 3 commits into from
Closed

Detect and report all duplicated keys #170

wants to merge 3 commits into from

Conversation

MiSikora
Copy link

@MiSikora MiSikora commented Aug 14, 2021

This is binary and source incompatible change as exceptions hierarchy and structure is changed. If library users rely on path properties of YamlException they will not find them in that type anymore. I decided to change it this way to make sure that everyone still catches all exceptions as DuplicateKeysException does not extend previous YamlException (now SinglePathYamlException).

I created a separate test suite due to #27.

Closes #169

?.let { throw DuplicateKeysException(it) }
}

private tailrec fun accumulateDuplicateKeys(
Copy link
Author

@MiSikora MiSikora Aug 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function and iteration over all nodes could be avoided if something like YamlNodeFactory was added and make it its responsibility to calculate duplicates (and possibly do other stuff for different node types). However, I didn't want to change too much in the code structure.

Copy link
Owner

@charleskorn charleskorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I have a couple of questions and suggestions, please see my comments.

@@ -267,7 +267,7 @@ private sealed class YamlMapLikeInputBase(map: YamlMap, context: SerializersModu
protected fun <T> fromCurrentValue(action: YamlInput.() -> T): T {
try {
return action(currentValueDecoder)
} catch (e: YamlException) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this stay as YamlException?

Copy link
Author

@MiSikora MiSikora Aug 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the current code only SinglePathYamlException can be thrown here. It probably would be more future–proof to catch 'em all and do more checks inside. But if you'd go with not traversing the whole tree the root exception could stay the same so this change would be reverted.

init {
val keys = entries.keys.sortedWith { a, b ->
val lineComparison = a.location.line.compareTo(b.location.line)
private val encounteredKeys = mutableMapOf<String, YamlScalar>()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we wouldn't move these into the init block and just immediately throw the exception in there?

Copy link
Author

@MiSikora MiSikora Aug 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If exception is thrown immediately there, there is no way to traverse the whole tree and accumulate duplicates.

However, after reading your comment I believe it is something you'd prefer.

expect({ parser.decodeFromString<MapStructure>(yaml) }).toThrow<DuplicateKeysException> {
message.toBe(
"""
Found duplicates of key 'key1' defined at line 1, column 1:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the error message would be clearer if it read as something like this:

Key 'key1' is only allowed once, but is defined multiple times:
- at line 1, column 1
- at line 2, column 1

What do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this looks better. I didn't have a good idea for the message.

@charleskorn
Copy link
Owner

Just had an idea that might make this change less invasive - we could continue to use the "old" YamlException, including its path and location information as the base for DuplicateKeysException. The path and location information would refer to the parent map that contains the duplicate keys, and then DuplicateKeysException could contain information about the duplicate keys, similar to what's in this PR.

So if we had this input YAML:

some:
  thing:
    key1: blah
    key2: foo
    key1: bar
    key2: baz

We'd end up with a DuplicateKeysException with the path of some.thing, line 2, column 8, and then duplicates for key1 on lines 3 and 5, and duplicates for key2 on lines 4 and 6.

I don't think we need to report all duplicates in all locations throughout the tree. Doing this would be quite different behaviour to all the other exceptions kaml currently throws, as kaml currently just throws an exception on the first issue, rather than trying to aggregate all of them.

How does that sound?

@MiSikora
Copy link
Author

MiSikora commented Aug 15, 2021

I don't think we need to report all duplicates in all locations throughout the tree. Doing this would be quite different behaviour to all the other exceptions kaml currently throws, as kaml currently just throws an exception on the first issue, rather than trying to aggregate all of them.

How does that sound?

In that case it is not something I was hoping for so it would be better to close this PR (or adjust it to your needs) as I'll need to fork it or make a local copy in the project I'm currently writing.

@MiSikora MiSikora closed this Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Accumulate duplicated keys
2 participants