Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data-classification.md extension #1317

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rob-sessink
Copy link

Provides an extension where an event source can annotate an event with
information around data classification of an event and its payload. CloudEvents
may contain payload which is subjected to data protection regulations like GDPR
or HIPAA. For intermediaries and consumers knowing how event payload is
classified enables compliant processing of an event.

Adds an extension with attributes:

  • dataclassification (Required). Data classification level of an event and
    payload within the context of a data protection regulation.
  • dataregulation (Optional). Applicable data protection regulation.
  • datacategory (Optional). Data category of the event payload within the
    context of data classification and data protection regulation.

Signed-off-by: Rob Sessink <rob.sessink@gmail.com>
`confidential`, `restricted`.
- Constraints:
- REQUIRED
- SHOULD be applicable to data protection regulation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on what this "SHOULD" means? What does someone need to do (from a coding perspective) to adhere to this "SHOULD"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SHOULD statement is merely meant as an indication towards event producers that the data classification label should have its origin within the applicable data-regulation. But maybe this is stating the obvious and from a coding perspective not relevant. Being already stated in the description, it does not add value. I will remove it

`datacategory` attributes MAY be set to provide additional details on the
classification context.

Intermediaries and consumers SHOULD take these attributes into account and act
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this "SHOULD" should be a "MUST" instead? Should a consumer reject a request if it can't meet the data regulation requirements? Are clients expecting some kind of guarantee? Meaning, a non-error means "yup, got it and it'll be protected appropriately". Although, extensions can be ignored... maybe it would need to be worded like: "If an implementation supports this extension, then it MUST reject the event if it can not adhere to the requirements of the specified data classification attributes" ??

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raises an interesting possibility, which is too late for v1 but could be interesting in a future version: if an event could say "consumers/intermediaries must understand extensions x, y and z, and must otherwise reject/ignore the event" then we could be stricter. (So that would be an attribute that's part of the main spec, but the values of which would be names of extension attributes.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes changing this section to be more prescriptive towards consumers is warranted. When an implementation supports this extension, an event MUST be handled in a compliant manner or otherwise MUST be rejected/ignored.

I will adjust the phrasing.

@duglin
Copy link
Collaborator

duglin commented Nov 13, 2024

Can you update the README in the "extensions" dir too?

- Type: `String`
- Description: Data classification level for the event payload within the
context of a `dataregulation`. Typical labels are: `public`, `internal`,
`confidential`, `restricted`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect these values are probably defined by the data regulations being adhered to, but since dataregulation is optional, should this spec define some recommended values for cases where it's missing to provide some consistency?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I feel that is a good approach. I did not want to make the dataregulation attribute required as I feel this is supportive information and not directly mandatory for processing. My intent is that usage of this extension should be as light as possible, meaning less required attributes as possible.

What do you think of:

Description: Data classification level for the event payload within the context of a dataregulation. In a situation where dataregulation is undefined, recommended labels are: public, internal, confidential, or restricted.

…README.md and usage of MUST keyword in example use case

-

Signed-off-by: Rob Sessink <rob.sessink@gmail.com>
cloudevents/extensions/data-classification.md Outdated Show resolved Hide resolved
accordingly to data regulations and/or internal policies when processing the
event and payload.

Intermediaries SHOULD NOT modify the `dataclassification`, `dataregulation`, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a redacting intermediary a valid use case? (I'm guessing not - that they'd end up effectively being a new event producer, as it's no longer the same event really if a bunch of information has been removed. But I thought I'd mention it as a possibility.)

Copy link
Author

@rob-sessink rob-sessink Nov 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that when an intermediary changes an event or payload it becomes a new event. Than also the role of the intermediary shifts to that of event producer and it has responsibility/freedom to define the classification attributes.

Reading the CloudEvents specification, intermediaries forward/route messages and don´t redact them, so I would want to leave it this way.

`datacategory` attributes MAY be set to provide additional details on the
classification context.

Intermediaries and consumers SHOULD take these attributes into account and act
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raises an interesting possibility, which is too late for v1 but could be interesting in a future version: if an event could say "consumers/intermediaries must understand extensions x, y and z, and must otherwise reject/ignore the event" then we could be stricter. (So that would be an attribute that's part of the main spec, but the values of which would be names of extension attributes.)

Signed-off-by: Rob Sessink <rob.sessink@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants