Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify Content-Type / Content-Encoding / Content-Language handling #160

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nigoroll
Copy link
Contributor

@nigoroll nigoroll commented Sep 9, 2020

see #158 for discussion

@Acconut
Copy link
Member

Acconut commented Sep 16, 2020

I am not sure if adding recommendations about Content-Encoding and Content-Language headers is a good idea. If I understand correctly, these headers are mostly relevant for GET requests, which tus does neither mention nor use. Of course, if your server supports GET requests nevertheless, the Content-Encoding and Content-Language headers may also be included in HEAD responses but that in itself is not relevant for tus. Is it understandable what I mean?

The changes regarding metadata are good, but let's clarify the other point first.

@nigoroll
Copy link
Contributor Author

(force-pushed only to resolve the merge conflict, no content changed)

re @Acconut regarding your interpretation that the HTTP metadata headers were specific to GET please see #158 (comment) - they are specific to the resource and in the case of tus that is the uploaded object.

Regarding Content-Encoding, if a client chose to use it and the tus server simply ignored it, things would get terribly wrong. For example, if a client uploaded text/html with Content-Encoding: gzip and the server neither decoded the upload not stored the information about the encoding, the upload would be corrupt in the sense that the encoded form simply is not text/html.

Please note that RFC7231 explicitly considers uploads in this paragraph:

An origin server MAY respond with a status code of 415 (Unsupported Media Type) if a representation in the request message has a content coding that is not acceptable.

@smatsson
Copy link

The specific metadata keys documented herein are reserved for the
respective use and MUST NOT be used for other purposes.

Note really a fan of this part at all. There are multiple implementations out there already and if they use any of the keys defined for anything else than this update says then they would be in violation with the spec.

In other parts I agree that this would be a good addition to the spec.

@nigoroll
Copy link
Contributor Author

nigoroll commented Sep 17, 2020

@smatsson would you have a better suggestion to reserving the metadata keys? My original suggestion was to make a breaking change and use Content-Type with the next tus major version.

@smatsson
Copy link

@nigoroll To be the devil's advocate: Why would we need to reserve metadata keys to begin with? The 1.0 version of protocol has been around since 2015 and haven't included any reserved keys. I think that the openness of not reserving metadata keys is a good thing as words have different meanings in different contexts. For example I would not translate "filetype" to Content-Type but rather to file extension ("It's a pdf"). I don't necessarily disagree with having reserved keys, it's just that it's 5 years to late. Breaking changes to a protocol that is out in the wild is not a good thing as issues will arise.

@nigoroll
Copy link
Contributor Author

nigoroll commented Sep 17, 2020

@smatsson we cannot have agreement on semantics without agreement on how to represent that semantics. My impression from the discussion in #158 was that the filename and filetype metadata keys were a de-facto standard because uppy/tusd use them.
But you are right, no matter if we reserved metadata keys or brought back the original Content-Type semantics, both would be breaking changes and thus formally require a protocol major version bump.
This PR was intended as a compromise trying to avoid that, but in fact I think that a new version with changed semantics would be the better option.

@smatsson
Copy link

@nigoroll I'm only concerned with the breaking changes and that the protocol will be ambiguous between implementations and when they were implemented (as the spec changed). As I said earlier I don't really object to the change per se, just the breaking change part. If you and @Acconut feel strongly that this should go in the spec it's fine by me.


##### [Content-Language](https://httpwg.org/specs/rfc7231.html#header.content-language)

Clients and Servers SHOULD support the ``Content-Language`` header.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate more what "supporting the Content-Language header" means? To be concrete: Can you say what behavior a typical tus server (e.g. tusd) must have to support this header?

accept the ``Content-Encoding`` chosen by the client.

Servers MUST either store the ``Content-Encoding`` and deliver it with
subsequent requests, or properly decode the content before storing it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deliver it with subsequent requests

Does this mean that the Content-Encoding header must be present in the response for every PATCH, DELETE and HEAD request?

protocol.md Outdated
* ``filename`` for a common file name

The specific metadata keys documented herein are reserved for the
respective use and MUST NOT be used for other purposes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @smatsson that this sentence is too restrictive. It would be a breaking change and there are no plans to make a major release for tus. I don't think this would serve the ecosystem well.

That being said, I am happy to recommend (not force) the usage of the filename and filetype metadata keys for the purposes as laid out here.

@Acconut
Copy link
Member

Acconut commented Sep 18, 2020

@nigoroll Would it be OK if I add some commits to this PR to show what I mean with some of my comments?

@nigoroll
Copy link
Contributor Author

@Acconut sure

@smatsson
Copy link

Do we need to consider the Content-Encoding changing between requests? E.g. the file is created with Content-Encoding: gzip but the client then decides to not include it (and not encoding the content either) for subsequent requests?

@Acconut
Copy link
Member

Acconut commented Sep 29, 2020

I adjusted the phrasing of the filename and filetype metadata keys to match the rest of the document and to also ensure that this is not a breaking change. Let me know what you think

@nigoroll There are also some questions from me above. Could you have a look at them when you have the time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants