Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async File uploads #4572

Closed
exalate-issue-sync bot opened this issue Sep 13, 2022 · 7 comments
Closed

Async File uploads #4572

exalate-issue-sync bot opened this issue Sep 13, 2022 · 7 comments
Labels
p3-medium Status:Stale Type:Epic Epic is the parent of user stories

Comments

@exalate-issue-sync
Copy link

After a client finished the upload of a file (ie. transfered the last byte), it should get the upload success code immediately if all bytes were transfered successfully.

From the clients perspective, upload means transfering bytes through the network to the server. As soon as that has happened, the client should get a positive answer immediately.

Things that technically might take long time (assembling of huge files after upload) should be transparent to the client. It does not have to wait for it and should not have to poll for success.

Operation steps like virus scan which also might take long time are actually a workflow step and should be handled as that.

It needs to be defined how this can be done properly from the technical and the user experience POV.

@exalate-issue-sync exalate-issue-sync bot added p3-medium Type:Epic Epic is the parent of user stories labels Sep 13, 2022
@exalate-issue-sync
Copy link
Author

exalate-issue-sync bot commented Sep 13, 2022

Klaas Freitag commented: # Specification:

The upload workflow in oCIS should - as a first step - work like this:

Client defined UUIDs

Clients are allowed to send a UUID in a header of the upload with the first request. That UUID is used as the file ID in oCIS. That way the client does not have to wait for the server side processing to be completed to get the file ID. The file ID is simply set by the client. oCIS needs to check if there is a file ID header and use that for the file.

Questions:

  • The server maintains a specific format of a UUID. Is it important that the clients use the same format? Capability?

Async Client Upload

The following Async Upload will make uploading faster from the clients point of view, because it does not have to wait for any postprocessing or updating of the meta data (which can take some time in oC10).

Also, in situations where storage of Metadata and binary data is split (eg. S3) we need this.

The processing is as follows:

  • Once the data of the file is uploaded completely, which means that the last byte arrived on the server, it immediately returns with an status code 202 (Accepted). That indicates to the uploading client that postprocessing is ongoing.
  • The client uploads the data to a fast, temporar POSIX based storage temporarily which is for example in case of TUS needed anyway.
  • The reply to the upload contains an arbitrary, changed ETag as usual. The ETag of the touched file is changed already.
  • In subsequent listings of the directory (via PROPFIND or GraphAPI) the server will list the file from the moment that the upload completed. If a status is included in the reply, the status 425 (TOO EARLY) will be returned.
  • If a client tries to download a file while postprocessing is ongoing, the server replies with 425
  • If a client tries to change or delete a file while postprocessing is ongoing, the server replies 425
  • After postprocessing has completed, the server changes the ETag of the parent dir of the file.

The most asked question:

If a file is removed, because it contains a virus that makes the virus check (which is running in post-processing) deleting the file, how does the client get to know that?

Answer: Currently, there is no special info for the client. The file just disappears. It is up to the virus app (or any other that does similar things) to inform the user, ie. through the activity app.

Checksums

  • To do a proper checksum handling (the checksum of the complete file is not validated after the last upload) the TUS extension of partial checksums should be implemented. See Partial Checksum Extension tus/tus-resumable-upload-protocol#172
  • As long as the extension is not there, the server has to implement a rolling checksum while the file is uploaded and do the check prior the reply on the last chunk upload.

@exalate-issue-sync
Copy link
Author

Klaas Freitag commented: Since older clients do not handle the 403 forbidden status for a single file correctly, we will return "423 Locked" instead. That is handled as a Softerror (at least in Desktop), which means: The client does not bother the user with that error, but silently tries to get the file later, which is the correct behaviour.

@exalate-issue-sync
Copy link
Author

Jörn Friedrich Dreyer commented: In effect, this behavior means we queue uploads on the server side. Even if we directly upload all chunks to the final destination, eg as multipart upload for an S3 storage, the final assembly may fail. How do we handle that case? retry indefinitely? ping admin?

I actually think this would be ok, because when we shard the uploads per space, the bytes are still available in the space. We can notify admins so he can fix the problem. Or we give a metric on how many uploads are still in progress, so it can be monitored properly.

How do we deal with the case the where the client sucessfully transfers all bytes, but a server side workflow like anti virus scanning or file classification makes the finalize request fail?

I think we should treat file upload, as in transferring bytes, different than postprocessing.
When the last byte is transferred we return and the client can continue. But then we need to make transparent that a postprocessing workflow is in progress. For S3 that includes file assembly. It might include antivirus scanning or classification. Some workflow steps might cause the new bytes to be rejected. What is going to happen then?

The TUS upload ID can remain accessible until postprocessing is done ... so clients could check the status of the upload ... we could expose workflow steps using headers similar to the Server-Timing header? https://w3c.github.io/server-timing/#the-server-timing-header-field

But how often would they need to do that? how will clients get notified when a necessary postprocessing step fails. And ... should clients then delete the file locally? How can they even retry if all they got was a 202 Accepted and later a new etag for the same file ... AFAICT they will redownload and overwrite local changes ...

@exalate-issue-sync
Copy link
Author

Klaas Freitag commented: Reply to [~jfd]:

  • We do not have to queue uploads. While the upload is running, the server has to check if the if-match condition is still correct. If another upload is faster, the if-match will break, which is replied to the client. (Attention: An empty if-match header indicates a new file upload).
  • If postprocessing is running, other clients will get a reply accordingly and try again later.
  • From the clients POV, if all parts of a file are uploaded with a verified checksum (!), the file is on the server. If the final assembly fails, that is not the business of any client. Rather call the admin, yes. Nothing the client should bother. Or in other words: The final assembly CAN NEVER fail from the clients POV.
  • If the file does not pass the anti virus, it is just removed on the server, and as a result of that, on all clients. Nothing the client need to understand.

@exalate-issue-sync
Copy link
Author

Michael Barz commented: Was discussed in cross-platform meeting:

  • Outcome regarding the clients: New clients should work seamless with the new protocol, old clients should not explode.
  • Further usability improvements like showing the postprocessing, process name and the progress should only be visible in newly developed clients

@micbar
Copy link
Contributor

micbar commented Sep 13, 2022

This was implemented on the experimental branches of reva and ocis.

@stale
Copy link

stale bot commented Mar 18, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status:Stale label Mar 18, 2023
@stale stale bot closed this as completed Apr 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p3-medium Status:Stale Type:Epic Epic is the parent of user stories
Projects
None yet
Development

No branches or pull requests

1 participant