Full compatibility with S3 multipart uploads #204

Closed
opened 2022-01-25 11:22:25 +00:00 by lx · 2 comments
Owner

Task 1 of the NLnet Garage project started in April 2023 consists in fixing this issue. Progress on this task is tracked in #553.

See #197 and #198.

We are missing the following features:

  • Ability to re-upload a part that has been uploaded, by re-calling UploadPart. It should be silently overwritten:

    If you upload a new part using the same part number that was used with a previous part, the previously uploaded part is overwritten.

    https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html

  • Ability to drop some of the uploaded parts by not including them in the CompleteMultipartUpload call

  • On CompleteMultipartUpload, renumbering of parts with consecutive numbers starting at 1 (necessary so that the partNumber parameter in GetObject and HeadObject works correctly).

These features can only be implemented by fundamentally reworking the data model of Garage, meaning it's kind of a big change we aren't going to do just now.

For now, since renumbering is not supported, and to ensure that partNumber works correctly, the following restriction is implemented:

  • The uploaded parts must have consecutive numbers starting at 1.

Moreover, the following restrictions apply:

  • All uploaded parts must be used in CompleteMultipartUpload.

  • In case one of the parts upload went wrong, you have to abort the multipart upload and start over.

**Task 1 of the NLnet Garage project started in April 2023 consists in fixing this issue. Progress on this task is tracked in #553.** See #197 and #198. We are missing the following features: - Ability to re-upload a part that has been uploaded, by re-calling UploadPart. It should be silently overwritten: > If you upload a new part using the same part number that was used with a previous part, the previously uploaded part is overwritten. https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html - Ability to drop some of the uploaded parts by not including them in the CompleteMultipartUpload call - On CompleteMultipartUpload, renumbering of parts with consecutive numbers starting at 1 (necessary so that the `partNumber` parameter in GetObject and HeadObject works correctly). These features can only be implemented by fundamentally reworking the data model of Garage, meaning it's kind of a big change we aren't going to do just now. For now, since renumbering is not supported, and to ensure that `partNumber` works correctly, the following restriction is implemented: - The uploaded parts must have consecutive numbers starting at 1. Moreover, the following restrictions apply: - All uploaded parts must be used in CompleteMultipartUpload. - In case one of the parts upload went wrong, you have to abort the multipart upload and start over.
lx added the
scope
s3-api
label 2022-01-25 11:25:59 +00:00
  • In case one of the parts upload went wrong, you have to abort the multipart upload and start over.

I think this is the most annoying part of this ticket. It's one of the main selling point of multipart upload with higher bandwidth usage. From Amazon documentation :

If you're uploading over a spotty network, use multipart upload to increase resiliency to network errors by avoiding upload restarts. When using multipart upload, you need to retry uploading only parts that are interrupted during the upload. You don't need to restart uploading your object from the beginning.

> - In case one of the parts upload went wrong, you have to abort the multipart upload and start over. I think this is the most annoying part of this ticket. It's one of the main selling point of multipart upload with higher bandwidth usage. From [Amazon documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html) : > If you're uploading over a spotty network, use multipart upload to increase resiliency to network errors by avoiding upload restarts. When using multipart upload, you need to retry uploading only parts that are interrupted during the upload. You don't need to restart uploading your object from the beginning.
Owner

We observed one of the undesired behaviour in our CI during a nightly build.
Some artefacts failed to upload as a part upload was interrupted and couldn't be resumed by Garage.

$ nix-shell --attr release --run "to_s3"

upload failed: result/bin/garage to s3://garagehq.deuxfleurs.fr/_releases/8cd02639dc688dcb736b5c36dae822706862fac1/x86_64-unknown-linux-musl/garage An error occurred (InvalidRequest) when calling the UploadPart operation: Bad request: Part number 4 has already been uploaded
We observed one of the undesired behaviour in our CI during a nightly build. Some artefacts failed to upload as a part upload was interrupted and couldn't be resumed by Garage. ``` $ nix-shell --attr release --run "to_s3" upload failed: result/bin/garage to s3://garagehq.deuxfleurs.fr/_releases/8cd02639dc688dcb736b5c36dae822706862fac1/x86_64-unknown-linux-musl/garage An error occurred (InvalidRequest) when calling the UploadPart operation: Bad request: Part number 4 has already been uploaded ```
lx added this to the v1.0 milestone 2022-10-16 19:09:57 +00:00
lx modified the milestone from v1.0 to v0.9 2022-11-10 10:45:14 +00:00
lx self-assigned this 2022-12-11 17:45:27 +00:00
lx closed this issue 2023-06-09 15:34:10 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#204
No description provided.