Interrupted PUT leave temporary files and stale metadata #469

Closed
opened 2023-01-08 19:11:41 +00:00 by kaiyou · 1 comment
Contributor

When a PUT (either PutObject or UploadPart) fails before all related blocks are written to disk, the Object is never created, while some stale data and metadata remains:

  • stale .tmp files not renamed to .zst even when properly uploaded and written, if request is interrupted while data is flushed to disk and before the file is renamed;
  • stale Block entries, not related to any ObjectVersion;
  • stale ObjectVersion in uploading state (some are normal unfinished multipart uploads, some are failed PUT requests).

Two following commits were pushed to fix most of these issues:

After running the dev build for 3 days, no new stale meta or temporary files remain, so maybe this issue can be closed immediately, I mostly wrote it to keep track of the failure case.

In case you have reamining errorred blocks:

  • drop all unfinished multipart uploads,
  • check zst integrity of stale tmp files and rename them to .zst,
  • drop and reupload remaining objects.

(Or just drop and reupload them all if you can affort it).

When a PUT (either `PutObject` or `UploadPart`) fails before all related blocks are written to disk, the `Object` is never created, while some stale data and metadata remains: - stale `.tmp` files not renamed to `.zst` even when properly uploaded and written, if request is interrupted while data is flushed to disk and before the file is renamed; - stale `Block` entries, not related to any `ObjectVersion`; - stale `ObjectVersion` in `uploading` state (some are normal unfinished multipart uploads, some are failed PUT requests). Two following commits were pushed to fix most of these issues: - https://git.deuxfleurs.fr/Deuxfleurs/garage/commit/0650a43cf14e7e52121a553130a9ea6c92b7bd4a - https://git.deuxfleurs.fr/Deuxfleurs/garage/commit/936b6cb563b9dc8bb5c879f8bd6b89574f016f03 After running the dev build for 3 days, no new stale meta or temporary files remain, so maybe this issue can be closed immediately, I mostly wrote it to keep track of the failure case. In case you have reamining errorred blocks: - drop all unfinished multipart uploads, - check zst integrity of stale tmp files and rename them to `.zst`, - drop and reupload remaining objects. (Or just drop and reupload them all if you can affort it).
quentin added the
kind
wrong-behavior
label 2023-01-15 09:22:35 +00:00
Owner

I think this has been fixed by the two commits you mentionned, so I'm closing this issue. Feel free to re-open if the bug still appears.

I think this has been fixed by the two commits you mentionned, so I'm closing this issue. Feel free to re-open if the bug still appears.
lx closed this issue 2023-04-25 09:24:30 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#469
No description provided.