Fragementation with repetitive UploadPartCopy #203
Labels
No labels
action
check-aws
action
discussion-needed
action
for-external-contributors
action
for-newcomers
action
more-info-needed
action
need-funding
action
triage-required
kind
correctness
kind/experimental
kind
ideas
kind
improvement
kind
performance
kind
testing
kind
usability
kind
wrong-behavior
prio
critical
prio
low
scope
admin-api
scope
admin-sdk
scope
background-healing
scope
build
scope
documentation
scope
k8s
scope
layout
scope
metadata
scope
ops
scope
rpc
scope
s3-api
scope
security
scope
telemetry
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#203
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
When using UploadPartCopy, we can upload part number
i
in a multipart upload from a previously existing object, or a sub-range of such an object.UploadPartCopy tries to preserve as most as possible the blocks of the source object. For instance, if we are copying an object that has blocks:
Then the created part will be composed of blocks
b1
,b2
andb3
.Now, if we specify a source range in the object to copy, for example 500kb-2100kb,
corresponding to the following:
then the uploaded part will be constitued of the following blocks:
In other words, the blocks
b1
andb3
will be split in the middle, and blockb2
will be copied as-is.Now if we use the resulting object as the source of yet another UploadPartCopy file, we will make fragmentation even worse:
b1'
could itself be split in the middle, same asb3'
, making for smaller and smaller blocks each time an existing block is split.To avoid this, the following recommendation should be applied:
Unfortunately the S3 API does not give direct access to the boundaries of individual blocks (an implementation detail that is abstracted away). However with ListParts as well as with the partNumber parameter of GetObject and HeadObject, we can known the boundaries of the parts in a multipart upload. These boundaries are always block boundaries and can be safely used as bound in an UploadPartCopy source range without causing further fragmentation.
Part fragmentation happens mostly when repetitively making new multipart uploads that use copies of previous files with UploadPartCopy: each time an existing object is used as the source of an UploadPartCopy with a sub-range, some blocks (one at the beginning, one at the end) might need to be split at some point. The problem being that we do not control the place where these blocks are split, so the sub-chunks may be excessively small. After just one UploadPartCopy there isn't much issue with this because there are only at most two small sub-chunks, which is a constant overhead which can still be managed. But after repetitively rewriting an object using UploadPartCopy, many chunks will have been split at many places, and the object might be constituted of a majority of very small chunks, instead of a majority of 1MB chunks with a few very small chunks in between.
Proposed solution: add a defragmentation procedure to the code of UploadPartCopy. The first idea was to read the copied part and re-split it at 1MB boundaries, but that's costly and not required in most cases. It also breaks deduplication in many cases. What we can do instead is keep all blocks that are 1MB, and if there are several consecutive blocks that are smaller (and their total size is also less than 1MB), we re-read them and concatenate them in a single block. In a majority of cases we will not need to re-read/re-write blocks, and in the few pathological cases where consecutive blocks are split in very small pieces, it will repair that oportunistically.