Large Files fail to upload #662
Labels
No labels
action
check-aws
action
discussion-needed
action
for-external-contributors
action
for-newcomers
action
more-info-needed
action
need-funding
action
triage-required
kind
correctness
kind
ideas
kind
improvement
kind
performance
kind
testing
kind
usability
kind
wrong-behavior
prio
critical
prio
low
scope
admin-api
scope
background-healing
scope
build
scope
documentation
scope
k8s
scope
layout
scope
metadata
scope
ops
scope
rpc
scope
s3-api
scope
security
scope
telemetry
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#662
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I have encountered a reproducible problem uploading large files, typically close to or greater than 1TB in size. I can without issue upload lots a data, I have uploaded ~12TB to this current setup. But large individual files ~1TB+ I am unable to upload. I am able to verify this also doesn't work on older versions, tested also with v0.8.2.
Current version:
garage v0.9.0 [features: k2v, sled, lmdb, sqlite, consul-discovery, kubernetes-discovery, metrics, telemetry-otlp, bundled-libs]
Upload command:
aws s3 cp test3.tar s3://pizzatest/
upload failed: ./test3.tar to s3://pizzatest/test3.tar Read timeout on endpoint URL: "http://xxx.xxx.xxx.xxx:3900/pizzatest/test3.tar?uploadId=e06b810a5221c5aa16d1131d5c04bf8ce736e8ef5dd8db3f180b9018ddce6517"
File:
ls -l test3.tar
-rw-r--r-- 1 root users 1146850877441 Jul 4 17:18 test3.tar
Config setup:
cat garage.toml
metadata_dir = "/home/meta"
data_dir = "/home/data"
db_engine = "lmdb"
#replication_mode = "none"
replication_mode = "3"
rpc_bind_addr = "[::]:3901"
rpc_public_addr = "xxx.xxx.xxx.xxx:3901"
rpc_secret = "xxxx"
[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
root_domain = ".s3.garage.localhost"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage.localhost"
index = "index.html"
[k2v_api]
api_bind_addr = "[::]:3904"
[admin]
api_bind_addr = "0.0.0.0:3903"
admin_token = "xxx="
The issue is probably linked to the fact that your 1TB+ files will get split into 1M+ blocks if you keep the default block size of 1MB. This means that Garage will have to generate (when uploading the object) and read (when reading the object) a list of 1M block IDs. In theory this is relatively doable but the issue you are having seems to indicate that you are stretching the limits of the system here. The first thing to do is to increase your block size (and maybe increase the size of parts in your multipart upload as well) to something like 10MB or 100MB depending on your network conditions (and something like 1GB for the multipart upload parts). If that doesn't fix the issue we will have to look into optimizing the code path.