Large Files fail to upload #662
Labels
No Label
AdminAPI
Bug
Check AWS
CI
Correctness
Critical
Documentation
Ideas
Improvement
Low priority
Newcomer
Performance
S3 Compatibility
Testing
Usability
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#662
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I have encountered a reproducible problem uploading large files, typically close to or greater than 1TB in size. I can without issue upload lots a data, I have uploaded ~12TB to this current setup. But large individual files ~1TB+ I am unable to upload. I am able to verify this also doesn't work on older versions, tested also with v0.8.2.
Current version:
garage v0.9.0 [features: k2v, sled, lmdb, sqlite, consul-discovery, kubernetes-discovery, metrics, telemetry-otlp, bundled-libs]
Upload command:
aws s3 cp test3.tar s3://pizzatest/
upload failed: ./test3.tar to s3://pizzatest/test3.tar Read timeout on endpoint URL: "http://xxx.xxx.xxx.xxx:3900/pizzatest/test3.tar?uploadId=e06b810a5221c5aa16d1131d5c04bf8ce736e8ef5dd8db3f180b9018ddce6517"
File:
ls -l test3.tar
-rw-r--r-- 1 root users 1146850877441 Jul 4 17:18 test3.tar
Config setup:
cat garage.toml
metadata_dir = "/home/meta"
data_dir = "/home/data"
db_engine = "lmdb"
#replication_mode = "none"
replication_mode = "3"
rpc_bind_addr = "[::]:3901"
rpc_public_addr = "xxx.xxx.xxx.xxx:3901"
rpc_secret = "xxxx"
[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
root_domain = ".s3.garage.localhost"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage.localhost"
index = "index.html"
[k2v_api]
api_bind_addr = "[::]:3904"
[admin]
api_bind_addr = "0.0.0.0:3903"
admin_token = "xxx="
The issue is probably linked to the fact that your 1TB+ files will get split into 1M+ blocks if you keep the default block size of 1MB. This means that Garage will have to generate (when uploading the object) and read (when reading the object) a list of 1M block IDs. In theory this is relatively doable but the issue you are having seems to indicate that you are stretching the limits of the system here. The first thing to do is to increase your block size (and maybe increase the size of parts in your multipart upload as well) to something like 10MB or 100MB depending on your network conditions (and something like 1GB for the multipart upload parts). If that doesn't fix the issue we will have to look into optimizing the code path.