S3-compatible object store for small self-hosted geo-distributed deployments https://garagehq.deuxfleurs.fr/
Go to file
Quentin b4592a00fe
continuous-integration/drone/push Build is passing Details
Implement ListMultipartUploads (#171)
Implement ListMultipartUploads, also refactor ListObjects and ListObjectsV2.

It took me some times as I wanted to propose the following things:
  - Using an iterator instead of the loop+goto pattern. I find it easier to read and it should enable some optimizations. For example, when consuming keys of a common prefix, we do many [redundant checks](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/main/src/api/s3_list.rs#L125-L156) while the only thing to do is to [check if the following key is still part of the common prefix](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/feature/s3-multipart-compat/src/api/s3_list.rs#L476).
  - Try to name things (see ExtractionResult and RangeBegin enums) and to separate concerns (see ListQuery and Accumulator)
  - An IO closure to make unit tests possibles.
  - Unit tests, to track regressions and document how to interact with the code
  - Integration tests with `s3api`. In the future, I would like to move them in Rust with the aws rust SDK.

Merging of the logic of ListMultipartUploads and ListObjects was not a goal but a consequence of the previous modifications.

Some points that we might want to discuss:
  - ListObjectsV1, when using pagination and delimiters, has a weird behavior (it lists multiple times the same prefix) with `aws s3api` due to the fact that it can not use our optimization to skip the whole prefix. It is independant from my refactor and can be tested with the commented `s3api` tests in `test-smoke.sh`. It probably has the same weird behavior on the official AWS S3 implementation.
  - Considering ListMultipartUploads, I had to "abuse" upload id marker to support prefix skipping. I send an `upload-id-marker` with the hardcoded value `include` to emulate your "including" token.
  - Some ways to test ListMultipartUploads with existing software (my tests are limited to s3api for now).

Co-authored-by: Quentin Dufour <quentin@deuxfleurs.fr>
Reviewed-on: #171
Co-authored-by: Quentin <quentin@dufour.io>
Co-committed-by: Quentin <quentin@dufour.io>
2022-01-12 19:04:55 +01:00
doc Implement ListMultipartUploads (#171) 2022-01-12 19:04:55 +01:00
nix Extract toolchain build from the CI 2021-10-29 11:34:01 +02:00
script Implement ListMultipartUploads (#171) 2022-01-12 19:04:55 +01:00
src Implement ListMultipartUploads (#171) 2022-01-12 19:04:55 +01:00
.dockerignore Build Docker image 2020-06-30 17:18:42 +02:00
.drone.yml Extract toolchain build from the CI 2021-10-29 11:34:01 +02:00
.gitignore Work on API 2020-04-28 10:18:14 +00:00
Cargo.lock Some movement of helper code and refactoring of error handling 2022-01-04 12:52:46 +01:00
Cargo.nix Hopefully fix Nix build 2022-01-04 12:52:46 +01:00
Cargo.toml Skeleton to the new web API 2020-11-02 15:48:39 +01:00
Dockerfile Extract toolchain build from the CI 2021-10-29 11:34:01 +02:00
LICENSE Switch to AGPL 2021-03-16 16:35:46 +01:00
Makefile Build Garage with Nix 2021-10-19 16:56:07 +02:00
README.md Improve how node roles are assigned in Garage 2021-11-16 16:05:53 +01:00
default.nix Implement ListMultipartUploads (#171) 2022-01-12 19:04:55 +01:00
rustfmt.toml Fix the Sync issue. Details: 2020-04-10 22:01:48 +02:00
shell.nix Implement ListMultipartUploads (#171) 2022-01-12 19:04:55 +01:00

README.md

Garage Build Status

Garage logo

[ Website and documentation | Binary releases | Git repository | Matrix channel ]

Garage is a lightweight S3-compatible distributed object store, with the following goals:

  • As self-contained as possible
  • Easy to set up
  • Highly resilient to network failures, network latency, disk failures, sysadmin failures
  • Relatively simple
  • Made for multi-datacenter deployments

Non-goals include:

  • Extremely high performance
  • Complete implementation of the S3 API
  • Erasure coding (our replication model is simply to copy the data as is on several nodes, in different datacenters if possible)

Our main use case is to provide a distributed storage layer for small-scale self hosted services such as Deuxfleurs.