Nodes would stabilize on different encoding formats for the values,
some having the pre-migration format and some having the post-migration
format. This would be reflected in the Merkle trees never converging
and thus having an infinite resync loop.
Implement ListMultipartUploads, also refactor ListObjects and ListObjectsV2.
It took me some times as I wanted to propose the following things:
- Using an iterator instead of the loop+goto pattern. I find it easier to read and it should enable some optimizations. For example, when consuming keys of a common prefix, we do many [redundant checks](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/main/src/api/s3_list.rs#L125-L156) while the only thing to do is to [check if the following key is still part of the common prefix](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/feature/s3-multipart-compat/src/api/s3_list.rs#L476).
- Try to name things (see ExtractionResult and RangeBegin enums) and to separate concerns (see ListQuery and Accumulator)
- An IO closure to make unit tests possibles.
- Unit tests, to track regressions and document how to interact with the code
- Integration tests with `s3api`. In the future, I would like to move them in Rust with the aws rust SDK.
Merging of the logic of ListMultipartUploads and ListObjects was not a goal but a consequence of the previous modifications.
Some points that we might want to discuss:
- ListObjectsV1, when using pagination and delimiters, has a weird behavior (it lists multiple times the same prefix) with `aws s3api` due to the fact that it can not use our optimization to skip the whole prefix. It is independant from my refactor and can be tested with the commented `s3api` tests in `test-smoke.sh`. It probably has the same weird behavior on the official AWS S3 implementation.
- Considering ListMultipartUploads, I had to "abuse" upload id marker to support prefix skipping. I send an `upload-id-marker` with the hardcoded value `include` to emulate your "including" token.
- Some ways to test ListMultipartUploads with existing software (my tests are limited to s3api for now).
Co-authored-by: Quentin Dufour <quentin@deuxfleurs.fr>
Reviewed-on: Deuxfleurs/garage#171
Co-authored-by: Quentin <quentin@dufour.io>
Co-committed-by: Quentin <quentin@dufour.io>
- Fix bucket delete
- fix merge of bucket creation date
- Replace deletable with option in aliases
Rationale: if two aliases point to conflicting bucket, resolving
by making an arbitrary choice risks making data accessible when it
shouldn't be. We'd rather resolve to deleting the alias until
someone puts it back.
- ensure bucket names are correct aws s3 names
- when making aliases, ensure timestamps of links in both ways are the
same
- fix small remarks by trinity
- don't have a separate website_access field
fix#77
this does not store anything but a on/off switch for website, and does not implement GetBucketWebsite as it would require storing more. GetBucketWebsite should be pretty easy to implement once data is stored though.
Co-authored-by: Trinity Pointard <trinity.pointard@gmail.com>
Reviewed-on: Deuxfleurs/garage#174
Co-authored-by: trinity-1686a <trinity.pointard@gmail.com>
Co-committed-by: trinity-1686a <trinity.pointard@gmail.com>
fix#161
Current request router was organically grown, and is getting messier and messier with each addition.
This router cover exaustively existing API endpoints (with exceptions listed in [#161(comment)](Deuxfleurs/garage#161 (comment)) either because new and old api endpoint can't feasabily be differentied, or it's more lambda than s3).
Co-authored-by: Trinity Pointard <trinity.pointard@gmail.com>
Reviewed-on: Deuxfleurs/garage#163
Reviewed-by: Alex <alex@adnab.me>
Co-authored-by: trinity-1686a <trinity.pointard@gmail.com>
Co-committed-by: trinity-1686a <trinity.pointard@gmail.com>
- change the terminology: the network configuration becomes the role
table, the configuration of a nodes becomes a node's role
- the modification of the role table takes place in two steps: first,
changes are staged in a CRDT data structure. Then, once the user is
happy with the changes, they can commit them all at once (or revert
them).
- update documentation
- fix tests
- implement smarter partition assignation algorithm
This patch breaks the format of the network configuration: when
migrating, the cluster will be in a state where no roles are assigned.
All roles must be re-assigned and commited at once. This migration
should not pose an issue.
- Explicit "replication_mode" configuration parameters that takes
either "none", "2" or "3" as values, instead of letting user configure
replication factor themselves. These are presets whose corresponding
replication/quorum values can be found in replication/mode.rs
- Explicit support for single-node and two-node deployments
(number of nodes must be at least "replication_mode", with "none"
we can have only one node)
- Ring is now stored much more compactly with 256*8 + n*32 bytes,
instead of 256*32 bytes
- Support for gateway-only nodes that do not store data
(these nodes still need a metadata_directory to store the list
of bucket and keys since those are stored on all nodes; it also
technically needs a data_directory to start but it will stay
empty unless we have bugs)
- Use quick_xml and serde for all XML response returned by the S3 API.
- Include tests for all structs used to generate XML
- Remove old manual XML escaping function which was unsafe
- return XML errors
- implement AuthorizationHeaderMalformed error to redirect clients to
correct location (used by minio client)
- implement GetBucketLocation
- fix DeleteObjects XML parsing and response