Compare commits

..

212 commits

Author SHA1 Message Date
89b8087ba8 Merge pull request 'Properly return HTTP 204 when deleting non-existent object (fix #227)' (#384) from deleteobject-204 into main
Reviewed-on: Deuxfleurs/garage#384
2022-09-14 17:16:39 +02:00
76f42a1a2b
Properly return HTTP 204 when deleting non-existent object (fix #227) 2022-09-14 17:07:55 +02:00
82600acf77 Merge pull request 'Allow for hostnames in bootstrap_peers and rpc_public_addr (fix #353)' (#383) from resolve-peer-names into main
Reviewed-on: Deuxfleurs/garage#383
2022-09-14 16:37:18 +02:00
e46dc2a8ef
Allow for hostnames in bootstrap_peers and rpc_public_addr (fix #353) 2022-09-14 16:09:38 +02:00
80fdbfb0aa Merge pull request 'various fixes for v0.8.0' (#380) from various-fixes-for-0.8 into main
Reviewed-on: Deuxfleurs/garage#380
2022-09-13 16:49:05 +02:00
ab722cb40f
Add checks on replication_factor of layouts we use (fix #363, fix #364) 2022-09-13 16:22:23 +02:00
38be811b1c
Fix clippy lint that says we should implement Eq 2022-09-13 16:08:00 +02:00
44733474bb
Remove/change println! in server code (fix #358) 2022-09-13 16:01:55 +02:00
07febd3ecd
Ensure data dir is created immediately when Garage starts (fix #349) 2022-09-13 15:57:27 +02:00
11bdc971e2 Merge pull request 'use netapp streaming body' (#343) from netapp-stream-body into main
Reviewed-on: Deuxfleurs/garage#343
2022-09-13 15:26:08 +02:00
ff30891999
Use streaming block API for get with Range requests 2022-09-13 15:13:07 +02:00
28a4af73ca
Use netapp 0.5 published from crates.io 2022-09-13 13:11:44 +02:00
b823151a0b
improvements in block manager 2022-09-12 16:57:38 +02:00
309d7aef3f Merge pull request 'performance improvements' (#342) from lx-perf-improvements into main
Performance improvements included in this PR:

- [x] Use `Bytes` at a few places where appropriate, instead of `Vec<u8>`, to reduce the number of copies
  - [x] StreamChunker now accumulates incoming slices in a `Vec<Bytes>` instead of a `VecDeque<u8>`. Replaces calls to `.extend()` and `.drain()` that were quite costly by a simple `concat()` on a vec of slices which is much more optimized
- [x] Hashing (b2, sha256, md5) is now done on a Tokio thread dedicated to cpu-intensive tasks, using `spawn_blocking`
- [x] Block manager now uses 256 independant locks instead of one big lock for writing, reduces contention when writing several/many objects in parallel
- [x] Better LMDB defaults: we now put flags `NoSync` and `NoMetaSync` to avoid `fsync` at each transaction (extremely slow). Also increased number of LMDB readers to accomodate more intensive workloads

Other changes included in this PR:

- [x] Update to hashing and MAC crates: md5 and sha2 from 0.9 to 0.10, hmac from 0.10 to 0.12
- [x] switch to `tracing_subscriber` for logs, which allows to have timing of each event

Reviewed-on: Deuxfleurs/garage#342
2022-09-12 16:38:43 +02:00
f91fab8582
Simplify+improve async hasher by using bounded channel 2022-09-12 16:23:43 +02:00
7f54706b95
Merge branch 'lx-perf-improvements' into netapp-stream-body 2022-09-08 15:50:56 +02:00
d9d199a6c9
Merge branch 'main' into lx-perf-improvements 2022-09-08 15:49:17 +02:00
03c40a0b24 Merge pull request 'Reorganize dependencies' (#373) from improve-deps into main
This PR includes work from @jirutka :

- [x] Allow linking against system-provided libraries (libsodium, libsqlite, libzstd) #370
- [x] Make OTLP exporter optional and allow building without Prometheus exporter (/metrics) #372

And also:

- [x] Update `.nix` files
- [x] Remove heed default-features
- [x] Bump versions of all Garage crates to 0.8.0
- [x] Make db engines (lmdb, sled, sqlite) optionnal
- [x] Add documentation for available features
- [x] Directly include code of previous versions used for migration in order to reduce dependencies
- [x] Read variable `GIT_VERSION` from garage main instead of in crate garage_util to make builds faster
- [x] Report features used in the build somewhere? (in `garage --version` or something)
- [x] Check we `warn!` correctly if we try to use deactivated feature
- [x] Allow not to launch S3 endpoint if not in config

Reviewed-on: Deuxfleurs/garage#373
2022-09-08 15:45:09 +02:00
ceb1f0229a
Move version back into util 2022-09-07 18:36:46 +02:00
f310fce34b
Inject GIT_VERSION even later 2022-09-07 18:30:15 +02:00
06df301de5
Fix merge 2022-09-07 18:16:01 +02:00
8adc654713
Merge branch 'main' into improve-deps 2022-09-07 18:13:27 +02:00
107853334b
Fix build error 2022-09-07 18:10:19 +02:00
1449204439
Add warnings when features are not included in build 2022-09-07 18:02:13 +02:00
2e00809af5
Error messages when system-libs XOR bundled-libs != 1 2022-09-07 17:57:12 +02:00
2559f63e9b
Make all HTTP services optionnal 2022-09-07 17:54:16 +02:00
28d86e7602
Report build features in garage --help 2022-09-07 17:05:21 +02:00
db61f41030
Move GIT_VERSION injection later in build chain to reduce build times 2022-09-07 11:59:56 +02:00
907054775d
Faster copy, better get error message 2022-09-06 22:25:23 +02:00
6b958979bd
Merge branch 'lx-perf-improvements' into netapp-stream-body 2022-09-06 22:13:01 +02:00
d23b3a14fc
Merge branch 'main' into lx-perf-improvements 2022-09-06 21:53:37 +02:00
4024822585
Update netapp to lastest git version with LAS scheduling 2022-09-06 19:45:00 +02:00
c2cc08852b
Reenable node ordering 2022-09-06 19:31:42 +02:00
6f02c36a89
cargo fmt 2022-09-06 17:59:41 +02:00
0f5689c169
Include code from v0.5.1 directly to remove dependencies 2022-09-06 17:52:50 +02:00
1e92e9f782
Disable k2v tests when feature is disabled 2022-09-06 17:29:46 +02:00
431dee050f
Remove opentelemetry-otlp dep in api/ 2022-09-06 17:25:44 +02:00
2c2b93acdf
Update Nix files with optional db engines 2022-09-06 17:20:10 +02:00
bbb970965c
Document available build features 2022-09-06 17:16:45 +02:00
b886c75450
Make all DB engines optional build features 2022-09-06 17:09:43 +02:00
48ffaaadfc
Bump versions to 0.8.0 (compatibility is broken already) 2022-09-06 16:47:56 +02:00
7de53a4d66
Force disable pkg-config for libsodum-sys and libzstd-sys 2022-09-06 16:41:58 +02:00
8d77a76df1
Update .nix files 2022-09-06 15:49:41 +02:00
454d8474ef
Fix clippy 2022-09-06 15:43:50 +02:00
ed7796924b Merge pull request 'Make OTLP exporter optional and allow building without Prometheus exporter (/metrics)' (#372) from jirutka/garage:telemetry-and-metrics into improve-deps
Reviewed-on: Deuxfleurs/garage#372
Reviewed-by: Alex <alex@adnab.me>
2022-09-06 15:11:30 +02:00
ea36b9ff90 Allow building without Prometheus exporter (/metrics endpoint)
prometheus and opentelemetry-prometheus add 7 extra dependencies in
total and increases the size of the garage binary by ~7 % (with
fat LTO).
2022-09-06 01:15:09 +02:00
e7af006c1c Make OTLP exporter optional via feature "telemetry-otlp"
opentelemetry-otlp add 48 (!) extra dependencies and increases the
size of the garage binary by ~11 % (with fat LTO).
2022-09-06 01:14:47 +02:00
db72812f01 Use the new cargo feature resolver "2"
Garage currently uses the legacy resolver "1". The new one is used
by default if the root package specifies 'edition = 2021', which
Garage does not (yet).

The problem with the legacy resolver is, among others, that features
enabled by dev-dependencies are propagated to normal dependencies.
This affects e.g. hyper - one of the dev-dependencies enables "http2"
feature that adds many extra dependencies. If we build garage without
opentelemetry-otlp (this is enabled in the following commit), there's
no normal dependency enabling "http2" feature.

See https://doc.rust-lang.org/cargo/reference/resolver.html#feature-resolver-version-2
2022-09-06 01:14:19 +02:00
729a910e14
Remove Heed default features 2022-09-05 16:40:13 +02:00
9f5433db82 Merge pull request 'Update .drone.yml signature' (#374) from fix-drone-signature into main
Reviewed-on: Deuxfleurs/garage#374
2022-09-05 16:18:15 +02:00
fd8074ad9b
Update .drone.yml signature 2022-09-05 16:09:01 +02:00
07e6bcde85
Merge branch 'main' into lx-perf-improvements 2022-09-05 12:40:17 +02:00
0009fd136c Merge pull request 'Make block resync speed dynamically configurable' (#369) from resync-ajustable-speed into main
Included in this PR:

- [x] Small refactor, resync code is moved to a separate `block/resync.rs` file
- [x] Block resync tranquility is no longer in config file, it is set dynamically using `garage worker set resync-tranquility` (this parameter is persisted over Garage restarts)
- [x] Up to 4 block resync workers can be activated to run simultaneously to speed up big resyncs, this parameter is set dynamically using `garage worker set resync-n-workers`

Reviewed-on: Deuxfleurs/garage#369
2022-09-05 12:35:08 +02:00
7511ba5530 Allow linking against system-provided libsqlite
Unfortunately, rusqlite uses the opposite logic for enabling/disabling
bundled libraries to others (libsodium-sys, zstd-sys). Cargo features
are very limited and doesn't allow to enable feature A in a dependency
iff feature B is disabled.

Note, lmdb-rkv-sys doesn't need any special treatment because it
automatically links against system liblmdb if found via pkgconf.

Linux distros should build garage with
`--no-default-features --features system-libs` to disable bundled-libs
and enable system-libs.
2022-09-03 19:15:57 +02:00
a6e40b75ea Add feature "system-libs" to enable linking against system libraries
If this feature is enabled, libsodium-sys and zstd-sys will link
dynamically against system-provided libraries instead of building
and linking statically the bundled (possibly outdated and vulnerable)
copies of them. This feature is intended mainly for linux package
maintainers.
2022-09-03 18:44:34 +02:00
e1751c8a9c
fix clippy 2022-09-02 17:24:26 +02:00
5d4b937a00
Ability to have up to 4 concurrently working resync workers 2022-09-02 17:18:13 +02:00
5e8baa433d
Make BlockManagerLocked fully private again 2022-09-02 16:52:22 +02:00
47be652a1f
block manager: refactor: split resync into separate file 2022-09-02 16:47:15 +02:00
943d76c583
Ability to dynamically set resync tranquility 2022-09-02 15:34:21 +02:00
6226f5ceca
Update to netapp 0.4.5 - fixed ping 2022-09-02 14:33:12 +02:00
13b5f28c7e
Make use of BytesBuf from new Netapp 2022-09-02 13:46:42 +02:00
1ef87ac4cb
cargo fmt 2022-09-02 13:38:29 +02:00
99b532b85b
Apply PRIO_SECONDARY to block data transfers 2022-09-01 16:35:43 +02:00
e648bf7b69
update cargo.nix 2022-09-01 16:31:04 +02:00
df094bd807
Less strict timeouts 2022-09-01 16:30:44 +02:00
f3bf34b6a1
update netapp: straming + fix-ping 2022-09-01 14:23:54 +02:00
bc977f9a7a
Update to Netapp with OrderTag support and exploit OrderTags 2022-09-01 12:58:20 +02:00
4b726b0941
netapp recv with unbounded channel removes deadlock 2022-09-01 09:47:28 +02:00
70231d68b2
Fix bytes_read counter 2022-08-31 19:44:27 +02:00
e598231ca4
update netapp git commit 2022-08-31 19:27:25 +02:00
c9bc9d89de
Merge branch 'lx-perf-improvements' into netapp-stream-body 2022-08-31 17:42:31 +02:00
eb97e13a6a
update cargo.nix 2022-08-31 17:42:00 +02:00
efbca67ce4
Add env filter to tracing subscriber 2022-08-31 14:39:12 +02:00
44cd98d2e4
Tracing-subscriber: write to stderr 2022-08-31 14:28:17 +02:00
dd5304f6fc
Replace logging crate pretty_env_logger by tracing_subscriber::fmt 2022-08-31 14:24:41 +02:00
322dafc761
Try to fix clippy 2022-08-29 17:32:45 +02:00
5d065b8a0f
cargo2nix fix to fetchCrateGit 2022-08-29 17:24:53 +02:00
52749e28f7
Merge branch 'lx-perf-improvements' into netapp-stream-body 2022-08-29 16:48:43 +02:00
4da67b0035
Update drone signature 2022-08-29 16:48:31 +02:00
1921f4f7e6
Merge branch 'lx-perf-improvements' into netapp-stream-body 2022-08-29 16:45:05 +02:00
ebc20a8798
Merge branch 'main' into lx-perf-improvements 2022-08-29 16:44:13 +02:00
532eca7ff9
Add some documentation for Caddy 2022-08-12 10:33:41 +02:00
2c7bae935a
Configure structopt to report the right version
By default, structopt reports the value provided by
the env var CARGO_PKG_VERSION, feeded by Cargo when reading
Cargo.toml. However for Garage we use a versioning based on git,
so we often report a version that is behind the real version.
In this commit, we create garage_util::version::garage() that
reports the right version and configure all structopt subcommands
to call this function instead of using the env var.
2022-08-11 10:21:45 +02:00
8cd02639dc
drone: set TARGET env as needed by "to_s3" func 2022-08-03 11:19:26 +02:00
e935861854
Factor out node request order selection logic & use in manager 2022-07-29 12:25:03 +02:00
f0ee3056d3
Update cargo.nix 2022-07-29 12:25:03 +02:00
126b037307
update netapp 2022-07-29 12:25:03 +02:00
33750c04ed
Update cargo.nix 2022-07-29 12:25:03 +02:00
68087ee13d
Fix clippy 2022-07-29 12:25:03 +02:00
605a630333
Use streaming in block manager 2022-07-29 12:25:02 +02:00
a35d4da721
update netapp to 0.5 2022-07-29 12:25:02 +02:00
8e7e680afe
First adaptation to WIP netapp with streaming body 2022-07-29 12:25:02 +02:00
16f6a1a65d
fix clippy 2022-07-29 12:24:49 +02:00
ad35b18bb1
Faster chunker 2022-07-29 12:24:49 +02:00
49154a78d8
Update cargo.nix 2022-07-29 12:24:48 +02:00
ff4771c36a
cargo fmt 2022-07-29 12:24:48 +02:00
381eb9a5a1
Fix tests 2022-07-29 12:24:48 +02:00
2cad656a03
More make clippy happy 2022-07-29 12:24:48 +02:00
0176da3ad2
Make clippy happy 2022-07-29 12:24:48 +02:00
40150527b8
Update cargo.nix 2022-07-29 12:24:48 +02:00
2f111e6b3d
Performance improvements:
- reduce contention on mutation_lock by having 256 of them
- better lmdb defaults
2022-07-29 12:24:48 +02:00
1b2e1296eb
Compute hashes on dedicated threads 2022-07-29 12:24:44 +02:00
a184f0d0b5
Migrate to nix-daemon builders 2022-07-29 08:37:33 +02:00
fcb04843f7
Run clippy in nix, leveraging nix caching ability 2022-07-26 18:27:52 +02:00
5fb8584247
Refactor default.nix to follow Nix Flakes patterns 2022-07-26 18:27:52 +02:00
96561c48a1
Bump Nix image to 22.05 2022-07-26 18:27:52 +02:00
a49d0ea19f
Fix: compile aarch64+armv6 as static binaries 2022-07-26 18:27:51 +02:00
9c9e483375
Put log-lines in nix.conf 2022-07-26 18:27:51 +02:00
76cb34a0ae
Fail if compiled binary is dynamic 2022-07-26 18:27:46 +02:00
ac03fa7937
Uniformize tracing::* imports (hopefully fixes 32-bit build) 2022-07-15 18:31:19 +02:00
4f38cadf6e Background task manager (#332)
- [x] New background worker trait
- [x] Adapt all current workers to use new API
- [x] Command to list currently running workers, and whether they are active, idle, or dead
- [x] Error reporting
- Optimizations
  - [x] Merkle updater: several items per iteration
  - [ ] Use `tokio::task::spawn_blocking` where appropriate so that CPU-intensive tasks don't block other things going on
- scrub:
  - [x] have only one worker with a channel to start/pause/cancel
  - [x] automatic scrub
  - [x] ability to view and change tranquility from CLI
  - [x] persistence of a few info
- [ ] Testing

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#332
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-07-08 13:30:26 +02:00
aab34bfe54
add delays in k2v test_items_and_indices 2022-07-08 10:41:57 +02:00
fe3fa83de7 Publish k2v-client crate to crates.io (#337)
Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#337
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-07-04 18:27:25 +02:00
b6d59ec19a
Fix poll item when item didn't change 2022-07-04 14:00:02 +02:00
0850bac874 Add poll command to k2v-cli (#335)
Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#335
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-07-04 12:45:32 +02:00
b74b533b7b Fix typo 2022-06-29 11:50:51 +02:00
996f2a6d58 Slides for talk at IMT Atlantique / STACK on 2022-06-23 (#333)
Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#333
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-06-23 14:28:40 +02:00
77e3fd6db2 improve internal item counter mechanisms and implement bucket quotas (#326)
- [x] Refactoring of internal counting API
- [x] Repair procedure for counters (it's an offline procedure!!!)
- [x] New counter for objects in buckets
- [x] Add quotas to buckets struct
- [x] Add CLI to manage bucket quotas
- [x] Add admin API to manage bucket quotas
- [x] Apply quotas by adding checks on put operations
- [x] Proof-read

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#326
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-06-15 20:20:28 +02:00
d544a0e0e0
Send CORS headers for all requests 2022-06-13 10:19:52 +02:00
138e13071b
Fix garage_db build on 32-bit systems 2022-06-09 14:55:20 +02:00
b44d3fc796 Abstract database behind generic interface and implement alternative drivers (#322)
- [x] Design interface
- [x] Implement Sled backend
  - [x] Re-implement the SledCountedTree hack ~~on Sled backend~~ on all backends (i.e. over the abstraction)
- [x] Convert Garage code to use generic interface
- [x] Proof-read converted Garage code
- [ ] Test everything well
- [x] Implement sqlite backend
- [x] Implement LMDB backend
- [ ] (Implement Persy backend?)
- [ ] (Implement other backends? (like RocksDB, ...))
- [x] Implement backend choice in config file and garage server module
- [x] Add CLI for converting between DB formats
- Exploit the new interface to put more things in transactions
  - [x] `.updated()` trigger on Garage tables

Fix #284

**Bugs**

- [x] When exporting sqlite, trees iterate empty??
- [x] LMDB doesn't work

**Known issues for various back-ends**

- Sled:
  - Eats all my RAM and also all my disk space
  - `.len()` has to traverse the whole table
  - Is actually quite slow on some operations
  - And is actually pretty bad code...
- Sqlite:
  - Requires a lock to be taken on all operations. The lock is also taken when iterating on a table with `.iter()`, and the lock isn't released until the iterator is dropped. This means that we must be VERY carefull to not do anything else inside a `.iter()` loop or else we will have a deadlock! Most such cases have been eliminated from the Garage codebase, but there might still be some that remain. If your Garage-over-Sqlite seems to hang/freeze, this is the reason.
  - (adapter uses a bunch of unsafe code)
- Heed (LMDB):
  - Not suited for 32-bit machines as it has to map the whole DB in memory.
  - (adpater uses a tiny bit of unsafe code)

**My recommendation:** avoid 32-bit machines and use LMDB as much as possible.

**Converting databases** is actually quite easy. For example from Sled to LMDB:

```bash
cd src/db
cargo run --features cli --bin convert -- -i path/to/garage/meta/db -a sled -o path/to/garage/meta/db.lmdb -b lmdb
```

Then, just add this to your `config.toml`:

```toml
db_engine = "lmdb"
```

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#322
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-06-08 10:01:44 +02:00
Simon C
7eed3ceda9 docs: Add Trafik reverse proxy documentation 2022-06-07 16:16:52 +02:00
Simon C
4b8f48f3c5 docs: Fix title level 2022-06-07 13:32:52 +02:00
Simon C
7d3b5585f1 docs: Add link to facilitate navigation in the documentation 2022-06-07 09:38:59 +02:00
a1abed0378
Remove useless MC_REGION env variable 2022-06-02 12:50:11 +02:00
b54a938724 Fix garage_version() now that GIT_VERSION is read in crate garage_rpc 2022-06-02 12:00:10 +02:00
ff06d3f082
Fix Content-Type headers for {admin,k2v} errors and admin responses
Fix #315
2022-05-25 17:09:33 +02:00
93eab8eaa3 Fixes to S3 compatibility page (#314)
Mention PostObject is implemented, fix english mistakes

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#314
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-25 16:54:44 +02:00
43ddc933f9
Update Ceph S3 endpoints compatibility 2022-05-25 15:20:08 +02:00
9f303f6308
Shorter page title 2022-05-24 15:47:42 +02:00
3be43f3372
Add lost content for Restic with Garage
Suggested-by: Quentin <quentin@deuxfleurs.fr>
2022-05-24 15:32:42 +02:00
2da448b43f
Add documentation for new Admin API and a few infos on K2V 2022-05-24 15:28:37 +02:00
b2a2d3859f K2V client improvements (#307)
- [x] Better distinguish error types
- [x] Parse error messages received from server
- [x] Remove `src/` folder layer, we don't have that for other crates

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#307
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-24 12:48:05 +02:00
382e74c798 First version of admin API (#298)
**Spec:**

- [x] Start writing
- [x] Specify all layout endpoints
- [x] Specify all endpoints for operations on keys
- [x] Specify all endpoints for operations on key/bucket permissions
- [x] Specify all endpoints for operations on buckets
- [x] Specify all endpoints for operations on bucket aliases

View rendered spec at <https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/admin-api/doc/drafts/admin-api.md>

**Code:**

- [x] Refactor code for admin api to use common api code that was created for K2V

**General endpoints:**

- [x] Metrics
- [x] GetClusterStatus
- [x] ConnectClusterNodes
- [x] GetClusterLayout
- [x] UpdateClusterLayout
- [x] ApplyClusterLayout
- [x] RevertClusterLayout

**Key-related endpoints:**

- [x] ListKeys
- [x] CreateKey
- [x] ImportKey
- [x] GetKeyInfo
- [x] UpdateKey
- [x] DeleteKey

**Bucket-related endpoints:**

- [x] ListBuckets
- [x] CreateBucket
- [x] GetBucketInfo
- [x] DeleteBucket
- [x] PutBucketWebsite
- [x] DeleteBucketWebsite

**Operations on key/bucket permissions:**

- [x] BucketAllowKey
- [x] BucketDenyKey

**Operations on bucket aliases:**

- [x] GlobalAliasBucket
- [x] GlobalUnaliasBucket
- [x] LocalAliasBucket
- [x] LocalUnaliasBucket

**And also:**

- [x] Separate error type for the admin API (this PR includes a quite big refactoring of error handling)
- [x] Add management of website access
- [ ] Check that nothing is missing wrt what can be done using the CLI
- [ ] Improve formatting of the spec
- [x] Make sure everyone is cool with the API design

Fix #231
Fix #295

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#298
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-24 12:16:39 +02:00
64c193e3db Add a K2V client library and CLI (#303)
lib.rs could use getting split in modules, but I'm not sure how exactly

Co-authored-by: trinity-1686a <trinity@deuxfleurs.fr>
Reviewed-on: Deuxfleurs/garage#303
Co-authored-by: trinity-1686a <trinity.pointard@gmail.com>
Co-committed-by: trinity-1686a <trinity.pointard@gmail.com>
2022-05-18 22:24:09 +02:00
c692f55d5c
K2V: Fix end parameter and add tests (fix #305) 2022-05-17 11:50:23 +02:00
7b474855e3
Make background runner terminate correctly 2022-05-17 11:38:31 +02:00
176715c5b2
Fix ReadIndex spec and add JSON5 remark to doc 2022-05-16 11:54:37 +02:00
5768bf3622 First implementation of K2V (#293)
**Specification:**

View spec at [this URL](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/k2v/doc/drafts/k2v-spec.md)

- [x] Specify the structure of K2V triples
- [x] Specify the DVVS format used for causality detection
- [x] Specify the K2V index (just a counter of number of values per partition key)
- [x] Specify single-item endpoints: ReadItem, InsertItem, DeleteItem
- [x] Specify index endpoint: ReadIndex
- [x] Specify multi-item endpoints: InsertBatch, ReadBatch, DeleteBatch
- [x] Move to JSON objects instead of tuples
- [x] Specify endpoints for polling for updates on single values (PollItem)

**Implementation:**

- [x] Table for K2V items, causal contexts
- [x] Indexing mechanism and table for K2V index
- [x] Make API handlers a bit more generic
- [x] K2V API endpoint
- [x] K2V API router
- [x] ReadItem
- [x] InsertItem
- [x] DeleteItem
- [x] PollItem
- [x] ReadIndex
- [x] InsertBatch
- [x] ReadBatch
- [x] DeleteBatch

**Testing:**

- [x] Just a simple Python script that does some requests to check visually that things are going right (does not contain parsing of results or assertions on returned values)
- [x] Actual tests:
  - [x] Adapt testing framework
  - [x] Simple test with InsertItem + ReadItem
  - [x] Test with several Insert/Read/DeleteItem + ReadIndex
  - [x] Test all combinations of return formats for ReadItem
  - [x] Test with ReadBatch, InsertBatch, DeleteBatch
  - [x] Test with PollItem
  - [x] Test error codes
- [ ] Fix most broken stuff
  - [x] test PollItem broken randomly
  - [x] when invalid causality tokens are given, errors should be 4xx not 5xx

**Improvements:**

- [x] Descending range queries
  - [x] Specify
  - [x] Implement
  - [x] Add test
- [x] Batch updates to index counter
- [x] Put K2V behind `k2v` feature flag

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#293
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-10 13:16:57 +02:00
def78c5e6f
Update netapp to 0.4.4, fix #300 2022-05-09 12:08:47 +02:00
277a20ec44 Fix layout show to not show changes when there are no changes (#297)
fixes #295, partially

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#297
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-09 11:14:55 +02:00
c9ef3e461b
fix clippy 2022-04-19 12:50:40 +02:00
c93008d333
Prettier code for defragmentation 2022-04-19 12:50:40 +02:00
e5341ca47b
Defragmentation in UploadPartCopy: first pass (not pretty but it compiles) 2022-04-19 12:50:36 +02:00
a4f9f19ac3 remove size limitation in UploadPartCopy (#280)
This removes the >1mb s3_copy restriction.

This restriction doesn't seem to be documented anywhere (I could be wrong). It also causes some software to fail (such as #248).

Co-authored-by: Rob Landers <landers.robert@gmail.com>
Reviewed-on: Deuxfleurs/garage#280
Co-authored-by: withinboredom <landers.robert@gmail.com>
Co-committed-by: withinboredom <landers.robert@gmail.com>
2022-04-19 12:49:43 +02:00
Baptiste Jonglez
47e57518ec Add documentation on running Kopia with Garage 2022-04-10 13:04:07 +02:00
dffcd9f4b1
update Cargo.nix 2022-04-08 14:35:09 +02:00
5d404dcd54
Add missing opentelemetry features 2022-04-08 14:21:04 +02:00
62f0715abe Add/Fix OpenTelemetry 2022-04-07 16:12:35 +02:00
7e1ac51b58 Add files to quickly test k8s 2022-04-07 16:12:35 +02:00
94f1e48fff Update to netapp 0.4.2 (a tiny fix) 2022-04-07 11:50:03 +02:00
cb5836d53c Bring maximum exponential backoff time down from 16h to 1h 2022-04-07 11:49:29 +02:00
8e3ee82c3e Be clearer on what upgrades are (not) supported 2022-04-06 21:45:59 +02:00
a122a8cb46 Add an "upgrading" section, add a guide for 0.7 2022-04-05 10:08:31 +02:00
9fd8ec1dee Add documentation for winscp+sftpgo 2022-03-31 10:25:56 +02:00
0091002ef2
New replication modes and their documentation 2022-03-28 16:26:04 +02:00
8f9cf3a5d1
fix a clippy lint 2022-03-28 15:48:55 +02:00
913f7754bb
Add blocks in errored state to garage stats 2022-03-28 15:47:23 +02:00
42dde54126
Log admin GET requests at debug level instead of info
to reduce noise in logs
2022-03-28 15:46:52 +02:00
dca2ffdf91
document administrative options 2022-03-28 12:26:08 +02:00
0cf4efac89 Compile kuberetes-discovery only when release=true 2022-03-24 16:57:43 +01:00
9d0ed78887 Add feature flag for Kubernetes discovery 2022-03-24 16:57:43 +01:00
509d256c58
Make layout optimization work in relative terms 2022-03-24 15:27:14 +01:00
2814d41842
Allow garage layout assign to assign to several nodes at once 2022-03-24 15:27:13 +01:00
7e0e2ffda2
Slight change and add comment to layout assignation algo 2022-03-24 15:27:13 +01:00
413ab0eaed
Small change to partition assignation algorithm
This change helps ensure that nodes for each partition are spread
over all datacenters, a property that wasn't ensured previously
when going from a 2 DC deployment to a 3 DC deployment
2022-03-24 15:27:10 +01:00
43945234ae
Add missing src/block to toplevel cargo.toml 2022-03-23 10:26:10 +01:00
3dc9214172
Add lots of comments on how the resync queue works
(I don't really want to change/refactor that code though)
2022-03-23 10:25:39 +01:00
077dd1cde9
Clippy 2022-03-23 10:25:39 +01:00
2d13f0aa13
run cargo2nix 2022-03-23 10:25:37 +01:00
e480aaf338
Make background tranquility a configurable parameter 2022-03-23 10:25:19 +01:00
8fd6745745
Move block RC code to separate rc.rs 2022-03-23 10:25:19 +01:00
c3982a90b6
Move DataBlock out of manager.rs 2022-03-23 10:25:19 +01:00
c1d9854d2c
Move block manager to separate module 2022-03-23 10:25:15 +01:00
8565f7dc31 cleanup 2022-03-23 10:22:37 +01:00
8db6b84559 add test for create bucket and put website with streaming signature 2022-03-23 10:22:37 +01:00
1eb7fdb08f add test framework for arbitraty S3 requests
and implement some basic test with it
2022-03-23 10:22:36 +01:00
e934934f14 garage_api: Update streaming payload stream unit tests 2022-03-23 10:22:36 +01:00
98545a16dd garage_api: Handle streaming payload early in request handling 2022-03-23 10:22:36 +01:00
822128e3c8 Talk a bit about capacity balancing between regions 2022-03-22 12:07:13 +01:00
Rune Henriksen
aea8b41728 document request routing logic 2022-03-21 12:03:57 +01:00
Rune Henriksen
71e6645e09 add short tutorial for duplicati usage with garage 2022-03-21 11:58:19 +01:00
15da2156f6 Change position of the node-id argument 2022-03-19 18:03:23 +01:00
0529f3c34d Patch cargo2nix openssl override 2022-03-17 12:17:38 +01:00
db46cdef79
Update netapp to v0.4.1 2022-03-15 17:09:57 +01:00
ba6b56ae68
Fix some new clippy lints 2022-03-14 12:27:49 +01:00
0af314b295
Add comment for fsync 2022-03-14 11:54:00 +01:00
d78bf379fb
Fix resync queue to not drop items 2022-03-14 11:51:37 +01:00
f7e6f4616f
Spawn a single resync worker 2022-03-14 11:51:37 +01:00
dc5ec4ecf9
Add appropriate fsync() calls in write_block
to ensure that data is persisted properly
2022-03-14 11:51:32 +01:00
fe62d01b7e
Implement exponential backoff for resync retries 2022-03-14 11:41:20 +01:00
bfb4353df5
Update Grafana dashboard 2022-03-14 10:55:30 +01:00
9b2b531f4d
Make admin server optional 2022-03-14 10:54:25 +01:00
a19341b188
Add Grafana dashboard for Garage 2022-03-14 10:54:25 +01:00
2377a92f6b
Add wrapper over sled tree to count items (used for big queues) 2022-03-14 10:54:25 +01:00
203e8d2c34
Bump version to 0.7 because of incompatible Netapp 2022-03-14 10:54:24 +01:00
f869ca625d
Add spans to table calls, change span names in RPC 2022-03-14 10:54:12 +01:00
0cc31ee169
add missing netapp telemetry feature 2022-03-14 10:54:11 +01:00
dc8d0496cc
Refactoring: rename config files, make modifications less invasive 2022-03-14 10:53:51 +01:00
d9a35359bf
Add metrics to web endpoint 2022-03-14 10:53:50 +01:00
2a5609b292
Add metrics to API endpoint 2022-03-14 10:53:36 +01:00
818daa5c78
Refactor how durations are measured 2022-03-14 10:53:35 +01:00
f0d0cd9a20
Remove strum crate dependency; add protobuf nix dependency 2022-03-14 10:53:00 +01:00
55d4471599
Remove ... at end of hex IDs 2022-03-14 10:52:31 +01:00
bb04d94fa9
Update to Netapp 0.4 which supports distributed tracing 2022-03-14 10:52:30 +01:00
8c2fb0c066
Add tracing integration with opentelemetry 2022-03-14 10:52:13 +01:00
b6561f6e1b
Add docker-compose for traces & metrics 2022-03-14 10:51:52 +01:00
2cab84b1fe
Add many metrics in table/ and rpc/ 2022-03-14 10:51:50 +01:00
1e2cf26373
Implement basic metrics in table 2022-03-14 10:51:17 +01:00
mricher
e349af13a7
Update dependencies and add admin module with metrics
- Global dependencies updated in Cargo.lock
- New module created in src/admin to host:
  - the (future) admin REST API
  - the metric collection
- add configuration block

No metrics implemented yet
2022-03-14 10:51:12 +01:00
9d44127245
add support for kubernetes service discovery
This commit adds support to discover garage instances running in
kubernetes.

Once enabled by setting `kubernetes_namespace` and
`kubernetes_service_name` garage will create a Custom Resources
`garagenodes.deuxfleurs.fr` with nodes public key as the resource name.
and IP and Port information as spec in the namespace configured by
`kubernetes_namespace`.

For discovering nodes the resources are filtered with the optionally set
`kubernetes_service_name` which sets a label
`garage.deuxfleurs.fr/service` on the resources.

This allows to separate multiple garage deployments in a single
namespace.

the `kubernetes_skip_crd` variable allows to disable the creation of the
CRD by garage itself. The user must deploy this manually.
2022-03-12 13:05:52 +01:00
251 changed files with 37709 additions and 6121 deletions

View file

@ -2,68 +2,26 @@
kind: pipeline
name: default
workspace:
base: /drone/garage
volumes:
- name: nix_store
host:
path: /var/lib/drone/nix
- name: nix_config
temp: {}
environment:
HOME: /drone/garage
node:
nix-daemon: 1
steps:
- name: setup nix
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
- name: check formatting
image: nixpkgs/nix:nixos-22.05
commands:
- cp nix/nix.conf /etc/nix/nix.conf
- nix-build --no-build-output --no-out-link shell.nix --arg release false -A inputDerivation
- name: code quality
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
commands:
- nix-shell --arg release false --run "cargo fmt -- --check"
- nix-shell --arg release false --run "cargo clippy -- --deny warnings"
- nix-shell --attr rust --run "cargo fmt -- --check"
- name: build
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
commands:
- nix-build --no-build-output --option log-lines 100 --argstr target x86_64-unknown-linux-musl --arg release false --argstr git_version $DRONE_COMMIT
- nix-build --no-build-output --attr clippy.amd64 --argstr git_version ${DRONE_TAG:-$DRONE_COMMIT}
- name: unit + func tests
image: nixpkgs/nix:nixos-21.05
image: nixpkgs/nix:nixos-22.05
environment:
GARAGE_TEST_INTEGRATION_EXE: result/bin/garage
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
commands:
- |
nix-build \
--no-build-output \
--option log-lines 100 \
--argstr target x86_64-unknown-linux-musl \
--argstr compileMode test
- nix-build --no-build-output --attr test.amd64
- ./result/bin/garage_api-*
- ./result/bin/garage_model-*
- ./result/bin/garage_rpc-*
@ -73,16 +31,11 @@ steps:
- ./result/bin/garage-*
- ./result/bin/integration-*
- name: smoke-test
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
- name: integration tests
image: nixpkgs/nix:nixos-22.05
commands:
- nix-build --no-build-output --argstr target x86_64-unknown-linux-musl --arg release false --argstr git_version $DRONE_COMMIT
- nix-shell --arg release false --run ./script/test-smoke.sh || (cat /tmp/garage.log; false)
- nix-build --no-build-output --attr clippy.amd64 --argstr git_version ${DRONE_TAG:-$DRONE_COMMIT}
- nix-shell --attr integration --run ./script/test-smoke.sh || (cat /tmp/garage.log; false)
trigger:
event:
@ -92,78 +45,39 @@ trigger:
- tag
- cron
node:
nix: 1
---
kind: pipeline
type: docker
name: release-linux-x86_64
name: release-linux-amd64
volumes:
- name: nix_store
host:
path: /var/lib/drone/nix
- name: nix_config
temp: {}
environment:
TARGET: x86_64-unknown-linux-musl
node:
nix-daemon: 1
steps:
- name: setup nix
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
commands:
- cp nix/nix.conf /etc/nix/nix.conf
- nix-build --no-build-output --no-out-link shell.nix -A inputDerivation
- name: build
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
commands:
- nix-build --no-build-output --argstr target $TARGET --arg release true --argstr git_version $DRONE_COMMIT
- nix-build --no-build-output --attr pkgs.amd64.release --argstr git_version ${DRONE_TAG:-$DRONE_COMMIT}
- nix-shell --attr rust --run "./script/not-dynamic.sh result/bin/garage"
- name: integration
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
commands:
- nix-shell --run ./script/test-smoke.sh || (cat /tmp/garage.log; false)
- nix-shell --attr integration --run ./script/test-smoke.sh || (cat /tmp/garage.log; false)
- name: push static binary
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
environment:
AWS_ACCESS_KEY_ID:
from_secret: garagehq_aws_access_key_id
AWS_SECRET_ACCESS_KEY:
from_secret: garagehq_aws_secret_access_key
TARGET: "x86_64-unknown-linux-musl"
commands:
- nix-shell --arg rust false --arg integration false --run "to_s3"
- nix-shell --attr release --run "to_s3"
- name: docker build and publish
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
environment:
DOCKER_AUTH:
from_secret: docker_auth
@ -174,7 +88,7 @@ steps:
- mkdir -p /kaniko/.docker
- echo $DOCKER_AUTH > /kaniko/.docker/config.json
- export CONTAINER_TAG=${DRONE_TAG:-$DRONE_COMMIT}
- nix-shell --arg rust false --arg integration false --run "to_docker"
- nix-shell --attr release --run "to_docker"
trigger:
@ -182,78 +96,39 @@ trigger:
- promote
- cron
node:
nix: 1
---
kind: pipeline
type: docker
name: release-linux-i686
name: release-linux-i386
volumes:
- name: nix_store
host:
path: /var/lib/drone/nix
- name: nix_config
temp: {}
environment:
TARGET: i686-unknown-linux-musl
node:
nix-daemon: 1
steps:
- name: setup nix
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
commands:
- cp nix/nix.conf /etc/nix/nix.conf
- nix-build --no-build-output --no-out-link shell.nix -A inputDerivation
- name: build
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
commands:
- nix-build --no-build-output --argstr target $TARGET --arg release true --argstr git_version $DRONE_COMMIT
- nix-build --no-build-output --attr pkgs.i386.release --argstr git_version ${DRONE_TAG:-$DRONE_COMMIT}
- nix-shell --attr rust --run "./script/not-dynamic.sh result/bin/garage"
- name: integration
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
commands:
- nix-shell --run ./script/test-smoke.sh || (cat /tmp/garage.log; false)
- nix-shell --attr integration --run ./script/test-smoke.sh || (cat /tmp/garage.log; false)
- name: push static binary
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
environment:
AWS_ACCESS_KEY_ID:
from_secret: garagehq_aws_access_key_id
AWS_SECRET_ACCESS_KEY:
from_secret: garagehq_aws_secret_access_key
TARGET: "i686-unknown-linux-musl"
commands:
- nix-shell --arg rust false --arg integration false --run "to_s3"
- nix-shell --attr release --run "to_s3"
- name: docker build and publish
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
environment:
DOCKER_AUTH:
from_secret: docker_auth
@ -264,75 +139,41 @@ steps:
- mkdir -p /kaniko/.docker
- echo $DOCKER_AUTH > /kaniko/.docker/config.json
- export CONTAINER_TAG=${DRONE_TAG:-$DRONE_COMMIT}
- nix-shell --arg rust false --arg integration false --run "to_docker"
- nix-shell --attr release --run "to_docker"
trigger:
event:
- promote
- cron
node:
nix: 1
---
kind: pipeline
type: docker
name: release-linux-aarch64
name: release-linux-arm64
volumes:
- name: nix_store
host:
path: /var/lib/drone/nix
- name: nix_config
temp: {}
environment:
TARGET: aarch64-unknown-linux-musl
node:
nix-daemon: 1
steps:
- name: setup nix
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
commands:
- cp nix/nix.conf /etc/nix/nix.conf
- nix-build --no-build-output --no-out-link ./shell.nix --arg rust false --arg integration false -A inputDerivation
- name: build
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
commands:
- nix-build --no-build-output --argstr target $TARGET --arg release true --argstr git_version $DRONE_COMMIT
- nix-build --no-build-output --attr pkgs.arm64.release --argstr git_version ${DRONE_TAG:-$DRONE_COMMIT}
- nix-shell --attr rust --run "./script/not-dynamic.sh result/bin/garage"
- name: push static binary
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
environment:
AWS_ACCESS_KEY_ID:
from_secret: garagehq_aws_access_key_id
AWS_SECRET_ACCESS_KEY:
from_secret: garagehq_aws_secret_access_key
TARGET: "aarch64-unknown-linux-musl"
commands:
- nix-shell --arg rust false --arg integration false --run "to_s3"
- nix-shell --attr release --run "to_s3"
- name: docker build and publish
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
environment:
DOCKER_AUTH:
from_secret: docker_auth
@ -343,75 +184,41 @@ steps:
- mkdir -p /kaniko/.docker
- echo $DOCKER_AUTH > /kaniko/.docker/config.json
- export CONTAINER_TAG=${DRONE_TAG:-$DRONE_COMMIT}
- nix-shell --arg rust false --arg integration false --run "to_docker"
- nix-shell --attr release --run "to_docker"
trigger:
event:
- promote
- cron
node:
nix: 1
---
kind: pipeline
type: docker
name: release-linux-armv6l
name: release-linux-arm
volumes:
- name: nix_store
host:
path: /var/lib/drone/nix
- name: nix_config
temp: {}
environment:
TARGET: armv6l-unknown-linux-musleabihf
node:
nix-daemon: 1
steps:
- name: setup nix
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
commands:
- cp nix/nix.conf /etc/nix/nix.conf
- nix-build --no-build-output --no-out-link --arg rust false --arg integration false -A inputDerivation
- name: build
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
commands:
- nix-build --no-build-output --argstr target $TARGET --arg release true --argstr git_version $DRONE_COMMIT
- nix-build --no-build-output --attr pkgs.arm.release --argstr git_version ${DRONE_TAG:-$DRONE_COMMIT}
- nix-shell --attr rust --run "./script/not-dynamic.sh result/bin/garage"
- name: push static binary
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
environment:
AWS_ACCESS_KEY_ID:
from_secret: garagehq_aws_access_key_id
AWS_SECRET_ACCESS_KEY:
from_secret: garagehq_aws_secret_access_key
TARGET: "armv6l-unknown-linux-musleabihf"
commands:
- nix-shell --arg integration false --arg rust false --run "to_s3"
- nix-shell --attr release --run "to_s3"
- name: docker build and publish
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
- name: nix_config
path: /etc/nix
image: nixpkgs/nix:nixos-22.05
environment:
DOCKER_AUTH:
from_secret: docker_auth
@ -422,32 +229,24 @@ steps:
- mkdir -p /kaniko/.docker
- echo $DOCKER_AUTH > /kaniko/.docker/config.json
- export CONTAINER_TAG=${DRONE_TAG:-$DRONE_COMMIT}
- nix-shell --arg rust false --arg integration false --run "to_docker"
- nix-shell --attr release --run "to_docker"
trigger:
event:
- promote
- cron
node:
nix: 1
---
kind: pipeline
type: docker
name: refresh-release-page
volumes:
- name: nix_store
host:
path: /var/lib/drone/nix
node:
nix-daemon: 1
steps:
- name: refresh-index
image: nixpkgs/nix:nixos-21.05
volumes:
- name: nix_store
path: /nix
image: nixpkgs/nix:nixos-22.05
environment:
AWS_ACCESS_KEY_ID:
from_secret: garagehq_aws_access_key_id
@ -455,24 +254,21 @@ steps:
from_secret: garagehq_aws_secret_access_key
commands:
- mkdir -p /etc/nix && cp nix/nix.conf /etc/nix/nix.conf
- nix-shell --arg integration false --arg rust false --run "refresh_index"
- nix-shell --attr release --run "refresh_index"
depends_on:
- release-linux-x86_64
- release-linux-i686
- release-linux-aarch64
- release-linux-armv6l
- release-linux-amd64
- release-linux-i386
- release-linux-arm64
- release-linux-arm
trigger:
event:
- promote
- cron
node:
nix: 1
---
kind: signature
hmac: 3fc19d6f9a3555519c8405e3281b2e74289bb802f644740d5481d53df3a01fa4
hmac: 362639b4c9541ad9bd06ff7f72b5235b2b0216bcb16eececd25285b6fe94ba6f
...

2554
Cargo.lock generated

File diff suppressed because it is too large Load diff

4376
Cargo.nix

File diff suppressed because it is too large Load diff

View file

@ -1,14 +1,20 @@
[workspace]
resolver = "2"
members = [
"src/db",
"src/util",
"src/rpc",
"src/table",
"src/block",
"src/model",
"src/api",
"src/web",
"src/garage",
"src/k2v-client",
]
default-members = ["src/garage"]
[profile.dev]
lto = "off"

View file

@ -1,13 +1,27 @@
.PHONY: doc all release shell
.PHONY: doc all release shell run1 run2 run3
all:
clear; cargo build
doc:
cd doc/book; mdbook build
release:
nix-build --arg release true
shell:
nix-shell
# ----
run1:
RUST_LOG=garage=debug ./target/debug/garage -c tmp/config1.toml server
run1rel:
RUST_LOG=garage=debug ./target/release/garage -c tmp/config1.toml server
run2:
RUST_LOG=garage=debug ./target/debug/garage -c tmp/config2.toml server
run2rel:
RUST_LOG=garage=debug ./target/release/garage -c tmp/config2.toml server
run3:
RUST_LOG=garage=debug ./target/debug/garage -c tmp/config3.toml server
run3rel:
RUST_LOG=garage=debug ./target/release/garage -c tmp/config3.toml server

View file

@ -1,125 +1,33 @@
{
system ? builtins.currentSystem,
release ? false,
target ? "x86_64-unknown-linux-musl",
compileMode ? null,
git_version ? null,
}:
with import ./nix/common.nix;
let
crossSystem = { config = target; };
in let
log = v: builtins.trace v v;
pkgs = import pkgsSrc {
inherit system crossSystem;
overlays = [ cargo2nixOverlay ];
};
/*
Rust and Nix triples are not the same. Cargo2nix has a dedicated library
to convert Nix triples to Rust ones. We need this conversion as we want to
set later options linked to our (rust) target in a generic way. Not only
the triple terminology is different, but also the "roles" are named differently.
Nix uses a build/host/target terminology where Nix's "host" maps to Cargo's "target".
*/
rustTarget = log (pkgs.rustBuilder.rustLib.rustTriple pkgs.stdenv.hostPlatform);
/*
Cargo2nix is built for rustOverlay which installs Rust from Mozilla releases.
We want our own Rust to avoid incompatibilities, like we had with musl 1.2.0.
rustc was built with musl < 1.2.0 and nix shipped musl >= 1.2.0 which lead to compilation breakage.
So we want a Rust release that is bound to our Nix repository to avoid these problems.
See here for more info: https://musl.libc.org/time64.html
Because Cargo2nix does not support the Rust environment shipped by NixOS,
we emulate the structure of the Rust object created by rustOverlay.
In practise, rustOverlay ships rustc+cargo in a single derivation while
NixOS ships them in separate ones. We reunite them with symlinkJoin.
*/
rustChannel = pkgs.symlinkJoin {
name ="rust-channel";
paths = [
pkgs.rustPlatform.rust.rustc
pkgs.rustPlatform.rust.cargo
];
};
overrides = pkgs.rustBuilder.overrides.all ++ [
/*
[1] We need to alter Nix hardening to be able to statically compile: PIE,
Position Independent Executables seems to be supported only on amd64. Having
this flags set either make our executables crash or compile as dynamic on many platforms.
In the following section codegenOpts, we reactive it for the supported targets
(only amd64 curently) through the `-static-pie` flag. PIE is a feature used
by ASLR, which helps mitigate security issues.
Learn more about Nix Hardening: https://github.com/NixOS/nixpkgs/blob/master/pkgs/build-support/cc-wrapper/add-hardening.sh
[2] We want to inject the git version while keeping the build deterministic.
As we do not want to consider the .git folder as part of the input source,
we ask the user (the CI often) to pass the value to Nix.
*/
(pkgs.rustBuilder.rustLib.makeOverride {
name = "garage";
overrideAttrs = drv:
/* [1] */ { hardeningDisable = [ "pie" ]; }
//
/* [2] */ (if git_version != null then {
preConfigure = ''
${drv.preConfigure or ""}
export GIT_VERSION="${git_version}"
'';
} else {});
})
];
packageFun = import ./Cargo.nix;
/*
We compile fully static binaries with musl to simplify deployment on most systems.
When possible, we reactivate PIE hardening (see above).
Also, if you set the RUSTFLAGS environment variable, the following parameters will
be ignored.
For more information on static builds, please refer to Rust's RFC 1721.
https://rust-lang.github.io/rfcs/1721-crt-static.html#specifying-dynamicstatic-c-runtime-linkage
*/
codegenOpts = {
"armv6l-unknown-linux-musleabihf" = [ "target-feature=+crt-static" "link-arg=-static" ]; /* compile as dynamic with static-pie */
"aarch64-unknown-linux-musl" = [ "target-feature=+crt-static" "link-arg=-static" ]; /* segfault with static-pie */
"i686-unknown-linux-musl" = [ "target-feature=+crt-static" "link-arg=-static" ]; /* segfault with static-pie */
"x86_64-unknown-linux-musl" = [ "target-feature=+crt-static" "link-arg=-static-pie" ];
};
/*
The following definition is not elegant as we use a low level function of Cargo2nix
that enables us to pass our custom rustChannel object. We need this low level definition
to pass Nix's Rust toolchains instead of Mozilla's one.
target is mandatory but must be kept to null to allow cargo2nix to set it to the appropriate value
for each crate.
*/
rustPkgs = pkgs.rustBuilder.makePackageSet {
inherit packageFun rustChannel release codegenOpts;
packageOverrides = overrides;
target = null;
buildRustPackages = pkgs.buildPackages.rustBuilder.makePackageSet {
inherit rustChannel packageFun codegenOpts;
packageOverrides = overrides;
target = null;
};
};
in
if compileMode == "test"
then pkgs.symlinkJoin {
pkgs = import pkgsSrc { };
compile = import ./nix/compile.nix;
build_debug_and_release = (target: {
debug = (compile { inherit target git_version; release = false; }).workspace.garage { compileMode = "build"; };
release = (compile { inherit target git_version; release = true; }).workspace.garage { compileMode = "build"; };
});
test = (rustPkgs: pkgs.symlinkJoin {
name ="garage-tests";
paths = builtins.map (key: rustPkgs.workspace.${key} { inherit compileMode; }) (builtins.attrNames rustPkgs.workspace);
paths = builtins.map (key: rustPkgs.workspace.${key} { compileMode = "test"; }) (builtins.attrNames rustPkgs.workspace);
});
in {
pkgs = {
amd64 = build_debug_and_release "x86_64-unknown-linux-musl";
i386 = build_debug_and_release "i686-unknown-linux-musl";
arm64 = build_debug_and_release "aarch64-unknown-linux-musl";
arm = build_debug_and_release "armv6l-unknown-linux-musleabihf";
};
test = {
amd64 = test (compile { inherit git_version; target = "x86_64-unknown-linux-musl"; });
};
clippy = {
amd64 = (compile { inherit git_version; compiler = "clippy"; }).workspace.garage { compileMode = "build"; } ;
};
}
else rustPkgs.workspace.garage { inherit compileMode; }

View file

@ -17,6 +17,61 @@ If you still want to use Borg, you can use it with `rclone mount`.
## Restic
Create your key and bucket:
```bash
garage key new my-key
garage bucket create backup
garage bucket allow backup --read --write --key my-key
```
Then register your Key ID and Secret key in your environment:
```bash
export AWS_ACCESS_KEY_ID=GKxxx
export AWS_SECRET_ACCESS_KEY=xxxx
```
Configure restic from environment too:
```bash
export RESTIC_REPOSITORY="s3:http://localhost:3900/backups"
echo "Generated password (save it safely): $(openssl rand -base64 32)"
export RESTIC_PASSWORD=xxx # copy paste your generated password here
```
Do not forget to save your password safely (in your password manager or print it). It will be needed to decrypt your backups.
Now you can use restic:
```bash
# Initialize the bucket, must be run once
restic init
# Backup your PostgreSQL database
# (We suppose your PostgreSQL daemon is stopped for all commands)
restic backup /var/lib/postgresql
# Show backup history
restic snapshots
# Backup again your PostgreSQL database, it will be faster as only changes will be uploaded
restic backup /var/lib/postgresql
# Show backup history (again)
restic snapshots
# Restore a backup
# (79766175 is the ID of the snapshot you want to restore)
mv /var/lib/postgresql /var/lib/postgresql.broken
restic restore 79766175 --target /var/lib/postgresql
```
Restic has way more features than the ones presented here.
You can discover all of them by accessing its documentation from the link below.
*External links:* [Restic Documentation > Amazon S3](https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html#amazon-s3)
## Duplicity
@ -25,7 +80,22 @@ If you still want to use Borg, you can use it with `rclone mount`.
## Duplicati
*External links:* [Duplicati Documentation > Storage Providers](https://github.com/kees-z/DuplicatiDocs/blob/master/docs/05-storage-providers.md#user-content-s3-compatible)
*External links:* [Duplicati Documentation > Storage Providers](https://duplicati.readthedocs.io/en/latest/05-storage-providers/#s3-compatible)
The following fields need to be specified:
```
Storage Type: S3 Compatible
Use SSL: [ ] # Only if you have SSL
Server: Custom server url (s3.garage.localhost:3900)
Bucket name: bucket-name
Bucket create region: Custom region value (garage) # Or as you've specified in garage.toml
AWS Access ID: Key ID from "garage key info key-name"
AWS Access Key: Secret key from "garage key info key-name"
Client Library to use: Minio SDK
```
Click `Test connection` and then no when asked `The bucket name should start with your username, prepend automatically?`. Then it should say `Connection worked!`.
## knoxite
@ -35,3 +105,24 @@ If you still want to use Borg, you can use it with `rclone mount`.
*External links:* [Kopia Documentation > Repositories](https://kopia.io/docs/repositories/#amazon-s3)
To create the Kopia repository, you need to specify the region, the HTTP(S) endpoint, the bucket name and the access keys.
For instance, if you have an instance of garage running on `https://garage.example.com`:
```
kopia repository create s3 --region=garage --bucket=mybackups --access-key=KEY_ID --secret-access-key=SECRET_KEY --endpoint=garage.example.com
```
Or if you have an instance running on localhost, without TLS:
```
kopia repository create s3 --region=garage --bucket=mybackups --access-key=KEY_ID --secret-access-key=SECRET_KEY --endpoint=localhost:3900 --disable-tls
```
After the repository has been created, check that everything works as expected:
```
kopia repository validate-provider
```
You can then run all the standard kopia commands: `kopia snapshot create`, `kopia mount`...
Everything should work out-of-the-box.

View file

@ -13,7 +13,8 @@ These tools are particularly suitable for debug, backups, website deployments or
| [rclone](#rclone) | ✅ | |
| [s3cmd](#s3cmd) | ✅ | |
| [(Cyber)duck](#cyberduck) | ✅ | |
| [WinSCP (libs3)](#winscp) | ✅ | No instructions yet |
| [WinSCP (libs3)](#winscp) | ✅ | CLI instructions only |
| [sftpgo](#sftpgo) | ✅ | |
## Minio client
@ -281,5 +282,59 @@ duck --delete garage:/my-files/an-object.txt
## WinSCP (libs3) {#winscp}
*No instruction yet. You can find ones in french [in our wiki](https://wiki.deuxfleurs.fr/fr/Guide/Garage/WinSCP).*
*You can find instructions on how to use the GUI in french [in our wiki](https://wiki.deuxfleurs.fr/fr/Guide/Garage/WinSCP).*
How to use `winscp.com`, the CLI interface of WinSCP:
```
open s3://GKxxxxx:yyyyyyy@127.0.0.1:4443 -certificate=* -rawsettings S3DefaultRegion=garage S3UrlStyle=1
ls
ls my-files/
get my-files/an-object.txt Z:\tmp\object.txt
put Z:\tmp\object.txt my-files/another-object.txt
rm my-files/an-object
exit
```
Notes:
- It seems WinSCP supports only TLS connections for S3
- `-certificate=*` allows self-signed certificates, remove it if you have valid certificates
## sftpgo {#sftpgo}
sftpgo needs a database to work, by default it uses sqlite and does not require additional configuration.
You can then directly init it:
```
sftpgo initprovider
```
Then you can directly launch the daemon that will listen by default on `:8080 (http)` and `:2022 (ssh)`:
```
sftpgo serve
```
Go to the admin web interface (http://[::1]:8080/web/admin/), create the required admin account, then create a user account.
Choose a username (eg: `ada`) and a password.
In the filesystem section, choose:
- Storage: AWS S3 (Compatible)
- Bucket: *your bucket name*
- Region: `garage` (or the one you defined in `config.toml`)
- Access key: *your access key*
- Access secret: *your secret key*
- Endpoint: *your endpoint*, eg. `https://garage.example.tld`, note that the protocol (`https` here) must be specified. Non standard ports and `http` have not been tested yet.
- Keep the default values for other fields
- Tick "Use path-style addressing". It should work without ticking it if you have correctly configured your instance to use URL vhost-style.
Now you can access your bucket through SFTP:
```
sftp -P2022 ada@[::1]
ls
```
And through the web interface at http://[::1]:8080/web/client

View file

@ -3,7 +3,7 @@ title = "Websites (Hugo, Jekyll, Publii...)"
weight = 10
+++
Garage is also suitable to host static websites.
Garage is also suitable [to host static websites](@/documentation/cookbook/exposing-websites.md).
While they can be deployed with traditional CLI tools, some static website generators have integrated options to ease your workflow.
| Name | Status | Note |

View file

@ -20,6 +20,24 @@ sudo apt-get update
sudo apt-get install build-essential
```
## Using source from the Gitea repository (recommended)
The primary location for Garage's source code is the
[Gitea repository](https://git.deuxfleurs.fr/Deuxfleurs/garage).
Clone the repository and build Garage with the following commands:
```bash
git clone https://git.deuxfleurs.fr/Deuxfleurs/garage.git
cd garage
cargo build
```
Be careful, as this will make a debug build of Garage, which will be extremely slow!
To make a release build, invoke `cargo build --release` (this takes much longer).
The binaries built this way are found in `target/{debug,release}/garage`.
## Using source from `crates.io`
Garage's source code is published on `crates.io`, Rust's official package repository.
@ -39,21 +57,20 @@ sudo cp $HOME/.cargo/bin/garage /usr/local/bin/garage
```
## Using source from the Gitea repository
## Selecting features to activate in your build
The primary location for Garage's source code is the
[Gitea repository](https://git.deuxfleurs.fr/Deuxfleurs/garage).
Clone the repository and build Garage with the following commands:
```bash
git clone https://git.deuxfleurs.fr/Deuxfleurs/garage.git
cd garage
cargo build
```
Be careful, as this will make a debug build of Garage, which will be extremely slow!
To make a release build, invoke `cargo build --release` (this takes much longer).
The binaries built this way are found in `target/{debug,release}/garage`.
Garage supports a number of compilation options in the form of Cargo features,
which can be used to provide builds adapted to your system and your use case.
The following features are available:
| Feature | Enabled | Description |
| ------- | ------- | ----------- |
| `bundled-libs` | BY DEFAULT | Use bundled version of sqlite3, zstd, lmdb and libsodium |
| `system-libs` | optional | Use system version of sqlite3, zstd, lmdb and libsodium if available (exclusive with `bundled-libs`, build using `cargo build --no-default-features --features system-libs`) |
| `k2v` | optional | Enable the experimental K2V API (if used, all nodes on your Garage cluster must have it enabled as well) |
| `kubernetes-discovery` | optional | Enable automatic registration and discovery of cluster nodes through the Kubernetes API |
| `metrics` | BY DEFAULT | Enable collection of metrics in Prometheus format on the admin API |
| `telemetry-otlp` | optional | Enable collection of execution traces using OpenTelemetry |
| `sled` | BY DEFAULT | Enable using Sled to store Garage's metadata |
| `lmdb` | optional | Enable using LMDB to store Garage's metadata |
| `sqlite` | optional | Enable using Sqlite3 to store Garage's metadata |

View file

@ -23,7 +23,7 @@ To run a real-world deployment, make sure the following conditions are met:
- Ideally, each machine should have a SSD available in addition to the HDD you are dedicating
to Garage. This will allow for faster access to metadata and has the potential
to drastically reduce Garage's response times.
to significantly reduce Garage's response times.
- This guide will assume you are using Docker containers to deploy Garage on each node.
Garage can also be run independently, for instance as a [Systemd service](@/documentation/cookbook/systemd.md).
@ -35,12 +35,19 @@ For our example, we will suppose the following infrastructure with IPv6 connecti
| Location | Name | IP Address | Disk Space |
|----------|---------|------------|------------|
| Paris | Mercury | fc00:1::1 | 1 To |
| Paris | Venus | fc00:1::2 | 2 To |
| London | Earth | fc00:B::1 | 2 To |
| Brussels | Mars | fc00:F::1 | 1.5 To |
| Paris | Mercury | fc00:1::1 | 1 TB |
| Paris | Venus | fc00:1::2 | 2 TB |
| London | Earth | fc00:B::1 | 2 TB |
| Brussels | Mars | fc00:F::1 | 1.5 TB |
Note that Garage will **always** store the three copies of your data on nodes at different
locations. This means that in the case of this small example, the available capacity
of the cluster is in fact only 1.5 TB, because nodes in Brussels can't store more than that.
This also means that nodes in Paris and London will be under-utilized.
To make better use of the available hardware, you should ensure that the capacity
available in the different locations of your cluster is roughly the same.
For instance, here, the Mercury node could be moved to Brussels; this would allow the cluster
to store 2 TB of data in total.
## Get a Docker image
@ -208,10 +215,10 @@ For our example, we will suppose we have the following infrastructure
| Location | Name | Disk Space | `Capacity` | `Identifier` | `Zone` |
|----------|---------|------------|------------|--------------|--------------|
| Paris | Mercury | 1 To | `10` | `563e` | `par1` |
| Paris | Venus | 2 To | `20` | `86f0` | `par1` |
| London | Earth | 2 To | `20` | `6814` | `lon1` |
| Brussels | Mars | 1.5 To | `15` | `212f` | `bru1` |
| Paris | Mercury | 1 TB | `10` | `563e` | `par1` |
| Paris | Venus | 2 TB | `20` | `86f0` | `par1` |
| London | Earth | 2 TB | `20` | `6814` | `lon1` |
| Brussels | Mars | 1.5 TB | `15` | `212f` | `bru1` |
#### Node identifiers
@ -261,10 +268,10 @@ have 66% chance of being stored by Venus and 33% chance of being stored by Mercu
Given the information above, we will configure our cluster as follow:
```bash
garage layout assign -z par1 -c 10 -t mercury 563e
garage layout assign -z par1 -c 20 -t venus 86f0
garage layout assign -z lon1 -c 20 -t earth 6814
garage layout assign -z bru1 -c 15 -t mars 212f
garage layout assign 563e -z par1 -c 10 -t mercury
garage layout assign 86f0 -z par1 -c 20 -t venus
garage layout assign 6814 -z lon1 -c 20 -t earth
garage layout assign 212f -z bru1 -c 15 -t mars
```
At this point, the changes in the cluster layout have not yet been applied.

View file

@ -100,7 +100,7 @@ server {
}
```
## Exposing the web endpoint
### Exposing the web endpoint
To better understand the logic involved, you can refer to the [Exposing buckets as websites](/cookbook/exposing_websites.html) section.
Otherwise, the configuration is very similar to the S3 endpoint.
@ -140,6 +140,165 @@ server {
@TODO
## Traefik
## Traefik v2
We will see in this part how to set up a reverse proxy with [Traefik](https://docs.traefik.io/).
Here is [a basic configuration file](https://doc.traefik.io/traefik/https/acme/#configuration-examples):
```toml
[entryPoints]
[entryPoints.web]
address = ":80"
[entryPoints.websecure]
address = ":443"
[certificatesResolvers.myresolver.acme]
email = "your-email@example.com"
storage = "acme.json"
[certificatesResolvers.myresolver.acme.httpChallenge]
# used during the challenge
entryPoint = "web"
```
### Add Garage service
To add Garage on Traefik you should declare a new service using its IP address (or hostname) and port:
```toml
[http.services]
[http.services.my_garage_service.loadBalancer]
[[http.services.my_garage_service.loadBalancer.servers]]
url = "http://xxx.xxx.xxx.xxx"
port = 3900
```
It's possible to declare multiple Garage servers as back-ends:
```toml
[http.services]
[[http.services.my_garage_service.loadBalancer.servers]]
url = "http://xxx.xxx.xxx.xxx"
port = 3900
[[http.services.my_garage_service.loadBalancer.servers]]
url = "http://yyy.yyy.yyy.yyy"
port = 3900
[[http.services.my_garage_service.loadBalancer.servers]]
url = "http://zzz.zzz.zzz.zzz"
port = 3900
```
Traefik can remove unhealthy servers automatically with [a health check configuration](https://doc.traefik.io/traefik/routing/services/#health-check):
```
[http.services]
[http.services.my_garage_service.loadBalancer]
[http.services.my_garage_service.loadBalancer.healthCheck]
path = "/"
interval = "60s"
timeout = "5s"
```
### Adding a website
To add a new website, add the following declaration to your Traefik configuration file:
```toml
[http.routers]
[http.routers.my_website]
rule = "Host(`yoururl.example.org`)"
service = "my_garage_service"
entryPoints = ["web"]
```
Enable HTTPS access to your website with the following configuration section ([documentation](https://doc.traefik.io/traefik/https/overview/)):
```toml
...
entryPoints = ["websecure"]
[http.routers.my_website.tls]
certResolver = "myresolver"
...
```
### Adding gzip compression
Add the following configuration section [to compress response](https://doc.traefik.io/traefik/middlewares/http/compress/) using [gzip](https://developer.mozilla.org/en-US/docs/Glossary/GZip_compression) before sending them to the client:
```toml
[http.routers]
[http.routers.my_website]
...
middlewares = ["gzip_compress"]
...
[http.middlewares]
[http.middlewares.gzip_compress.compress]
```
### Add caching response
Traefik's caching middleware is only available on [entreprise version](https://doc.traefik.io/traefik-enterprise/middlewares/http-cache/), however the freely-available [Souin plugin](https://github.com/darkweak/souin#tr%C3%A6fik-container) can also do the job. (section to be completed)
### Complete example
```toml
[entryPoints]
[entryPoints.web]
address = ":80"
[entryPoints.websecure]
address = ":443"
[certificatesResolvers.myresolver.acme]
email = "your-email@example.com"
storage = "acme.json"
[certificatesResolvers.myresolver.acme.httpChallenge]
# used during the challenge
entryPoint = "web"
[http.routers]
[http.routers.my_website]
rule = "Host(`yoururl.example.org`)"
service = "my_garage_service"
middlewares = ["gzip_compress"]
entryPoints = ["websecure"]
[http.services]
[http.services.my_garage_service.loadBalancer]
[http.services.my_garage_service.loadBalancer.healthCheck]
path = "/"
interval = "60s"
timeout = "5s"
[[http.services.my_garage_service.loadBalancer.servers]]
url = "http://xxx.xxx.xxx.xxx"
[[http.services.my_garage_service.loadBalancer.servers]]
url = "http://yyy.yyy.yyy.yyy"
[[http.services.my_garage_service.loadBalancer.servers]]
url = "http://zzz.zzz.zzz.zzz"
[http.middlewares]
[http.middlewares.gzip_compress.compress]
```
## Caddy
Your Caddy configuration can be as simple as:
```caddy
s3.garage.tld, *.s3.garage.tld {
reverse_proxy localhost:3900 192.168.1.2:3900 example.tld:3900
}
*.web.garage.tld {
reverse_proxy localhost:3902 192.168.1.2:3900 example.tld:3900
}
admin.garage.tld {
reverse_proxy localhost:3903
}
```
But at the same time, the `reverse_proxy` is very flexible.
For a production deployment, you should [read its documentation](https://caddyserver.com/docs/caddyfile/directives/reverse_proxy) as it supports features like DNS discovery of upstreams, load balancing with checks, streaming parameters, etc.
@TODO

View file

@ -0,0 +1,50 @@
+++
title = "Upgrading Garage"
weight = 40
+++
Garage is a stateful clustered application, where all nodes are communicating together and share data structures.
It makes upgrade more difficult than stateless applications so you must be more careful when upgrading.
On a new version release, there is 2 possibilities:
- protocols and data structures remained the same ➡️ this is a **straightforward upgrade**
- protocols or data structures changed ➡️ this is an **advanced upgrade**
You can quickly now what type of update you will have to operate by looking at the version identifier.
Following the [SemVer ](https://semver.org/) terminology, if only the *patch* number changed, it will only need a straightforward upgrade.
Example: an upgrade from v0.6.0 from v0.6.1 is a straightforward upgrade.
If the *minor* or *major* number changed however, you will have to do an advanced upgrade. Example: from v0.6.1 to v0.7.0.
Migrations are designed to be run only between contiguous versions (from a *major*.*minor* perspective, *patches* can be skipped).
Example: migrations from v0.6.1 to v0.7.0 and from v0.6.0 to v0.7.0 are supported but migrations from v0.5.0 to v0.7.0 are not supported.
## Straightforward upgrades
Straightforward upgrades do not imply cluster downtime.
Before upgrading, you should still read [the changelog](https://git.deuxfleurs.fr/Deuxfleurs/garage/releases) and ideally test your deployment on a staging cluster before.
When you are ready, start by checking the health of your cluster.
You can force some checks with `garage repair`, we recommend at least running `garage repair --all-nodes --yes` that is very quick to run (less than a minute).
You will see that the command correctly terminated in the logs of your daemon.
Finally, you can simply upgrades nodes one by one.
For each node: stop it, install the new binary, edit the configuration if needed, restart it.
## Advanced upgrades
Advanced upgrades will imply cluster downtime.
Before upgrading, you must read [the changelog](https://git.deuxfleurs.fr/Deuxfleurs/garage/releases) and you must test your deployment on a staging cluster before.
From a high level perspective, an advanced upgrade looks like this:
1. Make sure the health of your cluster is good (see `garage repair`)
2. Disable API access (comment the configuration in your reverse proxy)
3. Check that your cluster is idle
4. Stop the whole cluster
5. Backup the metadata folder of all your nodes, so that you will be able to restore it quickly if the upgrade fails (blocks being immutable, they should not be impacted)
6. Install the new binary, update the configuration
7. Start the whole cluster
8. If needed, run the corresponding migration from `garage migrate`
9. Make sure the health of your cluster is good
10. Enable API access (uncomment the configuration in your reverse proxy)
11. Monitor your cluster while load comes back, check that all your applications are happy with this new version
We write guides for each advanced upgrade, they are stored under the "Working Documents" section of this documentation.

View file

@ -249,16 +249,6 @@ mc alias set \
--api S3v4
```
You must also add an environment variable to your configuration to
inform MinIO of our region (`garage` by default, corresponding to the `s3_region` parameter
in the configuration file).
The best way is to add the following snippet to your `$HOME/.bash_profile`
or `$HOME/.bashrc` file:
```bash
export MC_REGION=garage
```
### Use `mc`
You can not list buckets from `mc` currently.

View file

@ -0,0 +1,644 @@
+++
title = "Administration API"
weight = 16
+++
The Garage administration API is accessible through a dedicated server whose
listen address is specified in the `[admin]` section of the configuration
file (see [configuration file
reference](@/documentation/reference-manual/configuration.md))
**WARNING.** At this point, there is no comittement to stability of the APIs described in this document.
We will bump the version numbers prefixed to each API endpoint at each time the syntax
or semantics change, meaning that code that relies on these endpoint will break
when changes are introduced.
The Garage administration API was introduced in version 0.7.2, this document
does not apply to older versions of Garage.
## Access control
The admin API uses two different tokens for acces control, that are specified in the config file's `[admin]` section:
- `metrics_token`: the token for accessing the Metrics endpoint (if this token
is not set in the config file, the Metrics endpoint can be accessed without
access control);
- `admin_token`: the token for accessing all of the other administration
endpoints (if this token is not set in the config file, access to these
endpoints is disabled entirely).
These tokens are used as simple HTTP bearer tokens. In other words, to
authenticate access to an admin API endpoint, add the following HTTP header
to your request:
```
Authorization: Bearer <token>
```
## Administration API endpoints
### Metrics-related endpoints
#### Metrics `GET /metrics`
Returns internal Garage metrics in Prometheus format.
### Cluster operations
#### GetClusterStatus `GET /v0/status`
Returns the cluster's current status in JSON, including:
- ID of the node being queried and its version of the Garage daemon
- Live nodes
- Currently configured cluster layout
- Staged changes to the cluster layout
Example response body:
```json
{
"node": "ec79480e0ce52ae26fd00c9da684e4fa56658d9c64cdcecb094e936de0bfe71f",
"garage_version": "git:v0.8.0",
"knownNodes": {
"ec79480e0ce52ae26fd00c9da684e4fa56658d9c64cdcecb094e936de0bfe71f": {
"addr": "10.0.0.11:3901",
"is_up": true,
"last_seen_secs_ago": 9,
"hostname": "node1"
},
"4a6ae5a1d0d33bf895f5bb4f0a418b7dc94c47c0dd2eb108d1158f3c8f60b0ff": {
"addr": "10.0.0.12:3901",
"is_up": true,
"last_seen_secs_ago": 1,
"hostname": "node2"
},
"23ffd0cdd375ebff573b20cc5cef38996b51c1a7d6dbcf2c6e619876e507cf27": {
"addr": "10.0.0.21:3901",
"is_up": true,
"last_seen_secs_ago": 7,
"hostname": "node3"
},
"e2ee7984ee65b260682086ec70026165903c86e601a4a5a501c1900afe28d84b": {
"addr": "10.0.0.22:3901",
"is_up": true,
"last_seen_secs_ago": 1,
"hostname": "node4"
}
},
"layout": {
"version": 12,
"roles": {
"ec79480e0ce52ae26fd00c9da684e4fa56658d9c64cdcecb094e936de0bfe71f": {
"zone": "dc1",
"capacity": 4,
"tags": [
"node1"
]
},
"4a6ae5a1d0d33bf895f5bb4f0a418b7dc94c47c0dd2eb108d1158f3c8f60b0ff": {
"zone": "dc1",
"capacity": 6,
"tags": [
"node2"
]
},
"23ffd0cdd375ebff573b20cc5cef38996b51c1a7d6dbcf2c6e619876e507cf27": {
"zone": "dc2",
"capacity": 10,
"tags": [
"node3"
]
}
},
"stagedRoleChanges": {
"e2ee7984ee65b260682086ec70026165903c86e601a4a5a501c1900afe28d84b": {
"zone": "dc2",
"capacity": 5,
"tags": [
"node4"
]
}
}
}
}
```
#### ConnectClusterNodes `POST /v0/connect`
Instructs this Garage node to connect to other Garage nodes at specified addresses.
Example request body:
```json
[
"ec79480e0ce52ae26fd00c9da684e4fa56658d9c64cdcecb094e936de0bfe71f@10.0.0.11:3901",
"4a6ae5a1d0d33bf895f5bb4f0a418b7dc94c47c0dd2eb108d1158f3c8f60b0ff@10.0.0.12:3901"
]
```
The format of the string for a node to connect to is: `<node ID>@<ip address>:<port>`, same as in the `garage node connect` CLI call.
Example response:
```json
[
{
"success": true,
"error": null
},
{
"success": false,
"error": "Handshake error"
}
]
```
#### GetClusterLayout `GET /v0/layout`
Returns the cluster's current layout in JSON, including:
- Currently configured cluster layout
- Staged changes to the cluster layout
(the info returned by this endpoint is a subset of the info returned by GetClusterStatus)
Example response body:
```json
{
"version": 12,
"roles": {
"ec79480e0ce52ae26fd00c9da684e4fa56658d9c64cdcecb094e936de0bfe71f": {
"zone": "dc1",
"capacity": 4,
"tags": [
"node1"
]
},
"4a6ae5a1d0d33bf895f5bb4f0a418b7dc94c47c0dd2eb108d1158f3c8f60b0ff": {
"zone": "dc1",
"capacity": 6,
"tags": [
"node2"
]
},
"23ffd0cdd375ebff573b20cc5cef38996b51c1a7d6dbcf2c6e619876e507cf27": {
"zone": "dc2",
"capacity": 10,
"tags": [
"node3"
]
}
},
"stagedRoleChanges": {
"e2ee7984ee65b260682086ec70026165903c86e601a4a5a501c1900afe28d84b": {
"zone": "dc2",
"capacity": 5,
"tags": [
"node4"
]
}
}
}
```
#### UpdateClusterLayout `POST /v0/layout`
Send modifications to the cluster layout. These modifications will
be included in the staged role changes, visible in subsequent calls
of `GetClusterLayout`. Once the set of staged changes is satisfactory,
the user may call `ApplyClusterLayout` to apply the changed changes,
or `Revert ClusterLayout` to clear all of the staged changes in
the layout.
Request body format:
```json
{
<node_id>: {
"capacity": <new_capacity>,
"zone": <new_zone>,
"tags": [
<new_tag>,
...
]
},
<node_id_to_remove>: null,
...
}
```
Contrary to the CLI that may update only a subset of the fields
`capacity`, `zone` and `tags`, when calling this API all of these
values must be specified.
#### ApplyClusterLayout `POST /v0/layout/apply`
Applies to the cluster the layout changes currently registered as
staged layout changes.
Request body format:
```json
{
"version": 13
}
```
Similarly to the CLI, the body must include the version of the new layout
that will be created, which MUST be 1 + the value of the currently
existing layout in the cluster.
#### RevertClusterLayout `POST /v0/layout/revert`
Clears all of the staged layout changes.
Request body format:
```json
{
"version": 13
}
```
Reverting the staged changes is done by incrementing the version number
and clearing the contents of the staged change list.
Similarly to the CLI, the body must include the incremented
version number, which MUST be 1 + the value of the currently
existing layout in the cluster.
### Access key operations
#### ListKeys `GET /v0/key`
Returns all API access keys in the cluster.
Example response:
```json
[
{
"id": "GK31c2f218a2e44f485b94239e",
"name": "test"
},
{
"id": "GKe10061ac9c2921f09e4c5540",
"name": "test2"
}
]
```
#### CreateKey `POST /v0/key`
Creates a new API access key.
Request body format:
```json
{
"name": "NameOfMyKey"
}
```
#### ImportKey `POST /v0/key/import`
Imports an existing API key.
Request body format:
```json
{
"accessKeyId": "GK31c2f218a2e44f485b94239e",
"secretAccessKey": "b892c0665f0ada8a4755dae98baa3b133590e11dae3bcc1f9d769d67f16c3835",
"name": "NameOfMyKey"
}
```
#### GetKeyInfo `GET /v0/key?id=<acces key id>`
#### GetKeyInfo `GET /v0/key?search=<pattern>`
Returns information about the requested API access key.
If `id` is set, the key is looked up using its exact identifier (faster).
If `search` is set, the key is looked up using its name or prefix
of identifier (slower, all keys are enumerated to do this).
Example response:
```json
{
"name": "test",
"accessKeyId": "GK31c2f218a2e44f485b94239e",
"secretAccessKey": "b892c0665f0ada8a4755dae98baa3b133590e11dae3bcc1f9d769d67f16c3835",
"permissions": {
"createBucket": false
},
"buckets": [
{
"id": "70dc3bed7fe83a75e46b66e7ddef7d56e65f3c02f9f80b6749fb97eccb5e1033",
"globalAliases": [
"test2"
],
"localAliases": [],
"permissions": {
"read": true,
"write": true,
"owner": false
}
},
{
"id": "d7452a935e663fc1914f3a5515163a6d3724010ce8dfd9e4743ca8be5974f995",
"globalAliases": [
"test3"
],
"localAliases": [],
"permissions": {
"read": true,
"write": true,
"owner": false
}
},
{
"id": "e6a14cd6a27f48684579ec6b381c078ab11697e6bc8513b72b2f5307e25fff9b",
"globalAliases": [],
"localAliases": [
"test"
],
"permissions": {
"read": true,
"write": true,
"owner": true
}
},
{
"id": "96470e0df00ec28807138daf01915cfda2bee8eccc91dea9558c0b4855b5bf95",
"globalAliases": [
"alex"
],
"localAliases": [],
"permissions": {
"read": true,
"write": true,
"owner": true
}
}
]
}
```
#### DeleteKey `DELETE /v0/key?id=<acces key id>`
Deletes an API access key.
#### UpdateKey `POST /v0/key?id=<acces key id>`
Updates information about the specified API access key.
Request body format:
```json
{
"name": "NameOfMyKey",
"allow": {
"createBucket": true,
},
"deny": {}
}
```
All fields (`name`, `allow` and `deny`) are optionnal.
If they are present, the corresponding modifications are applied to the key, otherwise nothing is changed.
The possible flags in `allow` and `deny` are: `createBucket`.
### Bucket operations
#### ListBuckets `GET /v0/bucket`
Returns all storage buckets in the cluster.
Example response:
```json
[
{
"id": "70dc3bed7fe83a75e46b66e7ddef7d56e65f3c02f9f80b6749fb97eccb5e1033",
"globalAliases": [
"test2"
],
"localAliases": []
},
{
"id": "96470e0df00ec28807138daf01915cfda2bee8eccc91dea9558c0b4855b5bf95",
"globalAliases": [
"alex"
],
"localAliases": []
},
{
"id": "d7452a935e663fc1914f3a5515163a6d3724010ce8dfd9e4743ca8be5974f995",
"globalAliases": [
"test3"
],
"localAliases": []
},
{
"id": "e6a14cd6a27f48684579ec6b381c078ab11697e6bc8513b72b2f5307e25fff9b",
"globalAliases": [],
"localAliases": [
{
"accessKeyId": "GK31c2f218a2e44f485b94239e",
"alias": "test"
}
]
}
]
```
#### GetBucketInfo `GET /v0/bucket?id=<bucket id>`
#### GetBucketInfo `GET /v0/bucket?globalAlias=<alias>`
Returns information about the requested storage bucket.
If `id` is set, the bucket is looked up using its exact identifier.
If `globalAlias` is set, the bucket is looked up using its global alias.
(both are fast)
Example response:
```json
{
"id": "afa8f0a22b40b1247ccd0affb869b0af5cff980924a20e4b5e0720a44deb8d39",
"globalAliases": [],
"websiteAccess": false,
"websiteConfig": null,
"keys": [
{
"accessKeyId": "GK31c2f218a2e44f485b94239e",
"name": "Imported key",
"permissions": {
"read": true,
"write": true,
"owner": true
},
"bucketLocalAliases": [
"debug"
]
}
],
"objects": 14827,
"bytes": 13189855625,
"unfinshedUploads": 0,
"quotas": {
"maxSize": null,
"maxObjects": null
}
}
```
#### CreateBucket `POST /v0/bucket`
Creates a new storage bucket.
Request body format:
```json
{
"globalAlias": "NameOfMyBucket"
}
```
OR
```json
{
"localAlias": {
"accessKeyId": "GK31c2f218a2e44f485b94239e",
"alias": "NameOfMyBucket",
"allow": {
"read": true,
"write": true,
"owner": false
}
}
}
```
OR
```json
{}
```
Creates a new bucket, either with a global alias, a local one,
or no alias at all.
Technically, you can also specify both `globalAlias` and `localAlias` and that would create
two aliases, but I don't see why you would want to do that.
#### DeleteBucket `DELETE /v0/bucket?id=<bucket id>`
Deletes a storage bucket. A bucket cannot be deleted if it is not empty.
Warning: this will delete all aliases associated with the bucket!
#### UpdateBucket `PUT /v0/bucket?id=<bucket id>`
Updates configuration of the given bucket.
Request body format:
```json
{
"websiteAccess": {
"enabled": true,
"indexDocument": "index.html",
"errorDocument": "404.html"
},
"quotas": {
"maxSize": 19029801,
"maxObjects": null,
}
}
```
All fields (`websiteAccess` and `quotas`) are optionnal.
If they are present, the corresponding modifications are applied to the bucket, otherwise nothing is changed.
In `websiteAccess`: if `enabled` is `true`, `indexDocument` must be specified.
The field `errorDocument` is optional, if no error document is set a generic
error message is displayed when errors happen. Conversely, if `enabled` is
`false`, neither `indexDocument` nor `errorDocument` must be specified.
In `quotas`: new values of `maxSize` and `maxObjects` must both be specified, or set to `null`
to remove the quotas. An absent value will be considered the same as a `null`. It is not possible
to change only one of the two quotas.
### Operations on permissions for keys on buckets
#### BucketAllowKey `POST /v0/bucket/allow`
Allows a key to do read/write/owner operations on a bucket.
Request body format:
```json
{
"bucketId": "e6a14cd6a27f48684579ec6b381c078ab11697e6bc8513b72b2f5307e25fff9b",
"accessKeyId": "GK31c2f218a2e44f485b94239e",
"permissions": {
"read": true,
"write": true,
"owner": true
},
}
```
Flags in `permissions` which have the value `true` will be activated.
Other flags will remain unchanged.
#### BucketDenyKey `POST /v0/bucket/deny`
Denies a key from doing read/write/owner operations on a bucket.
Request body format:
```json
{
"bucketId": "e6a14cd6a27f48684579ec6b381c078ab11697e6bc8513b72b2f5307e25fff9b",
"accessKeyId": "GK31c2f218a2e44f485b94239e",
"permissions": {
"read": false,
"write": false,
"owner": true
},
}
```
Flags in `permissions` which have the value `true` will be deactivated.
Other flags will remain unchanged.
### Operations on bucket aliases
#### GlobalAliasBucket `PUT /v0/bucket/alias/global?id=<bucket id>&alias=<global alias>`
Empty body. Creates a global alias for a bucket.
#### GlobalUnaliasBucket `DELETE /v0/bucket/alias/global?id=<bucket id>&alias=<global alias>`
Removes a global alias for a bucket.
#### LocalAliasBucket `PUT /v0/bucket/alias/local?id=<bucket id>&accessKeyId=<access key ID>&alias=<local alias>`
Empty body. Creates a local alias for a bucket in the namespace of a specific access key.
#### LocalUnaliasBucket `DELETE /v0/bucket/alias/local?id=<bucket id>&accessKeyId<access key ID>&alias=<local alias>`
Removes a local alias for a bucket in the namespace of a specific access key.

View file

@ -10,6 +10,7 @@ metadata_dir = "/var/lib/garage/meta"
data_dir = "/var/lib/garage/data"
block_size = 1048576
block_manager_background_tranquility = 2
replication_mode = "3"
@ -29,6 +30,10 @@ bootstrap_peers = [
consul_host = "consul.service"
consul_service_name = "garage-daemon"
kubernetes_namespace = "garage"
kubernetes_service_name = "garage-daemon"
kubernetes_skip_crd = false
sled_cache_capacity = 134217728
sled_flush_every_ms = 2000
@ -40,6 +45,12 @@ root_domain = ".s3.garage"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage"
[admin]
api_bind_addr = "0.0.0.0:3903"
metrics_token = "cacce0b2de4bc2d9f5b5fdff551e01ac1496055aed248202d415398987e35f81"
admin_token = "ae8cb40ea7368bbdbb6430af11cca7da833d3458a5f52086f4e805a570fb5c2a"
trace_sink = "http://localhost:4317"
```
The following gives details about each available configuration option.
@ -76,24 +87,62 @@ files will remain available. This however means that chunks from existing files
will not be deduplicated with chunks from newly uploaded files, meaning you
might use more storage space that is optimally possible.
### `block_manager_background_tranquility`
This parameter tunes the activity of the background worker responsible for
resyncing data blocks between nodes. The higher the tranquility value is set,
the more the background worker will wait between iterations, meaning the load
on the system (including network usage between nodes) will be reduced. The
minimal value for this parameter is `0`, where the background worker will
allways work at maximal throughput to resynchronize blocks. The default value
is `2`, where the background worker will try to spend at most 1/3 of its time
working, and 2/3 sleeping in order to reduce system load.
### `replication_mode`
Garage supports the following replication modes:
- `none` or `1`: data stored on Garage is stored on a single node. There is no redundancy,
and data will be unavailable as soon as one node fails or its network is disconnected.
Do not use this for anything else than test deployments.
- `none` or `1`: data stored on Garage is stored on a single node. There is no
redundancy, and data will be unavailable as soon as one node fails or its
network is disconnected. Do not use this for anything else than test
deployments.
- `2`: data stored on Garage will be stored on two different nodes, if possible in different
zones. Garage tolerates one node failure before losing data. Data should be available
read-only when one node is down, but write operations will fail.
Use this only if you really have to.
- `2`: data stored on Garage will be stored on two different nodes, if possible
in different zones. Garage tolerates one node failure, or several nodes
failing but all in a single zone (in a deployment with at least two zones),
before losing data. Data remains available in read-only mode when one node is
down, but write operations will fail.
- `3`: data stored on Garage will be stored on three different nodes, if possible each in
a different zones.
Garage tolerates two node failure before losing data. Data should be available
read-only when two nodes are down, and writes should be possible if only a single node
is down.
- `2-dangerous`: a variant of mode `2`, where written objects are written to
the second replica asynchronously. This means that Garage will return `200
OK` to a PutObject request before the second copy is fully written (or even
before it even starts being written). This means that data can more easily
be lost if the node crashes before a second copy can be completed. This
also means that written objects might not be visible immediately in read
operations. In other words, this mode severely breaks the consistency and
durability guarantees of standard Garage cluster operation. Benefits of
this mode: you can still write to your cluster when one node is
unavailable.
- `3`: data stored on Garage will be stored on three different nodes, if
possible each in a different zones. Garage tolerates two node failure, or
several node failures but in no more than two zones (in a deployment with at
least three zones), before losing data. As long as only a single node fails,
or node failures are only in a single zone, reading and writing data to
Garage can continue normally.
- `3-degraded`: a variant of replication mode `3`, that lowers the read
quorum to `1`, to allow you to read data from your cluster when several
nodes (or nodes in several zones) are unavailable. In this mode, Garage
does not provide read-after-write consistency anymore. The write quorum is
still 2, ensuring that data successfully written to Garage is stored on at
least two nodes.
- `3-dangerous`: a variant of replication mode `3` that lowers both the read
and write quorums to `1`, to allow you to both read and write to your
cluster when several nodes (or nodes in several zones) are unavailable. It
is the least consistent mode of operation proposed by Garage, and also one
that should probably never be used.
Note that in modes `2` and `3`,
if at least the same number of zones are available, an arbitrary number of failures in
@ -102,8 +151,35 @@ any given zone is tolerated as copies of data will be spread over several zones.
**Make sure `replication_mode` is the same in the configuration files of all nodes.
Never run a Garage cluster where that is not the case.**
Changing the `replication_mode` of a cluster might work (make sure to shut down all nodes
and changing it everywhere at the time), but is not officially supported.
The quorums associated with each replication mode are described below:
| `replication_mode` | Number of replicas | Write quorum | Read quorum | Read-after-write consistency? |
| ------------------ | ------------------ | ------------ | ----------- | ----------------------------- |
| `none` or `1` | 1 | 1 | 1 | yes |
| `2` | 2 | 2 | 1 | yes |
| `2-dangerous` | 2 | 1 | 1 | NO |
| `3` | 3 | 2 | 2 | yes |
| `3-degraded` | 3 | 2 | 1 | NO |
| `3-dangerous` | 3 | 1 | 1 | NO |
Changing the `replication_mode` between modes with the same number of replicas
(e.g. from `3` to `3-degraded`, or from `2-dangerous` to `2`), can be done easily by
just changing the `replication_mode` parameter in your config files and restarting all your
Garage nodes.
It is also technically possible to change the replication mode to a mode with a
different numbers of replicas, although it's a dangerous operation that is not
officially supported. This requires you to delete the existing cluster layout
and create a new layout from scratch, meaning that a full rebalancing of your
cluster's data will be needed. To do it, shut down your cluster entirely,
delete the `custer_layout` files in the meta directories of all your nodes,
update all your configuration files with the new `replication_mode` parameter,
restart your cluster, and then create a new layout with all the nodes you want
to keep. Rebalancing data will take some time, and data might temporarily
appear unavailable to your users. It is recommended to shut down public access
to the cluster while rebalancing is in progress. In theory, no data should be
lost as rebalancing is a routine operation for Garage, although we cannot
guarantee you that everything will go right in such an extreme scenario.
### `compression_level`
@ -181,6 +257,20 @@ RPC ports are announced.
Garage does not yet support talking to Consul over TLS.
### `kubernetes_namespace`, `kubernetes_service_name` and `kubernetes_skip_crd`
Garage supports discovering other nodes of the cluster using kubernetes custom
resources. For this to work `kubernetes_namespace` and `kubernetes_service_name`
need to be configured.
`kubernetes_namespace` sets the namespace in which the custom resources are
configured. `kubernetes_service_name` is added as a label to these resources to
filter them, to allow for multiple deployments in a single namespace.
`kubernetes_skip_crd` can be set to true to disable the automatic creation and
patching of the `garagenodes.deuxfleurs.fr` CRD. You will need to create the CRD
manually.
### `sled_cache_capacity`
This parameter can be used to tune the capacity of the cache used by
@ -242,3 +332,35 @@ For instance, if `root_domain` is `web.garage.eu`, a bucket called `deuxfleurs.f
will be accessible either with hostname `deuxfleurs.fr.web.garage.eu`
or with hostname `deuxfleurs.fr`.
## The `[admin]` section
Garage has a few administration capabilities, in particular to allow remote monitoring. These features are detailed below.
### `api_bind_addr`
If specified, Garage will bind an HTTP server to this port and address, on
which it will listen to requests for administration features.
See [administration API reference](@/documentation/reference-manual/admin-api.md) to learn more about these features.
### `metrics_token` (since version 0.7.2)
The token for accessing the Metrics endpoint. If this token is not set in
the config file, the Metrics endpoint can be accessed without access
control.
You can use any random string for this value. We recommend generating a random token with `openssl rand -hex 32`.
### `admin_token` (since version 0.7.2)
The token for accessing all of the other administration endpoints. If this
token is not set in the config file, access to these endpoints is disabled
entirely.
You can use any random string for this value. We recommend generating a random token with `openssl rand -hex 32`.
### `trace_sink`
Optionnally, the address of an Opentelemetry collector. If specified,
Garage will send traces in the Opentelemetry format to this endpoint. These
trace allow to inspect Garage's operation when it handles S3 API requests.

View file

@ -0,0 +1,58 @@
+++
title = "K2V"
weight = 30
+++
Starting with version 0.7.2, Garage introduces an optionnal feature, K2V,
which is an alternative storage API designed to help efficiently store
many small values in buckets (in opposition to S3 which is more designed
to store large blobs).
K2V is currently disabled at compile time in all builds, as the
specification is still subject to changes. To build a Garage version with
K2V, the Cargo feature flag `k2v` must be activated. Special builds with
the `k2v` feature flag enabled can be obtained from our download page under
"Extra builds": such builds can be identified easily as their tag name ends
with `-k2v` (example: `v0.7.2-k2v`).
The specification of the K2V API can be found
[here](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/k2v/doc/drafts/k2v-spec.md).
This document also includes a high-level overview of K2V's design.
The K2V API uses AWSv4 signatures for authentification, same as the S3 API.
The AWS region used for signature calculation is always the same as the one
defined for the S3 API in the config file.
## Enabling and using K2V
To enable K2V, download and run a build that has the `k2v` feature flag
enabled, or produce one yourself. Then, add the following section to your
configuration file:
```toml
[k2v_api]
api_bind_addr = "<ip>:<port>"
```
Please select a port number that is not already in use by another API
endpoint (S3 api, admin API) or by the RPC server.
We provide an early-stage K2V client library for Rust which can be imported by adding the following to your `Cargo.toml` file:
```toml
k2v-client = { git = "https://git.deuxfleurs.fr/Deuxfleurs/garage.git" }
```
There is also a simple CLI utility which can be built from source in the
following way:
```sh
git clone https://git.deuxfleurs.fr/Deuxfleurs/garage.git
cd garage/src/k2v-client
cargo build --features cli --bin k2v-cli
```
The CLI utility is self-documented, run `k2v-cli --help` to learn how to use
it. There is also a short README.md in the `src/k2v-client` folder with some
instructions.

View file

@ -0,0 +1,45 @@
+++
title = "Request routing logic"
weight = 10
+++
Data retrieval requests to Garage endpoints (S3 API and websites) are resolved
to an individual object in a bucket. Since objects are replicated to multiple nodes
Garage must ensure consistency before answering the request.
## Using quorum to ensure consistency
Garage ensures consistency by attempting to establish a quorum with the
data nodes responsible for the object. When a majority of the data nodes
have provided metadata on a object Garage can then answer the request.
When a request arrives Garage will, assuming the recommended 3 replicas, perform the following actions:
- Make a request to the two preferred nodes for object metadata
- Try the third node if one of the two initial requests fail
- Check that the metadata from at least 2 nodes match
- Check that the object hasn't been marked deleted
- Answer the request with inline data from metadata if object is small enough
- Or get data blocks from the preferred nodes and answer using the assembled object
Garage dynamically determines which nodes to query based on health, preference, and
which nodes actually host a given data. Garage has no concept of "primary" so any
healthy node with the data can be used as long as a quorum is reached for the metadata.
## Node health
Garage keeps a TCP session open to each node in the cluster and periodically pings them. If a connection
cannot be established, or a node fails to answer a number of pings, the target node is marked as failed.
Failed nodes are not used for quorum or other internal requests.
## Node preference
Garage prioritizes which nodes to query according to a few criteria:
- A node always prefers itself if it can answer the request
- Then the node prioritizes nodes in the same zone
- Finally the nodes with the lowest latency are prioritized
For further reading on the cluster structure look at the [gateway](@/documentation/cookbook/gateways.md)
and [cluster layout management](@/documentation/reference-manual/layout.md) pages.

View file

@ -3,24 +3,46 @@ title = "S3 Compatibility status"
weight = 20
+++
## Endpoint implementation
## DISCLAIMER
All APIs that are missing on Garage will return a 501 Not Implemented.
Some `x-amz-` headers are not implemented.
**The compatibility list for other platforms is given only for informational
purposes and based on available documentation.** They are sometimes completed,
in a best effort approach, with the source code and inputs from maintainers
when documentation is lacking. We are not proactively monitoring new versions
of each software: check the modification history to know when the page has been
updated for the last time. Some entries will be inexact or outdated. For any
serious decision, you must make your own tests.
**The official documentation of each project can be accessed by clicking on the
project name in the column header.**
*The compatibility list for other platforms is given only for information purposes and based on available documentation. Some entries might be inexact. Feel free to open a PR to fix this table. Minio is missing because they do not provide a public S3 compatibility list.*
Feel free to open a PR to suggest fixes this table. Minio is missing because they do not provide a public S3 compatibility list.
### Features
## Update history
- 2022-02-07 - First version of this page
- 2022-05-25 - Many Ceph S3 endpoints are not documented but implemented. Following a notification from the Ceph community, we added them.
## High-level features
| Feature | Garage | [Openstack Swift](https://docs.openstack.org/swift/latest/s3_compat.html) | [Ceph Object Gateway](https://docs.ceph.com/en/latest/radosgw/s3/) | [Riak CS](https://docs.riak.com/riak/cs/2.1.1/references/apis/storage/s3/index.html) | [OpenIO](https://docs.openio.io/latest/source/arch-design/s3_compliancy.html) |
|------------------------------|----------------------------------|-----------------|---------------|---------|-----|
| [signature v2](https://docs.aws.amazon.com/general/latest/gr/signature-version-2.html) (deprecated) | ❌ Missing | ✅ | ❌ | ✅ | ✅ |
| [signature v2](https://docs.aws.amazon.com/general/latest/gr/signature-version-2.html) (deprecated) | ❌ Missing | ✅ | | ✅ | ✅ |
| [signature v4](https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html) | ✅ Implemented | ✅ | ✅ | ❌ | ✅ |
| [URL path-style](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#path-style-access) (eg. `host.tld/bucket/key`) | ✅ Implemented | ✅ | ✅ | ❓| ✅ |
| [URL vhost-style](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#virtual-hosted-style-access) URL (eg. `bucket.host.tld/key`) | ✅ Implemented | ❌| ✅| ✅ | ✅ |
| [Presigned URLs](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html) | ✅ Implemented | ❌| ✅ | ✅ | ✅(❓) |
*Note:* OpenIO does not says if it supports presigned URLs. Because it is part of signature v4 and they claim they support it without additional precisions, we suppose that OpenIO supports presigned URLs.
*Note:* OpenIO does not says if it supports presigned URLs. Because it is part
of signature v4 and they claim they support it without additional precisions,
we suppose that OpenIO supports presigned URLs.
## Endpoint implementation
All endpoints that are missing on Garage will return a 501 Not Implemented.
Some `x-amz-` headers are not implemented.
### Core endoints
@ -37,13 +59,17 @@ Some `x-amz-` headers are not implemented.
| [DeleteObjects](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) | ✅ Implemented | ✅ | ✅ | ✅ | ✅ |
| [GetObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) | ✅ Implemented | ✅ | ✅ | ✅ | ✅ |
| [ListObjects](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html) | ✅ Implemented (see details below) | ✅ | ✅ | ✅ | ❌|
| [ListObjectsV2](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html) | ✅ Implemented | ❌| | ❌| ✅ |
| [PostObject](https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPOST.html) (compatibility API) | ❌ Missing | ❌| ✅ | ❌| ❌|
| [ListObjectsV2](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html) | ✅ Implemented | ❌| | ❌| ✅ |
| [PostObject](https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPOST.html) | ✅ Implemented | ❌| ✅ | ❌| ❌|
| [PutObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html) | ✅ Implemented | ✅ | ✅ | ✅ | ✅ |
**ListObjects:** Implemented, but there isn't a very good specification of what `encoding-type=url` covers so there might be some encoding bugs. In our implementation the url-encoded fields are in the same in ListObjects as they are in ListObjectsV2.
**ListObjects:** Implemented, but there isn't a very good specification of what
`encoding-type=url` covers so there might be some encoding bugs. In our
implementation the url-encoded fields are in the same in ListObjects as they
are in ListObjectsV2.
*Note: Ceph API documentation is incomplete and miss at least HeadBucket and UploadPartCopy, but these endpoints are documented in [Red Hat Ceph Storage - Chapter 2. Ceph Object Gateway and the S3 API](https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/developer_guide/ceph-object-gateway-and-the-s3-api)*
*Note: Ceph API documentation is incomplete and lacks at least HeadBucket and UploadPartCopy,
but these endpoints are documented in [Red Hat Ceph Storage - Chapter 2. Ceph Object Gateway and the S3 API](https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/developer_guide/ceph-object-gateway-and-the-s3-api)*
### Multipart Upload endpoints
@ -67,13 +93,13 @@ For more information, please refer to our [issue tracker](https://git.deuxfleurs
| [DeleteBucketWebsite](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketWebsite.html) | ✅ Implemented | ❌| ❌| ❌| ❌|
| [GetBucketWebsite](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketWebsite.html) | ✅ Implemented | ❌ | ❌| ❌| ❌|
| [PutBucketWebsite](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketWebsite.html) | ⚠ Partially implemented (see below)| ❌| ❌| ❌| ❌|
| [DeleteBucketCors](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketCors.html) | ✅ Implemented | ❌| | ❌| ✅ |
| [GetBucketCors](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketCors.html) | ✅ Implemented | ❌ | | ❌| ✅ |
| [PutBucketCors](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketCors.html) | ✅ Implemented | ❌| | ❌| ✅ |
| [DeleteBucketCors](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketCors.html) | ✅ Implemented | ❌| | ❌| ✅ |
| [GetBucketCors](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketCors.html) | ✅ Implemented | ❌ | | ❌| ✅ |
| [PutBucketCors](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketCors.html) | ✅ Implemented | ❌| | ❌| ✅ |
**PutBucketWebsite:** Implemented, but only stores the index document suffix and the error document path. Redirects are not supported.
*Note: Ceph radosgw has some support for static websites but it is different from Amazon one plus it does not implement its configuration endpoints.*
*Note: Ceph radosgw has some support for static websites but it is different from the Amazon one. It also does not implement its configuration endpoints.*
### ACL, Policies endpoints
@ -83,27 +109,27 @@ See Garage CLI reference manual to learn how to use Garage's permission system.
| Endpoint | Garage | [Openstack Swift](https://docs.openstack.org/swift/latest/s3_compat.html) | [Ceph Object Gateway](https://docs.ceph.com/en/latest/radosgw/s3/) | [Riak CS](https://docs.riak.com/riak/cs/2.1.1/references/apis/storage/s3/index.html) | [OpenIO](https://docs.openio.io/latest/source/arch-design/s3_compliancy.html) |
|------------------------------|----------------------------------|-----------------|---------------|---------|-----|
| [DeleteBucketPolicy](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketPolicy.html) | ❌ Missing | ❌| | ✅ | ❌|
| [GetBucketPolicy](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketPolicy.html) | ❌ Missing | ❌| | ⚠ | ❌|
| [GetBucketPolicyStatus](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketPolicyStatus.html) | ❌ Missing | ❌| | ❌| ❌|
| [PutBucketPolicy](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketPolicy.html) | ❌ Missing | ❌| | ⚠ | ❌|
| [DeleteBucketPolicy](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketPolicy.html) | ❌ Missing | ❌| | ✅ | ❌|
| [GetBucketPolicy](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketPolicy.html) | ❌ Missing | ❌| | ⚠ | ❌|
| [GetBucketPolicyStatus](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketPolicyStatus.html) | ❌ Missing | ❌| | ❌| ❌|
| [PutBucketPolicy](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketPolicy.html) | ❌ Missing | ❌| | ⚠ | ❌|
| [GetBucketAcl](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketAcl.html) | ❌ Missing | ✅ | ✅ | ✅ | ✅ |
| [PutBucketAcl](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketAcl.html) | ❌ Missing | ✅ | ✅ | ✅ | ✅ |
| [GetObjectAcl](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectAcl.html) | ❌ Missing | ✅ | ✅ | ✅ | ✅ |
| [PutObjectAcl](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectAcl.html) | ❌ Missing | ✅ | ✅ | ✅ | ✅ |
*Notes:* Ceph claims that it supports bucket policies but does not implement any Policy endpoints. They probably refer to their own permission system. Riak CS only supports a subset of the policy configuration.
*Notes:* Riak CS only supports a subset of the policy configuration.
### Versioning, Lifecycle endpoints
Garage does not support (yet) object versioning.
Garage does not (yet) support object versioning.
If you need this feature, please [share your use case in our dedicated issue](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/166).
| Endpoint | Garage | [Openstack Swift](https://docs.openstack.org/swift/latest/s3_compat.html) | [Ceph Object Gateway](https://docs.ceph.com/en/latest/radosgw/s3/) | [Riak CS](https://docs.riak.com/riak/cs/2.1.1/references/apis/storage/s3/index.html) | [OpenIO](https://docs.openio.io/latest/source/arch-design/s3_compliancy.html) |
|------------------------------|----------------------------------|-----------------|---------------|---------|-----|
| [DeleteBucketLifecycle](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketLifecycle.html) | ❌ Missing | ❌| ✅| ❌| ✅|
| [GetBucketLifecycleConfiguration](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketLifecycleConfiguration.html) | ❌ Missing | ❌| | ❌| ✅|
| [PutBucketLifecycleConfiguration](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketLifecycleConfiguration.html) | ❌ Missing | ❌| | ❌| ✅|
| [GetBucketLifecycleConfiguration](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketLifecycleConfiguration.html) | ❌ Missing | ❌| | ❌| ✅|
| [PutBucketLifecycleConfiguration](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketLifecycleConfiguration.html) | ❌ Missing | ❌| | ❌| ✅|
| [GetBucketVersioning](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketVersioning.html) | ❌ Stub (see below) | ✅| ✅ | ❌| ✅|
| [ListObjectVersions](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectVersions.html) | ❌ Missing | ❌| ✅ | ❌| ✅|
| [PutBucketVersioning](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketVersioning.html) | ❌ Missing | ❌| ✅| ❌| ✅|
@ -111,8 +137,6 @@ If you need this feature, please [share your use case in our dedicated issue](ht
**GetBucketVersioning:** Stub implementation (Garage does not yet support versionning so this always returns "versionning not enabled").
*Note: Ceph only supports `Expiration`, `NoncurrentVersionExpiration` and `AbortIncompleteMultipartUpload` on its Lifecycle endpoints.*
### Replication endpoints
Please open an issue if you have a use case for replication.
@ -123,7 +147,10 @@ Please open an issue if you have a use case for replication.
| [GetBucketReplication](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketReplication.html) | ❌ Missing | ❌| ✅ | ❌| ❌|
| [PutBucketReplication](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketReplication.html) | ❌ Missing | ❌| ⚠ | ❌| ❌|
*Note: Ceph documentation briefly says that Ceph supports [replication though the S3 API](https://docs.ceph.com/en/latest/radosgw/multisite-sync-policy/#s3-replication-api) but with some limitations. Additionaly, replication endpoints are not documented in the S3 compatibility page so I don't know what kind of support we can expect.*
*Note: Ceph documentation briefly says that Ceph supports
[replication through the S3 API](https://docs.ceph.com/en/latest/radosgw/multisite-sync-policy/#s3-replication-api)
but with some limitations.
Additionaly, replication endpoints are not documented in the S3 compatibility page so I don't know what kind of support we can expect.*
### Locking objects
@ -135,8 +162,8 @@ Amazon defines a concept of [object locking](https://docs.aws.amazon.com/AmazonS
| [PutObjectLegalHold](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectLegalHold.html) | ❌ Missing | ❌| ✅ | ❌| ❌|
| [GetObjectRetention](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectRetention.html) | ❌ Missing | ❌| ✅ | ❌| ❌|
| [PutObjectRetention](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectRetention.html) | ❌ Missing | ❌| ✅ | ❌| ❌|
| [GetObjectLockConfiguration](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectLockConfiguration.html) | ❌ Missing | ❌| | ❌| ❌|
| [PutObjectLockConfiguration](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectLockConfiguration.html) | ❌ Missing | ❌| | ❌| ❌|
| [GetObjectLockConfiguration](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectLockConfiguration.html) | ❌ Missing | ❌| | ❌| ❌|
| [PutObjectLockConfiguration](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectLockConfiguration.html) | ❌ Missing | ❌| | ❌| ❌|
### (Server-side) encryption
@ -145,9 +172,9 @@ Please open an issue if you have a use case.
| Endpoint | Garage | [Openstack Swift](https://docs.openstack.org/swift/latest/s3_compat.html) | [Ceph Object Gateway](https://docs.ceph.com/en/latest/radosgw/s3/) | [Riak CS](https://docs.riak.com/riak/cs/2.1.1/references/apis/storage/s3/index.html) | [OpenIO](https://docs.openio.io/latest/source/arch-design/s3_compliancy.html) |
|------------------------------|----------------------------------|-----------------|---------------|---------|-----|
| [DeleteBucketEncryption](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketEncryption.html) | ❌ Missing | ❌| | ❌| ❌|
| [GetBucketEncryption](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketEncryption.html) | ❌ Missing | ❌| | ❌| ❌|
| [PutBucketEncryption](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketEncryption.html) | ❌ Missing | ❌| | ❌| ❌|
| [DeleteBucketEncryption](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketEncryption.html) | ❌ Missing | ❌| | ❌| ❌|
| [GetBucketEncryption](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketEncryption.html) | ❌ Missing | ❌| | ❌| ❌|
| [PutBucketEncryption](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketEncryption.html) | ❌ Missing | ❌| | ❌| ❌|
### Misc endpoints
@ -155,13 +182,13 @@ Please open an issue if you have a use case.
|------------------------------|----------------------------------|-----------------|---------------|---------|-----|
| [GetBucketNotificationConfiguration](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketNotificationConfiguration.html) | ❌ Missing | ❌| ✅ | ❌| ❌|
| [PutBucketNotificationConfiguration](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketNotificationConfiguration.html) | ❌ Missing | ❌| ✅ | ❌| ❌|
| [DeleteBucketTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [GetBucketTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [PutBucketTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [DeleteObjectTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjectTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [GetObjectTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [PutObjectTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [GetObjectTorrent](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectTorrent.html) | ❌ Missing | ❌| | ❌| ❌|
| [DeleteBucketTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [GetBucketTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [PutBucketTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [DeleteObjectTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjectTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [GetObjectTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [PutObjectTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectTagging.html) | ❌ Missing | ❌| | ❌| ✅ |
| [GetObjectTorrent](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectTorrent.html) | ❌ Missing | ❌| | ❌| ❌|
### Vendor specific endpoints

View file

@ -4,12 +4,12 @@ weight = 15
+++
**This guide explains how to migrate to 0.6 if you have an existing 0.5 cluster.
We don't recommend trying to migrate directly from 0.4 or older to 0.6.**
We don't recommend trying to migrate to 0.6 directly from 0.4 or older.**
**We make no guarantee that this migration will work perfectly:
back up all your data before attempting it!**
Garage v0.6 (not yet released) introduces a new data model for buckets,
Garage v0.6 introduces a new data model for buckets,
that allows buckets to have many names (aliases).
Buckets can also have "private" aliases (called local aliases),
which are only visible when using a certain access key.

View file

@ -0,0 +1,31 @@
+++
title = "Migrating from 0.6 to 0.7"
weight = 14
+++
**This guide explains how to migrate to 0.7 if you have an existing 0.6 cluster.
We don't recommend trying to migrate to 0.7 directly from 0.5 or older.**
**We make no guarantee that this migration will work perfectly:
back up all your data before attempting it!**
Garage v0.7 introduces a cluster protocol change to support request tracing through OpenTelemetry.
No data structure is changed, so no data migration is required.
The migration steps are as follows:
1. Do `garage repair --all-nodes --yes tables` and `garage repair --all-nodes --yes blocks`,
check the logs and check that all data seems to be synced correctly between
nodes. If you have time, do additional checks (`scrub`, `block_refs`, etc.)
2. Disable api and web access. Garage does not support disabling
these endpoints but you can change the port number or stop your reverse
proxy for instance.
3. Check once again that your cluster is healty. Run again `garage repair --all-nodes --yes tables` which is quick.
Also check your queues are empty, run `garage stats` to query them.
4. Turn off Garage v0.6
5. Backup the metadata folder of all your nodes: `cd /var/lib/garage ; tar -acf meta-v0.6.tar.zst meta/`
6. Install Garage v0.7, edit the configuration if you plan to use OpenTelemetry or the Kubernetes integration
7. Turn on Garage v0.7
8. Do `garage repair --all-nodes --yes tables` and `garage repair --all-nodes --yes blocks`
9. Your upgraded cluster should be in a working state. Re-enable API and Web
access and check that everything went well.
10. Monitor your cluster in the next hours to see if it works well under your production load, report any issue.

717
doc/drafts/k2v-spec.md Normal file
View file

@ -0,0 +1,717 @@
# Specification of the Garage K2V API (K2V = Key/Key/Value)
- We are storing triplets of the form `(partition key, sort key, value)` -> no
user-defined fields, the client is responsible of writing whatever he wants
in the value (typically an encrypted blob). Values are binary blobs, which
are always represented as their base64 encoding in the JSON API. Partition
keys and sort keys are utf8 strings.
- Triplets are stored in buckets; each bucket stores a separate set of triplets
- Bucket names and access keys are the same as for accessing the S3 API
- K2V triplets exist separately from S3 objects. K2V triplets don't exist for
the S3 API, and S3 objects don't exist for the K2V API.
- Values stored for triplets have associated causality information, that enables
Garage to detect concurrent writes. In case of concurrent writes, Garage
keeps the concurrent values until a further write supersedes the concurrent
values. This is the same method as Riak KV implements. The method used is
based on DVVS (dotted version vector sets), described in the paper "Scalable
and Accurate Causality Tracking for Eventually Consistent Data Stores", as
well as [here](https://github.com/ricardobcl/Dotted-Version-Vectors)
## Data format
### Triple format
Triples in K2V are constituted of three fields:
- a partition key (`pk`), an utf8 string that defines in what partition the
triplet is stored; triplets in different partitions cannot be listed together
in a ReadBatch command, or deleted together in a DeleteBatch command: a
separate command must be included in the ReadBatch/DeleteBatch call for each
partition key in which the client wants to read/delete lists of items
- a sort key (`sk`), an utf8 string that defines the index of the triplet inside its
partition; triplets are uniquely idendified by their partition key + sort key
- a value (`v`), an opaque binary blob associated to the partition key + sort key;
they are transmitted as binary when possible but in most case in the JSON API
they will be represented as strings using base64 encoding; a value can also
be `null` to indicate a deleted triplet (a `null` value is called a tombstone)
### Causality information
K2V supports storing several concurrent values associated to a pk+sk, in the
case where insertion or deletion operations are detected to be concurrent (i.e.
there is not one that was aware of the other, they are not causally dependant
one on the other). In practice, it even looks more like the opposite: to
overwrite a previously existing value, the client must give a "causality token"
that "proves" (not in a cryptographic sense) that it had seen a previous value.
Otherwise, the value written will not overwrite an existing value, it will just
create a new concurrent value.
The causality token is a binary/b64-encoded representation of a context,
specified below.
A set of concurrent values looks like this:
```
(node1, tdiscard1, (v1, t1), (v2, t2)) ; tdiscard1 < t1 < t2
(node2, tdiscard2, (v3, t3) ; tdiscard2 < t3
```
`tdiscard` for a node `i` means that all values inserted by node `i` with times
`<= tdiscard` are obsoleted, i.e. have been read by a client that overwrote it
afterwards.
The associated context would be the following: `[(node1, t2), (node2, t3)]`,
i.e. if a node reads this set of values and inserts a new values, we will now
have `tdiscard1 = t2` and `tdiscard2 = t3`, to indicate that values v1, v2 and v3
are obsoleted by the new write.
**Basic insertion.** To insert a new value `v4` with context `[(node1, t2), (node2, t3)]`, in a
simple case where there was no insertion in-between reading the value
mentionned above and writing `v4`, and supposing that node2 receives the
InsertItem query:
- `node2` generates a timestamp `t4` such that `t4 > t3`.
- the new state is as follows:
```
(node1, tdiscard1', ()) ; tdiscard1' = t2
(node2, tdiscard2', (v4, t4)) ; tdiscard2' = t3
```
**A more complex insertion example.** In the general case, other intermediate values could have
been written before `v4` with context `[(node1, t2), (node2, t3)]` is sent to the system.
For instance, here is a possible sequence of events:
1. First we have the set of values v1, v2 and v3 described above.
A node reads it, it obtains values v1, v2 and v3 with context `[(node1, t2), (node2, t3)]`.
2. A node writes a value `v5` with context `[(node1, t1)]`, i.e. `v5` is only a
successor of v1 but not of v2 or v3. Suppose node1 receives the write, it
will generate a new timestamp `t5` larger than all of the timestamps it
knows of, i.e. `t5 > t2`. We will now have:
```
(node1, tdiscard1'', (v2, t2), (v5, t5)) ; tdiscard1'' = t1 < t2 < t5
(node2, tdiscard2, (v3, t3) ; tdiscard2 < t3
```
3. Now `v4` is written with context `[(node1, t2), (node2, t3)]`, and node2
processes the query. It will generate `t4 > t3` and the state will become:
```
(node1, tdiscard1', (v5, t5)) ; tdiscard1' = t2 < t5
(node2, tdiscard2', (v4, t4)) ; tdiscard2' = t3
```
**Generic algorithm for handling insertions:** A certain node n handles the
InsertItem and is responsible for the correctness of this procedure.
1. Lock the key (or the whole table?) at this node to prevent concurrent updates of the value that would mess things up
2. Read current set of values
3. Generate a new timestamp that is larger than the largest timestamp for node n
4. Add the inserted value in the list of values of node n
5. Update the discard times to be the times set in the context, and accordingly discard overwritten values
6. Release lock
7. Propagate updated value to other nodes
8. Return to user when propagation achieved the write quorum (propagation to other nodes continues asynchronously)
**Encoding of contexts:**
Contexts consist in a list of (node id, timestamp) pairs.
They are encoded in binary as follows:
```
checksum: u64, [ node: u64, timestamp: u64 ]*
```
The checksum is just the XOR of all of the node IDs and timestamps.
Once encoded in binary, contexts are written and transmitted in base64.
### Indexing
K2V keeps an index, a secondary data structure that is updated asynchronously,
that keeps tracks of the number of triplets stored for each partition key.
This allows easy listing of all of the partition keys for which triplets exist
in a bucket, as the partition key becomes the sort key in the index.
How indexing works:
- Each node keeps a local count of how many items it stores for each partition,
in a local Sled tree that is updated atomically when an item is modified.
- These local counters are asynchronously stored in the index table which is
a regular Garage table spread in the network. Counters are stored as LWW values,
so basically the final table will have the following structure:
```
- pk: bucket
- sk: partition key for which we are counting
- v: lwwmap (node id -> number of items)
```
The final number of items present in the partition can be estimated by taking
the maximum of the values (i.e. the value for the node that announces having
the most items for that partition). In most cases the values for different node
IDs should all be the same; more precisely, three node IDs should map to the
same non-zero value, and all other node IDs that are present are tombstones
that map to zeroes. Note that we need to filter out values from nodes that are
no longer part of the cluster layout, as when nodes are removed they won't
necessarily have had the time to set their counters to zero.
## Important details
**THIS SECTION CONTAINS A FEW WARNINGS ON THE K2V API WHICH ARE IMPORTANT
TO UNDERSTAND IN ORDER TO USE IT CORRECTLY.**
- **Internal server errors on updates do not mean that the update isn't stored.**
K2V will return an internal server error when it cannot reach a quorum of nodes on
which to save an updated value. However the value may still be stored on just one
node, which will then propagate it to other nodes asynchronously via anti-entropy.
- **Batch operations are not transactions.** When calling InsertBatch or DeleteBatch,
items may appear partially inserted/deleted while the operation is being processed.
More importantly, if InsertBatch or DeleteBatch returns an internal server error,
some of the items to be inserted/deleted might end up inserted/deleted on the server,
while others may still have their old value.
- **Concurrent values are deduplicated.** When inserting a value for a key,
Garage might internally end up
storing the value several times if there are network errors. These values will end up as
concurrent values for a key, with the same byte string (or `null` for a deletion).
Garage fixes this by deduplicating concurrent values when they are returned to the
user on read operations. Importantly, *Garage does not differentiate between duplicate
concurrent values due to the user making the same call twice, or Garage having to
do an internal retry*. This means that all duplicate concurrent values are deduplicated
when an item is read: if the user inserts twice concurrently the same value, they will
only read it once.
## API Endpoints
**Remark.** Example queries and responses here are given in JSON5 format
for clarity. However the actual K2V API uses basic JSON so all examples
and responses need to be translated.
### Operations on single items
**ReadItem: `GET /<bucket>/<partition key>?sort_key=<sort key>`**
Query parameters:
| name | default value | meaning |
| - | - | - |
| `sort_key` | **mandatory** | The sort key of the item to read |
Returns the item with specified partition key and sort key. Values can be
returned in either of two ways:
1. a JSON array of base64-encoded values, or `null`'s for tombstones, with
header `Content-Type: application/json`
2. in the case where there are no concurrent values, the single present value
can be returned directly as the response body (or an HTTP 204 NO CONTENT for
a tombstone), with header `Content-Type: application/octet-stream`
The choice between return formats 1 and 2 is directed by the `Accept` HTTP header:
- if the `Accept` header is not present, format 1 is always used
- if `Accept` contains `application/json` but not `application/octet-stream`,
format 1 is always used
- if `Accept` contains `application/octet-stream` but not `application/json`,
format 2 is used when there is a single value, and an HTTP error 409 (HTTP
409 CONFLICT) is returned in the case of multiple concurrent values
(including concurrent tombstones)
- if `Accept` contains both, format 2 is used when there is a single value, and
format 1 is used as a fallback in case of concurrent values
- if `Accept` contains none, HTTP 406 NOT ACCEPTABLE is raised
Example query:
```
GET /my_bucket/mailboxes?sort_key=INBOX HTTP/1.1
```
Example response:
```json
HTTP/1.1 200 OK
X-Garage-Causality-Token: opaquetoken123
Content-Type: application/json
[
"b64cryptoblob123",
"b64cryptoblob'123"
]
```
Example response in case the item is a tombstone:
```
HTTP/1.1 200 OK
X-Garage-Causality-Token: opaquetoken999
Content-Type: application/json
[
null
]
```
Example query 2:
```
GET /my_bucket/mailboxes?sort_key=INBOX HTTP/1.1
Accept: application/octet-stream
```
Example response if multiple concurrent versions exist:
```
HTTP/1.1 409 CONFLICT
X-Garage-Causality-Token: opaquetoken123
Content-Type: application/octet-stream
```
Example response in case of single value:
```
HTTP/1.1 200 OK
X-Garage-Causality-Token: opaquetoken123
Content-Type: application/octet-stream
cryptoblob123
```
Example response in case of a single value that is a tombstone:
```
HTTP/1.1 204 NO CONTENT
X-Garage-Causality-Token: opaquetoken123
Content-Type: application/octet-stream
```
**PollItem: `GET /<bucket>/<partition key>?sort_key=<sort key>&causality_token=<causality token>`**
This endpoint will block until a new value is written to a key.
The GET parameter `causality_token` should be set to the causality
token returned with the last read of the key, so that K2V knows
what values are concurrent or newer than the ones that the
client previously knew.
This endpoint returns the new value in the same format as ReadItem.
If no new value is written and the timeout elapses,
an HTTP 304 NOT MODIFIED is returned.
Query parameters:
| name | default value | meaning |
| - | - | - |
| `sort_key` | **mandatory** | The sort key of the item to read |
| `causality_token` | **mandatory** | The causality token of the last known value or set of values |
| `timeout` | 300 | The timeout before 304 NOT MODIFIED is returned if the value isn't updated |
The timeout can be set to any number of seconds, with a maximum of 600 seconds (10 minutes).
**InsertItem: `PUT /<bucket>/<partition key>?sort_key=<sort_key>`**
Inserts a single item. This request does not use JSON, the body is sent directly as a binary blob.
To supersede previous values, the HTTP header `X-Garage-Causality-Token` should
be set to the causality token returned by a previous read on this key. This
header can be ommitted for the first writes to the key.
Example query:
```
PUT /my_bucket/mailboxes?sort_key=INBOX HTTP/1.1
X-Garage-Causality-Token: opaquetoken123
myblobblahblahblah
```
Example response:
```
HTTP/1.1 200 OK
```
**DeleteItem: `DELETE /<bucket>/<partition key>?sort_key=<sort_key>`**
Deletes a single item. The HTTP header `X-Garage-Causality-Token` must be set
to the causality token returned by a previous read on this key, to indicate
which versions of the value should be deleted. The request will not process if
`X-Garage-Causality-Token` is not set.
Example query:
```
DELETE /my_bucket/mailboxes?sort_key=INBOX HTTP/1.1
X-Garage-Causality-Token: opaquetoken123
```
Example response:
```
HTTP/1.1 204 NO CONTENT
```
### Operations on index
**ReadIndex: `GET /<bucket>?start=<start>&end=<end>&limit=<limit>`**
Lists all partition keys in the bucket for which some triplets exist, and gives
for each the number of triplets, total number of values (which might be bigger
than the number of triplets in case of conflicts), total number of bytes of
these values, and number of triplets that are in a state of conflict.
The values returned are an approximation of the true counts in the bucket,
as these values are asynchronously updated, and thus eventually consistent.
Query parameters:
| name | default value | meaning |
| - | - | - |
| `prefix` | `null` | Restrict listing to partition keys that start with this prefix |
| `start` | `null` | First partition key to list, in lexicographical order |
| `end` | `null` | Last partition key to list (excluded) |
| `limit` | `null` | Maximum number of partition keys to list |
| `reverse` | `false` | Iterate in reverse lexicographical order |
The response consists in a JSON object that repeats the parameters of the query and gives the result (see below).
The listing starts at partition key `start`, or if not specified at the
smallest partition key that exists. It returns partition keys in increasing
order, or decreasing order if `reverse` is set to `true`,
and stops when either of the following conditions is met:
1. if `end` is specfied, the partition key `end` is reached or surpassed (if it
is reached exactly, it is not included in the result)
2. if `limit` is specified, `limit` partition keys have been listed
3. no more partition keys are available to list
In case 2, and if there are more partition keys to list before condition 1
triggers, then in the result `more` is set to `true` and `nextStart` is set to
the first partition key that couldn't be listed due to the limit. In the first
case (if the listing stopped because of the `end` parameter), `more` is not set
and the `nextStart` key is not specified.
Note that if `reverse` is set to `true`, `start` is the highest key
(in lexicographical order) for which values are returned.
This means that if an `end` is specified, it must be smaller than `start`,
otherwise no values will be returned.
Example query:
```
GET /my_bucket HTTP/1.1
```
Example response:
```json
HTTP/1.1 200 OK
{
prefix: null,
start: null,
end: null,
limit: null,
reverse: false,
partitionKeys: [
{
pk: "keys",
entries: 3043,
conflicts: 0,
values: 3043,
bytes: 121720,
},
{
pk: "mailbox:INBOX",
entries: 42,
conflicts: 1,
values: 43,
bytes: 142029,
},
{
pk: "mailbox:Junk",
entries: 2991
conflicts: 0,
values: 2991,
bytes: 12019322,
},
{
pk: "mailbox:Trash",
entries: 10,
conflicts: 0,
values: 10,
bytes: 32401,
},
{
pk: "mailboxes",
entries: 3,
conflicts: 0,
values: 3,
bytes: 3019,
},
],
more: false,
nextStart: null,
}
```
### Operations on batches of items
**InsertBatch: `POST /<bucket>`**
Simple insertion and deletion of triplets. The body is just a list of items to
insert in the following format:
`{ pk: "<partition key>", sk: "<sort key>", ct: "<causality token>"|null, v: "<value>"|null }`.
The causality token should be the one returned in a previous read request (e.g.
by ReadItem or ReadBatch), to indicate that this write takes into account the
values that were returned from these reads, and supersedes them causally. If
the triplet is inserted for the first time, the causality token should be set to
`null`.
The value is expected to be a base64-encoded binary blob. The value `null` can
also be used to delete the triplet while preserving causality information: this
allows to know if a delete has happenned concurrently with an insert, in which
case both are preserved and returned on reads (see below).
Partition keys and sort keys are utf8 strings which are stored sorted by
lexicographical ordering of their binary representation.
Example query:
```json
POST /my_bucket HTTP/1.1
[
{ pk: "mailbox:INBOX", sk: "001892831", ct: "opaquetoken321", v: "b64cryptoblob321updated" },
{ pk: "mailbox:INBOX", sk: "001892912", ct: null, v: "b64cryptoblob444" },
{ pk: "mailbox:INBOX", sk: "001892932", ct: "opaquetoken654", v: null },
]
```
Example response:
```
HTTP/1.1 200 OK
```
**ReadBatch: `POST /<bucket>?search`**, or alternatively<br/>
**ReadBatch: `SEARCH /<bucket>`**
Batch read of triplets in a bucket.
The request body is a JSON list of searches, that each specify a range of
items to get (to get single items, set `singleItem` to `true`). A search is a
JSON struct with the following fields:
| name | default value | meaning |
| - | - | - |
| `partitionKey` | **mandatory** | The partition key in which to search |
| `prefix` | `null` | Restrict items to list to those whose sort keys start with this prefix |
| `start` | `null` | The sort key of the first item to read |
| `end` | `null` | The sort key of the last item to read (excluded) |
| `limit` | `null` | The maximum number of items to return |
| `reverse` | `false` | Iterate in reverse lexicographical order on sort keys |
| `singleItem` | `false` | Whether to return only the item with sort key `start` |
| `conflictsOnly` | `false` | Whether to return only items that have several concurrent values |
| `tombstones` | `false` | Whether or not to return tombstone lines to indicate the presence of old deleted items |
For each of the searches, triplets are listed and returned separately. The
semantics of `prefix`, `start`, `end`, `limit` and `reverse` are the same as for ReadIndex. The
additionnal parameter `singleItem` allows to get a single item, whose sort key
is the one given in `start`. Parameters `conflictsOnly` and `tombstones`
control additional filters on the items that are returned.
The result is a list of length the number of searches, that consists in for
each search a JSON object specified similarly to the result of ReadIndex, but
that lists triplets within a partition key.
The format of returned tuples is as follows: `{ sk: "<sort key>", ct: "<causality
token>", v: ["<value1>", ...] }`, with the following fields:
- `sk` (sort key): any unicode string used as a sort key
- `ct` (causality token): an opaque token served by the server (generally
base64-encoded) to be used in subsequent writes to this key
- `v` (list of values): each value is a binary blob, always base64-encoded;
contains multiple items when concurrent values exists
- in case of concurrent update and deletion, a `null` is added to the list of concurrent values
- if the `tombstones` query parameter is set to `true`, tombstones are returned
for items that have been deleted (this can be usefull for inserting after an
item that has been deleted, so that the insert is not considered
concurrent with the delete). Tombstones are returned as tuples in the
same format with only `null` values
Example query:
```json
POST /my_bucket?search HTTP/1.1
[
{
partitionKey: "mailboxes",
},
{
partitionKey: "mailbox:INBOX",
start: "001892831",
limit: 3,
},
{
partitionKey: "keys",
start: "0",
singleItem: true,
},
]
```
Example associated response body:
```json
HTTP/1.1 200 OK
[
{
partitionKey: "mailboxes",
prefix: null,
start: null,
end: null,
limit: null,
reverse: false,
conflictsOnly: false,
tombstones: false,
singleItem: false,
items: [
{ sk: "INBOX", ct: "opaquetoken123", v: ["b64cryptoblob123", "b64cryptoblob'123"] },
{ sk: "Trash", ct: "opaquetoken456", v: ["b64cryptoblob456"] },
{ sk: "Junk", ct: "opaquetoken789", v: ["b64cryptoblob789"] },
],
more: false,
nextStart: null,
},
{
partitionKey: "mailbox::INBOX",
prefix: null,
start: "001892831",
end: null,
limit: 3,
reverse: false,
conflictsOnly: false,
tombstones: false,
singleItem: false,
items: [
{ sk: "001892831", ct: "opaquetoken321", v: ["b64cryptoblob321"] },
{ sk: "001892832", ct: "opaquetoken654", v: ["b64cryptoblob654"] },
{ sk: "001892874", ct: "opaquetoken987", v: ["b64cryptoblob987"] },
],
more: true,
nextStart: "001892898",
},
{
partitionKey: "keys",
prefix: null,
start: "0",
end: null,
conflictsOnly: false,
tombstones: false,
limit: null,
reverse: false,
singleItem: true,
items: [
{ sk: "0", ct: "opaquetoken999", v: ["b64binarystuff999"] },
],
more: false,
nextStart: null,
},
]
```
**DeleteBatch: `POST /<bucket>?delete`**
Batch deletion of triplets. The request format is the same for `POST
/<bucket>?search` to indicate items or range of items, except that here they
are deleted instead of returned, but only the fields `partitionKey`, `prefix`, `start`,
`end`, and `singleItem` are supported. Causality information is not given by
the user: this request will internally list all triplets and write deletion
markers that supersede all of the versions that have been read.
This request returns for each series of items to be deleted, the number of
matching items that have been found and deleted.
Example query:
```json
POST /my_bucket?delete HTTP/1.1
[
{
partitionKey: "mailbox:OldMailbox",
},
{
partitionKey: "mailbox:INBOX",
start: "0018928321",
singleItem: true,
},
]
```
Example response:
```
HTTP/1.1 200 OK
[
{
partitionKey: "mailbox:OldMailbox",
prefix: null,
start: null,
end: null,
singleItem: false,
deletedItems: 35,
},
{
partitionKey: "mailbox:INBOX",
prefix: null,
start: "0018928321",
end: null,
singleItem: true,
deletedItems: 1,
},
]
```
## Internals: causality tokens
The method used is based on DVVS (dotted version vector sets). See:
- the paper "Scalable and Accurate Causality Tracking for Eventually Consistent Data Stores"
- <https://github.com/ricardobcl/Dotted-Version-Vectors>
For DVVS to work, write operations (at each node) must take a lock on the data table.

14
doc/talks/2022-06-23-stack/.gitignore vendored Normal file
View file

@ -0,0 +1,14 @@
*
!assets
!.gitignore
!*.svg
!*.png
!*.jpg
!*.tex
!Makefile
!.gitignore
!assets/*.drawio.pdf
!talk.pdf

View file

@ -0,0 +1,5 @@
talk.pdf: talk.tex assets/consistent_hashing_1.pdf assets/consistent_hashing_2.pdf assets/consistent_hashing_3.pdf assets/consistent_hashing_4.pdf assets/garage_tables.pdf assets/deuxfleurs.pdf
pdflatex talk.tex
assets/%.pdf: assets/%.svg
inkscape -D -z --file=$^ --export-pdf=$@

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 115 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 184 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.9 KiB

BIN
doc/talks/2022-06-23-stack/assets/aerogramme_keys.drawio.pdf (Stored with Git LFS) Normal file

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 263 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 82 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 53 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 54 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 56 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 57 KiB

View file

@ -0,0 +1,91 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
viewBox="0 0 70.424515 70.300102"
version="1.1"
id="svg8"
sodipodi:docname="logo.svg"
inkscape:version="1.1 (c68e22c387, 2021-05-23)"
inkscape:export-filename="/home/quentin/Documents/dev/deuxfleurs/site/src/img/logo.png"
inkscape:export-xdpi="699.30194"
inkscape:export-ydpi="699.30194"
width="70.424515"
height="70.300102"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns="http://www.w3.org/2000/svg"
xmlns:svg="http://www.w3.org/2000/svg">
<defs
id="defs12" />
<sodipodi:namedview
id="namedview10"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageshadow="2"
inkscape:pageopacity="0.0"
inkscape:pagecheckerboard="0"
showgrid="false"
inkscape:zoom="12.125"
inkscape:cx="43.092783"
inkscape:cy="48.082474"
inkscape:window-width="3072"
inkscape:window-height="1659"
inkscape:window-x="0"
inkscape:window-y="0"
inkscape:window-maximized="1"
inkscape:current-layer="svg8" />
<g
id="g79969"
transform="translate(-0.827,34.992103)">
<path
fill="#ffffff"
d="m 15.632,34.661 c -0.799,-0.597 -1.498,-1.484 -2.035,-2.592 l -0.228,-0.47 -0.46,0.249 c -0.975,0.528 -1.913,0.858 -2.744,0.969 L 9.963,29.061 6.327,30.029 C 6.17,29.175 6.202,28.142 6.423,27.007 L 6.526,26.482 5.994,26.416 C 4.752,26.262 3.688,25.891 2.89,25.336 L 4.411,22.419 1.423,20.896 C 1.742,19.952 2.371,19.014 3.257,18.161 L 3.634,17.798 3.255,17.438 C 2.452,16.674 1.847,15.884 1.485,15.127 L 4.995,13.774 2.95,10.615 C 3.69,10.213 4.643,9.929 5.739,9.783 L 6.258,9.715 6.167,9.201 C 5.952,7.99 5.995,6.863 6.291,5.913 l 3.308,0.523 0.524,-3.308 c 0.988,0.013 2.08,0.326 3.164,0.907 L 13.749,4.283 13.975,3.81 C 14.454,2.807 15.019,1.986 15.628,1.406 L 18,4.326 20.372,1.406 c 0.609,0.58 1.175,1.401 1.653,2.404 l 0.226,0.473 0.462,-0.247 C 23.798,3.455 24.891,3.142 25.877,3.13 L 26.4,6.438 29.71,5.913 c 0.296,0.951 0.34,2.078 0.124,3.288 l -0.092,0.515 0.518,0.069 c 1.095,0.145 2.048,0.43 2.788,0.832 l -2.046,3.156 3.511,1.355 c -0.361,0.757 -0.966,1.547 -1.77,2.311 l -0.379,0.36 0.377,0.363 c 0.888,0.854 1.516,1.793 1.835,2.736 l -2.984,1.52 1.521,2.984 c -0.812,0.574 -1.871,0.964 -3.094,1.134 l -0.518,0.072 0.096,0.514 c 0.201,1.089 0.226,2.083 0.073,2.909 l -3.634,-0.97 -0.204,3.757 c -0.83,-0.11 -1.768,-0.44 -2.742,-0.968 l -0.459,-0.249 -0.228,0.47 c -0.539,1.107 -1.237,1.994 -2.036,2.591 L 18,32.293 Z"
id="path2" />
<path
d="M 7.092,10.678 C 6.562,9.189 6.394,7.708 6.66,6.478 l 2.368,0.375 0.987,0.156 0.157,-0.988 0.375,-2.368 C 11.808,3.78 13.16,4.396 14.409,5.359 14.527,5.022 14.653,4.696 14.791,4.392 13.24,3.257 11.568,2.629 10.061,2.629 9.938,2.629 9.816,2.633 9.695,2.642 L 9.184,5.865 5.96,5.354 C 5.36,6.841 5.395,8.769 6.045,10.747 6.38,10.71 6.729,10.686 7.092,10.678 Z M 21.593,5.359 c 1.248,-0.962 2.6,-1.578 3.86,-1.705 l 0.376,2.368 0.156,0.988 0.987,-0.157 2.369,-0.376 c 0.266,1.23 0.098,2.71 -0.432,4.2 0.361,0.009 0.711,0.032 1.046,0.07 C 30.606,8.769 30.64,6.841 30.04,5.353 L 26.815,5.865 26.304,2.641 c -0.12,-0.008 -0.242,-0.012 -0.365,-0.012 -1.507,0 -3.179,0.628 -4.73,1.762 0.14,0.306 0.266,0.631 0.384,0.968 z M 7.368,27 h 0.035 c 0.067,0 0.157,-0.604 0.26,-0.947 -0.098,0.004 -0.197,0.046 -0.294,0.046 -1.496,0 -2.826,-0.303 -3.83,-0.89 L 4.628,23.081 5.082,22.194 4.191,21.742 2.055,20.654 C 2.563,19.503 3.57,18.404 4.873,17.511 4.586,17.292 4.312,17.07 4.063,16.842 2.376,18.059 1.217,19.597 0.828,21.152 l 2.908,1.483 -1.482,2.843 C 3.475,26.501 5.303,27 7.368,27 Z m 27.806,-5.846 c -0.39,-1.555 -1.548,-3.093 -3.234,-4.311 -0.25,0.228 -0.523,0.451 -0.81,0.669 1.304,0.893 2.31,1.992 2.817,3.145 l -2.136,1.088 -0.891,0.453 0.454,0.892 1.089,2.137 c -1.004,0.587 -2.332,0.904 -3.828,0.904 -0.099,0 -0.199,-0.01 -0.299,-0.013 0.103,0.344 0.192,0.683 0.26,1.011 l 0.039,0.002 c 2.066,0 3.892,-0.563 5.112,-1.587 l -1.482,-2.908 z m -12.653,9.182 c -0.447,1.517 -1.181,2.812 -2.119,3.651 L 18.707,32.293 18,31.586 l -0.707,0.707 -1.695,1.694 c -0.938,-0.839 -1.673,-2.136 -2.12,-3.652 -0.296,0.206 -0.593,0.397 -0.886,0.563 0.636,1.98 1.741,3.559 3.1,4.409 L 18,33 l 2.308,2.308 c 1.358,-0.851 2.464,-2.428 3.101,-4.408 -0.295,-0.168 -0.591,-0.359 -0.888,-0.564 z"
fill="#ea596e"
id="path4" />
<path
fill="#ea596e"
d="m 20.118,5.683 c 0.426,1.146 0.748,2.596 0.841,4.284 l 0.2,3.683 3.564,-0.946 c 1.32,-0.351 2.655,-0.536 3.86,-0.536 0.16,0 0.318,0.003 0.474,0.01 l -1.827,2.819 3.139,1.211 c -0.958,0.759 -2.237,1.514 -3.814,2.123 l -3.441,1.328 2.001,3.099 c 0.918,1.42 1.509,2.782 1.838,3.96 L 23.709,25.853 23.527,29.21 C 22.508,28.533 21.395,27.55 20.329,26.237 L 18,23.374 15.672,26.236 c -1.066,1.312 -2.179,2.295 -3.198,2.972 l -0.18,-3.354 -3.248,0.864 c 0.329,-1.178 0.921,-2.54 1.839,-3.961 L 12.889,19.658 9.447,18.33 C 7.87,17.721 6.591,16.967 5.633,16.208 L 8.768,15 6.941,12.177 c 0.155,-0.006 0.313,-0.01 0.473,-0.01 1.206,0 2.541,0.185 3.861,0.536 l 3.564,0.947 0.202,-3.683 c 0.092,-1.688 0.415,-3.138 0.84,-4.284 L 18,8.292 20.118,5.683 M 20.308,0.692 18,3.533 15.692,0.692 C 13.703,2.224 12.271,5.684 12.046,9.804 10.429,9.374 8.854,9.167 7.414,9.167 c -2.11,0 -3.929,0.445 -5.161,1.289 l 1.989,3.073 -3.415,1.316 c 0.842,2.366 3.69,4.797 7.54,6.283 -2.241,3.465 -3.116,7.106 -2.407,9.516 l 3.537,-0.941 0.196,3.654 c 2.512,-0.07 5.703,-2.027 8.307,-5.228 2.603,3.201 5.796,5.158 8.306,5.228 l 0.198,-3.655 3.535,0.943 c 0.71,-2.411 -0.165,-6.05 -2.404,-9.517 3.849,-1.485 6.696,-3.918 7.538,-6.283 l -3.415,-1.318 1.99,-3.07 c -1.233,-0.844 -3.053,-1.29 -5.164,-1.29 -1.438,0 -3.013,0.207 -4.63,0.636 C 23.729,5.684 22.297,2.224 20.308,0.692 Z"
id="path6" />
</g>
<g
id="g79964"
transform="translate(-1.043816,35.993714)">
<path
fill="#ffffff"
d="m 51.92633,-2.0247139 c -0.799,-0.597 -1.498,-1.484 -2.035,-2.592 l -0.228,-0.47 -0.46,0.249 c -0.975,0.528 -1.913,0.858 -2.744,0.969 l -0.202,-3.7560001 -3.636,0.968 c -0.157,-0.854 -0.125,-1.887 0.096,-3.022 l 0.103,-0.525 -0.532,-0.066 c -1.242,-0.154 -2.306,-0.525 -3.104,-1.08 l 1.521,-2.917 -2.988,-1.523 c 0.319,-0.944 0.948,-1.882 1.834,-2.735 l 0.377,-0.363 -0.379,-0.36 c -0.803,-0.764 -1.408,-1.554 -1.77,-2.311 l 3.51,-1.353 -2.045,-3.159 c 0.74,-0.402 1.693,-0.686 2.789,-0.832 l 0.519,-0.068 -0.091,-0.514 c -0.215,-1.211 -0.172,-2.338 0.124,-3.288 l 3.308,0.523 0.524,-3.308 c 0.988,0.013 2.08,0.326 3.164,0.907 l 0.462,0.248 0.226,-0.473 c 0.479,-1.003 1.044,-1.824 1.653,-2.404 l 2.372,2.92 2.372,-2.92 c 0.609,0.58 1.175,1.401 1.653,2.404 l 0.226,0.473 0.462,-0.247 c 1.085,-0.581 2.178,-0.894 3.164,-0.906 l 0.523,3.308 3.31,-0.525 c 0.296,0.951 0.34,2.078 0.124,3.288 l -0.092,0.515 0.518,0.069 c 1.095,0.145 2.048,0.43 2.788,0.832 l -2.046,3.156 3.511,1.355 c -0.361,0.757 -0.966,1.547 -1.77,2.311 l -0.379,0.36 0.377,0.363 c 0.888,0.854 1.516,1.793 1.835,2.736 l -2.984,1.52 1.521,2.984 c -0.812,0.574 -1.871,0.964 -3.094,1.134 l -0.518,0.072 0.096,0.514 c 0.201,1.089 0.226,2.083 0.073,2.909 l -3.634,-0.97 -0.204,3.7570001 c -0.83,-0.11 -1.768,-0.44 -2.742,-0.968 l -0.459,-0.249 -0.228,0.47 c -0.539,1.107 -1.237,1.994 -2.036,2.591 l -2.367,-2.369 z"
id="path2-9" />
<path
d="m 43.38633,-26.007714 c -0.53,-1.489 -0.698,-2.97 -0.432,-4.2 l 2.368,0.375 0.987,0.156 0.157,-0.988 0.375,-2.368 c 1.261,0.127 2.613,0.743 3.862,1.706 0.118,-0.337 0.244,-0.663 0.382,-0.967 -1.551,-1.135 -3.223,-1.763 -4.73,-1.763 -0.123,0 -0.245,0.004 -0.366,0.013 l -0.511,3.223 -3.224,-0.511 c -0.6,1.487 -0.565,3.415 0.085,5.393 0.335,-0.037 0.684,-0.061 1.047,-0.069 z m 14.501,-5.319 c 1.248,-0.962 2.6,-1.578 3.86,-1.705 l 0.376,2.368 0.156,0.988 0.987,-0.157 2.369,-0.376 c 0.266,1.23 0.098,2.71 -0.432,4.2 0.361,0.009 0.711,0.032 1.046,0.07 0.651,-1.978 0.685,-3.906 0.085,-5.394 l -3.225,0.512 -0.511,-3.224 c -0.12,-0.008 -0.242,-0.012 -0.365,-0.012 -1.507,0 -3.179,0.628 -4.73,1.762 0.14,0.306 0.266,0.631 0.384,0.968 z m -14.225,21.641 h 0.035 c 0.067,0 0.157,-0.604 0.26,-0.947 -0.098,0.004 -0.197,0.046 -0.294,0.046 -1.496,0 -2.826,-0.303 -3.83,-0.89 l 1.089,-2.128 0.454,-0.887 -0.891,-0.452 -2.136,-1.088 c 0.508,-1.151 1.515,-2.25 2.818,-3.143 -0.287,-0.219 -0.561,-0.441 -0.81,-0.669 -1.687,1.217 -2.846,2.755 -3.235,4.31 l 2.908,1.483 -1.482,2.843 c 1.221,1.023 3.049,1.522 5.114,1.522 z m 27.806,-5.846 c -0.39,-1.555 -1.548,-3.093 -3.234,-4.311 -0.25,0.228 -0.523,0.451 -0.81,0.669 1.304,0.893 2.31,1.992 2.817,3.145 l -2.136,1.088 -0.891,0.453 0.454,0.892 1.089,2.137 c -1.004,0.587 -2.332,0.904 -3.828,0.904 -0.099,0 -0.199,-0.01 -0.299,-0.013 0.103,0.344 0.192,0.683 0.26,1.011 l 0.039,0.002 c 2.066,0 3.892,-0.563 5.112,-1.587 l -1.482,-2.908 z m -12.653,9.182 c -0.447,1.5170001 -1.181,2.8120001 -2.119,3.6510001 l -1.695,-1.694 -0.707,-0.707 -0.707,0.707 -1.695,1.694 c -0.938,-0.839 -1.673,-2.136 -2.12,-3.6520001 -0.296,0.2060001 -0.593,0.3970001 -0.886,0.5630001 0.636,1.98 1.741,3.559 3.1,4.409 l 2.308,-2.307 2.308,2.308 c 1.358,-0.851 2.464,-2.428 3.101,-4.408 -0.295,-0.168 -0.591,-0.359 -0.888,-0.5640001 z"
fill="#ea596e"
id="path4-3" />
<path
fill="#ea596e"
d="m 56.41233,-31.002714 c 0.426,1.146 0.748,2.596 0.841,4.284 l 0.2,3.683 3.564,-0.946 c 1.32,-0.351 2.655,-0.536 3.86,-0.536 0.16,0 0.318,0.003 0.474,0.01 l -1.827,2.819 3.139,1.211 c -0.958,0.759 -2.237,1.514 -3.814,2.123 l -3.441,1.328 2.001,3.099 c 0.918,1.42 1.509,2.782 1.838,3.96 l -3.244,-0.865 -0.182,3.357 c -1.019,-0.677 -2.132,-1.66 -3.198,-2.973 l -2.329,-2.863 -2.328,2.862 c -1.066,1.312 -2.179,2.295 -3.198,2.972 l -0.18,-3.354 -3.248,0.864 c 0.329,-1.178 0.921,-2.54 1.839,-3.961 l 2.004,-3.099 -3.442,-1.328 c -1.577,-0.609 -2.856,-1.363 -3.814,-2.122 l 3.135,-1.208 -1.827,-2.823 c 0.155,-0.006 0.313,-0.01 0.473,-0.01 1.206,0 2.541,0.185 3.861,0.536 l 3.564,0.947 0.202,-3.683 c 0.092,-1.688 0.415,-3.138 0.84,-4.284 l 2.119,2.609 2.118,-2.609 m 0.19,-4.991 -2.308,2.841 -2.308,-2.841 c -1.989,1.532 -3.421,4.992 -3.646,9.112 -1.617,-0.43 -3.192,-0.637 -4.632,-0.637 -2.11,0 -3.929,0.445 -5.161,1.289 l 1.989,3.073 -3.415,1.316 c 0.842,2.366 3.69,4.797 7.54,6.283 -2.241,3.465 -3.116,7.106 -2.407,9.5160001 l 3.537,-0.9410001 0.196,3.6540001 c 2.512,-0.07 5.703,-2.027 8.307,-5.2280001 2.603,3.2010001 5.796,5.1580001 8.306,5.2280001 l 0.198,-3.6550001 3.535,0.9430001 c 0.71,-2.4110001 -0.165,-6.0500001 -2.404,-9.5170001 3.849,-1.485 6.696,-3.918 7.538,-6.283 l -3.415,-1.318 1.99,-3.07 c -1.233,-0.844 -3.053,-1.29 -5.164,-1.29 -1.438,0 -3.013,0.207 -4.63,0.636 -0.225,-4.119 -1.657,-7.579 -3.646,-9.111 z"
id="path6-6" />
</g>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:42.6667px;line-height:1.25;font-family:sans-serif;fill:#ea596e;fill-opacity:1;stroke:none"
x="2.2188232"
y="31.430677"
id="text46212"><tspan
sodipodi:role="line"
id="tspan46210"
x="2.2188232"
y="31.430677"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:42.6667px;font-family:'TeX Gyre Termes';-inkscape-font-specification:'TeX Gyre Termes'">D</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:42.6667px;line-height:1.25;font-family:sans-serif;fill:#ea596e;fill-opacity:1;stroke:none"
x="41.347008"
y="67.114784"
id="text46212-1"><tspan
sodipodi:role="line"
id="tspan46210-5"
x="41.347008"
y="67.114784"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:42.6667px;font-family:'TeX Gyre Termes';-inkscape-font-specification:'TeX Gyre Termes'">F</tspan></text>
</svg>

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 129 KiB

BIN
doc/talks/2022-06-23-stack/assets/garage.drawio.pdf (Stored with Git LFS) Normal file

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

BIN
doc/talks/2022-06-23-stack/assets/garage2a.drawio.pdf (Stored with Git LFS) Normal file

Binary file not shown.

BIN
doc/talks/2022-06-23-stack/assets/garage2b.drawio.pdf (Stored with Git LFS) Normal file

Binary file not shown.

View file

@ -0,0 +1,537 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
width="850"
height="480"
viewBox="0 0 224.89584 127"
version="1.1"
id="svg8"
inkscape:version="1.2 (dc2aedaf03, 2022-05-15)"
sodipodi:docname="garage_tables.svg"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns="http://www.w3.org/2000/svg"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<defs
id="defs2">
<marker
style="overflow:visible"
id="marker1262"
refX="0"
refY="0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:isstock="true">
<path
transform="matrix(-0.4,0,0,-0.4,-4,0)"
style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1"
d="M 0,0 5,-5 -12.5,0 5,5 Z"
id="path1260" />
</marker>
<marker
style="overflow:visible"
id="Arrow1Mend"
refX="0"
refY="0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:isstock="true"
inkscape:collect="always">
<path
transform="matrix(-0.4,0,0,-0.4,-4,0)"
style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1"
d="M 0,0 5,-5 -12.5,0 5,5 Z"
id="path965" />
</marker>
<marker
style="overflow:visible"
id="Arrow1Lend"
refX="0"
refY="0"
orient="auto"
inkscape:stockid="Arrow1Lend"
inkscape:isstock="true">
<path
transform="matrix(-0.8,0,0,-0.8,-10,0)"
style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1"
d="M 0,0 5,-5 -12.5,0 5,5 Z"
id="path959" />
</marker>
</defs>
<sodipodi:namedview
id="base"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageopacity="0.0"
inkscape:pageshadow="2"
inkscape:zoom="0.98994949"
inkscape:cx="429.31483"
inkscape:cy="289.40871"
inkscape:document-units="mm"
inkscape:current-layer="layer1"
inkscape:document-rotation="0"
showgrid="false"
units="px"
inkscape:window-width="1678"
inkscape:window-height="993"
inkscape:window-x="0"
inkscape:window-y="0"
inkscape:window-maximized="1"
inkscape:showpageshadow="2"
inkscape:pagecheckerboard="0"
inkscape:deskcolor="#d1d1d1" />
<metadata
id="metadata5">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title />
</cc:Work>
</rdf:RDF>
</metadata>
<g
inkscape:label="Layer 1"
inkscape:groupmode="layer"
id="layer1">
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="39.570904"
y="38.452755"
id="text2025"><tspan
sodipodi:role="line"
id="tspan2023"
x="39.570904"
y="38.452755"
style="font-size:5.64444px;stroke-width:0.264583" /></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:10.5833px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="101.95796"
y="92.835831"
id="text2139"><tspan
sodipodi:role="line"
id="tspan2137"
x="101.95796"
y="92.835831"
style="stroke-width:0.264583"> </tspan></text>
<g
id="g2316"
transform="translate(-11.455511,1.5722486)">
<g
id="g2277">
<rect
style="fill:none;stroke:#000000;stroke-width:0.8;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833"
width="47.419891"
height="95.353409"
x="18.534418"
y="24.42766" />
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-3"
width="47.419891"
height="86.973076"
x="18.534418"
y="32.807987" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="32.250839"
y="29.894743"
id="text852"><tspan
sodipodi:role="line"
id="tspan850"
x="32.250839"
y="29.894743"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Object</tspan></text>
</g>
<g
id="g2066"
transform="translate(-2.1807817,-3.0621439)">
<g
id="g1969"
transform="matrix(0.12763631,0,0,0.12763631,0.7215051,24.717273)"
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-opacity:1">
<path
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-width:0.264583;stroke-opacity:1"
d="m 203.71837,154.80038 c -1.11451,3.75057 -2.45288,5.84095 -5.11132,7.98327 -2.2735,1.83211 -4.66721,2.65982 -8.09339,2.79857 -2.59227,0.10498 -2.92868,0.0577 -5.02863,-0.70611 -3.99215,-1.45212 -7.1627,-4.65496 -8.48408,-8.57046 -1.28374,-3.80398 -0.61478,-8.68216 1.64793,-12.01698 0.87317,-1.28689 3.15089,-3.48326 4.18771,-4.03815 l 0.53332,-28.51234 5.78454,-5.09197 6.95158,6.16704 -3.21112,3.49026 3.17616,3.45499 -3.17616,3.40822 2.98973,3.28645 -3.24843,3.3829 4.49203,4.58395 0.0516,5.69106 c 1.06874,0.64848 3.81974,3.24046 4.69548,4.56257 0.452,0.68241 1.06834,2.0197 1.36962,2.97176 0.62932,1.98864 0.88051,5.785 0.47342,7.15497 z m -10.0406,2.32604 c -0.88184,-3.17515 -4.92402,-3.78864 -6.75297,-1.02492 -0.58328,0.8814 -0.6898,1.28852 -0.58362,2.23056 0.26492,2.35041 2.45434,3.95262 4.60856,3.37255 1.19644,-0.32217 2.39435,-1.44872 2.72875,-2.56621 0.30682,-1.02529 0.30686,-0.9045 -7.9e-4,-2.01198 z"
id="path1971"
sodipodi:nodetypes="ssscsscccccccccccssscsssscc" />
</g>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="28.809687"
y="44.070885"
id="text852-9"><tspan
sodipodi:role="line"
id="tspan850-4"
x="28.809687"
y="44.070885"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">bucket </tspan></text>
</g>
<g
id="g2066-7"
transform="translate(-2.1807817,6.2627616)">
<g
id="g1969-8"
transform="matrix(0.12763631,0,0,0.12763631,0.7215051,24.717273)"
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-opacity:1">
<path
style="fill:#4040ff;fill-opacity:1;stroke:none;stroke-width:0.264583;stroke-opacity:1"
d="m 203.71837,154.80038 c -1.11451,3.75057 -2.45288,5.84095 -5.11132,7.98327 -2.2735,1.83211 -4.66721,2.65982 -8.09339,2.79857 -2.59227,0.10498 -2.92868,0.0577 -5.02863,-0.70611 -3.99215,-1.45212 -7.1627,-4.65496 -8.48408,-8.57046 -1.28374,-3.80398 -0.61478,-8.68216 1.64793,-12.01698 0.87317,-1.28689 3.15089,-3.48326 4.18771,-4.03815 l 0.53332,-28.51234 5.78454,-5.09197 6.95158,6.16704 -3.21112,3.49026 3.17616,3.45499 -3.17616,3.40822 2.98973,3.28645 -3.24843,3.3829 4.49203,4.58395 0.0516,5.69106 c 1.06874,0.64848 3.81974,3.24046 4.69548,4.56257 0.452,0.68241 1.06834,2.0197 1.36962,2.97176 0.62932,1.98864 0.88051,5.785 0.47342,7.15497 z m -10.0406,2.32604 c -0.88184,-3.17515 -4.92402,-3.78864 -6.75297,-1.02492 -0.58328,0.8814 -0.6898,1.28852 -0.58362,2.23056 0.26492,2.35041 2.45434,3.95262 4.60856,3.37255 1.19644,-0.32217 2.39435,-1.44872 2.72875,-2.56621 0.30682,-1.02529 0.30686,-0.9045 -7.9e-4,-2.01198 z"
id="path1971-4"
sodipodi:nodetypes="ssscsscccccccccccssscsssscc" />
</g>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="28.809687"
y="44.070885"
id="text852-9-5"><tspan
sodipodi:role="line"
id="tspan850-4-0"
x="28.809687"
y="44.070885"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">file path </tspan></text>
<path
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-width:0.0337704;stroke-opacity:1"
d="m 174.20027,104.45585 c -0.14225,0.47871 -0.31308,0.74552 -0.65239,1.01896 -0.29018,0.23384 -0.5957,0.33949 -1.03301,0.3572 -0.33087,0.0134 -0.37381,0.007 -0.64184,-0.0901 -0.50954,-0.18534 -0.91422,-0.59414 -1.08287,-1.0939 -0.16385,-0.48552 -0.0785,-1.10816 0.21033,-1.5338 0.11145,-0.16426 0.40217,-0.44459 0.53451,-0.51542 l 0.0681,-3.639207 0.73832,-0.64992 0.88727,0.787138 -0.40986,0.445484 0.4054,0.440982 -0.4054,0.435013 0.3816,0.41947 -0.41461,0.43178 0.57334,0.58508 0.007,0.72639 c 0.13641,0.0828 0.48753,0.4136 0.59931,0.58235 0.0577,0.0871 0.13636,0.25778 0.17481,0.3793 0.0803,0.25382 0.11239,0.73838 0.0604,0.91323 z m -1.28154,0.29689 c -0.11256,-0.40526 -0.62849,-0.48357 -0.86193,-0.13082 -0.0745,0.1125 -0.088,0.16447 -0.0745,0.2847 0.0338,0.3 0.31326,0.5045 0.58822,0.43046 0.15271,-0.0411 0.30561,-0.1849 0.34829,-0.32754 0.0392,-0.13086 0.0392,-0.11544 -1e-4,-0.2568 z"
id="path1971-3"
sodipodi:nodetypes="ssscsscccccccccccssscsssscc" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="177.8474"
y="104.05132"
id="text852-9-6"><tspan
sodipodi:role="line"
id="tspan850-4-7"
x="177.8474"
y="104.05132"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">= partition key </tspan></text>
<path
style="fill:#4040ff;fill-opacity:1;stroke:none;stroke-width:0.0337704;stroke-opacity:1"
d="m 174.20027,113.78076 c -0.14225,0.47871 -0.31308,0.74552 -0.65239,1.01895 -0.29018,0.23385 -0.5957,0.33949 -1.03301,0.3572 -0.33087,0.0134 -0.37381,0.007 -0.64184,-0.0901 -0.50954,-0.18534 -0.91422,-0.59414 -1.08287,-1.0939 -0.16385,-0.48553 -0.0785,-1.10816 0.21033,-1.53381 0.11145,-0.16425 0.40217,-0.44459 0.53451,-0.51541 l 0.0681,-3.63921 0.73832,-0.64992 0.88727,0.78714 -0.40986,0.44548 0.4054,0.44098 -0.4054,0.43502 0.3816,0.41947 -0.41461,0.43178 0.57334,0.58508 0.007,0.72638 c 0.13641,0.0828 0.48753,0.4136 0.59931,0.58235 0.0577,0.0871 0.13636,0.25779 0.17481,0.37931 0.0803,0.25382 0.11239,0.73837 0.0604,0.91323 z m -1.28154,0.29689 c -0.11256,-0.40527 -0.62849,-0.48357 -0.86193,-0.13082 -0.0745,0.1125 -0.088,0.16446 -0.0745,0.2847 0.0338,0.3 0.31326,0.5045 0.58822,0.43046 0.15271,-0.0411 0.30561,-0.18491 0.34829,-0.32754 0.0392,-0.13087 0.0392,-0.11545 -1e-4,-0.2568 z"
id="path1971-4-5"
sodipodi:nodetypes="ssscsscccccccccccssscsssscc" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="177.8474"
y="113.37622"
id="text852-9-5-3"><tspan
sodipodi:role="line"
id="tspan850-4-0-5"
x="177.8474"
y="113.37622"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">= sort key </tspan></text>
</g>
<g
id="g2161"
transform="translate(-62.264403,-59.333115)">
<g
id="g2271"
transform="translate(0,67.042823)">
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-6"
width="39.008453"
height="16.775949"
x="84.896881"
y="90.266838" />
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-3-1"
width="39.008453"
height="8.673645"
x="84.896881"
y="98.369141" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="89.826942"
y="96.212921"
id="text852-0"><tspan
sodipodi:role="line"
id="tspan850-6"
x="89.826942"
y="96.212921"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Version 1</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="89.826942"
y="104.71013"
id="text852-0-3"><tspan
sodipodi:role="line"
id="tspan850-6-2"
x="89.826942"
y="104.71013"
style="font-style:italic;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';fill:#4d4d4d;stroke-width:0.264583">deleted</tspan></text>
</g>
</g>
<g
id="g2263"
transform="translate(0,-22.791204)">
<g
id="g2161-1"
transform="translate(-62.264403,-10.910843)">
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-6-5"
width="39.008453"
height="36.749603"
x="84.896881"
y="90.266838" />
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-3-1-5"
width="39.008453"
height="28.647301"
x="84.896881"
y="98.369141" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="89.826942"
y="96.212921"
id="text852-0-4"><tspan
sodipodi:role="line"
id="tspan850-6-7"
x="89.826942"
y="96.212921"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Version 2</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="89.826942"
y="104.71013"
id="text852-0-3-6"><tspan
sodipodi:role="line"
id="tspan850-6-2-5"
x="89.826942"
y="104.71013"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';fill:#000000;stroke-width:0.264583">id</tspan></text>
</g>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="27.56254"
y="100.34132"
id="text852-0-3-6-6"><tspan
sodipodi:role="line"
id="tspan850-6-2-5-9"
x="27.56254"
y="100.34132"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';fill:#000000;stroke-width:0.264583">size</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="27.56254"
y="106.90263"
id="text852-0-3-6-6-3"><tspan
sodipodi:role="line"
id="tspan850-6-2-5-9-7"
x="27.56254"
y="106.90263"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';fill:#000000;stroke-width:0.264583">MIME type</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="27.56254"
y="111.92816"
id="text852-0-3-6-6-3-4"><tspan
sodipodi:role="line"
id="tspan850-6-2-5-9-7-5"
x="27.56254"
y="111.92816"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';fill:#000000;stroke-width:0.264583">...</tspan></text>
</g>
</g>
<g
id="g898"
transform="translate(-6.2484318,29.95006)">
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-7"
width="47.419891"
height="44.007515"
x="95.443573"
y="24.42766" />
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-3-4"
width="47.419891"
height="35.627186"
x="95.443573"
y="32.807987" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="107.46638"
y="29.894743"
id="text852-4"><tspan
sodipodi:role="line"
id="tspan850-3"
x="107.46638"
y="29.894743"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Version</tspan></text>
<path
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-width:0.0337704;stroke-opacity:1"
d="m 102.90563,41.413279 c -0.14226,0.478709 -0.31308,0.745518 -0.65239,1.018956 -0.29019,0.233843 -0.59571,0.339489 -1.03301,0.357199 -0.33087,0.0134 -0.37381,0.0074 -0.64184,-0.09013 -0.50954,-0.185343 -0.914221,-0.594142 -1.082877,-1.093901 -0.163852,-0.485526 -0.07847,-1.108159 0.210335,-1.533803 0.111448,-0.164254 0.402172,-0.444591 0.534502,-0.515415 l 0.0681,-3.63921 0.73832,-0.64992 0.88727,0.787138 -0.40985,0.445484 0.40539,0.440982 -0.40539,0.435013 0.3816,0.41947 -0.41462,0.431781 0.57335,0.585078 0.007,0.726386 c 0.13641,0.08277 0.48753,0.413601 0.59931,0.58235 0.0577,0.0871 0.13636,0.257787 0.17481,0.379304 0.0803,0.253823 0.11239,0.738377 0.0604,0.913234 z m -1.28155,0.296888 c -0.11255,-0.405265 -0.62848,-0.483569 -0.86192,-0.130817 -0.0744,0.112498 -0.088,0.164461 -0.0745,0.2847 0.0338,0.299998 0.31326,0.504498 0.58822,0.43046 0.15271,-0.04112 0.3056,-0.184909 0.34828,-0.327542 0.0392,-0.130864 0.0392,-0.115447 -1e-4,-0.256801 z"
id="path1971-0"
sodipodi:nodetypes="ssscsscccccccccccssscsssscc" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="41.008743"
id="text852-9-7"><tspan
sodipodi:role="line"
id="tspan850-4-8"
x="104.99195"
y="41.008743"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">id </tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="49.168018"
id="text852-9-7-6"><tspan
sodipodi:role="line"
id="tspan850-4-8-8"
x="104.99195"
y="49.168018"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">h(block 1)</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="56.583336"
id="text852-9-7-6-8"><tspan
sodipodi:role="line"
id="tspan850-4-8-8-4"
x="104.99195"
y="56.583336"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">h(block 2)</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="64.265732"
id="text852-9-7-6-3"><tspan
sodipodi:role="line"
id="tspan850-4-8-8-1"
x="104.99195"
y="64.265732"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">...</tspan></text>
</g>
<g
id="g898-3"
transform="translate(75.777779,38.888663)">
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-7-6"
width="47.419891"
height="29.989157"
x="95.443573"
y="24.42766" />
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-3-4-7"
width="47.419891"
height="21.608831"
x="95.443573"
y="32.807987" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="102.11134"
y="29.894743"
id="text852-4-5"><tspan
sodipodi:role="line"
id="tspan850-3-3"
x="102.11134"
y="29.894743"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Data block</tspan></text>
<path
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-width:0.0337704;stroke-opacity:1"
d="m 102.90563,41.413279 c -0.14226,0.478709 -0.31308,0.745518 -0.65239,1.018956 -0.29019,0.233843 -0.59571,0.339489 -1.03301,0.357199 -0.33087,0.0134 -0.37381,0.0074 -0.64184,-0.09013 -0.50954,-0.185343 -0.914221,-0.594142 -1.082877,-1.093901 -0.163852,-0.485526 -0.07847,-1.108159 0.210335,-1.533803 0.111448,-0.164254 0.402172,-0.444591 0.534502,-0.515415 l 0.0681,-3.63921 0.73832,-0.64992 0.88727,0.787138 -0.40985,0.445484 0.40539,0.440982 -0.40539,0.435013 0.3816,0.41947 -0.41462,0.431781 0.57335,0.585078 0.007,0.726386 c 0.13641,0.08277 0.48753,0.413601 0.59931,0.58235 0.0577,0.0871 0.13636,0.257787 0.17481,0.379304 0.0803,0.253823 0.11239,0.738377 0.0604,0.913234 z m -1.28155,0.296888 c -0.11255,-0.405265 -0.62848,-0.483569 -0.86192,-0.130817 -0.0744,0.112498 -0.088,0.164461 -0.0745,0.2847 0.0338,0.299998 0.31326,0.504498 0.58822,0.43046 0.15271,-0.04112 0.3056,-0.184909 0.34828,-0.327542 0.0392,-0.130864 0.0392,-0.115447 -1e-4,-0.256801 z"
id="path1971-0-5"
sodipodi:nodetypes="ssscsscccccccccccssscsssscc" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="41.008743"
id="text852-9-7-62"><tspan
sodipodi:role="line"
id="tspan850-4-8-9"
x="104.99195"
y="41.008743"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">hash </tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="49.168018"
id="text852-9-7-6-1"><tspan
sodipodi:role="line"
id="tspan850-4-8-8-2"
x="104.99195"
y="49.168018"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">data</tspan></text>
</g>
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#Arrow1Mend)"
d="M 42.105292,69.455903 89.563703,69.317144"
id="path954"
sodipodi:nodetypes="cc" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker1262)"
d="m 134.32612,77.363197 38.12618,0.260865"
id="path1258"
sodipodi:nodetypes="cc" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="8.6727352"
y="16.687063"
id="text852-3"><tspan
sodipodi:role="line"
id="tspan850-67"
x="8.6727352"
y="16.687063"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Objects table </tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="89.190445"
y="16.687063"
id="text852-3-5"><tspan
sodipodi:role="line"
id="tspan850-67-3"
x="89.190445"
y="16.687063"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Versions table </tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="174.55702"
y="16.687063"
id="text852-3-56"><tspan
sodipodi:role="line"
id="tspan850-67-2"
x="174.55702"
y="16.687063"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Blocks table</tspan></text>
</g>
</svg>

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 199 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 145 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 174 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 87 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 81 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 124 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 81 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 81 KiB

File diff suppressed because it is too large Load diff

After

Width:  |  Height:  |  Size: 315 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 286 KiB

BIN
doc/talks/2022-06-23-stack/talk.pdf (Stored with Git LFS) Normal file

Binary file not shown.

View file

@ -0,0 +1,480 @@
%\nonstopmode
\documentclass[aspectratio=169]{beamer}
\usepackage[utf8]{inputenc}
% \usepackage[frenchb]{babel}
\usepackage{amsmath}
\usepackage{mathtools}
\usepackage{breqn}
\usepackage{multirow}
\usetheme{boxes}
\usepackage{graphicx}
\usepackage{adjustbox}
%\useoutertheme[footline=authortitle,subsection=false]{miniframes}
%\useoutertheme[footline=authorinstitute,subsection=false]{miniframes}
\useoutertheme{infolines}
\setbeamertemplate{headline}{}
\beamertemplatenavigationsymbolsempty
\definecolor{TitleOrange}{RGB}{255,137,0}
\setbeamercolor{title}{fg=TitleOrange}
\setbeamercolor{frametitle}{fg=TitleOrange}
\definecolor{ListOrange}{RGB}{255,145,5}
\setbeamertemplate{itemize item}{\color{ListOrange}$\blacktriangleright$}
\definecolor{verygrey}{RGB}{70,70,70}
\setbeamercolor{normal text}{fg=verygrey}
\usepackage{tabu}
\usepackage{multicol}
\usepackage{vwcol}
\usepackage{stmaryrd}
\usepackage{graphicx}
\usepackage[normalem]{ulem}
\title{Introducing Garage}
\subtitle{a new storage platform for self-hosted geo-distributed clusters}
\author{Deuxfleurs Association}
\date{IMT Atlantique, 2022-06-23}
\begin{document}
\begin{frame}
\centering
\includegraphics[width=.3\linewidth]{../../sticker/Garage.pdf}
\vspace{1em}
{\large\bf Deuxfleurs Association}
\vspace{1em}
\url{https://garagehq.deuxfleurs.fr/}
Matrix channel: \texttt{\#garage:deuxfleurs.fr}
\end{frame}
\begin{frame}
\frametitle{Who we are}
\begin{columns}[t]
\begin{column}{.2\textwidth}
\centering
\adjincludegraphics[width=.4\linewidth, valign=t]{assets/alex.jpg}
\end{column}
\begin{column}{.6\textwidth}
\textbf{Alex Auvolat}\\
PhD at Inria, team WIDE; co-founder of Deuxfleurs
\end{column}
\begin{column}{.2\textwidth}
~
\end{column}
\end{columns}
\vspace{1em}
\begin{columns}[t]
\begin{column}{.2\textwidth}
~
\end{column}
\begin{column}{.6\textwidth}
\textbf{Quentin Dufour}\\
PhD at Inria, team WIDE; co-founder of Deuxfleurs
\end{column}
\begin{column}{.2\textwidth}
\centering
\adjincludegraphics[width=.5\linewidth, valign=t]{assets/quentin.jpg}
\end{column}
\end{columns}
\vspace{2em}
\begin{columns}[t]
\begin{column}{.2\textwidth}
\centering
\adjincludegraphics[width=.5\linewidth, valign=t]{assets/deuxfleurs.pdf}
\end{column}
\begin{column}{.6\textwidth}
\textbf{Deuxfleurs}\\
A non-profit self-hosting collective,\\
member of the CHATONS network
\end{column}
\begin{column}{.2\textwidth}
\centering
\adjincludegraphics[width=.7\linewidth, valign=t]{assets/logo_chatons.png}
\end{column}
\end{columns}
\end{frame}
\begin{frame}
\frametitle{Our objective at Deuxfleurs}
\begin{center}
\textbf{Promote self-hosting and small-scale hosting\\
as an alternative to large cloud providers}
\end{center}
\vspace{2em}
\visible<2->{
Why is it hard?
}
\visible<3->{
\vspace{2em}
\begin{center}
\textbf{\underline{Resilience}}\\
{\footnotesize (we want good uptime/availability with low supervision)}
\end{center}
}
\end{frame}
\begin{frame}
\frametitle{How to make a \underline{stable} system}
Enterprise-grade systems typically employ:
\vspace{1em}
\begin{itemize}
\item RAID
\item Redundant power grid + UPS
\item Redundant Internet connections
\item Low-latency links
\item ...
\end{itemize}
\vspace{1em}
$\to$ it's costly and only worth it at DC scale
\end{frame}
\begin{frame}
\frametitle{How to make a \underline{resilient} system}
\only<1,4-5>{
Instead, we use:
\vspace{1em}
\begin{itemize}
\item \textcolor<2->{gray}{Commodity hardware (e.g. old desktop PCs)}
\vspace{.5em}
\item<4-> \textcolor<5->{gray}{Commodity Internet (e.g. FTTB, FTTH) and power grid}
\vspace{.5em}
\item<5-> \textcolor<6->{gray}{\textbf{Geographical redundancy} (multi-site replication)}
\end{itemize}
}
\only<2>{
\begin{center}
\includegraphics[width=.8\linewidth]{assets/atuin.jpg}
\end{center}
}
\only<3>{
\begin{center}
\includegraphics[width=.8\linewidth]{assets/neptune.jpg}
\end{center}
}
\only<6>{
\begin{center}
\includegraphics[width=.5\linewidth]{assets/inframap.jpg}
\end{center}
}
\end{frame}
\begin{frame}
\frametitle{How to make this happen}
\begin{center}
\only<1>{\includegraphics[width=.8\linewidth]{assets/slide1.png}}%
\only<2>{\includegraphics[width=.8\linewidth]{assets/slide2.png}}%
\only<3>{\includegraphics[width=.8\linewidth]{assets/slide3.png}}%
\end{center}
\end{frame}
\begin{frame}
\frametitle{Distributed file systems are slow}
File systems are complex, for example:
\vspace{1em}
\begin{itemize}
\item Concurrent modification by several processes
\vspace{1em}
\item Folder hierarchies
\vspace{1em}
\item Other requirements of the POSIX spec
\end{itemize}
\vspace{1em}
Coordination in a distributed system is costly
\vspace{1em}
Costs explode with commodity hardware / Internet connections\\
{\small (we experienced this!)}
\end{frame}
\begin{frame}
\frametitle{A simpler solution: object storage}
Only two operations:
\vspace{1em}
\begin{itemize}
\item Put an object at a key
\vspace{1em}
\item Retrieve an object from its key
\end{itemize}
\vspace{1em}
{\footnotesize (and a few others)}
\vspace{1em}
Sufficient for many applications!
\end{frame}
\begin{frame}
\frametitle{A simpler solution: object storage}
\begin{center}
\includegraphics[width=.2\linewidth]{../2020-12-02_wide-team/img/Amazon-S3.jpg}
\hspace{5em}
\includegraphics[width=.2\linewidth]{assets/minio.png}
\end{center}
\vspace{1em}
S3: a de-facto standard, many compatible applications
\vspace{1em}
MinIO is self-hostable but not suited for geo-distributed deployments
\end{frame}
\begin{frame}
\frametitle{But what is Garage, exactly?}
\textbf{Garage is a self-hosted drop-in replacement for the Amazon S3 object store}\\
\vspace{.5em}
that implements resilience through geographical redundancy on commodity hardware
\begin{center}
\includegraphics[width=.8\linewidth]{assets/garageuses.png}
\end{center}
\end{frame}
\begin{frame}
\frametitle{Overview}
\begin{center}
\only<1>{\includegraphics[width=.45\linewidth]{assets/garage2a.drawio.pdf}}%
\only<2>{\includegraphics[width=.45\linewidth]{assets/garage2b.drawio.pdf}}%
\end{center}
\end{frame}
\begin{frame}
\frametitle{Garage is \emph{location-aware}}
\begin{center}
\includegraphics[width=\linewidth]{assets/location-aware.png}
\end{center}
\vspace{2em}
Garage replicates data on different zones when possible
\end{frame}
\begin{frame}
\frametitle{Garage is \emph{location-aware}}
\begin{center}
\includegraphics[width=.8\linewidth]{assets/map.png}
\end{center}
\end{frame}
\begin{frame}
\frametitle{How to spread files over different cluster nodes?}
\textbf{Consistent hashing (DynamoDB):}
\vspace{1em}
\begin{center}
\only<1>{\includegraphics[width=.45\columnwidth]{assets/consistent_hashing_1.pdf}}%
\only<2>{\includegraphics[width=.45\columnwidth]{assets/consistent_hashing_2.pdf}}%
\only<3>{\includegraphics[width=.45\columnwidth]{assets/consistent_hashing_3.pdf}}%
\only<4>{\includegraphics[width=.45\columnwidth]{assets/consistent_hashing_4.pdf}}%
\end{center}
\end{frame}
\begin{frame}
\frametitle{How to spread files over different cluster nodes?}
\textbf{Issues with consistent hashing:}
\vspace{1em}
\begin{itemize}
\item Doesn't dispatch data based on geographical location of nodes
\vspace{1em}
\item<2-> Geographically aware adaptation, try 1:\\
data quantities not well balanced between nodes
\vspace{1em}
\item<3-> Geographically aware adaptation, try 2:\\
too many reshuffles when adding/removing nodes
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{How to spread files over different cluster nodes?}
\textbf{Garage's method: build an index table}
\vspace{1em}
Realization: we can actually precompute an optimal solution
\vspace{1em}
\visible<2->{
\begin{center}
\begin{tabular}{|l|l|l|l|}
\hline
\textbf{Partition} & \textbf{Node 1} & \textbf{Node 2} & \textbf{Node 3} \\
\hline
\hline
Partition 0 & Io (jupiter) & Drosera (atuin) & Courgette (neptune) \\
\hline
Partition 1 & Datura (atuin) & Courgette (neptune) & Io (jupiter) \\
\hline
Partition 2 & Io(jupiter) & Celeri (neptune) & Drosera (atuin) \\
\hline
\hspace{1em}$\vdots$ & \hspace{1em}$\vdots$ & \hspace{1em}$\vdots$ & \hspace{1em}$\vdots$ \\
\hline
Partition 255 & Concombre (neptune) & Io (jupiter) & Drosera (atuin) \\
\hline
\end{tabular}
\end{center}
}
\vspace{1em}
\visible<3->{
The index table is built centrally using an optimal* algorithm,\\
then propagated to all nodes\\
\hfill\footnotesize *not yet optimal but will be soon
}
\end{frame}
\begin{frame}
\frametitle{Garage's internal data structures}
\centering
\includegraphics[width=.75\columnwidth]{assets/garage_tables.pdf}
\end{frame}
%\begin{frame}
% \frametitle{Garage's architecture}
% \begin{center}
% \includegraphics[width=.35\linewidth]{assets/garage.drawio.pdf}
% \end{center}
%\end{frame}
\begin{frame}
\frametitle{Garage is \emph{coordination-free}:}
\begin{itemize}
\item No Raft or Paxos
\vspace{1em}
\item Internal data types are CRDTs
\vspace{1em}
\item All nodes are equivalent (no master/leader/index node)
\end{itemize}
\vspace{2em}
$\to$ less sensitive to higher latencies between nodes
\end{frame}
\begin{frame}
\frametitle{Consistency model}
\begin{itemize}
\item Not ACID (not required by S3 spec) / not linearizable
\vspace{1em}
\item \textbf{Read-after-write consistency}\\
{\footnotesize (stronger than eventual consistency)}
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Impact on performances}
\begin{center}
\includegraphics[width=.8\linewidth]{assets/endpoint-latency-dc.png}
\end{center}
\end{frame}
\begin{frame}
\frametitle{An ever-increasing compatibility list}
\begin{center}
\includegraphics[width=.7\linewidth]{assets/compatibility.png}
\end{center}
\end{frame}
\begin{frame}
\frametitle{Further plans for Garage}
\begin{center}
\only<1>{\includegraphics[width=.8\linewidth]{assets/slideB1.png}}%
\only<2>{\includegraphics[width=.8\linewidth]{assets/slideB2.png}}%
\only<3>{\includegraphics[width=.8\linewidth]{assets/slideB3.png}}%
\end{center}
\end{frame}
\begin{frame}
\frametitle{K2V Design}
\begin{itemize}
\item A new, custom, minimal API
\vspace{1em}
\item<2-> Exposes the partitoning mechanism of Garage\\
K2V = partition key / sort key / value (like Dynamo)
\vspace{1em}
\item<3-> Coordination-free, CRDT-friendly (inspired by Riak)\\
\vspace{1em}
\item<4-> Cryptography-friendly: values are binary blobs
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Application: an e-mail storage server}
\begin{center}
\only<1>{\includegraphics[width=.9\linewidth]{assets/aerogramme.png}}%
\end{center}
\end{frame}
\begin{frame}
\frametitle{Aerogramme data model}
\begin{center}
\only<1>{\includegraphics[width=.4\linewidth]{assets/aerogramme_datatype.drawio.pdf}}%
\only<2->{\includegraphics[width=.9\linewidth]{assets/aerogramme_keys.drawio.pdf}\vspace{1em}}%
\end{center}
\visible<3->{Aerogramme encrypts all stored values for privacy\\
(Garage server administrators can't read your mail)}
\end{frame}
\begin{frame}
\frametitle{Different deployment scenarios}
\begin{center}
\only<1>{\includegraphics[width=.9\linewidth]{assets/aerogramme_components1.drawio.pdf}}%
\only<2>{\includegraphics[width=.9\linewidth]{assets/aerogramme_components2.drawio.pdf}}%
\end{center}
\end{frame}
\begin{frame}
\frametitle{A new model for building resilient software}
\begin{itemize}
\item Design a data model suited to K2V\\
{\footnotesize (see Cassandra docs on porting SQL data models to Cassandra)}
\vspace{1em}
\begin{itemize}
\item Use CRDTs or other eventually consistent data types (see e.g. Bayou)
\vspace{1em}
\item Store opaque binary blobs to provide End-to-End Encryption\\
\end{itemize}
\vspace{1em}
\item Store big blobs (files) in S3
\vspace{1em}
\item Let Garage manage sharding, replication, failover, etc.
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Research perspectives}
\begin{itemize}
\item Write about Garage's global architecture \emph{(paper in progress)}
\vspace{1em}
\item Measure and improve Garage's performances
\vspace{1em}
\item Discuss the optimal layout algorithm, provide proofs
\vspace{1em}
\item Write about our proposed architecture for (E2EE) apps over K2V+S3
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Where to find us}
\begin{center}
\includegraphics[width=.25\linewidth]{../../logo/garage_hires.png}\\
\vspace{-1em}
\url{https://garagehq.deuxfleurs.fr/}\\
\url{mailto:garagehq@deuxfleurs.fr}\\
\texttt{\#garage:deuxfleurs.fr} on Matrix
\vspace{1.5em}
\includegraphics[width=.06\linewidth]{assets/rust_logo.png}
\includegraphics[width=.13\linewidth]{assets/AGPLv3_Logo.png}
\end{center}
\end{frame}
\end{document}
%% vim: set ts=4 sw=4 tw=0 noet spelllang=en :

158
k2v_test.py Executable file
View file

@ -0,0 +1,158 @@
#!/usr/bin/env python
import os
import requests
from datetime import datetime
# let's talk to our AWS Elasticsearch cluster
#from requests_aws4auth import AWS4Auth
#auth = AWS4Auth('GK31c2f218a2e44f485b94239e',
# 'b892c0665f0ada8a4755dae98baa3b133590e11dae3bcc1f9d769d67f16c3835',
# 'us-east-1',
# 's3')
from aws_requests_auth.aws_auth import AWSRequestsAuth
auth = AWSRequestsAuth(aws_access_key='GK31c2f218a2e44f485b94239e',
aws_secret_access_key='b892c0665f0ada8a4755dae98baa3b133590e11dae3bcc1f9d769d67f16c3835',
aws_host='localhost:3812',
aws_region='us-east-1',
aws_service='k2v')
print("-- ReadIndex")
response = requests.get('http://localhost:3812/alex',
auth=auth)
print(response.headers)
print(response.text)
sort_keys = ["a", "b", "c", "d"]
for sk in sort_keys:
print("-- (%s) Put initial (no CT)"%sk)
response = requests.put('http://localhost:3812/alex/root?sort_key=%s'%sk,
auth=auth,
data='{}: Hello, world!'.format(datetime.timestamp(datetime.now())))
print(response.headers)
print(response.text)
print("-- Get")
response = requests.get('http://localhost:3812/alex/root?sort_key=%s'%sk,
auth=auth)
print(response.headers)
print(response.text)
ct = response.headers["x-garage-causality-token"]
print("-- ReadIndex")
response = requests.get('http://localhost:3812/alex',
auth=auth)
print(response.headers)
print(response.text)
print("-- Put with CT")
response = requests.put('http://localhost:3812/alex/root?sort_key=%s'%sk,
auth=auth,
headers={'x-garage-causality-token': ct},
data='{}: Good bye, world!'.format(datetime.timestamp(datetime.now())))
print(response.headers)
print(response.text)
print("-- Get")
response = requests.get('http://localhost:3812/alex/root?sort_key=%s'%sk,
auth=auth)
print(response.headers)
print(response.text)
print("-- Put again with same CT (concurrent)")
response = requests.put('http://localhost:3812/alex/root?sort_key=%s'%sk,
auth=auth,
headers={'x-garage-causality-token': ct},
data='{}: Concurrent value, oops'.format(datetime.timestamp(datetime.now())))
print(response.headers)
print(response.text)
for sk in sort_keys:
print("-- (%s) Get"%sk)
response = requests.get('http://localhost:3812/alex/root?sort_key=%s'%sk,
auth=auth)
print(response.headers)
print(response.text)
ct = response.headers["x-garage-causality-token"]
print("-- Delete")
response = requests.delete('http://localhost:3812/alex/root?sort_key=%s'%sk,
headers={'x-garage-causality-token': ct},
auth=auth)
print(response.headers)
print(response.text)
print("-- ReadIndex")
response = requests.get('http://localhost:3812/alex',
auth=auth)
print(response.headers)
print(response.text)
print("-- InsertBatch")
response = requests.post('http://localhost:3812/alex',
auth=auth,
data='''
[
{"pk": "root", "sk": "a", "ct": null, "v": "aW5pdGlhbCB0ZXN0Cg=="},
{"pk": "root", "sk": "b", "ct": null, "v": "aW5pdGlhbCB0ZXN1Cg=="},
{"pk": "root", "sk": "c", "ct": null, "v": "aW5pdGlhbCB0ZXN2Cg=="}
]
''')
print(response.headers)
print(response.text)
print("-- ReadIndex")
response = requests.get('http://localhost:3812/alex',
auth=auth)
print(response.headers)
print(response.text)
for sk in sort_keys:
print("-- (%s) Get"%sk)
response = requests.get('http://localhost:3812/alex/root?sort_key=%s'%sk,
auth=auth)
print(response.headers)
print(response.text)
ct = response.headers["x-garage-causality-token"]
print("-- ReadBatch")
response = requests.post('http://localhost:3812/alex?search',
auth=auth,
data='''
[
{"partitionKey": "root"},
{"partitionKey": "root", "tombstones": true},
{"partitionKey": "root", "tombstones": true, "limit": 2},
{"partitionKey": "root", "start": "c", "singleItem": true},
{"partitionKey": "root", "start": "b", "end": "d", "tombstones": true}
]
''')
print(response.headers)
print(response.text)
print("-- DeleteBatch")
response = requests.post('http://localhost:3812/alex?delete',
auth=auth,
data='''
[
{"partitionKey": "root", "start": "b", "end": "c"}
]
''')
print(response.headers)
print(response.text)
print("-- ReadBatch")
response = requests.post('http://localhost:3812/alex?search',
auth=auth,
data='''
[
{"partitionKey": "root"}
]
''')
print(response.headers)
print(response.text)

View file

@ -8,14 +8,12 @@ rec {
sha256 = "1xy9zpypqfxs5gcq5dcla4bfkhxmh5nzn9dyqkr03lqycm9wg5cr";
};
cargo2nixSrc = fetchGit {
# As of 2022-02-03
url = "https://github.com/superboum/cargo2nix";
ref = "backward-compat";
rev = "08d963f32a774353ee8acf3f61749915875c1ec4";
# As of 2022-08-29, stacking two patches: superboum@dedup_propagate and Alexis211@fix_fetchcrategit
url = "https://github.com/Alexis211/cargo2nix";
ref = "fix_fetchcrategit";
rev = "4b31c0cc05b6394916d46e9289f51263d81973b9";
};
/*
* Shared objects
*/

244
nix/compile.nix Normal file
View file

@ -0,0 +1,244 @@
{
system ? builtins.currentSystem,
target ? null,
compiler ? "rustc",
release ? false,
git_version ? null,
}:
with import ./common.nix;
let
log = v: builtins.trace v v;
pkgs = import pkgsSrc {
inherit system;
${ if target == null then null else "crossSystem" } = { config = target; };
overlays = [ cargo2nixOverlay ];
};
/*
Rust and Nix triples are not the same. Cargo2nix has a dedicated library
to convert Nix triples to Rust ones. We need this conversion as we want to
set later options linked to our (rust) target in a generic way. Not only
the triple terminology is different, but also the "roles" are named differently.
Nix uses a build/host/target terminology where Nix's "host" maps to Cargo's "target".
*/
rustTarget = log (pkgs.rustBuilder.rustLib.rustTriple pkgs.stdenv.hostPlatform);
/*
Cargo2nix is built for rustOverlay which installs Rust from Mozilla releases.
We want our own Rust to avoid incompatibilities, like we had with musl 1.2.0.
rustc was built with musl < 1.2.0 and nix shipped musl >= 1.2.0 which lead to compilation breakage.
So we want a Rust release that is bound to our Nix repository to avoid these problems.
See here for more info: https://musl.libc.org/time64.html
Because Cargo2nix does not support the Rust environment shipped by NixOS,
we emulate the structure of the Rust object created by rustOverlay.
In practise, rustOverlay ships rustc+cargo in a single derivation while
NixOS ships them in separate ones. We reunite them with symlinkJoin.
*/
rustChannel = {
rustc = pkgs.symlinkJoin {
name = "rust-channel";
paths = [
pkgs.rustPlatform.rust.cargo
pkgs.rustPlatform.rust.rustc
];
};
clippy = pkgs.symlinkJoin {
name = "clippy-channel";
paths = [
pkgs.rustPlatform.rust.cargo
pkgs.rustPlatform.rust.rustc
pkgs.clippy
];
};
}.${compiler};
clippyBuilder = pkgs.writeScriptBin "clippy" ''
#!${pkgs.stdenv.shell}
. ${cargo2nixSrc + "/overlay/utils.sh"}
isBuildScript=
args=("$@")
for i in "''${!args[@]}"; do
if [ "xmetadata=" = "x''${args[$i]::9}" ]; then
args[$i]=metadata=$NIX_RUST_METADATA
elif [ "x--crate-name" = "x''${args[$i]}" ] && [ "xbuild_script_" = "x''${args[$i+1]::13}" ]; then
isBuildScript=1
fi
done
if [ "$isBuildScript" ]; then
args+=($NIX_RUST_BUILD_LINK_FLAGS)
else
args+=($NIX_RUST_LINK_FLAGS)
fi
touch invoke.log
echo "''${args[@]}" >>invoke.log
exec ${rustChannel}/bin/clippy-driver --deny warnings "''${args[@]}"
'';
buildEnv = (drv: {
rustc = drv.setBuildEnv;
clippy = ''
${drv.setBuildEnv or "" }
echo
echo --- BUILDING WITH CLIPPY ---
echo
export RUSTC=${clippyBuilder}/bin/clippy
'';
}.${compiler});
/*
Cargo2nix provides many overrides by default, you can take inspiration from them:
https://github.com/cargo2nix/cargo2nix/blob/master/overlay/overrides.nix
You can have a complete list of the available options by looking at the overriden object, mkcrate:
https://github.com/cargo2nix/cargo2nix/blob/master/overlay/mkcrate.nix
*/
overrides = pkgs.rustBuilder.overrides.all ++ [
/*
[1] We add some logic to compile our crates with clippy, it provides us many additional lints
[2] We need to alter Nix hardening to make static binaries: PIE,
Position Independent Executables seems to be supported only on amd64. Having
this flag set either 1. make our executables crash or 2. compile as dynamic on some platforms.
Here, we deactivate it. Later (find `codegenOpts`), we reactivate it for supported targets
(only amd64 curently) through the `-static-pie` flag.
PIE is a feature used by ASLR, which helps mitigate security issues.
Learn more about Nix Hardening at: https://github.com/NixOS/nixpkgs/blob/master/pkgs/build-support/cc-wrapper/add-hardening.sh
[3] We want to inject the git version while keeping the build deterministic.
As we do not want to consider the .git folder as part of the input source,
we ask the user (the CI often) to pass the value to Nix.
[4] We ship some parts of the code disabled by default by putting them behind a flag.
It speeds up the compilation (when the feature is not required) and released crates have less dependency by default (less attack surface, disk space, etc.).
But we want to ship these additional features when we release Garage.
In the end, we chose to exclude all features from debug builds while putting (all of) them in the release builds.
[5] We don't want libsodium-sys and zstd-sys to try to use pkgconfig to build against a system library.
However the features to do so get activated for some reason (due to a bug in cargo2nix?),
so disable them manually here.
*/
(pkgs.rustBuilder.rustLib.makeOverride {
name = "garage";
overrideAttrs = drv:
(if git_version != null then {
/* [3] */ preConfigure = ''
${drv.preConfigure or ""}
export GIT_VERSION="${git_version}"
'';
} else {})
//
{
/* [1] */ setBuildEnv = (buildEnv drv);
/* [2] */ hardeningDisable = [ "pie" ];
};
overrideArgs = old: {
/* [4] */ features = [ "bundled-libs" "sled" ]
++ (if release then [ "kubernetes-discovery" "telemetry-otlp" "metrics" "lmdb" "sqlite" ] else []);
};
})
(pkgs.rustBuilder.rustLib.makeOverride {
name = "garage_rpc";
overrideAttrs = drv: { /* [1] */ setBuildEnv = (buildEnv drv); };
})
(pkgs.rustBuilder.rustLib.makeOverride {
name = "garage_db";
overrideAttrs = drv: { /* [1] */ setBuildEnv = (buildEnv drv); };
})
(pkgs.rustBuilder.rustLib.makeOverride {
name = "garage_util";
overrideAttrs = drv: { /* [1] */ setBuildEnv = (buildEnv drv); };
})
(pkgs.rustBuilder.rustLib.makeOverride {
name = "garage_table";
overrideAttrs = drv: { /* [1] */ setBuildEnv = (buildEnv drv); };
})
(pkgs.rustBuilder.rustLib.makeOverride {
name = "garage_block";
overrideAttrs = drv: { /* [1] */ setBuildEnv = (buildEnv drv); };
})
(pkgs.rustBuilder.rustLib.makeOverride {
name = "garage_model";
overrideAttrs = drv: { /* [1] */ setBuildEnv = (buildEnv drv); };
})
(pkgs.rustBuilder.rustLib.makeOverride {
name = "garage_api";
overrideAttrs = drv: { /* [1] */ setBuildEnv = (buildEnv drv); };
})
(pkgs.rustBuilder.rustLib.makeOverride {
name = "garage_web";
overrideAttrs = drv: { /* [1] */ setBuildEnv = (buildEnv drv); };
})
(pkgs.rustBuilder.rustLib.makeOverride {
name = "k2v-client";
overrideAttrs = drv: { /* [1] */ setBuildEnv = (buildEnv drv); };
})
(pkgs.rustBuilder.rustLib.makeOverride {
name = "libsodium-sys";
overrideArgs = old: {
features = [ ]; /* [5] */
};
})
(pkgs.rustBuilder.rustLib.makeOverride {
name = "zstd-sys";
overrideArgs = old: {
features = [ ]; /* [5] */
};
})
];
packageFun = import ../Cargo.nix;
/*
We compile fully static binaries with musl to simplify deployment on most systems.
When possible, we reactivate PIE hardening (see above).
Also, if you set the RUSTFLAGS environment variable, the following parameters will
be ignored.
For more information on static builds, please refer to Rust's RFC 1721.
https://rust-lang.github.io/rfcs/1721-crt-static.html#specifying-dynamicstatic-c-runtime-linkage
*/
codegenOpts = {
"armv6l-unknown-linux-musleabihf" = [ "target-feature=+crt-static" "link-arg=-static" ]; /* compile as dynamic with static-pie */
"aarch64-unknown-linux-musl" = [ "target-feature=+crt-static" "link-arg=-static" ]; /* segfault with static-pie */
"i686-unknown-linux-musl" = [ "target-feature=+crt-static" "link-arg=-static" ]; /* segfault with static-pie */
"x86_64-unknown-linux-musl" = [ "target-feature=+crt-static" "link-arg=-static-pie" ];
};
in
/*
The following definition is not elegant as we use a low level function of Cargo2nix
that enables us to pass our custom rustChannel object. We need this low level definition
to pass Nix's Rust toolchains instead of Mozilla's one.
target is mandatory but must be kept to null to allow cargo2nix to set it to the appropriate value
for each crate.
*/
pkgs.rustBuilder.makePackageSet {
inherit packageFun rustChannel release codegenOpts;
packageOverrides = overrides;
target = null;
buildRustPackages = pkgs.buildPackages.rustBuilder.makePackageSet {
inherit rustChannel packageFun codegenOpts;
packageOverrides = overrides;
target = null;
};
}

View file

@ -1,4 +1,9 @@
substituters = https://cache.nixos.org https://nix.web.deuxfleurs.fr
trusted-public-keys = cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY= nix.web.deuxfleurs.fr:eTGL6kvaQn6cDR/F9lDYUIP9nCVR/kkshYfLDJf1yKs=
max-jobs = auto
cores = 4
cores = 0
log-lines = 200
filter-syscalls = false
sandbox = false
keep-outputs = true
keep-derivations = true

View file

@ -11,7 +11,7 @@ PATH="${GARAGE_DEBUG}:${GARAGE_RELEASE}:${NIX_RELEASE}:$PATH"
FANCYCOLORS=("41m" "42m" "44m" "45m" "100m" "104m")
export RUST_BACKTRACE=1
export RUST_LOG=garage=info,garage_api=debug
export RUST_LOG=garage=info,garage_api=debug,netapp=trace
MAIN_LABEL="\e[${FANCYCOLORS[0]}[main]\e[49m"
WHICH_GARAGE=$(which garage || exit 1)
@ -44,6 +44,9 @@ root_domain = ".s3.garage.localhost"
bind_addr = "0.0.0.0:$((3920+$count))"
root_domain = ".web.garage.localhost"
index = "index.html"
[admin]
api_bind_addr = "0.0.0.0:$((9900+$count))"
EOF
echo -en "$LABEL configuration written to $CONF_PATH\n"

13
script/k8s/README.md Normal file
View file

@ -0,0 +1,13 @@
Spawn a cluster with minikube
```bash
minikube start
minikube kubectl -- apply -f config.yaml
minikube kubectl -- apply -f daemon.yaml
minikube dashboard
minikube kubectl -- exec -it garage-0 --container garage -- /garage status
# etc.
```

12
script/k8s/admin.yaml Normal file
View file

@ -0,0 +1,12 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: garage-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: system:serviceaccount:default:default

30
script/k8s/config.yaml Normal file
View file

@ -0,0 +1,30 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: garage-config
namespace: default
data:
garage.toml: |-
metadata_dir = "/tmp/meta"
data_dir = "/tmp/data"
replication_mode = "3"
rpc_bind_addr = "[::]:3901"
rpc_secret = "1799bccfd7411eddcf9ebd316bc1f5287ad12a68094e1c6ac6abde7e6feae1ec"
bootstrap_peers = []
kubernetes_namespace = "default"
kubernetes_service_name = "garage-daemon"
kubernetes_skip_crd = false
[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
root_domain = ".s3.garage.tld"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage.tld"
index = "index.html"

52
script/k8s/daemon.yaml Normal file
View file

@ -0,0 +1,52 @@
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: garage
spec:
selector:
matchLabels:
app: garage
serviceName: "garage"
replicas: 3
template:
metadata:
labels:
app: garage
spec:
terminationGracePeriodSeconds: 10
containers:
- name: garage
image: dxflrs/amd64_garage:v0.7.0-rc1
ports:
- containerPort: 3900
name: s3-api
- containerPort: 3902
name: web-api
volumeMounts:
- name: fast
mountPath: /mnt/fast
- name: slow
mountPath: /mnt/slow
- name: etc
mountPath: /etc/garage.toml
subPath: garage.toml
volumes:
- name: etc
configMap:
name: garage-config
volumeClaimTemplates:
- metadata:
name: fast
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 100Mi
- metadata:
name: slow
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 100Mi

14
script/not-dynamic.sh Executable file
View file

@ -0,0 +1,14 @@
#!/usr/bin/env bash
set -e
if [ "$#" -ne 1 ]; then
echo "[fail] usage: $0 binary"
exit 2
fi
if file $1 | grep 'dynamically linked' 2>&1; then
echo "[fail] $1 is dynamic"
exit 1
fi
echo "[ok] $1 is probably static"

View file

@ -0,0 +1,21 @@
Configure your `[admin-api]` endpoint:
```
[admin]
api_bind_addr = "0.0.0.0:3903"
trace_sink = "http://localhost:4317"
```
Start the test stack:
```
cd telemetry
docker-compose up
```
Access the web interfaces:
- [Kibana](http://localhost:5601) - Click on the hamburger menu, in the Observability section, click APM
- [Grafana](http://localhost:3000) - Set a password, then on the left menu, click Dashboard -> Browse. On the new page click Import -> Choose the test dashboard we ship `grafana-garage-dashboard-elasticsearch.json`

View file

@ -0,0 +1,3 @@
COMPOSE_PROJECT_NAME=telemetry
OTEL_COLLECT_TAG=0.44.0
ELASTIC_BUNDLE_TAG=7.17.0

View file

@ -0,0 +1,10 @@
apm-server:
# Defines the host and port the server is listening on. Use "unix:/path/to.sock" to listen on a unix domain socket.
host: "0.0.0.0:8200"
#-------------------------- Elasticsearch output --------------------------
output.elasticsearch:
# Array of hosts to connect to.
# Scheme and port can be left out and will be set to the default (`http` and `9200`).
# In case you specify and additional path, the scheme is required: `http://localhost:9200/path`.
# IPv6 addresses should always be defined as: `https://[2001:db8::1]:9200`.
hosts: ["localhost:9200"]

View file

@ -0,0 +1,69 @@
version: "2"
services:
otel:
image: otel/opentelemetry-collector-contrib:${OTEL_COLLECT_TAG}
command: [ "--config=/etc/otel-config.yaml" ]
volumes:
- ./otel-config.yaml:/etc/otel-config.yaml
network_mode: "host"
elastic:
image: docker.elastic.co/elasticsearch/elasticsearch:${ELASTIC_BUNDLE_TAG}
container_name: elastic
environment:
- "node.name=elastic"
- "http.port=9200"
- "cluster.name=es-docker-cluster"
- "discovery.type=single-node"
- "bootstrap.memory_lock=true"
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
nofile: 65536
volumes:
- "es_data:/usr/share/elasticsearch/data"
network_mode: "host"
# kibana instance and collectors
# see https://www.elastic.co/guide/en/elastic-stack-get-started/current/get-started-docker.html
kibana:
image: docker.elastic.co/kibana/kibana:${ELASTIC_BUNDLE_TAG}
container_name: kibana
environment:
SERVER_NAME: "kibana.local"
# ELASTICSEARCH_URL: "http://localhost:9700"
ELASTICSEARCH_HOSTS: "http://localhost:9200"
depends_on: [ 'elastic' ]
network_mode: "host"
apm:
image: docker.elastic.co/apm/apm-server:${ELASTIC_BUNDLE_TAG}
container_name: apm
volumes:
- "./apm-config.yaml:/usr/share/apm-server/apm-server.yml:ro"
depends_on: [ 'elastic' ]
network_mode: "host"
grafana:
# see https://grafana.com/docs/grafana/latest/installation/docker/
image: "grafana/grafana:8.3.5"
container_name: grafana
# restart: unless-stopped
environment:
- "GF_INSTALL_PLUGINS=grafana-clock-panel,grafana-simple-json-datasource,grafana-piechart-panel,grafana-worldmap-panel,grafana-polystat-panel"
network_mode: "host"
volumes:
# chown 472:472 if needed
- grafana:/var/lib/grafana
- ./grafana/provisioning/:/etc/grafana/provisioning/
volumes:
es_data:
driver: local
grafana:
driver: local
metricbeat:
driver: local

View file

@ -0,0 +1,19 @@
apiVersion: 1
datasources:
- name: DS_ELASTICSEARCH
type: elasticsearch
access: proxy
url: http://localhost:9200
password: ''
user: ''
database: apm-*
basicAuth: false
isDefault: true
jsonData:
esVersion: 7.10.0
logLevelField: ''
logMessageField: ''
maxConcurrentShardRequests: 5
timeField: "@timestamp"
readOnly: false

View file

@ -0,0 +1,47 @@
receivers:
# Data sources: metrics, traces
otlp:
protocols:
grpc:
endpoint: ":4317"
http:
endpoint: ":55681"
# Data sources: metrics
prometheus:
config:
scrape_configs:
- job_name: "garage"
scrape_interval: 5s
static_configs:
- targets: ["localhost:3903"]
exporters:
logging:
logLevel: info
# see https://www.elastic.co/guide/en/apm/get-started/current/open-telemetry-elastic.html#open-telemetry-collector
otlp/elastic:
endpoint: "localhost:8200"
tls:
insecure: true
processors:
batch:
extensions:
health_check:
pprof:
endpoint: :1888
zpages:
endpoint: :55679
service:
extensions: [pprof, zpages, health_check]
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [logging, otlp/elastic]
metrics:
receivers: [otlp, prometheus]
processors: [batch]
exporters: [logging, otlp/elastic]

File diff suppressed because it is too large Load diff

View file

@ -1,8 +1,5 @@
{
system ? builtins.currentSystem,
rust ? true,
integration ? true,
release ? true,
}:
with import ./nix/common.nix;
@ -16,8 +13,58 @@ let
winscp = (import ./nix/winscp.nix) pkgs;
in
{
pkgs.mkShell {
/* --- Rust Shell ---
* Use it to compile Garage
*/
rust = pkgs.mkShell {
shellHook = ''
function refresh_toolchain {
nix copy \
--to 's3://nix?endpoint=garage.deuxfleurs.fr&region=garage&secret-key=/etc/nix/signing-key.sec' \
$(nix-store -qR \
$(nix-build --quiet --no-build-output --no-out-link nix/toolchain.nix))
}
'';
nativeBuildInputs = [
#pkgs.rustPlatform.rust.rustc
pkgs.rustPlatform.rust.cargo
#pkgs.clippy
pkgs.rustfmt
#pkgs.perl
#pkgs.protobuf
#pkgs.pkg-config
#pkgs.openssl
pkgs.file
#cargo2nix.packages.x86_64-linux.cargo2nix
];
};
/* --- Integration shell ---
* Use it to test Garage with common S3 clients
*/
integration = pkgs.mkShell {
nativeBuildInputs = [
winscp
pkgs.s3cmd
pkgs.awscli2
pkgs.minio-client
pkgs.rclone
pkgs.socat
pkgs.psmisc
pkgs.which
pkgs.openssl
pkgs.curl
pkgs.jq
];
};
/* --- Release shell ---
* A shell built to make releasing easier
*/
release = pkgs.mkShell {
shellHook = ''
function to_s3 {
aws \
@ -62,41 +109,12 @@ function refresh_index {
result/share/_releases.html \
s3://garagehq.deuxfleurs.fr/
}
function refresh_toolchain {
nix copy \
--to 's3://nix?endpoint=garage.deuxfleurs.fr&region=garage&secret-key=/etc/nix/signing-key.sec' \
$(nix-store -qR \
$(nix-build --quiet --no-build-output --no-out-link nix/toolchain.nix))
}
'';
nativeBuildInputs =
(if rust then [
pkgs.rustPlatform.rust.rustc
pkgs.rustPlatform.rust.cargo
pkgs.clippy
pkgs.rustfmt
cargo2nix.packages.x86_64-linux.cargo2nix
] else [])
++
(if integration then [
winscp
pkgs.s3cmd
pkgs.awscli2
pkgs.minio-client
pkgs.rclone
pkgs.socat
pkgs.psmisc
pkgs.which
pkgs.openssl
pkgs.curl
pkgs.jq
] else [])
++
(if release then [
nativeBuildInputs = [
pkgs.awscli2
kaniko
] else [])
;
];
};
}

View file

@ -1,6 +1,6 @@
[package]
name = "garage_api"
version = "0.6.0"
version = "0.8.0"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "AGPL-3.0"
@ -14,22 +14,25 @@ path = "lib.rs"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
garage_model = { version = "0.6.0", path = "../model" }
garage_table = { version = "0.6.0", path = "../table" }
garage_util = { version = "0.6.0", path = "../util" }
garage_model = { version = "0.8.0", path = "../model" }
garage_table = { version = "0.8.0", path = "../table" }
garage_block = { version = "0.8.0", path = "../block" }
garage_util = { version = "0.8.0", path = "../util" }
garage_rpc = { version = "0.8.0", path = "../rpc" }
async-trait = "0.1.7"
base64 = "0.13"
bytes = "1.0"
chrono = "0.4"
crypto-mac = "0.10"
crypto-common = "0.1"
err-derive = "0.3"
hex = "0.4"
hmac = "0.10"
hmac = "0.12"
idna = "0.2"
log = "0.4"
md-5 = "0.9"
tracing = "0.1.30"
md-5 = "0.10"
nom = "7.1"
sha2 = "0.9"
sha2 = "0.10"
futures = "0.3"
futures-util = "0.3"
@ -49,3 +52,11 @@ serde_bytes = "0.11"
serde_json = "1.0"
quick-xml = { version = "0.21", features = [ "serialize" ] }
url = "2.1"
opentelemetry = "0.17"
opentelemetry-prometheus = { version = "0.10", optional = true }
prometheus = { version = "0.13", optional = true }
[features]
k2v = [ "garage_util/k2v", "garage_model/k2v" ]
metrics = [ "opentelemetry-prometheus", "prometheus" ]

206
src/api/admin/api_server.rs Normal file
View file

@ -0,0 +1,206 @@
use std::net::SocketAddr;
use std::sync::Arc;
use async_trait::async_trait;
use futures::future::Future;
use http::header::{ACCESS_CONTROL_ALLOW_METHODS, ACCESS_CONTROL_ALLOW_ORIGIN, ALLOW};
use hyper::{Body, Request, Response};
use opentelemetry::trace::SpanRef;
#[cfg(feature = "metrics")]
use opentelemetry_prometheus::PrometheusExporter;
#[cfg(feature = "metrics")]
use prometheus::{Encoder, TextEncoder};
use garage_model::garage::Garage;
use garage_util::error::Error as GarageError;
use crate::generic_server::*;
use crate::admin::bucket::*;
use crate::admin::cluster::*;
use crate::admin::error::*;
use crate::admin::key::*;
use crate::admin::router::{Authorization, Endpoint};
pub struct AdminApiServer {
garage: Arc<Garage>,
#[cfg(feature = "metrics")]
exporter: PrometheusExporter,
metrics_token: Option<String>,
admin_token: Option<String>,
}
impl AdminApiServer {
pub fn new(garage: Arc<Garage>) -> Self {
let cfg = &garage.config.admin;
let metrics_token = cfg
.metrics_token
.as_ref()
.map(|tok| format!("Bearer {}", tok));
let admin_token = cfg
.admin_token
.as_ref()
.map(|tok| format!("Bearer {}", tok));
Self {
garage,
#[cfg(feature = "metrics")]
exporter: opentelemetry_prometheus::exporter().init(),
metrics_token,
admin_token,
}
}
pub async fn run(
self,
bind_addr: SocketAddr,
shutdown_signal: impl Future<Output = ()>,
) -> Result<(), GarageError> {
let region = self.garage.config.s3_api.s3_region.clone();
ApiServer::new(region, self)
.run_server(bind_addr, shutdown_signal)
.await
}
fn handle_options(&self, _req: &Request<Body>) -> Result<Response<Body>, Error> {
Ok(Response::builder()
.status(204)
.header(ALLOW, "OPTIONS, GET, POST")
.header(ACCESS_CONTROL_ALLOW_METHODS, "OPTIONS, GET, POST")
.header(ACCESS_CONTROL_ALLOW_ORIGIN, "*")
.body(Body::empty())?)
}
fn handle_metrics(&self) -> Result<Response<Body>, Error> {
#[cfg(feature = "metrics")]
{
use opentelemetry::trace::Tracer;
let mut buffer = vec![];
let encoder = TextEncoder::new();
let tracer = opentelemetry::global::tracer("garage");
let metric_families = tracer.in_span("admin/gather_metrics", |_| {
self.exporter.registry().gather()
});
encoder
.encode(&metric_families, &mut buffer)
.ok_or_internal_error("Could not serialize metrics")?;
Ok(Response::builder()
.status(200)
.header(http::header::CONTENT_TYPE, encoder.format_type())
.body(Body::from(buffer))?)
}
#[cfg(not(feature = "metrics"))]
Err(Error::bad_request(
"Garage was built without the metrics feature".to_string(),
))
}
}
#[async_trait]
impl ApiHandler for AdminApiServer {
const API_NAME: &'static str = "admin";
const API_NAME_DISPLAY: &'static str = "Admin";
type Endpoint = Endpoint;
type Error = Error;
fn parse_endpoint(&self, req: &Request<Body>) -> Result<Endpoint, Error> {
Endpoint::from_request(req)
}
async fn handle(
&self,
req: Request<Body>,
endpoint: Endpoint,
) -> Result<Response<Body>, Error> {
let expected_auth_header =
match endpoint.authorization_type() {
Authorization::MetricsToken => self.metrics_token.as_ref(),
Authorization::AdminToken => match &self.admin_token {
None => return Err(Error::forbidden(
"Admin token isn't configured, admin API access is disabled for security.",
)),
Some(t) => Some(t),
},
};
if let Some(h) = expected_auth_header {
match req.headers().get("Authorization") {
None => return Err(Error::forbidden("Authorization token must be provided")),
Some(v) => {
let authorized = v.to_str().map(|hv| hv.trim() == h).unwrap_or(false);
if !authorized {
return Err(Error::forbidden("Invalid authorization token provided"));
}
}
}
}
match endpoint {
Endpoint::Options => self.handle_options(&req),
Endpoint::Metrics => self.handle_metrics(),
Endpoint::GetClusterStatus => handle_get_cluster_status(&self.garage).await,
Endpoint::ConnectClusterNodes => handle_connect_cluster_nodes(&self.garage, req).await,
// Layout
Endpoint::GetClusterLayout => handle_get_cluster_layout(&self.garage).await,
Endpoint::UpdateClusterLayout => handle_update_cluster_layout(&self.garage, req).await,
Endpoint::ApplyClusterLayout => handle_apply_cluster_layout(&self.garage, req).await,
Endpoint::RevertClusterLayout => handle_revert_cluster_layout(&self.garage, req).await,
// Keys
Endpoint::ListKeys => handle_list_keys(&self.garage).await,
Endpoint::GetKeyInfo { id, search } => {
handle_get_key_info(&self.garage, id, search).await
}
Endpoint::CreateKey => handle_create_key(&self.garage, req).await,
Endpoint::ImportKey => handle_import_key(&self.garage, req).await,
Endpoint::UpdateKey { id } => handle_update_key(&self.garage, id, req).await,
Endpoint::DeleteKey { id } => handle_delete_key(&self.garage, id).await,
// Buckets
Endpoint::ListBuckets => handle_list_buckets(&self.garage).await,
Endpoint::GetBucketInfo { id, global_alias } => {
handle_get_bucket_info(&self.garage, id, global_alias).await
}
Endpoint::CreateBucket => handle_create_bucket(&self.garage, req).await,
Endpoint::DeleteBucket { id } => handle_delete_bucket(&self.garage, id).await,
Endpoint::UpdateBucket { id } => handle_update_bucket(&self.garage, id, req).await,
// Bucket-key permissions
Endpoint::BucketAllowKey => {
handle_bucket_change_key_perm(&self.garage, req, true).await
}
Endpoint::BucketDenyKey => {
handle_bucket_change_key_perm(&self.garage, req, false).await
}
// Bucket aliasing
Endpoint::GlobalAliasBucket { id, alias } => {
handle_global_alias_bucket(&self.garage, id, alias).await
}
Endpoint::GlobalUnaliasBucket { id, alias } => {
handle_global_unalias_bucket(&self.garage, id, alias).await
}
Endpoint::LocalAliasBucket {
id,
access_key_id,
alias,
} => handle_local_alias_bucket(&self.garage, id, access_key_id, alias).await,
Endpoint::LocalUnaliasBucket {
id,
access_key_id,
alias,
} => handle_local_unalias_bucket(&self.garage, id, access_key_id, alias).await,
}
}
}
impl ApiEndpoint for Endpoint {
fn name(&self) -> &'static str {
Endpoint::name(self)
}
fn add_span_attributes(&self, _span: SpanRef<'_>) {}
}

580
src/api/admin/bucket.rs Normal file
View file

@ -0,0 +1,580 @@
use std::collections::HashMap;
use std::sync::Arc;
use hyper::{Body, Request, Response, StatusCode};
use serde::{Deserialize, Serialize};
use garage_util::crdt::*;
use garage_util::data::*;
use garage_util::time::*;
use garage_table::*;
use garage_model::bucket_alias_table::*;
use garage_model::bucket_table::*;
use garage_model::garage::Garage;
use garage_model::permission::*;
use garage_model::s3::object_table::*;
use crate::admin::error::*;
use crate::admin::key::ApiBucketKeyPerm;
use crate::common_error::CommonError;
use crate::helpers::{json_ok_response, parse_json_body};
pub async fn handle_list_buckets(garage: &Arc<Garage>) -> Result<Response<Body>, Error> {
let buckets = garage
.bucket_table
.get_range(
&EmptyKey,
None,
Some(DeletedFilter::NotDeleted),
10000,
EnumerationOrder::Forward,
)
.await?;
let res = buckets
.into_iter()
.map(|b| {
let state = b.state.as_option().unwrap();
ListBucketResultItem {
id: hex::encode(b.id),
global_aliases: state
.aliases
.items()
.iter()
.filter(|(_, _, a)| *a)
.map(|(n, _, _)| n.to_string())
.collect::<Vec<_>>(),
local_aliases: state
.local_aliases
.items()
.iter()
.filter(|(_, _, a)| *a)
.map(|((k, n), _, _)| BucketLocalAlias {
access_key_id: k.to_string(),
alias: n.to_string(),
})
.collect::<Vec<_>>(),
}
})
.collect::<Vec<_>>();
Ok(json_ok_response(&res)?)
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct ListBucketResultItem {
id: String,
global_aliases: Vec<String>,
local_aliases: Vec<BucketLocalAlias>,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct BucketLocalAlias {
access_key_id: String,
alias: String,
}
#[derive(Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
struct ApiBucketQuotas {
max_size: Option<u64>,
max_objects: Option<u64>,
}
pub async fn handle_get_bucket_info(
garage: &Arc<Garage>,
id: Option<String>,
global_alias: Option<String>,
) -> Result<Response<Body>, Error> {
let bucket_id = match (id, global_alias) {
(Some(id), None) => parse_bucket_id(&id)?,
(None, Some(ga)) => garage
.bucket_helper()
.resolve_global_bucket_name(&ga)
.await?
.ok_or_else(|| HelperError::NoSuchBucket(ga.to_string()))?,
_ => {
return Err(Error::bad_request(
"Either id or globalAlias must be provided (but not both)",
));
}
};
bucket_info_results(garage, bucket_id).await
}
async fn bucket_info_results(
garage: &Arc<Garage>,
bucket_id: Uuid,
) -> Result<Response<Body>, Error> {
let bucket = garage
.bucket_helper()
.get_existing_bucket(bucket_id)
.await?;
let counters = garage
.object_counter_table
.table
.get(&bucket_id, &EmptyKey)
.await?
.map(|x| x.filtered_values(&garage.system.ring.borrow()))
.unwrap_or_default();
let mut relevant_keys = HashMap::new();
for (k, _) in bucket
.state
.as_option()
.unwrap()
.authorized_keys
.items()
.iter()
{
if let Some(key) = garage
.key_table
.get(&EmptyKey, k)
.await?
.filter(|k| !k.is_deleted())
{
if !key.state.is_deleted() {
relevant_keys.insert(k.clone(), key);
}
}
}
for ((k, _), _, _) in bucket
.state
.as_option()
.unwrap()
.local_aliases
.items()
.iter()
{
if relevant_keys.contains_key(k) {
continue;
}
if let Some(key) = garage.key_table.get(&EmptyKey, k).await? {
if !key.state.is_deleted() {
relevant_keys.insert(k.clone(), key);
}
}
}
let state = bucket.state.as_option().unwrap();
let quotas = state.quotas.get();
let res =
GetBucketInfoResult {
id: hex::encode(&bucket.id),
global_aliases: state
.aliases
.items()
.iter()
.filter(|(_, _, a)| *a)
.map(|(n, _, _)| n.to_string())
.collect::<Vec<_>>(),
website_access: state.website_config.get().is_some(),
website_config: state.website_config.get().clone().map(|wsc| {
GetBucketInfoWebsiteResult {
index_document: wsc.index_document,
error_document: wsc.error_document,
}
}),
keys: relevant_keys
.into_iter()
.map(|(_, key)| {
let p = key.state.as_option().unwrap();
GetBucketInfoKey {
access_key_id: key.key_id,
name: p.name.get().to_string(),
permissions: p
.authorized_buckets
.get(&bucket.id)
.map(|p| ApiBucketKeyPerm {
read: p.allow_read,
write: p.allow_write,
owner: p.allow_owner,
})
.unwrap_or_default(),
bucket_local_aliases: p
.local_aliases
.items()
.iter()
.filter(|(_, _, b)| *b == Some(bucket.id))
.map(|(n, _, _)| n.to_string())
.collect::<Vec<_>>(),
}
})
.collect::<Vec<_>>(),
objects: counters.get(OBJECTS).cloned().unwrap_or_default(),
bytes: counters.get(BYTES).cloned().unwrap_or_default(),
unfinshed_uploads: counters
.get(UNFINISHED_UPLOADS)
.cloned()
.unwrap_or_default(),
quotas: ApiBucketQuotas {
max_size: quotas.max_size,
max_objects: quotas.max_objects,
},
};
Ok(json_ok_response(&res)?)
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct GetBucketInfoResult {
id: String,
global_aliases: Vec<String>,
website_access: bool,
#[serde(default)]
website_config: Option<GetBucketInfoWebsiteResult>,
keys: Vec<GetBucketInfoKey>,
objects: i64,
bytes: i64,
unfinshed_uploads: i64,
quotas: ApiBucketQuotas,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct GetBucketInfoWebsiteResult {
index_document: String,
error_document: Option<String>,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct GetBucketInfoKey {
access_key_id: String,
name: String,
permissions: ApiBucketKeyPerm,
bucket_local_aliases: Vec<String>,
}
pub async fn handle_create_bucket(
garage: &Arc<Garage>,
req: Request<Body>,
) -> Result<Response<Body>, Error> {
let req = parse_json_body::<CreateBucketRequest>(req).await?;
if let Some(ga) = &req.global_alias {
if !is_valid_bucket_name(ga) {
return Err(Error::bad_request(format!(
"{}: {}",
ga, INVALID_BUCKET_NAME_MESSAGE
)));
}
if let Some(alias) = garage.bucket_alias_table.get(&EmptyKey, ga).await? {
if alias.state.get().is_some() {
return Err(CommonError::BucketAlreadyExists.into());
}
}
}
if let Some(la) = &req.local_alias {
if !is_valid_bucket_name(&la.alias) {
return Err(Error::bad_request(format!(
"{}: {}",
la.alias, INVALID_BUCKET_NAME_MESSAGE
)));
}
let key = garage
.key_helper()
.get_existing_key(&la.access_key_id)
.await?;
let state = key.state.as_option().unwrap();
if matches!(state.local_aliases.get(&la.alias), Some(_)) {
return Err(Error::bad_request("Local alias already exists"));
}
}
let bucket = Bucket::new();
garage.bucket_table.insert(&bucket).await?;
if let Some(ga) = &req.global_alias {
garage
.bucket_helper()
.set_global_bucket_alias(bucket.id, ga)
.await?;
}
if let Some(la) = &req.local_alias {
garage
.bucket_helper()
.set_local_bucket_alias(bucket.id, &la.access_key_id, &la.alias)
.await?;
if la.allow.read || la.allow.write || la.allow.owner {
garage
.bucket_helper()
.set_bucket_key_permissions(
bucket.id,
&la.access_key_id,
BucketKeyPerm {
timestamp: now_msec(),
allow_read: la.allow.read,
allow_write: la.allow.write,
allow_owner: la.allow.owner,
},
)
.await?;
}
}
bucket_info_results(garage, bucket.id).await
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct CreateBucketRequest {
global_alias: Option<String>,
local_alias: Option<CreateBucketLocalAlias>,
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct CreateBucketLocalAlias {
access_key_id: String,
alias: String,
#[serde(default)]
allow: ApiBucketKeyPerm,
}
pub async fn handle_delete_bucket(
garage: &Arc<Garage>,
id: String,
) -> Result<Response<Body>, Error> {
let helper = garage.bucket_helper();
let bucket_id = parse_bucket_id(&id)?;
let mut bucket = helper.get_existing_bucket(bucket_id).await?;
let state = bucket.state.as_option().unwrap();
// Check bucket is empty
if !helper.is_bucket_empty(bucket_id).await? {
return Err(CommonError::BucketNotEmpty.into());
}
// --- done checking, now commit ---
// 1. delete authorization from keys that had access
for (key_id, perm) in bucket.authorized_keys() {
if perm.is_any() {
helper
.set_bucket_key_permissions(bucket.id, key_id, BucketKeyPerm::NO_PERMISSIONS)
.await?;
}
}
// 2. delete all local aliases
for ((key_id, alias), _, active) in state.local_aliases.items().iter() {
if *active {
helper
.unset_local_bucket_alias(bucket.id, key_id, alias)
.await?;
}
}
// 3. delete all global aliases
for (alias, _, active) in state.aliases.items().iter() {
if *active {
helper.purge_global_bucket_alias(bucket.id, alias).await?;
}
}
// 4. delete bucket
bucket.state = Deletable::delete();
garage.bucket_table.insert(&bucket).await?;
Ok(Response::builder()
.status(StatusCode::NO_CONTENT)
.body(Body::empty())?)
}
pub async fn handle_update_bucket(
garage: &Arc<Garage>,
id: String,
req: Request<Body>,
) -> Result<Response<Body>, Error> {
let req = parse_json_body::<UpdateBucketRequest>(req).await?;
let bucket_id = parse_bucket_id(&id)?;
let mut bucket = garage
.bucket_helper()
.get_existing_bucket(bucket_id)
.await?;
let state = bucket.state.as_option_mut().unwrap();
if let Some(wa) = req.website_access {
if wa.enabled {
state.website_config.update(Some(WebsiteConfig {
index_document: wa.index_document.ok_or_bad_request(
"Please specify indexDocument when enabling website access.",
)?,
error_document: wa.error_document,
}));
} else {
if wa.index_document.is_some() || wa.error_document.is_some() {
return Err(Error::bad_request(
"Cannot specify indexDocument or errorDocument when disabling website access.",
));
}
state.website_config.update(None);
}
}
if let Some(q) = req.quotas {
state.quotas.update(BucketQuotas {
max_size: q.max_size,
max_objects: q.max_objects,
});
}
garage.bucket_table.insert(&bucket).await?;
bucket_info_results(garage, bucket_id).await
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct UpdateBucketRequest {
website_access: Option<UpdateBucketWebsiteAccess>,
quotas: Option<ApiBucketQuotas>,
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct UpdateBucketWebsiteAccess {
enabled: bool,
index_document: Option<String>,
error_document: Option<String>,
}
// ---- BUCKET/KEY PERMISSIONS ----
pub async fn handle_bucket_change_key_perm(
garage: &Arc<Garage>,
req: Request<Body>,
new_perm_flag: bool,
) -> Result<Response<Body>, Error> {
let req = parse_json_body::<BucketKeyPermChangeRequest>(req).await?;
let bucket_id = parse_bucket_id(&req.bucket_id)?;
let bucket = garage
.bucket_helper()
.get_existing_bucket(bucket_id)
.await?;
let state = bucket.state.as_option().unwrap();
let key = garage
.key_helper()
.get_existing_key(&req.access_key_id)
.await?;
let mut perm = state
.authorized_keys
.get(&key.key_id)
.cloned()
.unwrap_or(BucketKeyPerm::NO_PERMISSIONS);
if req.permissions.read {
perm.allow_read = new_perm_flag;
}
if req.permissions.write {
perm.allow_write = new_perm_flag;
}
if req.permissions.owner {
perm.allow_owner = new_perm_flag;
}
garage
.bucket_helper()
.set_bucket_key_permissions(bucket.id, &key.key_id, perm)
.await?;
bucket_info_results(garage, bucket.id).await
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct BucketKeyPermChangeRequest {
bucket_id: String,
access_key_id: String,
permissions: ApiBucketKeyPerm,
}
// ---- BUCKET ALIASES ----
pub async fn handle_global_alias_bucket(
garage: &Arc<Garage>,
bucket_id: String,
alias: String,
) -> Result<Response<Body>, Error> {
let bucket_id = parse_bucket_id(&bucket_id)?;
garage
.bucket_helper()
.set_global_bucket_alias(bucket_id, &alias)
.await?;
bucket_info_results(garage, bucket_id).await
}
pub async fn handle_global_unalias_bucket(
garage: &Arc<Garage>,
bucket_id: String,
alias: String,
) -> Result<Response<Body>, Error> {
let bucket_id = parse_bucket_id(&bucket_id)?;
garage
.bucket_helper()
.unset_global_bucket_alias(bucket_id, &alias)
.await?;
bucket_info_results(garage, bucket_id).await
}
pub async fn handle_local_alias_bucket(
garage: &Arc<Garage>,
bucket_id: String,
access_key_id: String,
alias: String,
) -> Result<Response<Body>, Error> {
let bucket_id = parse_bucket_id(&bucket_id)?;
garage
.bucket_helper()
.set_local_bucket_alias(bucket_id, &access_key_id, &alias)
.await?;
bucket_info_results(garage, bucket_id).await
}
pub async fn handle_local_unalias_bucket(
garage: &Arc<Garage>,
bucket_id: String,
access_key_id: String,
alias: String,
) -> Result<Response<Body>, Error> {
let bucket_id = parse_bucket_id(&bucket_id)?;
garage
.bucket_helper()
.unset_local_bucket_alias(bucket_id, &access_key_id, &alias)
.await?;
bucket_info_results(garage, bucket_id).await
}
// ---- HELPER ----
fn parse_bucket_id(id: &str) -> Result<Uuid, Error> {
let id_hex = hex::decode(&id).ok_or_bad_request("Invalid bucket id")?;
Ok(Uuid::try_from(&id_hex).ok_or_bad_request("Invalid bucket id")?)
}

193
src/api/admin/cluster.rs Normal file
View file

@ -0,0 +1,193 @@
use std::collections::HashMap;
use std::net::SocketAddr;
use std::sync::Arc;
use hyper::{Body, Request, Response, StatusCode};
use serde::{Deserialize, Serialize};
use garage_util::crdt::*;
use garage_util::data::*;
use garage_rpc::layout::*;
use garage_model::garage::Garage;
use crate::admin::error::*;
use crate::helpers::{json_ok_response, parse_json_body};
pub async fn handle_get_cluster_status(garage: &Arc<Garage>) -> Result<Response<Body>, Error> {
let res = GetClusterStatusResponse {
node: hex::encode(garage.system.id),
garage_version: garage_util::version::garage_version(),
garage_features: garage_util::version::garage_features(),
db_engine: garage.db.engine(),
known_nodes: garage
.system
.get_known_nodes()
.into_iter()
.map(|i| {
(
hex::encode(i.id),
KnownNodeResp {
addr: i.addr,
is_up: i.is_up,
last_seen_secs_ago: i.last_seen_secs_ago,
hostname: i.status.hostname,
},
)
})
.collect(),
layout: get_cluster_layout(garage),
};
Ok(json_ok_response(&res)?)
}
pub async fn handle_connect_cluster_nodes(
garage: &Arc<Garage>,
req: Request<Body>,
) -> Result<Response<Body>, Error> {
let req = parse_json_body::<Vec<String>>(req).await?;
let res = futures::future::join_all(req.iter().map(|node| garage.system.connect(node)))
.await
.into_iter()
.map(|r| match r {
Ok(()) => ConnectClusterNodesResponse {
success: true,
error: None,
},
Err(e) => ConnectClusterNodesResponse {
success: false,
error: Some(format!("{}", e)),
},
})
.collect::<Vec<_>>();
Ok(json_ok_response(&res)?)
}
pub async fn handle_get_cluster_layout(garage: &Arc<Garage>) -> Result<Response<Body>, Error> {
let res = get_cluster_layout(garage);
Ok(json_ok_response(&res)?)
}
fn get_cluster_layout(garage: &Arc<Garage>) -> GetClusterLayoutResponse {
let layout = garage.system.get_cluster_layout();
GetClusterLayoutResponse {
version: layout.version,
roles: layout
.roles
.items()
.iter()
.filter(|(_, _, v)| v.0.is_some())
.map(|(k, _, v)| (hex::encode(k), v.0.clone()))
.collect(),
staged_role_changes: layout
.staging
.items()
.iter()
.filter(|(k, _, v)| layout.roles.get(k) != Some(v))
.map(|(k, _, v)| (hex::encode(k), v.0.clone()))
.collect(),
}
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct GetClusterStatusResponse {
node: String,
garage_version: &'static str,
garage_features: Option<&'static [&'static str]>,
db_engine: String,
known_nodes: HashMap<String, KnownNodeResp>,
layout: GetClusterLayoutResponse,
}
#[derive(Serialize)]
struct ConnectClusterNodesResponse {
success: bool,
error: Option<String>,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct GetClusterLayoutResponse {
version: u64,
roles: HashMap<String, Option<NodeRole>>,
staged_role_changes: HashMap<String, Option<NodeRole>>,
}
#[derive(Serialize)]
struct KnownNodeResp {
addr: SocketAddr,
is_up: bool,
last_seen_secs_ago: Option<u64>,
hostname: String,
}
pub async fn handle_update_cluster_layout(
garage: &Arc<Garage>,
req: Request<Body>,
) -> Result<Response<Body>, Error> {
let updates = parse_json_body::<UpdateClusterLayoutRequest>(req).await?;
let mut layout = garage.system.get_cluster_layout();
let mut roles = layout.roles.clone();
roles.merge(&layout.staging);
for (node, role) in updates {
let node = hex::decode(node).ok_or_bad_request("Invalid node identifier")?;
let node = Uuid::try_from(&node).ok_or_bad_request("Invalid node identifier")?;
layout
.staging
.merge(&roles.update_mutator(node, NodeRoleV(role)));
}
garage.system.update_cluster_layout(&layout).await?;
Ok(Response::builder()
.status(StatusCode::OK)
.body(Body::empty())?)
}
pub async fn handle_apply_cluster_layout(
garage: &Arc<Garage>,
req: Request<Body>,
) -> Result<Response<Body>, Error> {
let param = parse_json_body::<ApplyRevertLayoutRequest>(req).await?;
let layout = garage.system.get_cluster_layout();
let layout = layout.apply_staged_changes(Some(param.version))?;
garage.system.update_cluster_layout(&layout).await?;
Ok(Response::builder()
.status(StatusCode::OK)
.body(Body::empty())?)
}
pub async fn handle_revert_cluster_layout(
garage: &Arc<Garage>,
req: Request<Body>,
) -> Result<Response<Body>, Error> {
let param = parse_json_body::<ApplyRevertLayoutRequest>(req).await?;
let layout = garage.system.get_cluster_layout();
let layout = layout.revert_staged_changes(Some(param.version))?;
garage.system.update_cluster_layout(&layout).await?;
Ok(Response::builder()
.status(StatusCode::OK)
.body(Body::empty())?)
}
type UpdateClusterLayoutRequest = HashMap<String, Option<NodeRole>>;
#[derive(Deserialize)]
struct ApplyRevertLayoutRequest {
version: u64,
}

97
src/api/admin/error.rs Normal file
View file

@ -0,0 +1,97 @@
use err_derive::Error;
use hyper::header::HeaderValue;
use hyper::{Body, HeaderMap, StatusCode};
pub use garage_model::helper::error::Error as HelperError;
use crate::common_error::CommonError;
pub use crate::common_error::{CommonErrorDerivative, OkOrBadRequest, OkOrInternalError};
use crate::generic_server::ApiError;
use crate::helpers::CustomApiErrorBody;
/// Errors of this crate
#[derive(Debug, Error)]
pub enum Error {
#[error(display = "{}", _0)]
/// Error from common error
Common(CommonError),
// Category: cannot process
/// The API access key does not exist
#[error(display = "Access key not found: {}", _0)]
NoSuchAccessKey(String),
/// In Import key, the key already exists
#[error(
display = "Key {} already exists in data store. Even if it is deleted, we can't let you create a new key with the same ID. Sorry.",
_0
)]
KeyAlreadyExists(String),
}
impl<T> From<T> for Error
where
CommonError: From<T>,
{
fn from(err: T) -> Self {
Error::Common(CommonError::from(err))
}
}
impl CommonErrorDerivative for Error {}
impl From<HelperError> for Error {
fn from(err: HelperError) -> Self {
match err {
HelperError::Internal(i) => Self::Common(CommonError::InternalError(i)),
HelperError::BadRequest(b) => Self::Common(CommonError::BadRequest(b)),
HelperError::InvalidBucketName(n) => Self::Common(CommonError::InvalidBucketName(n)),
HelperError::NoSuchBucket(n) => Self::Common(CommonError::NoSuchBucket(n)),
HelperError::NoSuchAccessKey(n) => Self::NoSuchAccessKey(n),
}
}
}
impl Error {
fn code(&self) -> &'static str {
match self {
Error::Common(c) => c.aws_code(),
Error::NoSuchAccessKey(_) => "NoSuchAccessKey",
Error::KeyAlreadyExists(_) => "KeyAlreadyExists",
}
}
}
impl ApiError for Error {
/// Get the HTTP status code that best represents the meaning of the error for the client
fn http_status_code(&self) -> StatusCode {
match self {
Error::Common(c) => c.http_status_code(),
Error::NoSuchAccessKey(_) => StatusCode::NOT_FOUND,
Error::KeyAlreadyExists(_) => StatusCode::CONFLICT,
}
}
fn add_http_headers(&self, header_map: &mut HeaderMap<HeaderValue>) {
use hyper::header;
header_map.append(header::CONTENT_TYPE, "application/json".parse().unwrap());
}
fn http_body(&self, garage_region: &str, path: &str) -> Body {
let error = CustomApiErrorBody {
code: self.code().to_string(),
message: format!("{}", self),
path: path.to_string(),
region: garage_region.to_string(),
};
Body::from(serde_json::to_string_pretty(&error).unwrap_or_else(|_| {
r#"
{
"code": "InternalError",
"message": "JSON encoding of error failed"
}
"#
.into()
}))
}
}

256
src/api/admin/key.rs Normal file
View file

@ -0,0 +1,256 @@
use std::collections::HashMap;
use std::sync::Arc;
use hyper::{Body, Request, Response, StatusCode};
use serde::{Deserialize, Serialize};
use garage_table::*;
use garage_model::garage::Garage;
use garage_model::key_table::*;
use crate::admin::error::*;
use crate::helpers::{json_ok_response, parse_json_body};
pub async fn handle_list_keys(garage: &Arc<Garage>) -> Result<Response<Body>, Error> {
let res = garage
.key_table
.get_range(
&EmptyKey,
None,
Some(KeyFilter::Deleted(DeletedFilter::NotDeleted)),
10000,
EnumerationOrder::Forward,
)
.await?
.iter()
.map(|k| ListKeyResultItem {
id: k.key_id.to_string(),
name: k.params().unwrap().name.get().clone(),
})
.collect::<Vec<_>>();
Ok(json_ok_response(&res)?)
}
#[derive(Serialize)]
struct ListKeyResultItem {
id: String,
name: String,
}
pub async fn handle_get_key_info(
garage: &Arc<Garage>,
id: Option<String>,
search: Option<String>,
) -> Result<Response<Body>, Error> {
let key = if let Some(id) = id {
garage.key_helper().get_existing_key(&id).await?
} else if let Some(search) = search {
garage
.key_helper()
.get_existing_matching_key(&search)
.await?
} else {
unreachable!();
};
key_info_results(garage, key).await
}
pub async fn handle_create_key(
garage: &Arc<Garage>,
req: Request<Body>,
) -> Result<Response<Body>, Error> {
let req = parse_json_body::<CreateKeyRequest>(req).await?;
let key = Key::new(&req.name);
garage.key_table.insert(&key).await?;
key_info_results(garage, key).await
}
#[derive(Deserialize)]
struct CreateKeyRequest {
name: String,
}
pub async fn handle_import_key(
garage: &Arc<Garage>,
req: Request<Body>,
) -> Result<Response<Body>, Error> {
let req = parse_json_body::<ImportKeyRequest>(req).await?;
let prev_key = garage.key_table.get(&EmptyKey, &req.access_key_id).await?;
if prev_key.is_some() {
return Err(Error::KeyAlreadyExists(req.access_key_id.to_string()));
}
let imported_key = Key::import(&req.access_key_id, &req.secret_access_key, &req.name);
garage.key_table.insert(&imported_key).await?;
key_info_results(garage, imported_key).await
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct ImportKeyRequest {
access_key_id: String,
secret_access_key: String,
name: String,
}
pub async fn handle_update_key(
garage: &Arc<Garage>,
id: String,
req: Request<Body>,
) -> Result<Response<Body>, Error> {
let req = parse_json_body::<UpdateKeyRequest>(req).await?;
let mut key = garage.key_helper().get_existing_key(&id).await?;
let key_state = key.state.as_option_mut().unwrap();
if let Some(new_name) = req.name {
key_state.name.update(new_name);
}
if let Some(allow) = req.allow {
if allow.create_bucket {
key_state.allow_create_bucket.update(true);
}
}
if let Some(deny) = req.deny {
if deny.create_bucket {
key_state.allow_create_bucket.update(false);
}
}
garage.key_table.insert(&key).await?;
key_info_results(garage, key).await
}
#[derive(Deserialize)]
struct UpdateKeyRequest {
name: Option<String>,
allow: Option<KeyPerm>,
deny: Option<KeyPerm>,
}
pub async fn handle_delete_key(garage: &Arc<Garage>, id: String) -> Result<Response<Body>, Error> {
let mut key = garage.key_helper().get_existing_key(&id).await?;
key.state.as_option().unwrap();
garage.key_helper().delete_key(&mut key).await?;
Ok(Response::builder()
.status(StatusCode::NO_CONTENT)
.body(Body::empty())?)
}
async fn key_info_results(garage: &Arc<Garage>, key: Key) -> Result<Response<Body>, Error> {
let mut relevant_buckets = HashMap::new();
let key_state = key.state.as_option().unwrap();
for id in key_state
.authorized_buckets
.items()
.iter()
.map(|(id, _)| id)
.chain(
key_state
.local_aliases
.items()
.iter()
.filter_map(|(_, _, v)| v.as_ref()),
) {
if !relevant_buckets.contains_key(id) {
if let Some(b) = garage.bucket_table.get(&EmptyKey, id).await? {
if b.state.as_option().is_some() {
relevant_buckets.insert(*id, b);
}
}
}
}
let res = GetKeyInfoResult {
name: key_state.name.get().clone(),
access_key_id: key.key_id.clone(),
secret_access_key: key_state.secret_key.clone(),
permissions: KeyPerm {
create_bucket: *key_state.allow_create_bucket.get(),
},
buckets: relevant_buckets
.into_iter()
.map(|(_, bucket)| {
let state = bucket.state.as_option().unwrap();
KeyInfoBucketResult {
id: hex::encode(bucket.id),
global_aliases: state
.aliases
.items()
.iter()
.filter(|(_, _, a)| *a)
.map(|(n, _, _)| n.to_string())
.collect::<Vec<_>>(),
local_aliases: state
.local_aliases
.items()
.iter()
.filter(|((k, _), _, a)| *a && *k == key.key_id)
.map(|((_, n), _, _)| n.to_string())
.collect::<Vec<_>>(),
permissions: key_state
.authorized_buckets
.get(&bucket.id)
.map(|p| ApiBucketKeyPerm {
read: p.allow_read,
write: p.allow_write,
owner: p.allow_owner,
})
.unwrap_or_default(),
}
})
.collect::<Vec<_>>(),
};
Ok(json_ok_response(&res)?)
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct GetKeyInfoResult {
name: String,
access_key_id: String,
secret_access_key: String,
permissions: KeyPerm,
buckets: Vec<KeyInfoBucketResult>,
}
#[derive(Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
struct KeyPerm {
#[serde(default)]
create_bucket: bool,
}
#[derive(Serialize)]
#[serde(rename_all = "camelCase")]
struct KeyInfoBucketResult {
id: String,
global_aliases: Vec<String>,
local_aliases: Vec<String>,
permissions: ApiBucketKeyPerm,
}
#[derive(Serialize, Deserialize, Default)]
pub(crate) struct ApiBucketKeyPerm {
#[serde(default)]
pub(crate) read: bool,
#[serde(default)]
pub(crate) write: bool,
#[serde(default)]
pub(crate) owner: bool,
}

7
src/api/admin/mod.rs Normal file
View file

@ -0,0 +1,7 @@
pub mod api_server;
mod error;
mod router;
mod bucket;
mod cluster;
mod key;

145
src/api/admin/router.rs Normal file
View file

@ -0,0 +1,145 @@
use std::borrow::Cow;
use hyper::{Method, Request};
use crate::admin::error::*;
use crate::router_macros::*;
pub enum Authorization {
MetricsToken,
AdminToken,
}
router_match! {@func
/// List of all Admin API endpoints.
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum Endpoint {
Options,
Metrics,
GetClusterStatus,
ConnectClusterNodes,
// Layout
GetClusterLayout,
UpdateClusterLayout,
ApplyClusterLayout,
RevertClusterLayout,
// Keys
ListKeys,
CreateKey,
ImportKey,
GetKeyInfo {
id: Option<String>,
search: Option<String>,
},
DeleteKey {
id: String,
},
UpdateKey {
id: String,
},
// Buckets
ListBuckets,
CreateBucket,
GetBucketInfo {
id: Option<String>,
global_alias: Option<String>,
},
DeleteBucket {
id: String,
},
UpdateBucket {
id: String,
},
// Bucket-Key Permissions
BucketAllowKey,
BucketDenyKey,
// Bucket aliases
GlobalAliasBucket {
id: String,
alias: String,
},
GlobalUnaliasBucket {
id: String,
alias: String,
},
LocalAliasBucket {
id: String,
access_key_id: String,
alias: String,
},
LocalUnaliasBucket {
id: String,
access_key_id: String,
alias: String,
},
}}
impl Endpoint {
/// Determine which S3 endpoint a request is for using the request, and a bucket which was
/// possibly extracted from the Host header.
/// Returns Self plus bucket name, if endpoint is not Endpoint::ListBuckets
pub fn from_request<T>(req: &Request<T>) -> Result<Self, Error> {
let uri = req.uri();
let path = uri.path();
let query = uri.query();
let mut query = QueryParameters::from_query(query.unwrap_or_default())?;
let res = router_match!(@gen_path_parser (req.method(), path, query) [
OPTIONS _ => Options,
GET "/metrics" => Metrics,
GET "/v0/status" => GetClusterStatus,
POST "/v0/connect" => ConnectClusterNodes,
// Layout endpoints
GET "/v0/layout" => GetClusterLayout,
POST "/v0/layout" => UpdateClusterLayout,
POST "/v0/layout/apply" => ApplyClusterLayout,
POST "/v0/layout/revert" => RevertClusterLayout,
// API key endpoints
GET "/v0/key" if id => GetKeyInfo (query_opt::id, query_opt::search),
GET "/v0/key" if search => GetKeyInfo (query_opt::id, query_opt::search),
POST "/v0/key" if id => UpdateKey (query::id),
POST "/v0/key" => CreateKey,
POST "/v0/key/import" => ImportKey,
DELETE "/v0/key" if id => DeleteKey (query::id),
GET "/v0/key" => ListKeys,
// Bucket endpoints
GET "/v0/bucket" if id => GetBucketInfo (query_opt::id, query_opt::global_alias),
GET "/v0/bucket" if global_alias => GetBucketInfo (query_opt::id, query_opt::global_alias),
GET "/v0/bucket" => ListBuckets,
POST "/v0/bucket" => CreateBucket,
DELETE "/v0/bucket" if id => DeleteBucket (query::id),
PUT "/v0/bucket" if id => UpdateBucket (query::id),
// Bucket-key permissions
POST "/v0/bucket/allow" => BucketAllowKey,
POST "/v0/bucket/deny" => BucketDenyKey,
// Bucket aliases
PUT "/v0/bucket/alias/global" => GlobalAliasBucket (query::id, query::alias),
DELETE "/v0/bucket/alias/global" => GlobalUnaliasBucket (query::id, query::alias),
PUT "/v0/bucket/alias/local" => LocalAliasBucket (query::id, query::access_key_id, query::alias),
DELETE "/v0/bucket/alias/local" => LocalUnaliasBucket (query::id, query::access_key_id, query::alias),
]);
if let Some(message) = query.nonempty_message() {
debug!("Unused query parameter: {}", message)
}
Ok(res)
}
/// Get the kind of authorization which is required to perform the operation.
pub fn authorization_type(&self) -> Authorization {
match self {
Self::Metrics => Authorization::MetricsToken,
_ => Authorization::AdminToken,
}
}
}
generateQueryParameters! {
"id" => id,
"search" => search,
"globalAlias" => global_alias,
"alias" => alias,
"accessKeyId" => access_key_id
}

View file

@ -1,543 +0,0 @@
use std::net::SocketAddr;
use std::sync::Arc;
use chrono::{DateTime, NaiveDateTime, Utc};
use futures::future::Future;
use futures::prelude::*;
use hyper::header;
use hyper::server::conn::AddrStream;
use hyper::service::{make_service_fn, service_fn};
use hyper::{Body, Method, Request, Response, Server};
use garage_util::data::*;
use garage_util::error::Error as GarageError;
use garage_model::garage::Garage;
use garage_model::key_table::Key;
use garage_table::util::*;
use crate::error::*;
use crate::signature::compute_scope;
use crate::signature::payload::check_payload_signature;
use crate::signature::streaming::SignedPayloadStream;
use crate::signature::LONG_DATETIME;
use crate::helpers::*;
use crate::s3_bucket::*;
use crate::s3_copy::*;
use crate::s3_cors::*;
use crate::s3_delete::*;
use crate::s3_get::*;
use crate::s3_list::*;
use crate::s3_post_object::handle_post_object;
use crate::s3_put::*;
use crate::s3_router::{Authorization, Endpoint};
use crate::s3_website::*;
/// Run the S3 API server
pub async fn run_api_server(
garage: Arc<Garage>,
shutdown_signal: impl Future<Output = ()>,
) -> Result<(), GarageError> {
let addr = &garage.config.s3_api.api_bind_addr;
let service = make_service_fn(|conn: &AddrStream| {
let garage = garage.clone();
let client_addr = conn.remote_addr();
async move {
Ok::<_, GarageError>(service_fn(move |req: Request<Body>| {
let garage = garage.clone();
handler(garage, req, client_addr)
}))
}
});
let server = Server::bind(addr).serve(service);
let graceful = server.with_graceful_shutdown(shutdown_signal);
info!("API server listening on http://{}", addr);
graceful.await?;
Ok(())
}
async fn handler(
garage: Arc<Garage>,
req: Request<Body>,
addr: SocketAddr,
) -> Result<Response<Body>, GarageError> {
let uri = req.uri().clone();
info!("{} {} {}", addr, req.method(), uri);
debug!("{:?}", req);
match handler_inner(garage.clone(), req).await {
Ok(x) => {
debug!("{} {:?}", x.status(), x.headers());
Ok(x)
}
Err(e) => {
let body: Body = Body::from(e.aws_xml(&garage.config.s3_api.s3_region, uri.path()));
let mut http_error_builder = Response::builder()
.status(e.http_status_code())
.header("Content-Type", "application/xml");
if let Some(header_map) = http_error_builder.headers_mut() {
e.add_headers(header_map)
}
let http_error = http_error_builder.body(body)?;
if e.http_status_code().is_server_error() {
warn!("Response: error {}, {}", e.http_status_code(), e);
} else {
info!("Response: error {}, {}", e.http_status_code(), e);
}
Ok(http_error)
}
}
}
async fn handler_inner(garage: Arc<Garage>, req: Request<Body>) -> Result<Response<Body>, Error> {
let authority = req
.headers()
.get(header::HOST)
.ok_or_bad_request("Host header required")?
.to_str()?;
let host = authority_to_host(authority)?;
let bucket_name = garage
.config
.s3_api
.root_domain
.as_ref()
.and_then(|root_domain| host_to_bucket(&host, root_domain));
let (endpoint, bucket_name) = Endpoint::from_request(&req, bucket_name.map(ToOwned::to_owned))?;
debug!("Endpoint: {:?}", endpoint);
// Some endpoints are processed early, before we even check for an API key
if let Endpoint::PostObject = endpoint {
return handle_post_object(garage, req, bucket_name.unwrap()).await;
}
if let Endpoint::Options = endpoint {
return handle_options_s3api(garage, &req, bucket_name).await;
}
let (api_key, mut content_sha256) = check_payload_signature(&garage, &req).await?;
let api_key = api_key.ok_or_else(|| {
Error::Forbidden("Garage does not support anonymous access yet".to_string())
})?;
let req = match req.headers().get("x-amz-content-sha256") {
Some(header) if header == "STREAMING-AWS4-HMAC-SHA256-PAYLOAD" => {
let signature = content_sha256
.take()
.ok_or_bad_request("No signature provided")?;
let secret_key = &api_key
.state
.as_option()
.ok_or_internal_error("Deleted key state")?
.secret_key;
let date = req
.headers()
.get("x-amz-date")
.ok_or_bad_request("Missing X-Amz-Date field")?
.to_str()?;
let date: NaiveDateTime = NaiveDateTime::parse_from_str(date, LONG_DATETIME)
.ok_or_bad_request("Invalid date")?;
let date: DateTime<Utc> = DateTime::from_utc(date, Utc);
let scope = compute_scope(&date, &garage.config.s3_api.s3_region);
let signing_hmac = crate::signature::signing_hmac(
&date,
secret_key,
&garage.config.s3_api.s3_region,
"s3",
)
.ok_or_internal_error("Unable to build signing HMAC")?;
req.map(move |body| {
Body::wrap_stream(
SignedPayloadStream::new(
body.map_err(Error::from),
signing_hmac,
date,
&scope,
signature,
)
.map_err(Error::from),
)
})
}
_ => req,
};
let bucket_name = match bucket_name {
None => return handle_request_without_bucket(garage, req, api_key, endpoint).await,
Some(bucket) => bucket.to_string(),
};
// Special code path for CreateBucket API endpoint
if let Endpoint::CreateBucket {} = endpoint {
return handle_create_bucket(&garage, req, content_sha256, api_key, bucket_name).await;
}
let bucket_id = resolve_bucket(&garage, &bucket_name, &api_key).await?;
let bucket = garage
.bucket_table
.get(&EmptyKey, &bucket_id)
.await?
.filter(|b| !b.state.is_deleted())
.ok_or(Error::NoSuchBucket)?;
let allowed = match endpoint.authorization_type() {
Authorization::Read => api_key.allow_read(&bucket_id),
Authorization::Write => api_key.allow_write(&bucket_id),
Authorization::Owner => api_key.allow_owner(&bucket_id),
_ => unreachable!(),
};
if !allowed {
return Err(Error::Forbidden(
"Operation is not allowed for this key.".to_string(),
));
}
// Look up what CORS rule might apply to response.
// Requests for methods different than GET, HEAD or POST
// are always preflighted, i.e. the browser should make
// an OPTIONS call before to check it is allowed
let matching_cors_rule = match *req.method() {
Method::GET | Method::HEAD | Method::POST => find_matching_cors_rule(&bucket, &req)?,
_ => None,
};
let resp = match endpoint {
Endpoint::HeadObject {
key, part_number, ..
} => handle_head(garage, &req, bucket_id, &key, part_number).await,
Endpoint::GetObject {
key, part_number, ..
} => handle_get(garage, &req, bucket_id, &key, part_number).await,
Endpoint::UploadPart {
key,
part_number,
upload_id,
} => {
handle_put_part(
garage,
req,
bucket_id,
&key,
part_number,
&upload_id,
content_sha256,
)
.await
}
Endpoint::CopyObject { key } => handle_copy(garage, &api_key, &req, bucket_id, &key).await,
Endpoint::UploadPartCopy {
key,
part_number,
upload_id,
} => {
handle_upload_part_copy(
garage,
&api_key,
&req,
bucket_id,
&key,
part_number,
&upload_id,
)
.await
}
Endpoint::PutObject { key } => {
handle_put(garage, req, bucket_id, &key, content_sha256).await
}
Endpoint::AbortMultipartUpload { key, upload_id } => {
handle_abort_multipart_upload(garage, bucket_id, &key, &upload_id).await
}
Endpoint::DeleteObject { key, .. } => handle_delete(garage, bucket_id, &key).await,
Endpoint::CreateMultipartUpload { key } => {
handle_create_multipart_upload(garage, &req, &bucket_name, bucket_id, &key).await
}
Endpoint::CompleteMultipartUpload { key, upload_id } => {
handle_complete_multipart_upload(
garage,
req,
&bucket_name,
bucket_id,
&key,
&upload_id,
content_sha256,
)
.await
}
Endpoint::CreateBucket {} => unreachable!(),
Endpoint::HeadBucket {} => {
let empty_body: Body = Body::from(vec![]);
let response = Response::builder().body(empty_body).unwrap();
Ok(response)
}
Endpoint::DeleteBucket {} => {
handle_delete_bucket(&garage, bucket_id, bucket_name, api_key).await
}
Endpoint::GetBucketLocation {} => handle_get_bucket_location(garage),
Endpoint::GetBucketVersioning {} => handle_get_bucket_versioning(),
Endpoint::ListObjects {
delimiter,
encoding_type,
marker,
max_keys,
prefix,
} => {
handle_list(
garage,
&ListObjectsQuery {
common: ListQueryCommon {
bucket_name,
bucket_id,
delimiter: delimiter.map(|d| d.to_string()),
page_size: max_keys.map(|p| p.clamp(1, 1000)).unwrap_or(1000),
prefix: prefix.unwrap_or_default(),
urlencode_resp: encoding_type.map(|e| e == "url").unwrap_or(false),
},
is_v2: false,
marker,
continuation_token: None,
start_after: None,
},
)
.await
}
Endpoint::ListObjectsV2 {
delimiter,
encoding_type,
max_keys,
prefix,
continuation_token,
start_after,
list_type,
..
} => {
if list_type == "2" {
handle_list(
garage,
&ListObjectsQuery {
common: ListQueryCommon {
bucket_name,
bucket_id,
delimiter: delimiter.map(|d| d.to_string()),
page_size: max_keys.map(|p| p.clamp(1, 1000)).unwrap_or(1000),
urlencode_resp: encoding_type.map(|e| e == "url").unwrap_or(false),
prefix: prefix.unwrap_or_default(),
},
is_v2: true,
marker: None,
continuation_token,
start_after,
},
)
.await
} else {
Err(Error::BadRequest(format!(
"Invalid endpoint: list-type={}",
list_type
)))
}
}
Endpoint::ListMultipartUploads {
delimiter,
encoding_type,
key_marker,
max_uploads,
prefix,
upload_id_marker,
} => {
handle_list_multipart_upload(
garage,
&ListMultipartUploadsQuery {
common: ListQueryCommon {
bucket_name,
bucket_id,
delimiter: delimiter.map(|d| d.to_string()),
page_size: max_uploads.map(|p| p.clamp(1, 1000)).unwrap_or(1000),
prefix: prefix.unwrap_or_default(),
urlencode_resp: encoding_type.map(|e| e == "url").unwrap_or(false),
},
key_marker,
upload_id_marker,
},
)
.await
}
Endpoint::ListParts {
key,
max_parts,
part_number_marker,
upload_id,
} => {
handle_list_parts(
garage,
&ListPartsQuery {
bucket_name,
bucket_id,
key,
upload_id,
part_number_marker: part_number_marker.map(|p| p.clamp(1, 10000)),
max_parts: max_parts.map(|p| p.clamp(1, 1000)).unwrap_or(1000),
},
)
.await
}
Endpoint::DeleteObjects {} => {
handle_delete_objects(garage, bucket_id, req, content_sha256).await
}
Endpoint::GetBucketWebsite {} => handle_get_website(&bucket).await,
Endpoint::PutBucketWebsite {} => {
handle_put_website(garage, bucket_id, req, content_sha256).await
}
Endpoint::DeleteBucketWebsite {} => handle_delete_website(garage, bucket_id).await,
Endpoint::GetBucketCors {} => handle_get_cors(&bucket).await,
Endpoint::PutBucketCors {} => handle_put_cors(garage, bucket_id, req, content_sha256).await,
Endpoint::DeleteBucketCors {} => handle_delete_cors(garage, bucket_id).await,
endpoint => Err(Error::NotImplemented(endpoint.name().to_owned())),
};
// If request was a success and we have a CORS rule that applies to it,
// add the corresponding CORS headers to the response
let mut resp_ok = resp?;
if let Some(rule) = matching_cors_rule {
add_cors_headers(&mut resp_ok, rule)
.ok_or_internal_error("Invalid bucket CORS configuration")?;
}
Ok(resp_ok)
}
async fn handle_request_without_bucket(
garage: Arc<Garage>,
_req: Request<Body>,
api_key: Key,
endpoint: Endpoint,
) -> Result<Response<Body>, Error> {
match endpoint {
Endpoint::ListBuckets => handle_list_buckets(&garage, &api_key).await,
endpoint => Err(Error::NotImplemented(endpoint.name().to_owned())),
}
}
#[allow(clippy::ptr_arg)]
pub async fn resolve_bucket(
garage: &Garage,
bucket_name: &String,
api_key: &Key,
) -> Result<Uuid, Error> {
let api_key_params = api_key
.state
.as_option()
.ok_or_internal_error("Key should not be deleted at this point")?;
if let Some(Some(bucket_id)) = api_key_params.local_aliases.get(bucket_name) {
Ok(*bucket_id)
} else {
Ok(garage
.bucket_helper()
.resolve_global_bucket_name(bucket_name)
.await?
.ok_or(Error::NoSuchBucket)?)
}
}
/// Extract the bucket name and the key name from an HTTP path and possibly a bucket provided in
/// the host header of the request
///
/// S3 internally manages only buckets and keys. This function splits
/// an HTTP path to get the corresponding bucket name and key.
pub fn parse_bucket_key<'a>(
path: &'a str,
host_bucket: Option<&'a str>,
) -> Result<(&'a str, Option<&'a str>), Error> {
let path = path.trim_start_matches('/');
if let Some(bucket) = host_bucket {
if !path.is_empty() {
return Ok((bucket, Some(path)));
} else {
return Ok((bucket, None));
}
}
let (bucket, key) = match path.find('/') {
Some(i) => {
let key = &path[i + 1..];
if !key.is_empty() {
(&path[..i], Some(key))
} else {
(&path[..i], None)
}
}
None => (path, None),
};
if bucket.is_empty() {
return Err(Error::BadRequest("No bucket specified".to_string()));
}
Ok((bucket, key))
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn parse_bucket_containing_a_key() -> Result<(), Error> {
let (bucket, key) = parse_bucket_key("/my_bucket/a/super/file.jpg", None)?;
assert_eq!(bucket, "my_bucket");
assert_eq!(key.expect("key must be set"), "a/super/file.jpg");
Ok(())
}
#[test]
fn parse_bucket_containing_no_key() -> Result<(), Error> {
let (bucket, key) = parse_bucket_key("/my_bucket/", None)?;
assert_eq!(bucket, "my_bucket");
assert!(key.is_none());
let (bucket, key) = parse_bucket_key("/my_bucket", None)?;
assert_eq!(bucket, "my_bucket");
assert!(key.is_none());
Ok(())
}
#[test]
fn parse_bucket_containing_no_bucket() {
let parsed = parse_bucket_key("", None);
assert!(parsed.is_err());
let parsed = parse_bucket_key("/", None);
assert!(parsed.is_err());
let parsed = parse_bucket_key("////", None);
assert!(parsed.is_err());
}
#[test]
fn parse_bucket_with_vhost_and_key() -> Result<(), Error> {
let (bucket, key) = parse_bucket_key("/a/super/file.jpg", Some("my-bucket"))?;
assert_eq!(bucket, "my-bucket");
assert_eq!(key.expect("key must be set"), "a/super/file.jpg");
Ok(())
}
#[test]
fn parse_bucket_with_vhost_no_key() -> Result<(), Error> {
let (bucket, key) = parse_bucket_key("", Some("my-bucket"))?;
assert_eq!(bucket, "my-bucket");
assert!(key.is_none());
let (bucket, key) = parse_bucket_key("/", Some("my-bucket"))?;
assert_eq!(bucket, "my-bucket");
assert!(key.is_none());
Ok(())
}
}

177
src/api/common_error.rs Normal file
View file

@ -0,0 +1,177 @@
use err_derive::Error;
use hyper::StatusCode;
use garage_util::error::Error as GarageError;
/// Errors of this crate
#[derive(Debug, Error)]
pub enum CommonError {
// ---- INTERNAL ERRORS ----
/// Error related to deeper parts of Garage
#[error(display = "Internal error: {}", _0)]
InternalError(#[error(source)] GarageError),
/// Error related to Hyper
#[error(display = "Internal error (Hyper error): {}", _0)]
Hyper(#[error(source)] hyper::Error),
/// Error related to HTTP
#[error(display = "Internal error (HTTP error): {}", _0)]
Http(#[error(source)] http::Error),
// ---- GENERIC CLIENT ERRORS ----
/// Proper authentication was not provided
#[error(display = "Forbidden: {}", _0)]
Forbidden(String),
/// Generic bad request response with custom message
#[error(display = "Bad request: {}", _0)]
BadRequest(String),
// ---- SPECIFIC ERROR CONDITIONS ----
// These have to be error codes referenced in the S3 spec here:
// https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html#ErrorCodeList
/// The bucket requested don't exists
#[error(display = "Bucket not found: {}", _0)]
NoSuchBucket(String),
/// Tried to create a bucket that already exist
#[error(display = "Bucket already exists")]
BucketAlreadyExists,
/// Tried to delete a non-empty bucket
#[error(display = "Tried to delete a non-empty bucket")]
BucketNotEmpty,
// Category: bad request
/// Bucket name is not valid according to AWS S3 specs
#[error(display = "Invalid bucket name: {}", _0)]
InvalidBucketName(String),
}
impl CommonError {
pub fn http_status_code(&self) -> StatusCode {
match self {
CommonError::InternalError(
GarageError::Timeout
| GarageError::RemoteError(_)
| GarageError::Quorum(_, _, _, _),
) => StatusCode::SERVICE_UNAVAILABLE,
CommonError::InternalError(_) | CommonError::Hyper(_) | CommonError::Http(_) => {
StatusCode::INTERNAL_SERVER_ERROR
}
CommonError::BadRequest(_) => StatusCode::BAD_REQUEST,
CommonError::Forbidden(_) => StatusCode::FORBIDDEN,
CommonError::NoSuchBucket(_) => StatusCode::NOT_FOUND,
CommonError::BucketNotEmpty | CommonError::BucketAlreadyExists => StatusCode::CONFLICT,
CommonError::InvalidBucketName(_) => StatusCode::BAD_REQUEST,
}
}
pub fn aws_code(&self) -> &'static str {
match self {
CommonError::Forbidden(_) => "AccessDenied",
CommonError::InternalError(
GarageError::Timeout
| GarageError::RemoteError(_)
| GarageError::Quorum(_, _, _, _),
) => "ServiceUnavailable",
CommonError::InternalError(_) | CommonError::Hyper(_) | CommonError::Http(_) => {
"InternalError"
}
CommonError::BadRequest(_) => "InvalidRequest",
CommonError::NoSuchBucket(_) => "NoSuchBucket",
CommonError::BucketAlreadyExists => "BucketAlreadyExists",
CommonError::BucketNotEmpty => "BucketNotEmpty",
CommonError::InvalidBucketName(_) => "InvalidBucketName",
}
}
pub fn bad_request<M: ToString>(msg: M) -> Self {
CommonError::BadRequest(msg.to_string())
}
}
pub trait CommonErrorDerivative: From<CommonError> {
fn internal_error<M: ToString>(msg: M) -> Self {
Self::from(CommonError::InternalError(GarageError::Message(
msg.to_string(),
)))
}
fn bad_request<M: ToString>(msg: M) -> Self {
Self::from(CommonError::BadRequest(msg.to_string()))
}
fn forbidden<M: ToString>(msg: M) -> Self {
Self::from(CommonError::Forbidden(msg.to_string()))
}
}
/// Trait to map error to the Bad Request error code
pub trait OkOrBadRequest {
type S;
fn ok_or_bad_request<M: AsRef<str>>(self, reason: M) -> Result<Self::S, CommonError>;
}
impl<T, E> OkOrBadRequest for Result<T, E>
where
E: std::fmt::Display,
{
type S = T;
fn ok_or_bad_request<M: AsRef<str>>(self, reason: M) -> Result<T, CommonError> {
match self {
Ok(x) => Ok(x),
Err(e) => Err(CommonError::BadRequest(format!(
"{}: {}",
reason.as_ref(),
e
))),
}
}
}
impl<T> OkOrBadRequest for Option<T> {
type S = T;
fn ok_or_bad_request<M: AsRef<str>>(self, reason: M) -> Result<T, CommonError> {
match self {
Some(x) => Ok(x),
None => Err(CommonError::BadRequest(reason.as_ref().to_string())),
}
}
}
/// Trait to map an error to an Internal Error code
pub trait OkOrInternalError {
type S;
fn ok_or_internal_error<M: AsRef<str>>(self, reason: M) -> Result<Self::S, CommonError>;
}
impl<T, E> OkOrInternalError for Result<T, E>
where
E: std::fmt::Display,
{
type S = T;
fn ok_or_internal_error<M: AsRef<str>>(self, reason: M) -> Result<T, CommonError> {
match self {
Ok(x) => Ok(x),
Err(e) => Err(CommonError::InternalError(GarageError::Message(format!(
"{}: {}",
reason.as_ref(),
e
)))),
}
}
}
impl<T> OkOrInternalError for Option<T> {
type S = T;
fn ok_or_internal_error<M: AsRef<str>>(self, reason: M) -> Result<T, CommonError> {
match self {
Some(x) => Ok(x),
None => Err(CommonError::InternalError(GarageError::Message(
reason.as_ref().to_string(),
))),
}
}
}

207
src/api/generic_server.rs Normal file
View file

@ -0,0 +1,207 @@
use std::net::SocketAddr;
use std::sync::Arc;
use async_trait::async_trait;
use futures::future::Future;
use hyper::header::HeaderValue;
use hyper::server::conn::AddrStream;
use hyper::service::{make_service_fn, service_fn};
use hyper::{Body, Request, Response, Server};
use hyper::{HeaderMap, StatusCode};
use opentelemetry::{
global,
metrics::{Counter, ValueRecorder},
trace::{FutureExt, SpanRef, TraceContextExt, Tracer},
Context, KeyValue,
};
use garage_util::error::Error as GarageError;
use garage_util::metrics::{gen_trace_id, RecordDuration};
pub(crate) trait ApiEndpoint: Send + Sync + 'static {
fn name(&self) -> &'static str;
fn add_span_attributes(&self, span: SpanRef<'_>);
}
pub trait ApiError: std::error::Error + Send + Sync + 'static {
fn http_status_code(&self) -> StatusCode;
fn add_http_headers(&self, header_map: &mut HeaderMap<HeaderValue>);
fn http_body(&self, garage_region: &str, path: &str) -> Body;
}
#[async_trait]
pub(crate) trait ApiHandler: Send + Sync + 'static {
const API_NAME: &'static str;
const API_NAME_DISPLAY: &'static str;
type Endpoint: ApiEndpoint;
type Error: ApiError;
fn parse_endpoint(&self, r: &Request<Body>) -> Result<Self::Endpoint, Self::Error>;
async fn handle(
&self,
req: Request<Body>,
endpoint: Self::Endpoint,
) -> Result<Response<Body>, Self::Error>;
}
pub(crate) struct ApiServer<A: ApiHandler> {
region: String,
api_handler: A,
// Metrics
request_counter: Counter<u64>,
error_counter: Counter<u64>,
request_duration: ValueRecorder<f64>,
}
impl<A: ApiHandler> ApiServer<A> {
pub fn new(region: String, api_handler: A) -> Arc<Self> {
let meter = global::meter("garage/api");
Arc::new(Self {
region,
api_handler,
request_counter: meter
.u64_counter(format!("api.{}.request_counter", A::API_NAME))
.with_description(format!(
"Number of API calls to the various {} API endpoints",
A::API_NAME_DISPLAY
))
.init(),
error_counter: meter
.u64_counter(format!("api.{}.error_counter", A::API_NAME))
.with_description(format!(
"Number of API calls to the various {} API endpoints that resulted in errors",
A::API_NAME_DISPLAY
))
.init(),
request_duration: meter
.f64_value_recorder(format!("api.{}.request_duration", A::API_NAME))
.with_description(format!(
"Duration of API calls to the various {} API endpoints",
A::API_NAME_DISPLAY
))
.init(),
})
}
pub async fn run_server(
self: Arc<Self>,
bind_addr: SocketAddr,
shutdown_signal: impl Future<Output = ()>,
) -> Result<(), GarageError> {
let service = make_service_fn(|conn: &AddrStream| {
let this = self.clone();
let client_addr = conn.remote_addr();
async move {
Ok::<_, GarageError>(service_fn(move |req: Request<Body>| {
let this = this.clone();
this.handler(req, client_addr)
}))
}
});
let server = Server::bind(&bind_addr).serve(service);
let graceful = server.with_graceful_shutdown(shutdown_signal);
info!(
"{} API server listening on http://{}",
A::API_NAME_DISPLAY,
bind_addr
);
graceful.await?;
Ok(())
}
async fn handler(
self: Arc<Self>,
req: Request<Body>,
addr: SocketAddr,
) -> Result<Response<Body>, GarageError> {
let uri = req.uri().clone();
info!("{} {} {}", addr, req.method(), uri);
debug!("{:?}", req);
let tracer = opentelemetry::global::tracer("garage");
let span = tracer
.span_builder(format!("{} API call (unknown)", A::API_NAME_DISPLAY))
.with_trace_id(gen_trace_id())
.with_attributes(vec![
KeyValue::new("method", format!("{}", req.method())),
KeyValue::new("uri", req.uri().to_string()),
])
.start(&tracer);
let res = self
.handler_stage2(req)
.with_context(Context::current_with_span(span))
.await;
match res {
Ok(x) => {
debug!("{} {:?}", x.status(), x.headers());
Ok(x)
}
Err(e) => {
let body: Body = e.http_body(&self.region, uri.path());
let mut http_error_builder = Response::builder().status(e.http_status_code());
if let Some(header_map) = http_error_builder.headers_mut() {
e.add_http_headers(header_map)
}
let http_error = http_error_builder.body(body)?;
if e.http_status_code().is_server_error() {
warn!("Response: error {}, {}", e.http_status_code(), e);
} else {
info!("Response: error {}, {}", e.http_status_code(), e);
}
Ok(http_error)
}
}
}
async fn handler_stage2(&self, req: Request<Body>) -> Result<Response<Body>, A::Error> {
let endpoint = self.api_handler.parse_endpoint(&req)?;
debug!("Endpoint: {}", endpoint.name());
let current_context = Context::current();
let current_span = current_context.span();
current_span.update_name::<String>(format!("S3 API {}", endpoint.name()));
current_span.set_attribute(KeyValue::new("endpoint", endpoint.name()));
endpoint.add_span_attributes(current_span);
let metrics_tags = &[KeyValue::new("api_endpoint", endpoint.name())];
let res = self
.api_handler
.handle(req, endpoint)
.record_duration(&self.request_duration, &metrics_tags[..])
.await;
self.request_counter.add(1, &metrics_tags[..]);
let status_code = match &res {
Ok(r) => r.status(),
Err(e) => e.http_status_code(),
};
if status_code.is_client_error() || status_code.is_server_error() {
self.error_counter.add(
1,
&[
metrics_tags[0].clone(),
KeyValue::new("status_code", status_code.as_str().to_string()),
],
);
}
res
}
}

View file

@ -1,5 +1,21 @@
use crate::Error;
use hyper::{Body, Request, Response};
use idna::domain_to_unicode;
use serde::{Deserialize, Serialize};
use crate::common_error::{CommonError as Error, *};
/// What kind of authorization is required to perform a given action
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum Authorization {
/// No authorization is required
None,
/// Having Read permission on bucket
Read,
/// Having Write permission on bucket
Write,
/// Having Owner permission on bucket
Owner,
}
/// Host to bucket
///
@ -31,7 +47,7 @@ pub fn authority_to_host(authority: &str) -> Result<String, Error> {
let mut iter = authority.chars().enumerate();
let (_, first_char) = iter
.next()
.ok_or_else(|| Error::BadRequest("Authority is empty".to_string()))?;
.ok_or_else(|| Error::bad_request("Authority is empty".to_string()))?;
let split = match first_char {
'[' => {
@ -39,7 +55,7 @@ pub fn authority_to_host(authority: &str) -> Result<String, Error> {
match iter.next() {
Some((_, ']')) => iter.next(),
_ => {
return Err(Error::BadRequest(format!(
return Err(Error::bad_request(format!(
"Authority {} has an illegal format",
authority
)))
@ -52,7 +68,7 @@ pub fn authority_to_host(authority: &str) -> Result<String, Error> {
let authority = match split {
Some((i, ':')) => Ok(&authority[..i]),
None => Ok(authority),
Some((_, _)) => Err(Error::BadRequest(format!(
Some((_, _)) => Err(Error::bad_request(format!(
"Authority {} has an illegal format",
authority
))),
@ -60,10 +76,134 @@ pub fn authority_to_host(authority: &str) -> Result<String, Error> {
authority.map(|h| domain_to_unicode(h).0)
}
/// Extract the bucket name and the key name from an HTTP path and possibly a bucket provided in
/// the host header of the request
///
/// S3 internally manages only buckets and keys. This function splits
/// an HTTP path to get the corresponding bucket name and key.
pub fn parse_bucket_key<'a>(
path: &'a str,
host_bucket: Option<&'a str>,
) -> Result<(&'a str, Option<&'a str>), Error> {
let path = path.trim_start_matches('/');
if let Some(bucket) = host_bucket {
if !path.is_empty() {
return Ok((bucket, Some(path)));
} else {
return Ok((bucket, None));
}
}
let (bucket, key) = match path.find('/') {
Some(i) => {
let key = &path[i + 1..];
if !key.is_empty() {
(&path[..i], Some(key))
} else {
(&path[..i], None)
}
}
None => (path, None),
};
if bucket.is_empty() {
return Err(Error::bad_request("No bucket specified"));
}
Ok((bucket, key))
}
const UTF8_BEFORE_LAST_CHAR: char = '\u{10FFFE}';
/// Compute the key after the prefix
pub fn key_after_prefix(pfx: &str) -> Option<String> {
let mut next = pfx.to_string();
while !next.is_empty() {
let tail = next.pop().unwrap();
if tail >= char::MAX {
continue;
}
// Circumvent a limitation of RangeFrom that overflow earlier than needed
// See: https://doc.rust-lang.org/core/ops/struct.RangeFrom.html
let new_tail = if tail == UTF8_BEFORE_LAST_CHAR {
char::MAX
} else {
(tail..).nth(1).unwrap()
};
next.push(new_tail);
return Some(next);
}
None
}
pub async fn parse_json_body<T: for<'de> Deserialize<'de>>(req: Request<Body>) -> Result<T, Error> {
let body = hyper::body::to_bytes(req.into_body()).await?;
let resp: T = serde_json::from_slice(&body).ok_or_bad_request("Invalid JSON")?;
Ok(resp)
}
pub fn json_ok_response<T: Serialize>(res: &T) -> Result<Response<Body>, Error> {
let resp_json = serde_json::to_string_pretty(res).map_err(garage_util::error::Error::from)?;
Ok(Response::builder()
.status(hyper::StatusCode::OK)
.header(http::header::CONTENT_TYPE, "application/json")
.body(Body::from(resp_json))?)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn parse_bucket_containing_a_key() -> Result<(), Error> {
let (bucket, key) = parse_bucket_key("/my_bucket/a/super/file.jpg", None)?;
assert_eq!(bucket, "my_bucket");
assert_eq!(key.expect("key must be set"), "a/super/file.jpg");
Ok(())
}
#[test]
fn parse_bucket_containing_no_key() -> Result<(), Error> {
let (bucket, key) = parse_bucket_key("/my_bucket/", None)?;
assert_eq!(bucket, "my_bucket");
assert!(key.is_none());
let (bucket, key) = parse_bucket_key("/my_bucket", None)?;
assert_eq!(bucket, "my_bucket");
assert!(key.is_none());
Ok(())
}
#[test]
fn parse_bucket_containing_no_bucket() {
let parsed = parse_bucket_key("", None);
assert!(parsed.is_err());
let parsed = parse_bucket_key("/", None);
assert!(parsed.is_err());
let parsed = parse_bucket_key("////", None);
assert!(parsed.is_err());
}
#[test]
fn parse_bucket_with_vhost_and_key() -> Result<(), Error> {
let (bucket, key) = parse_bucket_key("/a/super/file.jpg", Some("my-bucket"))?;
assert_eq!(bucket, "my-bucket");
assert_eq!(key.expect("key must be set"), "a/super/file.jpg");
Ok(())
}
#[test]
fn parse_bucket_with_vhost_no_key() -> Result<(), Error> {
let (bucket, key) = parse_bucket_key("", Some("my-bucket"))?;
assert_eq!(bucket, "my-bucket");
assert!(key.is_none());
let (bucket, key) = parse_bucket_key("/", Some("my-bucket"))?;
assert_eq!(bucket, "my-bucket");
assert!(key.is_none());
Ok(())
}
#[test]
fn authority_to_host_with_port() -> Result<(), Error> {
let domain = authority_to_host("[::1]:3902")?;
@ -111,4 +251,47 @@ mod tests {
assert_eq!(host_to_bucket("not-garage.tld", "garage.tld"), None);
assert_eq!(host_to_bucket("not-garage.tld", ".garage.tld"), None);
}
#[test]
fn test_key_after_prefix() {
use std::iter::FromIterator;
assert_eq!(UTF8_BEFORE_LAST_CHAR as u32, (char::MAX as u32) - 1);
assert_eq!(key_after_prefix("a/b/").unwrap().as_str(), "a/b0");
assert_eq!(key_after_prefix("").unwrap().as_str(), "");
assert_eq!(
key_after_prefix("􏿽").unwrap().as_str(),
String::from(char::from_u32(0x10FFFE).unwrap())
);
// When the last character is the biggest UTF8 char
let a = String::from_iter(['a', char::MAX].iter());
assert_eq!(key_after_prefix(a.as_str()).unwrap().as_str(), "b");
// When all characters are the biggest UTF8 char
let b = String::from_iter([char::MAX; 3].iter());
assert!(key_after_prefix(b.as_str()).is_none());
// Check utf8 surrogates
let c = String::from('\u{D7FF}');
assert_eq!(
key_after_prefix(c.as_str()).unwrap().as_str(),
String::from('\u{E000}')
);
// Check the character before the biggest one
let d = String::from('\u{10FFFE}');
assert_eq!(
key_after_prefix(d.as_str()).unwrap().as_str(),
String::from(char::MAX)
);
}
}
#[derive(Serialize)]
pub(crate) struct CustomApiErrorBody {
pub(crate) code: String,
pub(crate) message: String,
pub(crate) region: String,
pub(crate) path: String,
}

190
src/api/k2v/api_server.rs Normal file
View file

@ -0,0 +1,190 @@
use std::net::SocketAddr;
use std::sync::Arc;
use async_trait::async_trait;
use futures::future::Future;
use hyper::{Body, Method, Request, Response};
use opentelemetry::{trace::SpanRef, KeyValue};
use garage_util::error::Error as GarageError;
use garage_model::garage::Garage;
use crate::generic_server::*;
use crate::k2v::error::*;
use crate::signature::payload::check_payload_signature;
use crate::signature::streaming::*;
use crate::helpers::*;
use crate::k2v::batch::*;
use crate::k2v::index::*;
use crate::k2v::item::*;
use crate::k2v::router::Endpoint;
use crate::s3::cors::*;
pub struct K2VApiServer {
garage: Arc<Garage>,
}
pub(crate) struct K2VApiEndpoint {
bucket_name: String,
endpoint: Endpoint,
}
impl K2VApiServer {
pub async fn run(
garage: Arc<Garage>,
bind_addr: SocketAddr,
s3_region: String,
shutdown_signal: impl Future<Output = ()>,
) -> Result<(), GarageError> {
ApiServer::new(s3_region, K2VApiServer { garage })
.run_server(bind_addr, shutdown_signal)
.await
}
}
#[async_trait]
impl ApiHandler for K2VApiServer {
const API_NAME: &'static str = "k2v";
const API_NAME_DISPLAY: &'static str = "K2V";
type Endpoint = K2VApiEndpoint;
type Error = Error;
fn parse_endpoint(&self, req: &Request<Body>) -> Result<K2VApiEndpoint, Error> {
let (endpoint, bucket_name) = Endpoint::from_request(req)?;
Ok(K2VApiEndpoint {
bucket_name,
endpoint,
})
}
async fn handle(
&self,
req: Request<Body>,
endpoint: K2VApiEndpoint,
) -> Result<Response<Body>, Error> {
let K2VApiEndpoint {
bucket_name,
endpoint,
} = endpoint;
let garage = self.garage.clone();
// The OPTIONS method is procesed early, before we even check for an API key
if let Endpoint::Options = endpoint {
return Ok(handle_options_s3api(garage, &req, Some(bucket_name))
.await
.ok_or_bad_request("Error handling OPTIONS")?);
}
let (api_key, mut content_sha256) = check_payload_signature(&garage, "k2v", &req).await?;
let api_key = api_key
.ok_or_else(|| Error::forbidden("Garage does not support anonymous access yet"))?;
let req = parse_streaming_body(
&api_key,
req,
&mut content_sha256,
&garage.config.s3_api.s3_region,
"k2v",
)?;
let bucket_id = garage
.bucket_helper()
.resolve_bucket(&bucket_name, &api_key)
.await?;
let bucket = garage
.bucket_helper()
.get_existing_bucket(bucket_id)
.await?;
let allowed = match endpoint.authorization_type() {
Authorization::Read => api_key.allow_read(&bucket_id),
Authorization::Write => api_key.allow_write(&bucket_id),
Authorization::Owner => api_key.allow_owner(&bucket_id),
_ => unreachable!(),
};
if !allowed {
return Err(Error::forbidden("Operation is not allowed for this key."));
}
// Look up what CORS rule might apply to response.
// Requests for methods different than GET, HEAD or POST
// are always preflighted, i.e. the browser should make
// an OPTIONS call before to check it is allowed
let matching_cors_rule = match *req.method() {
Method::GET | Method::HEAD | Method::POST => find_matching_cors_rule(&bucket, &req)
.ok_or_internal_error("Error looking up CORS rule")?,
_ => None,
};
let resp = match endpoint {
Endpoint::DeleteItem {
partition_key,
sort_key,
} => handle_delete_item(garage, req, bucket_id, &partition_key, &sort_key).await,
Endpoint::InsertItem {
partition_key,
sort_key,
} => handle_insert_item(garage, req, bucket_id, &partition_key, &sort_key).await,
Endpoint::ReadItem {
partition_key,
sort_key,
} => handle_read_item(garage, &req, bucket_id, &partition_key, &sort_key).await,
Endpoint::PollItem {
partition_key,
sort_key,
causality_token,
timeout,
} => {
handle_poll_item(
garage,
&req,
bucket_id,
partition_key,
sort_key,
causality_token,
timeout,
)
.await
}
Endpoint::ReadIndex {
prefix,
start,
end,
limit,
reverse,
} => handle_read_index(garage, bucket_id, prefix, start, end, limit, reverse).await,
Endpoint::InsertBatch {} => handle_insert_batch(garage, bucket_id, req).await,
Endpoint::ReadBatch {} => handle_read_batch(garage, bucket_id, req).await,
Endpoint::DeleteBatch {} => handle_delete_batch(garage, bucket_id, req).await,
Endpoint::Options => unreachable!(),
};
// If request was a success and we have a CORS rule that applies to it,
// add the corresponding CORS headers to the response
let mut resp_ok = resp?;
if let Some(rule) = matching_cors_rule {
add_cors_headers(&mut resp_ok, rule)
.ok_or_internal_error("Invalid bucket CORS configuration")?;
}
Ok(resp_ok)
}
}
impl ApiEndpoint for K2VApiEndpoint {
fn name(&self) -> &'static str {
self.endpoint.name()
}
fn add_span_attributes(&self, span: SpanRef<'_>) {
span.set_attribute(KeyValue::new("bucket", self.bucket_name.clone()));
}
}

Some files were not shown because too many files have changed in this diff Show more