Compare commits

...

261 commits

Author SHA1 Message Date
Alex c3bd672d58
Reorganize code in s3_put.rs 2021-04-07 10:00:47 +02:00
Alex 93a9f96130
remove useless comment 2021-04-06 23:38:30 +02:00
Alex 7380f3855c Merge pull request 'Use content defined chunking' (#43) from trinity-1686a/garage:content-defined-chunking into main
Reviewed-on: Deuxfleurs/garage#43
2021-04-06 22:18:41 +02:00
Trinity Pointard 6cbc8d6ec9 mark branch as unreachable 2021-04-06 16:53:39 +02:00
Trinity Pointard b3b0b20d72 run fmt 2021-04-06 02:54:05 +02:00
Trinity Pointard 47d0aee9f8 change crate used for cdc
previous one seemed to output incorrect results
2021-04-06 02:50:07 +02:00
Trinity Pointard 6e0cb2dfb6 add content defined chuking 2021-04-06 02:50:07 +02:00
Alex c4c4b7dedc
Update documentation 2021-04-06 00:01:49 +02:00
Alex 8225fa2e4b Merge pull request 'Improved bootstraping procedure' (#56) from better_bootstrap into main
Reviewed-on: Deuxfleurs/garage#56
2021-04-05 23:50:33 +02:00
Alex ab67bd88de
Try to fix Drone 2021-04-05 23:41:50 +02:00
Alex 0f192a96b5
small simplify 2021-04-05 23:21:25 +02:00
Alex 7b85056942
Merge discovery loop with consul 2021-04-05 23:04:08 +02:00
Alex 7fd1f9a869
cargo fmt 2021-04-05 20:42:57 +02:00
Alex c5d8dc7d6d
Print stats 2021-04-05 20:42:46 +02:00
Alex fa11cb746a
Cargo fmt 2021-04-05 20:35:26 +02:00
Alex f11bd80d2a
Keep old data 2021-04-05 20:33:24 +02:00
Alex 595dc0ed0d
Persist directly and not in background 2021-04-05 20:26:01 +02:00
Alex 78eeaab5ed
Install rustfmt 2021-04-05 20:00:48 +02:00
Alex 22fbb3b892
Fix Drone CI signature 2021-04-05 19:58:42 +02:00
Alex 0eb5baea1a
Improve bootstraping: do it regularly; persist peer list 2021-04-05 19:55:53 +02:00
Alex 7d772737a5 Add signature to Drone CI 2021-04-05 12:10:28 +02:00
Alex 4c2d8f5a96 test2 2021-04-05 12:09:20 +02:00
Alex 0325086dac test 2021-04-05 12:08:05 +02:00
Alex 700925263f
Drone CI badge on branch main 2021-04-05 11:45:20 +02:00
Alex 6c83f66700 fix command for adding node 2021-04-02 23:23:14 +02:00
Alex 55e4a93bad Section on recovering from failures 2021-04-02 19:30:29 +02:00
LUXEY Adrien 9cb9945131 [doc] Added mention that GarageHQ is hosted with Garage 2021-03-22 16:19:58 +01:00
Alex 7f9c1d5595 publish website only in correct repo 2021-03-22 12:49:21 +01:00
Alex 991cf1b818 fix typos 2021-03-19 17:55:10 +01:00
Alex e26cb2640d Merge remote-tracking branch 'origin/feature/mdbook' 2021-03-19 17:49:57 +01:00
Alex 22041e924b Merge pull request 'intro.md: fix some typos, errors & dead links, plus some stylistic stuff' (#51) from adrien/garage:feature/mdbook into feature/mdbook
Reviewed-on: Deuxfleurs/garage#51
2021-03-19 17:14:41 +01:00
LUXEY Adrien b119e9d3c4 intro.md: fix some typos, errors & dead links, plus some stylistic stuff
modifié :         doc/book/src/intro.md
2021-03-19 16:51:05 +01:00
Alex c4cb9120fe only upload book from branch main 2021-03-19 14:29:37 +01:00
Alex c5e24de159 Center logo in book, add book CI 2021-03-19 14:25:57 +01:00
Alex 4e94b2b704 Fix checksum path in drone CI 2021-03-19 14:15:19 +01:00
Alex 44ff627e7d update makefile 2021-03-19 14:03:57 +01:00
Alex f859d15062 update to v0.2.1 2021-03-19 13:39:18 +01:00
Alex fd8f4caa81 Support old CPUs 2021-03-19 12:19:40 +01:00
Alex 4c26a0b9c1 Update Cargo.toml files with AGPL license info 2021-03-18 21:59:17 +01:00
Alex a1014224d3 garage node configure --replace <old_node_id> <new_node_id> 2021-03-18 21:49:12 +01:00
Alex e19e267b7e Merge pull request 'add support for using domain name in configuration' (#46) from trinity-1686a/garage:bind-domain into master
Reviewed-on: Deuxfleurs/garage#46
2021-03-18 21:26:34 +01:00
Trinity Pointard f17cb6c969 resolve domain to multiple addresses
And warn instead of failling when a domain can't be resolved
2021-03-18 21:04:30 +01:00
Trinity Pointard c8a7ce5cdf remove domain resolution for *_bind_addr 2021-03-18 19:47:51 +01:00
Trinity Pointard 81e9db783f simplify addresse deserialialiser and limit allocations 2021-03-18 19:47:51 +01:00
Trinity Pointard ae3b7029a9 add support for using domain name in configuration 2021-03-18 19:47:51 +01:00
Alex 6edbc65847 Add trinity's comment in the code 2021-03-18 19:46:43 +01:00
Alex bfa0ff8f82 Merge pull request 'add support for caching headers' (#49) from trinity-1686a/garage:cache-headers into master
Reviewed-on: Deuxfleurs/garage#49
2021-03-18 19:45:02 +01:00
Alex dead945c8f Prepare for release 0.2 2021-03-18 19:33:15 +01:00
Alex 4348bde180 Merge branch 'dev-0.2' 2021-03-18 19:27:02 +01:00
Alex 4eb16e8863 Allow to import keys from previous Garage instance 2021-03-18 19:24:59 +01:00
Alex 8e317e2783 Merge pull request 'try fixing CI' (#50) from trinity-1686a/garage:try-fix-ci into dev-0.2
Reviewed-on: Deuxfleurs/garage#50
2021-03-18 19:07:56 +01:00
Trinity Pointard ec26acb161 try fixing CI
maybe absolute path break caching, who knows
2021-03-18 17:45:26 +01:00
Trinity Pointard b4c903371c add support for caching headers 2021-03-18 15:46:33 +01:00
Quentin 3e6534d7a8 Fix garage_util description 2021-03-18 11:46:29 +01:00
Quentin 2dae4a25d6 Fix some typos 2021-03-18 11:35:50 +01:00
Quentin fcd566e89d Fix a table in the doc 2021-03-18 11:26:02 +01:00
Alex 5b659b28ce Merge pull request 'Add a mdbook documentation to present garage and help user on-boarding' (#45) from feature/mdbook into master
Reviewed-on: Deuxfleurs/garage#45
2021-03-18 10:39:55 +01:00
Quentin ea21c54434 Add handle files section to the doc 2021-03-17 22:44:35 +01:00
Quentin 1a5af9d1fc WIP getting started 2021-03-17 22:06:37 +01:00
Quentin b82a61fba2 Simplify our README 2021-03-17 20:58:30 +01:00
Quentin 44d0815ff9 Wrote daemon 2021-03-17 20:04:27 +01:00
Quentin 468e45ed7f WIP doc 2021-03-17 18:01:06 +01:00
Quentin 60f994a118 Working on the getting started guide 2021-03-17 17:24:11 +01:00
Quentin 002538f92c Refactor file organization 2021-03-17 16:15:18 +01:00
Quentin c50113acf3 Work on structure + Getting started is reworked 2021-03-17 15:44:29 +01:00
Quentin 0afc701a69 Doc skeleton + intro 2021-03-17 14:44:14 +01:00
Alex 797cda1c33 Add logo in repo 2021-03-17 10:26:50 +01:00
Alex 46c7226fe4 New logo!! 2021-03-17 08:41:07 +01:00
Alex 390ab02f41 Todo make a test for the Merkle updater 2021-03-16 20:13:07 +01:00
Alex 7b10245dfb Leader-based GC 2021-03-16 18:42:33 +01:00
Alex 08bcd51956 GC object table in a specific case 2021-03-16 16:51:15 +01:00
Alex 1bfe9c129e Switch to AGPL 2021-03-16 16:35:46 +01:00
Alex 3fadc5cbbd Small changes 2021-03-16 16:35:10 +01:00
Alex f4346cc5f4 Update dependencies 2021-03-16 15:58:40 +01:00
Alex 2a41b82384 Simpler Merkle & sync 2021-03-16 12:18:03 +01:00
Alex 0aad2f2e06 some reordering 2021-03-16 11:47:39 +01:00
Alex 515029d026 Refactor code 2021-03-16 11:43:58 +01:00
Alex 1d9961e411 Simplify replication logic 2021-03-16 11:14:27 +01:00
Alex 6a8439fd13 Some improvements in background worker but we terminate late 2021-03-15 23:14:12 +01:00
Alex 0cd5b2ae19 WIP migrate to tokio 1 2021-03-15 22:36:41 +01:00
Alex 4d4117f2b4 Refactor block resync loop; make workers infaillible 2021-03-15 20:09:44 +01:00
Alex 667e4e72a8 Small fixes 2021-03-15 19:51:16 +01:00
Alex 642bed601f Make it case-insensitive 2021-03-15 19:16:42 +01:00
Alex 5ee1d956b6 Allow manipulation of keys by their shorthand in the CLI 2021-03-15 19:14:26 +01:00
Alex 537f652fec Tiny things 2021-03-15 18:40:27 +01:00
Alex 0290afe1f8 Make block rc code more understandable 2021-03-15 18:27:26 +01:00
Alex 3bf2df622a Time and metadata improvements 2021-03-15 16:21:41 +01:00
Alex 097c339d98 Fix race condition 2021-03-15 15:26:29 +01:00
Alex bdcbdd1cd8 Fix list API bug 2021-03-15 14:46:37 +01:00
Alex 9b118160a8 Optim & refactor 2021-03-12 22:06:56 +01:00
Alex 831eb35763 cargo fmt 2021-03-12 21:52:19 +01:00
Alex c475471e7a Implement table gc, currently for block_ref and version only 2021-03-12 19:57:37 +01:00
Alex f4aad8fe6e cargo fmt 2021-03-12 18:16:03 +01:00
Alex 5ab33fddac Refactor CLI and prettify CLI outpu 2021-03-12 18:12:31 +01:00
Alex a1442f072a Implement garage stats to get info on node contents 2021-03-12 15:40:54 +01:00
Alex cbe7e1a66a Move table rpc client out of tableaux 2021-03-12 15:07:23 +01:00
Alex 8860aa19b8 Make syncer have its own rpc client/server 2021-03-12 15:05:26 +01:00
Alex 1fea257291 Don't sync at beginning 2021-03-12 14:51:17 +01:00
Alex 7fdaf7aef0 Fix merkle updater not being notified; improved logging 2021-03-12 14:37:46 +01:00
Alex 1ec49980ec whoops 2021-03-11 19:30:24 +01:00
Alex 3f7a496355 More security: don't delete stuff too easily 2021-03-11 19:06:27 +01:00
Alex f7c2cd1cd7 Add comment, and also whoops, this wasn't doing what we expected 2021-03-11 18:56:18 +01:00
Alex fae5104a2c Add a nice warning 2021-03-11 18:50:32 +01:00
Alex db7a9d4948 Tiny changes 2021-03-11 18:45:26 +01:00
Alex 046b649bcc (not well tested) use merkle tree for sync 2021-03-11 18:28:27 +01:00
Alex 94f3d28774 WIP big refactoring 2021-03-11 16:54:15 +01:00
Alex 8d63738cb0 Checkpoint: add merkle tree in data table 2021-03-11 13:47:21 +01:00
Alex 3214dd52dd Very minor changes 2021-03-10 21:50:09 +01:00
Alex af7600f989 Correctly implement CompleteMultipartUpload with etag check of parts 2021-03-10 17:01:05 +01:00
Alex 445912dc6a Remove migration paths from 0.1 branch 2021-03-10 16:38:31 +01:00
Alex 0fd7df8fa0 Switch to blake2 sum for identifying blocks by their data 2021-03-10 16:33:31 +01:00
Alex 2afd2c81ba Change hash function to blake2 for partition keys based on strings 2021-03-10 16:23:57 +01:00
Alex f319a7d374 Refactor model stuff, including cleaner CRDTs 2021-03-10 16:21:56 +01:00
Alex 6a3dcf3974 Rename n_tokens into capacity 2021-03-10 14:52:03 +01:00
Alex 7cda917b6b update condition 2021-03-05 17:08:03 +01:00
Alex 5e33c3cfc9 Use smaller capacities for nodes 2021-03-05 16:58:34 +01:00
Alex d7e005251d Not fully tested: new multi-dc MagLev 2021-03-05 16:22:29 +01:00
Alex 3882d5ba36 Remove epidemic propagation for fully replicated stuff: write directly to all nodes 2021-03-05 15:09:18 +01:00
Alex d7e148d302 Description of MultiDC MagLev 2021-02-25 11:37:42 +01:00
Alex 0522983aec precisions 2021-02-25 11:27:13 +01:00
Alex fdf908e845 Add precisions 2021-02-25 11:23:22 +01:00
Alex 2b4b69938f Add write-up about load-balancing 2021-02-25 11:18:44 +01:00
Alex 49c25a1509 Simulate stuff moving around 2021-02-25 10:53:33 +01:00
Alex 5fe95ebae7 fix tracing 2021-02-24 12:18:01 +01:00
Alex 13e2eda0c2 Arrange block manager 2021-02-24 11:58:03 +01:00
Alex 09fd6ea7f0 I was tired yesterday 2021-02-24 11:05:59 +01:00
Alex a52ab69640 fix misuse of sled transactions 2021-02-23 22:45:36 +01:00
Alex 20e6e9fa20 Update sled & try to debug deadlock (but its in sled...) 2021-02-23 21:27:28 +01:00
Alex bf25c95fe2 Make updated() be a sync function that doesn't fail 2021-02-23 20:25:15 +01:00
Alex 28bc967c83 Handle correctly deletion dues to offloading 2021-02-23 19:59:43 +01:00
Alex 55156cca9d Several changes in table_sync:
- separate path for case of offloading a partition we don't store
- use sync::Mutex instead of tokio::Mutex, make less fn's async
2021-02-23 19:11:02 +01:00
Alex 40763fd749 Cargo fmt 2021-02-23 18:46:25 +01:00
Alex 6e6f7e8555 Replace some checksums where it makes sense 2021-02-23 18:14:37 +01:00
Alex e8e4418ca7 Add blake2 and xxhash hash functions 2021-02-23 17:52:28 +01:00
Alex 1abbca37c4 Add adapted version of maglev for multi-dc 2021-02-21 19:14:28 +01:00
Alex 24f924afdb Maglev simulation 2021-02-21 18:32:13 +01:00
Alex b1b640ae8b rename hash() to sha256sum(), we might want to change it at some places 2021-02-21 15:24:30 +01:00
Alex e59322041a Evaluate hash functions 2021-02-21 15:11:15 +01:00
Alex 217b0dfd68 Add script to simulate different kinds of rings 2021-02-21 14:30:26 +01:00
Alex 80892df8cc Some refactoring 2021-02-21 13:11:10 +01:00
Alex 3bcbbe1e31 More precise logging (warn only when returning a 500) 2021-02-20 00:30:39 +01:00
Alex 10b983b8e7 Add verification of part numbers in CompleteMultipartUpload (WIP #30) 2021-02-20 00:13:07 +01:00
Alex 1de96248e0 add application/xml header and missing xml escapes 2021-02-19 23:40:18 +01:00
Alex 0ddfee92c5 add precision 2021-02-19 19:11:55 +01:00
Alex 5d1fa591d9 Add compatibility list 2021-02-19 19:10:23 +01:00
Alex e64ecbdccd S3 compatibility: return 404 instead of 400 on some multipart commands 2021-02-19 18:51:05 +01:00
Alex 55a2a636ca Implement ListObjectsV2 2021-02-19 16:44:06 +01:00
Alex 02d512f3fd Fix #28, extra headers being ignored (because of profound stupidity) 2021-02-19 12:38:22 +01:00
Alex 76390085ef Small improvements in the S3 put workflow 2021-02-19 12:11:02 +01:00
Alex 3b023c0c3b try to fix smoke test
dev cluster: don't ipv6 (fixes smoke test in container?)
2021-02-17 22:13:11 +01:00
Alex b5dc4d33a9 CI improvements 2021-02-17 22:13:08 +01:00
Alex a3d9d0faac yea more gzip 2021-02-10 13:11:13 +01:00
Alex bcf727f9c1 oops, missed a step 2021-02-10 12:54:11 +01:00
Alex 24766a0eb9 change dockerfile; add gzip caching 2021-02-10 12:53:00 +01:00
Alex 75d96bdc58 Add drone caching 2021-02-08 16:32:29 +01:00
Alex b3a27b9ebb Add drone badge 2021-02-08 15:04:22 +01:00
Alex 9a8b387bb0 fix drone 2021-02-08 15:02:03 +01:00
Alex 536bbb2277 add drone.yml 2021-02-08 14:58:24 +01:00
Alex f26c795c99 Add warning 2021-01-23 19:15:57 +01:00
Alex 6f6bf23bec Move doc files here from Deuxfleur's site repo 2021-01-23 19:12:10 +01:00
Alex 36814be447 Fix S3 ListObjects result and replace println!s by debug!s 2021-01-16 16:05:54 +01:00
Alex 6a5add3386 Fix build 2021-01-15 19:12:08 +01:00
Alex e818f51073 Forgot a bump 2021-01-15 18:36:51 +01:00
Alex ceeb0732a2 Use 0.1.0b instead of 0.1.0 (for compatibility with new Error type) 2021-01-15 18:27:58 +01:00
Alex 1d1d497e2b Bump everything to 0.1.1 2021-01-15 17:54:48 +01:00
Alex 97494b4a19 Merge pull request 'Support website publishing' (#7) from feature/website into master
Reviewed-on: Deuxfleurs/garage#7
2021-01-15 17:49:49 +01:00
Alex 851893a3f2 Do not accept domains such as [hello 2021-01-15 17:49:10 +01:00
Quentin f8a40e8c4f Explicitly set code path unreachable 2021-01-15 17:11:15 +01:00
Quentin fad7bc405b Behavior problem: do not panic anymore + add tests 2021-01-15 17:03:54 +01:00
Quentin 1e10c6a61c Doc tests that do not compile/work must be tagged with ignore 2021-01-15 17:03:38 +01:00
Quentin 11a79a95dd Simplify Error file 2021-01-15 16:25:44 +01:00
Quentin c441a358cd Remove unused dependencies 2021-01-15 16:16:32 +01:00
Quentin f496e41ef4 Replace an already done check by unreachable!() 2021-01-15 15:44:44 +01:00
Quentin 2f4378a9c4 Fix formatting 2020-12-17 22:51:44 +01:00
Quentin ccda9ab1ca Merge branch 'master' into feature/website 2020-12-17 21:09:50 +01:00
Quentin 086e5be290 Update testing script 2020-12-17 21:04:59 +01:00
Quentin 3132deca58 Web server access control 2020-12-17 20:43:14 +01:00
Quentin 011ff87b5f Push update 2020-12-15 13:23:22 +01:00
Quentin 3bc4d57a0f First implementation of the CLI 2020-12-15 12:48:24 +01:00
Quentin a3566e49da Start to implement Website CLI 2020-12-14 21:50:40 +01:00
Quentin d0eb6a457f Migrate RPC to new schema 2020-12-14 21:46:49 +01:00
Quentin 96388acf23 Implement migration 2020-12-12 21:35:29 +01:00
Alex 8956db2a81 Make less things public 2020-12-12 17:58:19 +01:00
Alex e72c6d0575 Merge pull request 'Table+Model doc' (#22) from doc/model into master
Reviewed-on: Deuxfleurs/garage#22
2020-12-12 17:08:28 +01:00
Alex 5c6c067b0c More documentation on CRDTs (we should probably extract this to a
standalone crate!)
2020-12-12 17:06:40 +01:00
Quentin e1ce2b228a WIP table migration 2020-12-12 17:00:31 +01:00
Alex 0b3084ca5f Merge branch 'master' into doc/model 2020-12-12 16:05:28 +01:00
Quentin 1119d466e7 Fix S3 command 2020-12-10 20:19:22 +01:00
Quentin e8c12072ce Merge branch 'master' into feature/website 2020-12-10 20:12:56 +01:00
Quentin 51d0c14e44 CLI structure 2020-12-10 18:13:32 +01:00
Alex 022b386a50 Improved compatibility on list API call 2020-12-06 15:39:03 +01:00
Alex 857440f192 Merge pull request 'Propose ETag fix' (#23) from bug/etag into master
Reviewed-on: Deuxfleurs/garage#23
2020-12-06 15:27:58 +01:00
Alex 39f45b3058 Merge pull request 'Merge the new smoke test to master' (#25) from feature/smoke-script into master
Reviewed-on: Deuxfleurs/garage#25
2020-12-06 15:27:39 +01:00
Quentin 986e15459a Merge branch 'master' into feature/website 2020-12-06 15:21:09 +01:00
Quentin e13fd09543 Reduce garage.1.rnd size to store it inline 2020-12-06 13:33:08 +01:00
Quentin a92868504f Indentation & comments 2020-12-06 10:23:14 +01:00
Quentin d2d1fc676d Test awscli/s3cmd interactions 2020-12-06 10:19:01 +01:00
Quentin a12930075d Test garage list & delete commands 2020-12-06 10:04:17 +01:00
Quentin 5c3dd9c74a Fix typo 2020-12-06 09:54:11 +01:00
Quentin 28055b708f Improve README, add more tests 2020-12-06 09:13:47 +01:00
Quentin 132c54b807 wip smoke test 2020-12-05 19:26:10 +01:00
Alex 4a5bbbb810 Propose ETag fix 2020-12-05 19:23:46 +01:00
Alex dfbc280c37 Merge pull request 'Content-range fix' (#24) from bug/content-range into master
Reviewed-on: Deuxfleurs/garage#24
2020-12-05 19:21:32 +01:00
Alex 76b489f3d3 Reformulate patch 2020-12-05 19:20:07 +01:00
Quentin bd7e3d1bd1 Fix Content-Length 2020-12-05 18:57:22 +01:00
Alex 9f46fb699a Content-range fix 2020-12-05 16:37:59 +01:00
Alex f844d4ee9b Add slide on consistency 2020-12-01 17:42:13 +01:00
Alex 7642229d54 Two new slides 2020-12-01 14:31:13 +01:00
Alex ad432eb154 Add some first technical slides 2020-12-01 13:46:23 +01:00
Quentin 9a57a0319a Talk 2020-11-30 17:32:37 +01:00
Alex 9b3aabfcbf Add talk template 2020-11-30 15:54:31 +01:00
Quentin 8df0b322ab Fix merge error 2020-11-29 17:40:36 +01:00
Quentin a512db342e Merge branch 'master' into feature/website 2020-11-29 17:40:03 +01:00
Alex 2f11191f60 Use ipv6 localhost for dev cluster and different port numbers 2020-11-29 17:31:58 +01:00
Quentin 54c3a023f0 Use aws cli version 2 2020-11-29 17:27:49 +01:00
Quentin 15f409d404 Merge branch 'master' into feature/website 2020-11-29 17:19:55 +01:00
Quentin cee6c3a821 A fix for s3cmd 2020-11-29 17:15:49 +01:00
Quentin 13d1b66ba4 Rollback logging on dev-cluster 2020-11-29 17:07:29 +01:00
Alex d54f15b2c6 Small optimisation 2020-11-29 17:07:14 +01:00
Quentin 3f18aa6f1d Add a smoke test script 2020-11-29 17:03:19 +01:00
Quentin 07e87595f8 S3 does not support accentuated buckets + add a script to clean tmp 2020-11-29 17:03:19 +01:00
Alex fed97f37e1 ETag patch 2020-11-29 16:38:01 +01:00
Alex 601ae25ad2 Small refactorings 2020-11-29 16:21:28 +01:00
Quentin cbd10c1b0a Add some doc on LWW 2020-11-23 18:17:48 +01:00
Quentin 8722e27600 CRDT doc 2020-11-23 17:49:21 +01:00
Quentin aa320aa04a Merge branch 'master' into feature/website 2020-11-22 19:54:47 +01:00
Quentin fb18f5e17a Fix wrong http status code 2020-11-21 18:14:02 +01:00
Quentin 28efe341cb Merge branch 'master' into feature/website 2020-11-21 18:01:50 +01:00
Quentin b7a377308b Handle HEAD 2020-11-21 17:58:14 +01:00
Quentin a88fd49f71 Use handle_get 2020-11-21 17:50:19 +01:00
Quentin 78be2b363f Remove s3cmd mention 2020-11-21 15:45:52 +01:00
Quentin 92ab3eedfc Use awscli instead of s3cmd 2020-11-21 15:44:09 +01:00
Quentin 0f33231ee6 We are able to serve a file 2020-11-21 15:15:25 +01:00
Quentin d4c7f4e374 Fix host to key 2020-11-21 12:01:02 +01:00
Quentin 2f6eca4ef3 Merge remote-tracking branch 'origin/master' into feature/website 2020-11-21 10:52:27 +01:00
Quentin 5b363626f4 Support punnycode 2020-11-20 21:23:32 +01:00
Quentin 2e94275e68 Merge branch 'master' into feature/website 2020-11-19 18:16:33 +01:00
Quentin ab71048176 Fix example 2020-11-19 15:10:04 +01:00
Quentin 04f455ff7f Make it compile again 2020-11-19 14:56:00 +01:00
Quentin fc427b0b66 Merge branch 'master' into feature/website 2020-11-19 14:39:30 +01:00
Quentin 6076d869b1 Build error 2020-11-11 21:17:34 +01:00
Quentin 2765291796 Build path correctly 2020-11-11 19:48:01 +01:00
Quentin d445c4ef9c WIP fetch object 2020-11-11 15:24:25 +01:00
Quentin 3cb3994cd2 Add documentation to host_to_bucket 2020-11-10 17:05:10 +01:00
Quentin cacf8ddf2d Panic when it is a logical error 2020-11-10 15:52:20 +01:00
Quentin d1b2fcc1e7 Rewrite for clarity 2020-11-10 15:48:40 +01:00
Quentin ab62c59acb Fix indent again 2020-11-10 15:40:33 +01:00
Quentin 8797eed0ab Fixes due to integration tests 2020-11-10 15:32:04 +01:00
Quentin 1e52ee9f5b Rewrite authority to host while staying on stack 2020-11-10 15:26:48 +01:00
Quentin 27795a390c Fix formatting 2020-11-10 09:59:52 +01:00
Quentin 4093833ae8 Extract bucket 2020-11-10 09:57:07 +01:00
Quentin 09137fd6b5 Log host 2020-11-08 16:06:52 +01:00
Quentin c78df603d7 Add some documentation 2020-11-08 16:02:16 +01:00
Quentin 71721f5bcf Merge branch 'master' into feature/website 2020-11-08 15:53:33 +01:00
Quentin 0791e7164e Parse host header 2020-11-08 15:47:25 +01:00
Quentin fdbf5100cd Merge branch 'master' into feature/website 2020-11-07 12:49:12 +01:00
Quentin 154f71f410 Fix README + create dev config file 2020-11-06 14:23:56 +01:00
Quentin 0d3bc169ee It compiles! 2020-11-03 12:37:16 +01:00
Quentin b3caa3628d Fix description of the crate 2020-11-02 15:57:23 +01:00
Quentin cea871d944 Skeleton to the new web API 2020-11-02 15:48:39 +01:00
Quentin 104e2ce0a2 Add "web" configuration entry 2020-10-31 17:28:56 +01:00
128 changed files with 11108 additions and 4259 deletions

123
.drone.yml Normal file
View file

@ -0,0 +1,123 @@
---
kind: pipeline
name: default
workspace:
base: /drone/garage
steps:
- name: restore-cache
image: meltwater/drone-cache:dev
environment:
AWS_ACCESS_KEY_ID:
from_secret: cache_aws_access_key_id
AWS_SECRET_ACCESS_KEY:
from_secret: cache_aws_secret_access_key
pull: true
settings:
restore: true
archive_format: "gzip"
bucket: drone-cache
cache_key: '{{ .Repo.Name }}_{{ checksum "Cargo.lock" }}_{{ arch }}_{{ os }}_gzip'
region: garage
mount:
- 'target'
- '/drone/cargo/registry/index'
- '/drone/cargo/registry/cache'
- '/drone/cargo/bin'
- '/drone/cargo/git/db'
path_style: true
endpoint: https://garage.deuxfleurs.fr
- name: build
image: rust:buster
environment:
CARGO_HOME: /drone/cargo
commands:
- apt-get update
- apt-get install --yes libsodium-dev
- rustup component add rustfmt
- pwd
- cargo fmt -- --check
- cargo build
- name: cargo-test
image: rust:buster
environment:
CARGO_HOME: /drone/cargo
commands:
- apt-get update
- apt-get install --yes libsodium-dev
- cargo test
- name: rebuild-cache
image: meltwater/drone-cache:dev
environment:
AWS_ACCESS_KEY_ID:
from_secret: cache_aws_access_key_id
AWS_SECRET_ACCESS_KEY:
from_secret: cache_aws_secret_access_key
pull: true
settings:
rebuild: true
archive_format: "gzip"
bucket: drone-cache
cache_key: '{{ .Repo.Name }}_{{ checksum "Cargo.lock" }}_{{ arch }}_{{ os }}_gzip'
region: garage
mount:
- 'target'
- '/drone/cargo/registry/index'
- '/drone/cargo/registry/cache'
- '/drone/cargo/git/db'
- '/drone/cargo/bin'
path_style: true
endpoint: https://garage.deuxfleurs.fr
- name: smoke-test
image: rust:buster
environment:
CARGO_HOME: /drone/cargo
commands:
- apt-get update
- apt-get install --yes libsodium-dev awscli python-pip
- pip install s3cmd
- ./script/test-smoke.sh || (cat /tmp/garage.log; false)
---
kind: pipeline
name: website
steps:
- name: build
image: hrektts/mdbook
commands:
- cd doc/book
- mdbook build
- name: upload
image: plugins/s3
settings:
bucket: garagehq.deuxfleurs.fr
access_key:
from_secret: garagehq_aws_access_key_id
secret_key:
from_secret: garagehq_aws_secret_access_key
source: doc/book/book/**/*
strip_prefix: doc/book/book/
target: /
path_style: true
endpoint: https://garage.deuxfleurs.fr
region: garage
when:
event:
- push
branch:
- main
repo:
- Deuxfleurs/garage
---
kind: signature
hmac: bfe75f47e5eecdd1f6dd8fd3cf1ea359b0215243d06ac767c51a4b4e363e963e
...

1158
Cargo.lock generated

File diff suppressed because it is too large Load diff

View file

@ -5,6 +5,7 @@ members = [
"src/table",
"src/model",
"src/api",
"src/web",
"src/garage",
]

View file

@ -3,7 +3,7 @@ FROM archlinux:latest
RUN mkdir -p /garage/meta
RUN mkdir -p /garage/data
ENV RUST_BACKTRACE=1
ENV RUST_LOG=garage=debug
ENV RUST_LOG=garage=info
COPY target/release/garage.stripped /garage/garage

142
LICENSE
View file

@ -1,5 +1,5 @@
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
GNU AFFERO GENERAL PUBLIC LICENSE
Version 3, 19 November 2007
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
@ -7,17 +7,15 @@
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
our General Public Licenses are intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
software for all its users.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
@ -26,44 +24,34 @@ them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate. Many developers of free software are heartened and
encouraged by the resulting cooperation. However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community. It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server. Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals. This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.
The precise terms and conditions for copying, distribution and
modification follow.
@ -72,7 +60,7 @@ modification follow.
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"This License" refers to version 3 of the GNU Affero General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
@ -549,35 +537,45 @@ to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
13. Remote Network Interaction; Use with the GNU General Public License.
Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software. This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
under version 3 of the GNU General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
the GNU Affero General Public License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Program specifies that a certain numbered version of the GNU Affero General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
GNU Affero General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
versions of the GNU Affero General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
@ -635,41 +633,29 @@ the "copyright" line and a pointer to where the full notice is found.
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
GNU Affero General Public License for more details.
You should have received a copy of the GNU General Public License
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
<program> Copyright (C) <year> <name of author>
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source. For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code. There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
For more information on this, and how to apply and follow the GNU AGPL, see
<https://www.gnu.org/licenses/>.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
<https://www.gnu.org/licenses/why-not-lgpl.html>.

View file

@ -2,19 +2,17 @@ BIN=target/release/garage
DOCKER=lxpz/garage_amd64
all:
#cargo fmt || true
#RUSTFLAGS="-C link-arg=-fuse-ld=lld" cargo build
cargo build
clear; cargo build
$(BIN):
#RUSTFLAGS="-C link-arg=-fuse-ld=lld" cargo build --release
cargo build --release
RUSTFLAGS="-C link-arg=-fuse-ld=lld -C target-cpu=x86-64 -C target-feature=+sse2" cargo build --release --no-default-features
$(BIN).stripped: $(BIN)
cp $^ $@
strip $@
docker: $(BIN).stripped
docker pull archlinux:latest
docker build -t $(DOCKER):$(TAG) .
docker push $(DOCKER):$(TAG)
docker tag $(DOCKER):$(TAG) $(DOCKER):latest

117
README.md
View file

@ -1,4 +1,11 @@
# Garage
Garage [![Build Status](https://drone.deuxfleurs.fr/api/badges/Deuxfleurs/garage/status.svg?ref=refs/heads/main)](https://drone.deuxfleurs.fr/Deuxfleurs/garage)
===
<p align="center" style="text-align:center;">
<a href="https://garagehq.deuxfleurs.fr">
<img alt="Garage logo" src="doc/logo/garage.png" height="200" />
</a>
</p>
Garage is a lightweight S3-compatible distributed object store, with the following goals:
@ -12,112 +19,8 @@ Non-goals include:
- Extremely high performance
- Complete implementation of the S3 API
- Erasure coding (our replication model is simply to copy the data as is on several nodes)
- Erasure coding (our replication model is simply to copy the data as is on several nodes, in different datacenters if possible)
Our main use case is to provide a distributed storage layer for small-scale self hosted services such as [Deuxfleurs](https://deuxfleurs.fr).
## Development
We propose the following quickstart to setup a full dev. environment as quickly as possible:
1. Setup a rust/cargo environment and install s3cmd. eg. `dnf install rust cargo s3cmd`
2. Run `cargo build` to build the project
3. Run `./script/dev-cluster.sh` to launch a test cluster (feel free to read the script)
4. Run `./script/dev-configure.sh` to configure your test cluster with default values (same datacenter, 100 tokens)
5. Run `./script/dev-bucket.sh` to create a bucket named `éprouvette` and an API key that will be stored in `/tmp/garage.s3`
6. Run `source ./script/dev-env.sh` to configure your CLI environment
7. You can use `garage` to manage the cluster. Try `garage --help`.
8. You can use `s3grg` to add, remove, and delete files. Try `s3grg --help`, `s3grg put /proc/cpuinfo s3://éprouvette/cpuinfo.txt`, `s3grg ls s3://éprouvette`. `s3grg` is a wrapper on `s3cmd` configured with the previously generated API key (the one in `/tmp/garage.s3`).
Now you should be ready to start hacking on garage!
## Setting up Garage
Use the `genkeys.sh` script to generate TLS keys for encrypting communications between Garage nodes.
The script takes no arguments and will generate keys in `pki/`.
This script creates a certificate authority `garage-ca` which signs certificates for individual Garage nodes.
Garage nodes from a same cluster authenticate themselves by verifying that they have certificates signed by the same certificate authority.
Garage requires two locations to store its data: a metadata directory, and a data directory.
The metadata directory is used to store metadata such as object lists, and should ideally be located on an SSD drive.
The data directory is used to store the chunks of data of the objects stored in Garage.
In a typical deployment the data directory is stored on a standard HDD.
Garage does not handle TLS for its S3 API endpoint. This should be handled by adding a reverse proxy.
Create a configuration file with the following structure:
```
block_size = 1048576 # objects are split in blocks of maximum this number of bytes
metadata_dir = "/path/to/ssd/metadata/directory"
data_dir = "/path/to/hdd/data/directory"
rpc_bind_addr = "[::]:3901" # the port other Garage nodes will use to talk to this node
bootstrap_peers = [
# Ideally this list should contain the IP addresses of all other Garage nodes of the cluster.
# Use Ansible or any kind of configuration templating to generate this automatically.
"10.0.0.1:3901",
"10.0.0.2:3901",
"10.0.0.3:3901",
]
# optionnal: garage can find cluster nodes automatically using a Consul server
# garage only does lookup but does not register itself, registration should be handled externally by e.g. Nomad
consul_host = "localhost:8500" # optionnal: host name of a Consul server for automatic peer discovery
consul_service_name = "garage" # optionnal: service name to look up on Consul
max_concurrent_rpc_requests = 12
data_replication_factor = 3
meta_replication_factor = 3
meta_epidemic_fanout = 3
[rpc_tls]
# NOT RECOMMENDED: you can skip this section if you don't want to encrypt intra-cluster traffic
# Thanks to genkeys.sh, generating the keys and certificates is easy, so there is NO REASON NOT TO DO IT.
ca_cert = "/path/to/garage/pki/garage-ca.crt"
node_cert = "/path/to/garage/pki/garage.crt"
node_key = "/path/to/garage/pki/garage.key"
[s3_api]
api_bind_addr = "[::1]:3900" # the S3 API port, HTTP without TLS. Add a reverse proxy for the TLS part.
s3_region = "garage" # set this to anything. S3 API calls will fail if they are not made against the region set here.
[s3_web]
web_bind_addr = "[::1]:3902"
```
Build Garage using `cargo build --release`.
Then, run it using either `./target/release/garage server -c path/to/config_file.toml` or `cargo run --release -- server -c path/to/config_file.toml`.
Set the `RUST_LOG` environment to `garage=debug` to dump some debug information.
Set it to `garage=trace` to dump even more debug information.
Set it to `garage=warn` to show nothing except warnings and errors.
## Setting up cluster nodes
Once all your `garage` nodes are running, you will need to:
1. check that they are correctly talking to one another;
2. configure them with their physical location (in the case of a multi-dc deployment) and a number of "ring tokens" proportionnal to the storage space available on each node;
3. create some S3 API keys and buckets;
4. ???;
5. profit!
To run these administrative tasks, you will need to use the `garage` command line tool and it to connect to any of the cluster's nodes on the RPC port.
The `garage` CLI also needs TLS keys and certificates of its own to authenticate and be authenticated in the cluster.
A typicall invocation will be as follows:
```
./target/release/garage --ca-cert=pki/garage-ca.crt --client-cert=pki/garage-client.crt --client-key=pki/garage-client.key <...>
```
## Notes to self
### What to repair
- `tables`: to do a full sync of metadata, should not be necessary because it is done every hour by the system
- `versions` and `block_refs`: very time consuming, usefull if deletions have not been propagated, improves garbage collection
- `blocks`: very usefull to resync/rebalance blocks betweeen nodes
**[Go to the documentation](https://garagehq.deuxfleurs.fr)**

View file

@ -17,4 +17,6 @@ api_bind_addr = "[::1]:3900" # the S3 API port, HTTP without TLS. Add a reverse
s3_region = "garage" # set this to anything. S3 API calls will fail if they are not made against the region set here.
[s3_web]
web_bind_addr = "[::1]:3902"
bind_addr = "[::1]:3902"
root_domain = ".garage.tld"
index = "index.html"

12
doc/20201202_talk/.gitignore vendored Normal file
View file

@ -0,0 +1,12 @@
*
!img
!.gitignore
!*.svg
!*.png
!*.jpg
!*.tex
!Makefile
!.gitignore
!talk.pdf

View file

@ -0,0 +1,6 @@
talk.pdf: talk.tex img/garage_distributed.pdf img/consistent_hashing_1.pdf img/consistent_hashing_2.pdf img/consistent_hashing_3.pdf img/consistent_hashing_4.pdf img/garage_tables.pdf
pdflatex talk.tex
img/%.pdf: img/%.svg
inkscape -D -z --file=$^ --export-pdf=$@

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 53 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 54 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 56 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 360 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

View file

@ -0,0 +1,404 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
width="480"
height="480"
viewBox="0 0 127 127"
version="1.1"
id="svg8"
inkscape:version="1.0.1 (3bc2e813f5, 2020-09-07)"
sodipodi:docname="garage_distributed.svg">
<defs
id="defs2" />
<sodipodi:namedview
id="base"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageopacity="0.0"
inkscape:pageshadow="2"
inkscape:zoom="1.979899"
inkscape:cx="171.34852"
inkscape:cy="170.69443"
inkscape:document-units="mm"
inkscape:current-layer="layer1-3"
inkscape:document-rotation="0"
showgrid="false"
units="px"
inkscape:window-width="1404"
inkscape:window-height="1016"
inkscape:window-x="103"
inkscape:window-y="27"
inkscape:window-maximized="0" />
<metadata
id="metadata5">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title />
</cc:Work>
</rdf:RDF>
</metadata>
<g
inkscape:label="Layer 1"
inkscape:groupmode="layer"
id="layer1">
<g
inkscape:label="Layer 1"
id="layer1-3"
transform="matrix(0.42851498,0,0,0.42851498,24.079728,-24.925134)">
<path
style="fill:none;stroke:#000000;stroke-width:2.065;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 66.78016,80.71889 99.921832,61.598165 132.84481,80.509232 V 127.38418 H 66.701651 Z"
id="path124"
sodipodi:nodetypes="cccccc" />
<g
id="g1106-5"
transform="matrix(0,0.95201267,-0.95201267,0,194.01664,-57.627274)"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<g
id="g1061-3"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path956-5"
cx="168.8569"
cy="92.889587"
r="13.125794" />
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path958-6"
cx="168.77444"
cy="92.702293"
r="3.0778286" />
<path
id="path960-2"
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 169.46072,82.84435 c 4.95795,0.336608 8.87296,4.341959 9.09638,9.306301"
sodipodi:nodetypes="cc" />
</g>
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 154.67824,112.84018 11.89881,-13.038071 c 1.46407,-1.552664 3.79541,0.878511 2.81832,2.089181 l -10.57965,14.481 c -1.8851,2.02632 -6.10786,-1.06119 -4.13748,-3.53211 z"
id="path964-9"
sodipodi:nodetypes="ccccc" />
<g
id="g1071-1"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none" />
<g
id="g1065-3"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<rect
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="rect949-6"
width="35.576611"
height="48.507355"
x="150.9623"
y="74.698929"
ry="2.7302756" />
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 150.76919,106.16944 6.36181,-0.0223 c 2.53845,3.46232 6.29787,4.20243 10.1055,4.40362 l 0.0176,13.09251"
id="path1033-0"
sodipodi:nodetypes="cccc" />
</g>
</g>
</g>
<g
inkscape:label="Layer 1"
id="layer1-3-5"
transform="matrix(0.42851499,0,0,0.42851499,68.181495,12.180995)">
<path
style="fill:none;stroke:#000000;stroke-width:2.065;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 66.78016,73.340623 99.921832,54.219898 132.84481,73.130965 V 120.00591 H 66.701651 Z"
id="path124-6"
sodipodi:nodetypes="cccccc" />
<g
id="g1106-5-2"
transform="matrix(0,0.95201267,-0.95201267,0,194.01664,-65.058377)"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<g
id="g1061-3-9"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path956-5-1"
cx="168.8569"
cy="92.889587"
r="13.125794" />
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path958-6-2"
cx="168.77444"
cy="92.702293"
r="3.0778286" />
<path
id="path960-2-7"
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 169.46072,82.84435 c 4.95795,0.336608 8.87296,4.341959 9.09638,9.306301"
sodipodi:nodetypes="cc" />
</g>
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 154.67824,112.84018 11.89881,-13.038071 c 1.46407,-1.552664 3.79541,0.878511 2.81832,2.089181 l -10.57965,14.481 c -1.8851,2.02632 -6.10786,-1.06119 -4.13748,-3.53211 z"
id="path964-9-0"
sodipodi:nodetypes="ccccc" />
<g
id="g1071-1-9"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none" />
<g
id="g1065-3-3"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<rect
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="rect949-6-6"
width="35.576611"
height="48.507355"
x="150.9623"
y="74.698929"
ry="2.7302756" />
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 150.76919,106.16944 6.36181,-0.0223 c 2.53845,3.46232 6.29787,4.20243 10.1055,4.40362 l 0.0176,13.09251"
id="path1033-0-0"
sodipodi:nodetypes="cccc" />
</g>
</g>
</g>
<g
inkscape:label="Layer 1"
id="layer1-3-6"
transform="matrix(0.42851499,0,0,0.42851499,-20.953301,19.351613)">
<path
style="fill:none;stroke:#000000;stroke-width:2.065;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 66.78016,73.340623 99.921832,54.219898 132.84481,73.130965 V 120.00591 H 66.701651 Z"
id="path124-2"
sodipodi:nodetypes="cccccc" />
<g
id="g1106-5-6"
transform="matrix(0,0.95201267,-0.95201267,0,194.01664,-65.058377)"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<g
id="g1061-3-1"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path956-5-8"
cx="168.8569"
cy="92.889587"
r="13.125794" />
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path958-6-7"
cx="168.77444"
cy="92.702293"
r="3.0778286" />
<path
id="path960-2-9"
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 169.46072,82.84435 c 4.95795,0.336608 8.87296,4.341959 9.09638,9.306301"
sodipodi:nodetypes="cc" />
</g>
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 154.67824,112.84018 11.89881,-13.038071 c 1.46407,-1.552664 3.79541,0.878511 2.81832,2.089181 l -10.57965,14.481 c -1.8851,2.02632 -6.10786,-1.06119 -4.13748,-3.53211 z"
id="path964-9-2"
sodipodi:nodetypes="ccccc" />
<g
id="g1071-1-0"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none" />
<g
id="g1065-3-2"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<rect
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="rect949-6-3"
width="35.576611"
height="48.507355"
x="150.9623"
y="74.698929"
ry="2.7302756" />
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 150.76919,106.16944 6.36181,-0.0223 c 2.53845,3.46232 6.29787,4.20243 10.1055,4.40362 l 0.0176,13.09251"
id="path1033-0-7"
sodipodi:nodetypes="cccc" />
</g>
</g>
</g>
<g
inkscape:label="Layer 1"
id="layer1-3-59"
transform="matrix(0.42851499,0,0,0.42851499,51.949789,75.218277)">
<path
style="fill:none;stroke:#000000;stroke-width:2.065;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 66.78016,73.340623 99.921832,54.219898 132.84481,73.130965 V 120.00591 H 66.701651 Z"
id="path124-22"
sodipodi:nodetypes="cccccc" />
<g
id="g1106-5-8"
transform="matrix(0,0.95201267,-0.95201267,0,194.01664,-65.058377)"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<g
id="g1061-3-97"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path956-5-3"
cx="168.8569"
cy="92.889587"
r="13.125794" />
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path958-6-6"
cx="168.77444"
cy="92.702293"
r="3.0778286" />
<path
id="path960-2-1"
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 169.46072,82.84435 c 4.95795,0.336608 8.87296,4.341959 9.09638,9.306301"
sodipodi:nodetypes="cc" />
</g>
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 154.67824,112.84018 11.89881,-13.038071 c 1.46407,-1.552664 3.79541,0.878511 2.81832,2.089181 l -10.57965,14.481 c -1.8851,2.02632 -6.10786,-1.06119 -4.13748,-3.53211 z"
id="path964-9-29"
sodipodi:nodetypes="ccccc" />
<g
id="g1071-1-3"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none" />
<g
id="g1065-3-1"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<rect
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="rect949-6-9"
width="35.576611"
height="48.507355"
x="150.9623"
y="74.698929"
ry="2.7302756" />
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 150.76919,106.16944 6.36181,-0.0223 c 2.53845,3.46232 6.29787,4.20243 10.1055,4.40362 l 0.0176,13.09251"
id="path1033-0-4"
sodipodi:nodetypes="cccc" />
</g>
</g>
</g>
<g
inkscape:label="Layer 1"
id="layer1-3-7"
transform="matrix(0.42851499,0,0,0.42851499,-1.173447,75.150288)">
<path
style="fill:none;stroke:#000000;stroke-width:2.065;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 66.78016,73.340623 99.921832,54.219898 132.84481,73.130965 V 120.00591 H 66.701651 Z"
id="path124-8"
sodipodi:nodetypes="cccccc" />
<g
id="g1106-5-4"
transform="matrix(0,0.95201267,-0.95201267,0,194.01664,-65.058377)"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<g
id="g1061-3-5"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path956-5-0"
cx="168.8569"
cy="92.889587"
r="13.125794" />
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path958-6-3"
cx="168.77444"
cy="92.702293"
r="3.0778286" />
<path
id="path960-2-6"
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 169.46072,82.84435 c 4.95795,0.336608 8.87296,4.341959 9.09638,9.306301"
sodipodi:nodetypes="cc" />
</g>
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 154.67824,112.84018 11.89881,-13.038071 c 1.46407,-1.552664 3.79541,0.878511 2.81832,2.089181 l -10.57965,14.481 c -1.8851,2.02632 -6.10786,-1.06119 -4.13748,-3.53211 z"
id="path964-9-1"
sodipodi:nodetypes="ccccc" />
<g
id="g1071-1-06"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none" />
<g
id="g1065-3-32"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<rect
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="rect949-6-0"
width="35.576611"
height="48.507355"
x="150.9623"
y="74.698929"
ry="2.7302756" />
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 150.76919,106.16944 6.36181,-0.0223 c 2.53845,3.46232 6.29787,4.20243 10.1055,4.40362 l 0.0176,13.09251"
id="path1033-0-6"
sodipodi:nodetypes="cccc" />
</g>
</g>
</g>
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 35.21897,43.254452 46.803736,32.872178"
id="path1045" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 85.798392,29.613721 10.944185,7.688225"
id="path1047" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 107.59813,71.879386 -6.2564,22.552649"
id="path1049" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 75.866769,119.14997 61.529058,118.74136"
id="path1051"
sodipodi:nodetypes="cc" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 29.414211,98.256475 C 29.681482,96.462435 21.07721,77.446418 21.07721,77.446418"
id="path1053" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 39.447822,61.341585 90.641428,57.562618"
id="path1055" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 90.440176,64.423751 54.180736,100.02908"
id="path1057"
sodipodi:nodetypes="cc" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 47.163557,96.532205 61.535331,33.078667"
id="path1059"
sodipodi:nodetypes="cc" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 71.396211,33.058731 15.77285,60.595014"
id="path1061" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 79.384641,100.96895 41.150775,67.902625"
id="path1063" />
</g>
</svg>

After

Width:  |  Height:  |  Size: 19 KiB

View file

@ -0,0 +1,502 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
width="850"
height="480"
viewBox="0 0 224.89584 127"
version="1.1"
id="svg8"
inkscape:version="1.0.1 (3bc2e813f5, 2020-09-07)"
sodipodi:docname="garage_tables.svg">
<defs
id="defs2">
<marker
style="overflow:visible"
id="marker1262"
refX="0"
refY="0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:isstock="true">
<path
transform="matrix(-0.4,0,0,-0.4,-4,0)"
style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1"
d="M 0,0 5,-5 -12.5,0 5,5 Z"
id="path1260" />
</marker>
<marker
style="overflow:visible"
id="Arrow1Mend"
refX="0"
refY="0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:isstock="true"
inkscape:collect="always">
<path
transform="matrix(-0.4,0,0,-0.4,-4,0)"
style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1"
d="M 0,0 5,-5 -12.5,0 5,5 Z"
id="path965" />
</marker>
<marker
style="overflow:visible"
id="Arrow1Lend"
refX="0"
refY="0"
orient="auto"
inkscape:stockid="Arrow1Lend"
inkscape:isstock="true">
<path
transform="matrix(-0.8,0,0,-0.8,-10,0)"
style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1"
d="M 0,0 5,-5 -12.5,0 5,5 Z"
id="path959" />
</marker>
</defs>
<sodipodi:namedview
id="base"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageopacity="0.0"
inkscape:pageshadow="2"
inkscape:zoom="0.98994949"
inkscape:cx="381.09221"
inkscape:cy="219.5592"
inkscape:document-units="mm"
inkscape:current-layer="layer1"
inkscape:document-rotation="0"
showgrid="false"
units="px"
inkscape:window-width="1867"
inkscape:window-height="1016"
inkscape:window-x="53"
inkscape:window-y="27"
inkscape:window-maximized="1" />
<metadata
id="metadata5">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title></dc:title>
</cc:Work>
</rdf:RDF>
</metadata>
<g
inkscape:label="Layer 1"
inkscape:groupmode="layer"
id="layer1">
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="39.570904"
y="38.452755"
id="text2025"><tspan
sodipodi:role="line"
id="tspan2023"
x="39.570904"
y="38.452755"
style="font-size:5.64444px;stroke-width:0.264583" /></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:10.5833px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="101.95796"
y="92.835831"
id="text2139"><tspan
sodipodi:role="line"
id="tspan2137"
x="101.95796"
y="92.835831"
style="stroke-width:0.264583"> </tspan></text>
<g
id="g2316"
transform="translate(-11.455511,1.5722486)">
<g
id="g2277">
<rect
style="fill:none;stroke:#000000;stroke-width:0.8;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833"
width="47.419891"
height="95.353409"
x="18.534418"
y="24.42766" />
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-3"
width="47.419891"
height="86.973076"
x="18.534418"
y="32.807987" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="32.250839"
y="29.894743"
id="text852"><tspan
sodipodi:role="line"
id="tspan850"
x="32.250839"
y="29.894743"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Object</tspan></text>
</g>
<g
id="g2066"
transform="translate(-2.1807817,-3.0621439)">
<g
id="g1969"
transform="matrix(0.12763631,0,0,0.12763631,0.7215051,24.717273)"
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-opacity:1">
<path
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-width:0.264583;stroke-opacity:1"
d="m 203.71837,154.80038 c -1.11451,3.75057 -2.45288,5.84095 -5.11132,7.98327 -2.2735,1.83211 -4.66721,2.65982 -8.09339,2.79857 -2.59227,0.10498 -2.92868,0.0577 -5.02863,-0.70611 -3.99215,-1.45212 -7.1627,-4.65496 -8.48408,-8.57046 -1.28374,-3.80398 -0.61478,-8.68216 1.64793,-12.01698 0.87317,-1.28689 3.15089,-3.48326 4.18771,-4.03815 l 0.53332,-28.51234 5.78454,-5.09197 6.95158,6.16704 -3.21112,3.49026 3.17616,3.45499 -3.17616,3.40822 2.98973,3.28645 -3.24843,3.3829 4.49203,4.58395 0.0516,5.69106 c 1.06874,0.64848 3.81974,3.24046 4.69548,4.56257 0.452,0.68241 1.06834,2.0197 1.36962,2.97176 0.62932,1.98864 0.88051,5.785 0.47342,7.15497 z m -10.0406,2.32604 c -0.88184,-3.17515 -4.92402,-3.78864 -6.75297,-1.02492 -0.58328,0.8814 -0.6898,1.28852 -0.58362,2.23056 0.26492,2.35041 2.45434,3.95262 4.60856,3.37255 1.19644,-0.32217 2.39435,-1.44872 2.72875,-2.56621 0.30682,-1.02529 0.30686,-0.9045 -7.9e-4,-2.01198 z"
id="path1971"
sodipodi:nodetypes="ssscsscccccccccccssscsssscc" />
</g>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="28.809687"
y="44.070885"
id="text852-9"><tspan
sodipodi:role="line"
id="tspan850-4"
x="28.809687"
y="44.070885"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">bucket </tspan></text>
</g>
<g
id="g2066-7"
transform="translate(-2.1807817,6.2627616)">
<g
id="g1969-8"
transform="matrix(0.12763631,0,0,0.12763631,0.7215051,24.717273)"
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-opacity:1">
<path
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-width:0.264583;stroke-opacity:1"
d="m 203.71837,154.80038 c -1.11451,3.75057 -2.45288,5.84095 -5.11132,7.98327 -2.2735,1.83211 -4.66721,2.65982 -8.09339,2.79857 -2.59227,0.10498 -2.92868,0.0577 -5.02863,-0.70611 -3.99215,-1.45212 -7.1627,-4.65496 -8.48408,-8.57046 -1.28374,-3.80398 -0.61478,-8.68216 1.64793,-12.01698 0.87317,-1.28689 3.15089,-3.48326 4.18771,-4.03815 l 0.53332,-28.51234 5.78454,-5.09197 6.95158,6.16704 -3.21112,3.49026 3.17616,3.45499 -3.17616,3.40822 2.98973,3.28645 -3.24843,3.3829 4.49203,4.58395 0.0516,5.69106 c 1.06874,0.64848 3.81974,3.24046 4.69548,4.56257 0.452,0.68241 1.06834,2.0197 1.36962,2.97176 0.62932,1.98864 0.88051,5.785 0.47342,7.15497 z m -10.0406,2.32604 c -0.88184,-3.17515 -4.92402,-3.78864 -6.75297,-1.02492 -0.58328,0.8814 -0.6898,1.28852 -0.58362,2.23056 0.26492,2.35041 2.45434,3.95262 4.60856,3.37255 1.19644,-0.32217 2.39435,-1.44872 2.72875,-2.56621 0.30682,-1.02529 0.30686,-0.9045 -7.9e-4,-2.01198 z"
id="path1971-4"
sodipodi:nodetypes="ssscsscccccccccccssscsssscc" />
</g>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="28.809687"
y="44.070885"
id="text852-9-5"><tspan
sodipodi:role="line"
id="tspan850-4-0"
x="28.809687"
y="44.070885"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">file path </tspan></text>
</g>
<g
id="g2161"
transform="translate(-62.264403,-59.333115)">
<g
id="g2271"
transform="translate(0,67.042823)">
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-6"
width="39.008453"
height="16.775949"
x="84.896881"
y="90.266838" />
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-3-1"
width="39.008453"
height="8.673645"
x="84.896881"
y="98.369141" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="89.826942"
y="96.212921"
id="text852-0"><tspan
sodipodi:role="line"
id="tspan850-6"
x="89.826942"
y="96.212921"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Version 1</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="89.826942"
y="104.71013"
id="text852-0-3"><tspan
sodipodi:role="line"
id="tspan850-6-2"
x="89.826942"
y="104.71013"
style="font-style:italic;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';fill:#4d4d4d;stroke-width:0.264583">deleted</tspan></text>
</g>
</g>
<g
id="g2263"
transform="translate(0,-22.791204)">
<g
id="g2161-1"
transform="translate(-62.264403,-10.910843)">
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-6-5"
width="39.008453"
height="36.749603"
x="84.896881"
y="90.266838" />
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-3-1-5"
width="39.008453"
height="28.647301"
x="84.896881"
y="98.369141" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="89.826942"
y="96.212921"
id="text852-0-4"><tspan
sodipodi:role="line"
id="tspan850-6-7"
x="89.826942"
y="96.212921"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Version 2</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="89.826942"
y="104.71013"
id="text852-0-3-6"><tspan
sodipodi:role="line"
id="tspan850-6-2-5"
x="89.826942"
y="104.71013"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';fill:#000000;stroke-width:0.264583">id</tspan></text>
</g>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="27.56254"
y="100.34132"
id="text852-0-3-6-6"><tspan
sodipodi:role="line"
id="tspan850-6-2-5-9"
x="27.56254"
y="100.34132"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';fill:#000000;stroke-width:0.264583">size</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="27.56254"
y="106.90263"
id="text852-0-3-6-6-3"><tspan
sodipodi:role="line"
id="tspan850-6-2-5-9-7"
x="27.56254"
y="106.90263"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';fill:#000000;stroke-width:0.264583">MIME type</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="27.56254"
y="111.92816"
id="text852-0-3-6-6-3-4"><tspan
sodipodi:role="line"
id="tspan850-6-2-5-9-7-5"
x="27.56254"
y="111.92816"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';fill:#000000;stroke-width:0.264583">...</tspan></text>
</g>
</g>
<g
id="g898"
transform="translate(-6.2484318,29.95006)">
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-7"
width="47.419891"
height="44.007515"
x="95.443573"
y="24.42766" />
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-3-4"
width="47.419891"
height="35.627186"
x="95.443573"
y="32.807987" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="107.46638"
y="29.894743"
id="text852-4"><tspan
sodipodi:role="line"
id="tspan850-3"
x="107.46638"
y="29.894743"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Version</tspan></text>
<path
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-width:0.0337704;stroke-opacity:1"
d="m 102.90563,41.413279 c -0.14226,0.478709 -0.31308,0.745518 -0.65239,1.018956 -0.29019,0.233843 -0.59571,0.339489 -1.03301,0.357199 -0.33087,0.0134 -0.37381,0.0074 -0.64184,-0.09013 -0.50954,-0.185343 -0.914221,-0.594142 -1.082877,-1.093901 -0.163852,-0.485526 -0.07847,-1.108159 0.210335,-1.533803 0.111448,-0.164254 0.402172,-0.444591 0.534502,-0.515415 l 0.0681,-3.63921 0.73832,-0.64992 0.88727,0.787138 -0.40985,0.445484 0.40539,0.440982 -0.40539,0.435013 0.3816,0.41947 -0.41462,0.431781 0.57335,0.585078 0.007,0.726386 c 0.13641,0.08277 0.48753,0.413601 0.59931,0.58235 0.0577,0.0871 0.13636,0.257787 0.17481,0.379304 0.0803,0.253823 0.11239,0.738377 0.0604,0.913234 z m -1.28155,0.296888 c -0.11255,-0.405265 -0.62848,-0.483569 -0.86192,-0.130817 -0.0744,0.112498 -0.088,0.164461 -0.0745,0.2847 0.0338,0.299998 0.31326,0.504498 0.58822,0.43046 0.15271,-0.04112 0.3056,-0.184909 0.34828,-0.327542 0.0392,-0.130864 0.0392,-0.115447 -1e-4,-0.256801 z"
id="path1971-0"
sodipodi:nodetypes="ssscsscccccccccccssscsssscc" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="41.008743"
id="text852-9-7"><tspan
sodipodi:role="line"
id="tspan850-4-8"
x="104.99195"
y="41.008743"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">id </tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="49.168018"
id="text852-9-7-6"><tspan
sodipodi:role="line"
id="tspan850-4-8-8"
x="104.99195"
y="49.168018"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">h(block 1)</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="56.583336"
id="text852-9-7-6-8"><tspan
sodipodi:role="line"
id="tspan850-4-8-8-4"
x="104.99195"
y="56.583336"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">h(block 2)</tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="64.265732"
id="text852-9-7-6-3"><tspan
sodipodi:role="line"
id="tspan850-4-8-8-1"
x="104.99195"
y="64.265732"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">...</tspan></text>
</g>
<g
id="g898-3"
transform="translate(75.777779,38.888663)">
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-7-6"
width="47.419891"
height="29.989157"
x="95.443573"
y="24.42766" />
<rect
style="fill:none;stroke:#000000;stroke-width:0.799999;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1"
id="rect833-3-4-7"
width="47.419891"
height="21.608831"
x="95.443573"
y="32.807987" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="102.11134"
y="29.894743"
id="text852-4-5"><tspan
sodipodi:role="line"
id="tspan850-3-3"
x="102.11134"
y="29.894743"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Data block</tspan></text>
<path
style="fill:#ff6600;fill-opacity:1;stroke:none;stroke-width:0.0337704;stroke-opacity:1"
d="m 102.90563,41.413279 c -0.14226,0.478709 -0.31308,0.745518 -0.65239,1.018956 -0.29019,0.233843 -0.59571,0.339489 -1.03301,0.357199 -0.33087,0.0134 -0.37381,0.0074 -0.64184,-0.09013 -0.50954,-0.185343 -0.914221,-0.594142 -1.082877,-1.093901 -0.163852,-0.485526 -0.07847,-1.108159 0.210335,-1.533803 0.111448,-0.164254 0.402172,-0.444591 0.534502,-0.515415 l 0.0681,-3.63921 0.73832,-0.64992 0.88727,0.787138 -0.40985,0.445484 0.40539,0.440982 -0.40539,0.435013 0.3816,0.41947 -0.41462,0.431781 0.57335,0.585078 0.007,0.726386 c 0.13641,0.08277 0.48753,0.413601 0.59931,0.58235 0.0577,0.0871 0.13636,0.257787 0.17481,0.379304 0.0803,0.253823 0.11239,0.738377 0.0604,0.913234 z m -1.28155,0.296888 c -0.11255,-0.405265 -0.62848,-0.483569 -0.86192,-0.130817 -0.0744,0.112498 -0.088,0.164461 -0.0745,0.2847 0.0338,0.299998 0.31326,0.504498 0.58822,0.43046 0.15271,-0.04112 0.3056,-0.184909 0.34828,-0.327542 0.0392,-0.130864 0.0392,-0.115447 -1e-4,-0.256801 z"
id="path1971-0-5"
sodipodi:nodetypes="ssscsscccccccccccssscsssscc" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="41.008743"
id="text852-9-7-62"><tspan
sodipodi:role="line"
id="tspan850-4-8-9"
x="104.99195"
y="41.008743"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">hash </tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="104.99195"
y="49.168018"
id="text852-9-7-6-1"><tspan
sodipodi:role="line"
id="tspan850-4-8-8-2"
x="104.99195"
y="49.168018"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">data</tspan></text>
</g>
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#Arrow1Mend)"
d="M 42.105292,69.455903 89.563703,69.317144"
id="path954"
sodipodi:nodetypes="cc" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker1262)"
d="m 134.32612,77.363197 38.12618,0.260865"
id="path1258"
sodipodi:nodetypes="cc" />
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="8.6727352"
y="16.687063"
id="text852-3"><tspan
sodipodi:role="line"
id="tspan850-67"
x="8.6727352"
y="16.687063"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Objects table </tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="89.190445"
y="16.687063"
id="text852-3-5"><tspan
sodipodi:role="line"
id="tspan850-67-3"
x="89.190445"
y="16.687063"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Versions table </tspan></text>
<text
xml:space="preserve"
style="font-style:normal;font-weight:normal;font-size:5.64444px;line-height:1.25;font-family:sans-serif;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.264583"
x="174.55702"
y="16.687063"
id="text852-3-56"><tspan
sodipodi:role="line"
id="tspan850-67-2"
x="174.55702"
y="16.687063"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:5.64444px;font-family:'Liberation Mono';-inkscape-font-specification:'Liberation Mono Bold';stroke-width:0.264583">Blocks table</tspan></text>
</g>
</svg>

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 86 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.1 KiB

BIN
doc/20201202_talk/talk.pdf Normal file

Binary file not shown.

247
doc/20201202_talk/talk.tex Normal file
View file

@ -0,0 +1,247 @@
%\nonstopmode
\documentclass[aspectratio=169]{beamer}
\usepackage[utf8]{inputenc}
% \usepackage[frenchb]{babel}
\usepackage{amsmath}
\usepackage{mathtools}
\usepackage{breqn}
\usepackage{multirow}
\usetheme{Luebeck}
\usepackage{graphicx}
%\useoutertheme[footline=authortitle,subsection=false]{miniframes}
\beamertemplatenavigationsymbolsempty
\setbeamertemplate{footline}
{%
\leavevmode%
\hbox{\begin{beamercolorbox}[wd=.15\paperwidth,ht=2.5ex,dp=1.125ex,leftskip=.3cm,rightskip=.3cm plus1fill]{author in head/foot}%
\usebeamerfont{author in head/foot} \insertframenumber{} / \inserttotalframenumber
\end{beamercolorbox}%
\begin{beamercolorbox}[wd=.2\paperwidth,ht=2.5ex,dp=1.125ex,leftskip=.3cm plus1fill,rightskip=.3cm]{author in head/foot}%
\usebeamerfont{author in head/foot}\insertshortauthor
\end{beamercolorbox}%
\begin{beamercolorbox}[wd=.65\paperwidth,ht=2.5ex,dp=1.125ex,leftskip=.3cm,rightskip=.3cm plus1fil]{title in head/foot}%
\usebeamerfont{title in head/foot}\insertshorttitle~--~\insertshortdate
\end{beamercolorbox}}%
\vskip0pt%
}
\usepackage{tabu}
\usepackage{multicol}
\usepackage{vwcol}
\usepackage{stmaryrd}
\usepackage{graphicx}
\usepackage[normalem]{ulem}
\title[Garage : jouer dans la cour des grands quand on est un hébergeur associatif]{Garage : jouer dans la cour des grands \\quand on est un hébergeur associatif}
\subtitle{(ou pourquoi on a décidé de réinventer la roue)}
\author[Q. Dufour \& A. Auvolat]{Quentin Dufour \& Alex Auvolat}
\date[02/12/2020]{Mercredi 2 décembre 2020}
\begin{document}
\begin{frame}
\titlepage
\end{frame}
\begin{frame}
\frametitle{La question qui tue}
\begin{center}
\includegraphics[scale=3]{img/sync.png} \\
\Huge Pourquoi vous n'hébergez pas vos fichiers chez vous ? \\
\end{center}
\end{frame}
\begin{frame}[t]
\frametitle{La cour des grands}
\begin{columns}[t]
\begin{column}{0.5\textwidth}
{\huge Le modèle du cloud...}
\begin{center}
\includegraphics[scale=0.08]{img/cloud.png}
\end{center}
+ \underline{intégrité} : plus de perte de données
+ \underline{disponibilité} : tout le temps accessible
+ \underline{service} : rien à gérer
\vspace{0.15cm}
\textbf{changement des comportements}
\end{column}
\pause
\begin{column}{0.5\textwidth}
{\huge ...et son prix}
\begin{center}
\includegraphics[scale=0.07]{img/dc.jpg}
\end{center}
- matériel couteux et polluant
- logiciels secrets
- gestion opaque
\vspace{0.2cm}
\textbf{prisonnier de l'écosystème}
\end{column}
\end{columns}
\end{frame}
\begin{frame}[t]
\frametitle{Garage l'imposteur}
\begin{columns}[t]
\begin{column}{0.5\textwidth}
{\huge Ressemble à du cloud...}
\begin{center}
\includegraphics[scale=0.5]{img/shh.jpg}
\end{center}
+ \underline{compatible} avec les apps existantes
+ \underline{fonctionne} avec le mobile
+ \underline{s'adapte} aux habitudes prises
\end{column}
\pause
\begin{column}{0.5\textwidth}
{\huge ...fait du P2P}
\begin{center}
\includegraphics[scale=1]{img/death.jpg}
\end{center}
\vspace{0.4cm}
+ \underline{contrôle} de l'infrastructure
+ \underline{transparent} code libre
+ \underline{sobre} fonctionne avec de vieilles machines à la maison
\end{column}
\end{columns}
\end{frame}
\graphicspath{{img/}}
\begin{frame}
\frametitle{Mais donc, c'est quoi Garage ?}
\begin{columns}[t]
\begin{column}{0.5\textwidth}
\centering
\textbf{Un système de stockage distribué}
\vspace{1em}
\includegraphics[width=.7\columnwidth]{img/garage_distributed.pdf}
\end{column}
\pause
\begin{column}{0.5\textwidth}
\centering
\textbf{qui implémente l'API S3}
\vspace{2em}
\includegraphics[width=.7\columnwidth]{img/Amazon-S3.jpg}
\end{column}
\end{columns}
\end{frame}
\begin{frame}
\frametitle{Consistent Hashing (DynamoDB)}
\textbf{Comment répartir les fichiers sur les différentes machines ?}
\vspace{1em}
\centering
\only<1>{\includegraphics[width=.55\columnwidth]{img/consistent_hashing_1.pdf}}%
\only<2>{\includegraphics[width=.55\columnwidth]{img/consistent_hashing_2.pdf}}%
\only<3>{\includegraphics[width=.55\columnwidth]{img/consistent_hashing_3.pdf}}%
\only<4>{\includegraphics[width=.55\columnwidth]{img/consistent_hashing_4.pdf}}%
\end{frame}
\begin{frame}
\frametitle{Garage Internals : 3 niveaux de consistent hashing}
\centering
\includegraphics[width=.85\columnwidth]{img/garage_tables.pdf}
\end{frame}
\begin{frame}
\frametitle{Modèles de cohérence}
Garage utilise un modèle de cohérence relativement faible :
\vspace{1em}
\begin{itemize}
\item Objets répliqués 3 fois, quorum de 2 pour les lectures et les écritures\\
$\to$ cohérence \textbf{``read your writes''}
\vspace{1em}
\item<2-> Types de donnée CRDT + mécanisme d'anti-entropie\\
$\to$ cohérence \textbf{à terme} (eventual consistency)
\vspace{1em}
\item<3-> Cela s'applique pour chaque fichier individuellement :\\
pas de linéarisabilté ou de cohérence causale entre les opérations\\
sur des fichiers différents
\vspace{1em}
\item<4-> \textbf{Avantage :} convient bien à un déploiement géodistribué (multi-datacenter)
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Rust : retour d'expérience}
\begin{columns}
\begin{column}{0.55\textwidth}
Garage est entièrement écrit en Rust !
\vspace{2em}
\textbf{Points forts :}
\vspace{.5em}
\begin{itemize}
\item Langage compilé, très rapide
\vspace{.5em}
\item Typage fort, beaucoup de sécurités
\vspace{.5em}
\item Le meilleur de plusieurs paradigmes:
fonctionnel, orienté objet, impératif
\vspace{.5em}
\item Un écosytème de librairies très complet:
serialisation, async/await, http, ...
\end{itemize}
\end{column}
\begin{column}{0.45\textwidth}
\begin{centering}
\hspace{2em}\includegraphics[width=0.55\columnwidth]{img/rustacean-flat-happy.png}
\end{centering}
\vspace{2em}
\textbf{Points faibles :}
\vspace{.5em}
\begin{itemize}
\item Les temps de compilation...
\vspace{.5em}
\item Compliqué à apprendre
\end{itemize}
\vspace{2em}
\end{column}
\end{columns}
\end{frame}
\end{document}
%% vim: set ts=4 sw=4 tw=0 noet spelllang=fr :

1
doc/book/.gitignore vendored Normal file
View file

@ -0,0 +1 @@
book

6
doc/book/book.toml Normal file
View file

@ -0,0 +1,6 @@
[book]
authors = ["Quentin Dufour"]
language = "en"
multilingual = false
src = "src"
title = "Garage Documentation"

32
doc/book/src/SUMMARY.md Normal file
View file

@ -0,0 +1,32 @@
# Summary
[The Garage Data Store](./intro.md)
- [Getting Started](./getting_started/index.md)
- [Get a binary](./getting_started/binary.md)
- [Configure the daemon](./getting_started/daemon.md)
- [Control the daemon](./getting_started/control.md)
- [Configure a cluster](./getting_started/cluster.md)
- [Create buckets and keys](./getting_started/bucket.md)
- [Handle files](./getting_started/files.md)
- [Cookbook](./cookbook/index.md)
- [Host a website](./cookbook/website.md)
- [Integrate as a media backend]()
- [Operate a cluster]()
- [Recovering from failures](./cookbook/recovering.md)
- [Reference Manual](./reference_manual/index.md)
- [Garage CLI]()
- [S3 API](./reference_manual/s3_compatibility.md)
- [Design](./design/index.md)
- [Related Work](./design/related_work.md)
- [Internals](./design/internals.md)
- [Development](./development/index.md)
- [Setup your environment](./development/devenv.md)
- [Your first contribution]()
- [Working Documents](./working_documents/index.md)
- [Load Balancing Data](./working_documents/load_balancing.md)

View file

@ -0,0 +1,5 @@
# Cookbook
A cookbook, when you cook, is a collection of recipes.
Similarly, Garage's cookbook contains a collection of recipes that are known to works well!
This chapter could also be referred as "Tutorials" or "Best practices".

View file

@ -0,0 +1,99 @@
# Recovering from failures
Garage is meant to work on old, second-hand hardware.
In particular, this makes it likely that some of your drives will fail, and some manual intervention will be needed.
Fear not! For Garage is fully equipped to handle drive failures, in most common cases.
## A note on availability of Garage
With nodes dispersed in 3 datacenters or more, here are the guarantees Garage provides with the default replication strategy (3 copies of all data, which is the recommended value):
- The cluster remains fully functional as long as the machines that fail are in only one datacenter. This includes a whole datacenter going down due to power/Internet outage.
- No data is lost as long as the machines that fail are in at most two datacenters.
Of course this only works if your Garage nodes are correctly configured to be aware of the datacenter in which they are located.
Make sure this is the case using `garage status` to check on the state of your cluster's configuration.
## First option: removing a node
If you don't have spare parts (HDD, SDD) to replace the failed component, and if there are enough remaining nodes in your cluster
(at least 3), you can simply remove the failed node from Garage's configuration.
Note that if you **do** intend to replace the failed parts by new ones, using this method followed by adding back the node is **not recommended** (although it should work),
and you should instead use one of the methods detailed in the next sections.
Removing a node is done with the following command:
```
garage node remove --yes <node_id>
```
(you can get the `node_id` of the failed node by running `garage status`)
This will repartition the data and ensure that 3 copies of everything are present on the nodes that remain available.
## Replacement scenario 1: only data is lost, metadata is fine
The recommended deployment for Garage uses an SSD to store metadata, and an HDD to store blocks of data.
In the case where only a single HDD crashes, the blocks of data are lost but the metadata is still fine.
This is very easy to recover by setting up a new HDD to replace the failed one.
The node does not need to be fully replaced and the configuration doesn't need to change.
We just need to tell Garage to get back all the data blocks and store them on the new HDD.
First, set up a new HDD to store Garage's data directory on the failed node, and restart Garage using
the existing configuration. Then, run:
```
garage repair -a --yes blocks
```
This will re-synchronize blocks of data that are missing to the new HDD, reading them from copies located on other nodes.
You can check on the advancement of this process by doing the following command:
```
garage stats -a
```
Look out for the following output:
```
Block manager stats:
resync queue length: 26541
```
This indicates that one of the Garage node is in the process of retrieving missing data from other nodes.
This number decreases to zero when the node is fully synchronized.
## Replacement scenario 2: metadata (and possibly data) is lost
This scenario covers the case where a full node fails, i.e. both the metadata directory and
the data directory are lost, as well as the case where only the metadata directory is lost.
To replace the lost node, we will start from an empty metadata directory, which means
Garage will generate a new node ID for the replacement node.
We will thus need to remove the previous node ID from Garage's configuration and replace it by the ID of the new node.
If your data directory is stored on a separate drive and is still fine, you can keep it, but it is not necessary to do so.
In all cases, the data will be rebalanced and the replacement node will not store the same pieces of data
as were originally stored on the one that failed. So if you keep the data files, the rebalancing
might be faster but most of the pieces will be deleted anyway from the disk and replaced by other ones.
First, set up a new drive to store the metadata directory for the replacement node (a SSD is recommended),
and for the data directory if necessary. You can then start Garage on the new node.
The restarted node should generate a new node ID, and it should be shown as `NOT CONFIGURED` in `garage status`.
The ID of the lost node should be shown in `garage status` in the section for disconnected/unavailable nodes.
Then, replace the broken node by the new one, using:
```
garage node configure --replace <old_node_id> \
-c <capacity> -d <datacenter> -t <node_tag> <new_node_id>
```
Garage will then start synchronizing all required data on the new node.
This process can be monitored using the `garage stats -a` command.

View file

@ -0,0 +1 @@
# Host a website

View file

@ -0,0 +1,5 @@
# Design
The design section helps you to see Garage from a "big picture" perspective.
It will allow you to understand if Garage is a good fit for you,
how to better use it, how to contribute to it, what can Garage could and could not do, etc.

View file

@ -0,0 +1,158 @@
**WARNING: this documentation is more a "design draft", which was written before Garage's actual implementation. The general principle is similar but details have not yet been updated.**
#### Modules
- `membership/`: configuration, membership management (gossip of node's presence and status), ring generation --> what about Serf (used by Consul/Nomad) : https://www.serf.io/? Seems a huge library with many features so maybe overkill/hard to integrate
- `metadata/`: metadata management
- `blocks/`: block management, writing, GC and rebalancing
- `internal/`: server to server communication (HTTP server and client that reuses connections, TLS if we want, etc)
- `api/`: S3 API
- `web/`: web management interface
#### Metadata tables
**Objects:**
- *Hash key:* Bucket name (string)
- *Sort key:* Object key (string)
- *Sort key:* Version timestamp (int)
- *Sort key:* Version UUID (string)
- Complete: bool
- Inline: bool, true for objects < threshold (say 1024)
- Object size (int)
- Mime type (string)
- Data for inlined objects (blob)
- Hash of first block otherwise (string)
*Having only a hash key on the bucket name will lead to storing all file entries of this table for a specific bucket on a single node. At the same time, it is the only way I see to rapidly being able to list all bucket entries...*
**Blocks:**
- *Hash key:* Version UUID (string)
- *Sort key:* Offset of block in total file (int)
- Hash of data block (string)
A version is defined by the existence of at least one entry in the blocks table for a certain version UUID.
We must keep the following invariant: if a version exists in the blocks table, it has to be referenced in the objects table.
We explicitly manage concurrent versions of an object: the version timestamp and version UUID columns are index columns, thus we may have several concurrent versions of an object.
Important: before deleting an older version from the objects table, we must make sure that we did a successfull delete of the blocks of that version from the blocks table.
Thus, the workflow for reading an object is as follows:
1. Check permissions (LDAP)
2. Read entry in object table. If data is inline, we have its data, stop here.
-> if several versions, take newest one and launch deletion of old ones in background
3. Read first block from cluster. If size <= 1 block, stop here.
4. Simultaneously with previous step, if size > 1 block: query the Blocks table for the IDs of the next blocks
5. Read subsequent blocks from cluster
Workflow for PUT:
1. Check write permission (LDAP)
2. Select a new version UUID
3. Write a preliminary entry for the new version in the objects table with complete = false
4. Send blocks to cluster and write entries in the blocks table
5. Update the version with complete = true and all of the accurate information (size, etc)
6. Return success to the user
7. Launch a background job to check and delete older versions
Workflow for DELETE:
1. Check write permission (LDAP)
2. Get current version (or versions) in object table
3. Do the deletion of those versions NOT IN A BACKGROUND JOB THIS TIME
4. Return succes to the user if we were able to delete blocks from the blocks table and entries from the object table
To delete a version:
1. List the blocks from Cassandra
2. For each block, delete it from cluster. Don't care if some deletions fail, we can do GC.
3. Delete all of the blocks from the blocks table
4. Finally, delete the version from the objects table
Known issue: if someone is reading from a version that we want to delete and the object is big, the read might be interrupted. I think it is ok to leave it like this, we just cut the connection if data disappears during a read.
("Soit P un problème, on s'en fout est une solution à ce problème")
#### Block storage on disk
**Blocks themselves:**
- file path = /blobs/(first 3 hex digits of hash)/(rest of hash)
**Reverse index for GC & other block-level metadata:**
- file path = /meta/(first 3 hex digits of hash)/(rest of hash)
- map block hash -> set of version UUIDs where it is referenced
Usefull metadata:
- list of versions that reference this block in the Casandra table, so that we can do GC by checking in Cassandra that the lines still exist
- list of other nodes that we know have acknowledged a write of this block, usefull in the rebalancing algorithm
Write strategy: have a single thread that does all write IO so that it is serialized (or have several threads that manage independent parts of the hash space). When writing a blob, write it to a temporary file, close, then rename so that a concurrent read gets a consistent result (either not found or found with whole content).
Read strategy: the only read operation is get(hash) that returns either the data or not found (can do a corruption check as well and return corrupted state if it is the case). Can be done concurrently with writes.
**Internal API:**
- get(block hash) -> ok+data/not found/corrupted
- put(block hash & data, version uuid + offset) -> ok/error
- put with no data(block hash, version uuid + offset) -> ok/not found plz send data/error
- delete(block hash, version uuid + offset) -> ok/error
GC: when last ref is deleted, delete block.
Long GC procedure: check in Cassandra that version UUIDs still exist and references this block.
Rebalancing: takes as argument the list of newly added nodes.
- List all blocks that we have. For each block:
- If it hits a newly introduced node, send it to them.
Use put with no data first to check if it has to be sent to them already or not.
Use a random listing order to avoid race conditions (they do no harm but we might have two nodes sending the same thing at the same time thus wasting time).
- If it doesn't hit us anymore, delete it and its reference list.
Only one balancing can be running at a same time. It can be restarted at the beginning with new parameters.
#### Membership management
Two sets of nodes:
- set of nodes from which a ping was recently received, with status: number of stored blocks, request counters, error counters, GC%, rebalancing%
(eviction from this set after say 30 seconds without ping)
- set of nodes that are part of the system, explicitly modified by the operator using the web UI (persisted to disk),
is a CRDT using a version number for the value of the whole set
Thus, three states for nodes:
- healthy: in both sets
- missing: not pingable but part of desired cluster
- unused/draining: currently present but not part of the desired cluster, empty = if contains nothing, draining = if still contains some blocks
Membership messages between nodes:
- ping with current state + hash of current membership info -> reply with same info
- send&get back membership info (the ids of nodes that are in the two sets): used when no local membership change in a long time and membership info hash discrepancy detected with first message (passive membership fixing with full CRDT gossip)
- inform of newly pingable node(s) -> no result, when receive new info repeat to all (reliable broadcast)
- inform of operator membership change -> no result, when receive new info repeat to all (reliable broadcast)
Ring: generated from the desired set of nodes, however when doing read/writes on the ring, skip nodes that are known to be not pingable.
The tokens are generated in a deterministic fashion from node IDs (hash of node id + token number from 1 to K).
Number K of tokens per node: decided by the operator & stored in the operator's list of nodes CRDT. Default value proposal: with node status information also broadcast disk total size and free space, and propose a default number of tokens equal to 80%Free space / 10Gb. (this is all user interface)
#### Constants
- Block size: around 1MB ? --> Exoscale use 16MB chunks
- Number of tokens in the hash ring: one every 10Gb of allocated storage
- Threshold for storing data directly in Cassandra objects table: 1kb bytes (maybe up to 4kb?)
- Ping timeout (time after which a node is registered as unresponsive/missing): 30 seconds
- Ping interval: 10 seconds
- ??
#### Links
- CDC: <https://www.usenix.org/system/files/conference/atc16/atc16-paper-xia.pdf>
- Erasure coding: <http://web.eecs.utk.edu/~jplank/plank/papers/CS-08-627.html>
- [Openstack Storage Concepts](https://docs.openstack.org/arch-design/design-storage/design-storage-concepts.html)
- [RADOS](https://ceph.com/wp-content/uploads/2016/08/weil-rados-pdsw07.pdf)

View file

@ -0,0 +1,56 @@
# Related Work
## Context
Data storage is critical: it can lead to data loss if done badly and/or on hardware failure.
Filesystems + RAID can help on a single machine but a machine failure can put the whole storage offline.
Moreover, it put a hard limit on scalability. Often this limit can be pushed back far away by buying expensive machines.
But here we consider non specialized off the shelf machines that can be as low powered and subject to failures as a raspberry pi.
Distributed storage may help to solve both availability and scalability problems on these machines.
Many solutions were proposed, they can be categorized as block storage, file storage and object storage depending on the abstraction they provide.
## Overview
Block storage is the most low level one, it's like exposing your raw hard drive over the network.
It requires very low latencies and stable network, that are often dedicated.
However it provides disk devices that can be manipulated by the operating system with the less constraints: it can be partitioned with any filesystem, meaning that it supports even the most exotic features.
We can cite [iSCSI](https://en.wikipedia.org/wiki/ISCSI) or [Fibre Channel](https://en.wikipedia.org/wiki/Fibre_Channel).
Openstack Cinder proxy previous solution to provide an uniform API.
File storage provides a higher abstraction, they are one filesystem among others, which means they don't necessarily have all the exotic features of every filesystem.
Often, they relax some POSIX constraints while many applications will still be compatible without any modification.
As an example, we are able to run MariaDB (very slowly) over GlusterFS...
We can also mention CephFS (read [RADOS](https://ceph.com/wp-content/uploads/2016/08/weil-rados-pdsw07.pdf) whitepaper), Lustre, LizardFS, MooseFS, etc.
OpenStack Manila proxy previous solutions to provide an uniform API.
Finally object storages provide the highest level abstraction.
They are the testimony that the POSIX filesystem API is not adapted to distributed filesystems.
Especially, the strong concistency has been dropped in favor of eventual consistency which is way more convenient and powerful in presence of high latencies and unreliability.
We often read about S3 that pioneered the concept that it's a filesystem for the WAN.
Applications must be adapted to work for the desired object storage service.
Today, the S3 HTTP REST API acts as a standard in the industry.
However, Amazon S3 source code is not open but alternatives were proposed.
We identified Minio, Pithos, Swift and Ceph.
Minio/Ceph enforces a total order, so properties similar to a (relaxed) filesystem.
Swift and Pithos are probably the most similar to AWS S3 with their consistent hashing ring.
However Pithos is not maintained anymore. More precisely the company that published Pithos version 1 has developped a second version 2 but has not open sourced it.
Some tests conducted by the [ACIDES project](https://acides.org/) have shown that Openstack Swift consumes way more resources (CPU+RAM) that we can afford. Furthermore, people developing Swift have not designed their software for geo-distribution.
There were many attempts in research too. I am only thinking to [LBFS](https://pdos.csail.mit.edu/papers/lbfs:sosp01/lbfs.pdf) that was used as a basis for Seafile. But none of them have been effectively implemented yet.
## Existing software
**[Pithos](https://github.com/exoscale/pithos) :**
Pithos has been abandonned and should probably not used yet, in the following we explain why we did not pick their design.
Pithos was relying as a S3 proxy in front of Cassandra (and was working with Scylla DB too).
From its designers' mouth, storing data in Cassandra has shown its limitations justifying the project abandonment.
They built a closed-source version 2 that does not store blobs in the database (only metadata) but did not communicate further on it.
We considered there v2's design but concluded that it does not fit both our *Self-contained & lightweight* and *Simple* properties. It makes the development, the deployment and the operations more complicated while reducing the flexibility.
**[IPFS](https://ipfs.io/) :**
*Not written yet*
## Specific research papers
*Not yet written*

View file

@ -0,0 +1,17 @@
# Setup your development environment
We propose the following quickstart to setup a full dev. environment as quickly as possible:
1. Setup a rust/cargo environment. eg. `dnf install rust cargo`
2. Install awscli v2 by following the guide [here](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html).
3. Run `cargo build` to build the project
4. Run `./script/dev-cluster.sh` to launch a test cluster (feel free to read the script)
5. Run `./script/dev-configure.sh` to configure your test cluster with default values (same datacenter, 100 tokens)
6. Run `./script/dev-bucket.sh` to create a bucket named `eprouvette` and an API key that will be stored in `/tmp/garage.s3`
7. Run `source ./script/dev-env-aws.sh` to configure your CLI environment
8. You can use `garage` to manage the cluster. Try `garage --help`.
9. You can use the `awsgrg` alias to add, remove, and delete files. Try `awsgrg help`, `awsgrg cp /proc/cpuinfo s3://eprouvette/cpuinfo.txt`, or `awsgrg ls s3://eprouvette`. `awsgrg` is a wrapper on the `aws s3` command pre-configured with the previously generated API key (the one in `/tmp/garage.s3`) and localhost as the endpoint.
Now you should be ready to start hacking on garage!

View file

@ -0,0 +1,4 @@
# Development
Now that you are a Garage expert, you want to enhance it, you are in the right place!
We discuss here how to hack on Garage, how we manage its development, etc.

View file

@ -0,0 +1,44 @@
# Get a binary
Currently, only two installations procedures are supported for Garage: from Docker (x86\_64 for Linux) and from source.
In the future, we plan to add a third one, by publishing a compiled binary (x86\_64 for Linux).
We did not test other architecture/operating system but, as long as your architecture/operating system is supported by Rust, you should be able to run Garage (feel free to report your tests!).
## From Docker
Our docker image is currently named `lxpz/garage_amd64` and is stored on the [Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated).
We encourage you to use a fixed tag (eg. `v0.2.1`) and not the `latest` tag.
For this example, we will use the latest published version at the time of the writing which is `v0.2.1` but it's up to you
to check [the most recent versions on the Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated).
For example:
```
sudo docker pull lxpz/garage_amd64:v0.2.1
```
## From source
Garage is a standard Rust project.
First, you need `rust` and `cargo`.
On Debian:
```bash
sudo apt-get update
sudo apt-get install -y rustc cargo
```
Then, you can ask cargo to install the binary for you:
```bash
cargo install garage
```
That's all, `garage` should be in `$HOME/.cargo/bin`.
You can add this folder to your `$PATH` or copy the binary somewhere else on your system.
For the following, we will assume you copied it in `/usr/local/bin/garage`:
```bash
sudo cp $HOME/.cargo/bin/garage /usr/local/bin/garage
```

View file

@ -0,0 +1,74 @@
# Create buckets and keys
*We use a command named `garagectl` which is in fact an alias you must define as explained in the [Control the daemon](./daemon.md) section.*
In this section, we will suppose that we want to create a bucket named `nextcloud-bucket`
that will be accessed through a key named `nextcloud-app-key`.
Don't forget that `help` command and `--help` subcommands can help you anywhere, the CLI tool is self-documented! Two examples:
```
garagectl help
garagectl bucket allow --help
```
## Create a bucket
Fine, now let's create a bucket (we imagine that you want to deploy nextcloud):
```
garagectl bucket create nextcloud-bucket
```
Check that everything went well:
```
garagectl bucket list
garagectl bucket info nextcloud-bucket
```
## Create an API key
Now we will generate an API key to access this bucket.
Note that API keys are independent of buckets: one key can access multiple buckets, multiple keys can access one bucket.
Now, let's start by creating a key only for our PHP application:
```
garagectl key new --name nextcloud-app-key
```
You will have the following output (this one is fake, `key_id` and `secret_key` were generated with the openssl CLI tool):
```
Key name: nextcloud-app-key
Key ID: GK3515373e4c851ebaad366558
Secret key: 7d37d093435a41f2aab8f13c19ba067d9776c90215f56614adad6ece597dbb34
Authorized buckets:
```
Check that everything works as intended:
```
garagectl key list
garagectl key info nextcloud-app-key
```
## Allow a key to access a bucket
Now that we have a bucket and a key, we need to give permissions to the key on the bucket!
```
garagectl bucket allow \
--read \
--write
nextcloud-bucket \
--key nextcloud-app-key
```
You can check at any times allowed keys on your bucket with:
```
garagectl bucket info nextcloud-bucket
```

View file

@ -0,0 +1,73 @@
# Configure a cluster
*We use a command named `garagectl` which is in fact an alias you must define as explained in the [Control the daemon](./daemon.md) section.*
In this section, we will inform garage of the disk space available on each node of the cluster
as well as the site (think datacenter) of each machine.
## Test cluster
As this part is not relevant for a test cluster, you can use this one-liner to create a basic topology:
```bash
garagectl status | grep UNCONFIGURED | grep -Po '^[0-9a-f]+' | while read id; do
garagectl node configure -d dc1 -c 1 $id
done
```
## Real-world cluster
For our example, we will suppose we have the following infrastructure (Capacity, Identifier and Datacenter are specific values to garage described in the following):
| Location | Name | Disk Space | `Capacity` | `Identifier` | `Datacenter` |
|----------|---------|------------|------------|--------------|--------------|
| Paris | Mercury | 1 To | `2` | `8781c5` | `par1` |
| Paris | Venus | 2 To | `4` | `2a638e` | `par1` |
| London | Earth | 2 To | `4` | `68143d` | `lon1` |
| Brussels | Mars | 1.5 To | `3` | `212f75` | `bru1` |
### Identifier
After its first launch, garage generates a random and unique identifier for each nodes, such as:
```
8781c50c410a41b363167e9d49cc468b6b9e4449b6577b64f15a249a149bdcbc
```
Often a shorter form can be used, containing only the beginning of the identifier, like `8781c5`,
which identifies the server "Mercury" located in "Paris" according to our previous table.
The most simple way to match an identifier to a node is to run:
```
garagectl status
```
It will display the IP address associated with each node; from the IP address you will be able to recognize the node.
### Capacity
Garage reasons on an arbitrary metric about disk storage that is named the *capacity* of a node.
The capacity configured in Garage must be proportional to the disk space dedicated to the node.
Additionaly, the capacity values used in Garage should be as small as possible, with
1 ideally representing the size of your smallest server.
Here we chose that 1 unit of capacity = 0.5 To, so that we can express servers of size
1 To and 2 To, as wel as the intermediate size 1.5 To.
### Datacenter
Datacenter are simply a user-chosen identifier that identify a group of server that are located in the same place.
It is up to the system administrator deploying garage to identify what does "the same place" means.
Behind the scene, garage will try to store the same data on different sites to provide high availability despite a data center failure.
### Inject the topology
Given the information above, we will configure our cluster as follow:
```
garagectl node configure --datacenter par1 -c 2 -t mercury 8781c5
garagectl node configure --datacenter par1 -c 4 -t venus 2a638e
garagectl node configure --datacenter lon1 -c 4 -t earth 68143d
garagectl node configure --datacenter bru1 -c 3 -t mars 212f75
```

View file

@ -0,0 +1,77 @@
# Control the daemon
The `garage` binary has two purposes:
- it acts as a daemon when launched with `garage server ...`
- it acts as a control tool for the daemon when launched with any other command
In this section, we will see how to use the `garage` binary as a control tool for the daemon we just started.
You first need to get a shell having access to this binary, which depends of your configuration:
- with `docker-compose`, run `sudo docker-compose exec g1 bash` then `/garage/garage`
- with `docker`, run `sudo docker exec -ti garaged bash` then `/garage/garage`
- with `systemd`, simply run `/usr/local/bin/garage` if you followed previous instructions
*You can also install the binary on your machine to remotely control the cluster.*
## Talk to the daemon and create an alias
`garage` requires 4 options to talk with the daemon:
```
--ca-cert <ca-cert>
--client-cert <client-cert>
--client-key <client-key>
-h, --rpc-host <rpc-host>
```
The 3 first ones are certificates and keys needed by TLS, the last one is simply the address of garage's RPC endpoint.
Because we configure garage directly from the server, we do not need to set `--rpc-host`.
To avoid typing the 3 first options each time we want to run a command, we will create an alias.
### `docker-compose` alias
```bash
alias garagectl='/garage/garage \
--ca-cert /pki/garage-ca.crt \
--client-cert /pki/garage.crt \
--client-key /pki/garage.key'
```
### `docker` alias
```bash
alias garagectl='/garage/garage \
--ca-cert /etc/garage/pki/garage-ca.crt \
--client-cert /etc/garage/pki/garage.crt \
--client-key /etc/garage/pki/garage.key'
```
### raw binary alias
```bash
alias garagectl='/usr/local/bin/garage \
--ca-cert /etc/garage/pki/garage-ca.crt \
--client-cert /etc/garage/pki/garage.crt \
--client-key /etc/garage/pki/garage.key'
```
Of course, if your deployment does not match exactly one of this alias, feel free to adapt it to your needs!
## Test the alias
You can test your alias by running a simple command such as:
```
garagectl status
```
You should get something like that as result:
```
Healthy nodes:
2a638ed6c775b69a… 37f0ba978d27 [::ffff:172.20.0.101]:3901 UNCONFIGURED/REMOVED
68143d720f20c89d… 9795a2f7abb5 [::ffff:172.20.0.103]:3901 UNCONFIGURED/REMOVED
8781c50c410a41b3… 758338dde686 [::ffff:172.20.0.102]:3901 UNCONFIGURED/REMOVED
```
...which means that you are ready to configure your cluster!

View file

@ -0,0 +1,222 @@
# Configure the daemon
Garage is a software that can be run only in a cluster and requires at least 3 instances.
In our getting started guide, we document two deployment types:
- [Test deployment](#test-deployment) though `docker-compose`
- [Real-world deployment](#real-world-deployment) through `docker` or `systemd`
In any case, you first need to generate TLS certificates, as traffic is encrypted between Garage's nodes.
## Generating a TLS Certificate
To generate your TLS certificates, run on your machine:
```
wget https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/branch/master/genkeys.sh
chmod +x genkeys.sh
./genkeys.sh
```
It will creates a folder named `pki` containing the keys that you will used for the cluster.
## Test deployment
Single machine deployment is only described through `docker-compose`.
Before starting, we recommend you create a folder for our deployment:
```bash
mkdir garage-single
cd garage-single
```
We start by creating a file named `docker-compose.yml` describing our network and our containers:
```yml
version: '3.4'
networks: { virtnet: { ipam: { config: [ subnet: 172.20.0.0/24 ]}}}
services:
g1:
image: lxpz/garage_amd64:v0.1.1d
networks: { virtnet: { ipv4_address: 172.20.0.101 }}
volumes:
- "./pki:/pki"
- "./config.toml:/garage/config.toml"
g2:
image: lxpz/garage_amd64:v0.1.1d
networks: { virtnet: { ipv4_address: 172.20.0.102 }}
volumes:
- "./pki:/pki"
- "./config.toml:/garage/config.toml"
g3:
image: lxpz/garage_amd64:v0.1.1d
networks: { virtnet: { ipv4_address: 172.20.0.103 }}
volumes:
- "./pki:/pki"
- "./config.toml:/garage/config.toml"
```
*We define a static network here which is not considered as a best practise on Docker.
The rational is that Garage only supports IP address and not domain names in its configuration, so we need to know the IP address in advance.*
and then create the `config.toml` file next to it as follow:
```toml
metadata_dir = "/garage/meta"
data_dir = "/garage/data"
rpc_bind_addr = "[::]:3901"
bootstrap_peers = [
"172.20.0.101:3901",
"172.20.0.102:3901",
"172.20.0.103:3901",
]
[rpc_tls]
ca_cert = "/pki/garage-ca.crt"
node_cert = "/pki/garage.crt"
node_key = "/pki/garage.key"
[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage"
index = "index.html"
```
*Please note that we have not mounted `/garage/meta` or `/garage/data` on the host: data will be lost when the container will be destroyed.*
And that's all, you are ready to launch your cluster!
```
sudo docker-compose up
```
While your daemons are up, your cluster is still not configured yet.
However, you can check that your services are still listening as expected by querying them from your host:
```bash
curl http://172.20.0.{101,102,103}:3902
```
which should give you:
```
Not found
Not found
Not found
```
That's all, you are ready to [configure your cluster!](./cluster.md).
## Real-world deployment
Before deploying garage on your infrastructure, you must inventory your machines.
For our example, we will suppose the following infrastructure:
| Location | Name | IP Address | Disk Space |
|----------|---------|------------|------------|
| Paris | Mercury | fc00:1::1 | 1 To |
| Paris | Venus | fc00:1::2 | 2 To |
| London | Earth | fc00:B::1 | 2 To |
| Brussels | Mars | fc00:F::1 | 1.5 To |
On each machine, we will have a similar setup, especially you must consider the following folders/files:
- `/etc/garage/pki`: Garage certificates, must be generated on your computer and copied on the servers
- `/etc/garage/config.toml`: Garage daemon's configuration (defined below)
- `/etc/systemd/system/garage.service`: Service file to start garage at boot automatically (defined below, not required if you use docker)
- `/var/lib/garage/meta`: Contains Garage's metadata, put this folder on a SSD if possible
- `/var/lib/garage/data`: Contains Garage's data, this folder will grows and must be on a large storage, possibly big HDDs.
A valid `/etc/garage/config.toml` for our cluster would be:
```toml
metadata_dir = "/var/lib/garage/meta"
data_dir = "/var/lib/garage/data"
rpc_bind_addr = "[::]:3901"
bootstrap_peers = [
"[fc00:1::1]:3901",
"[fc00:1::2]:3901",
"[fc00:B::1]:3901",
"[fc00:F::1]:3901",
]
[rpc_tls]
ca_cert = "/etc/garage/pki/garage-ca.crt"
node_cert = "/etc/garage/pki/garage.crt"
node_key = "/etc/garage/pki/garage.key"
[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage"
index = "index.html"
```
Please make sure to change `bootstrap_peers` to **your** IP addresses!
### For docker users
On each machine, you can run the daemon with:
```bash
docker run \
-d \
--name garaged \
--restart always \
--network host \
-v /etc/garage/pki:/etc/garage/pki \
-v /etc/garage/config.toml:/garage/config.toml \
-v /var/lib/garage/meta:/var/lib/garage/meta \
-v /var/lib/garage/data:/var/lib/garage/data \
lxpz/garage_amd64:v0.1.1d
```
It should be restart automatically at each reboot.
Please note that we use host networking as otherwise Docker containers can no communicate with IPv6.
To upgrade, simply stop and remove this container and start again the command with a new version of garage.
### For systemd/raw binary users
Create a file named `/etc/systemd/system/garage.service`:
```toml
[Unit]
Description=Garage Data Store
After=network-online.target
Wants=network-online.target
[Service]
Environment='RUST_LOG=garage=info' 'RUST_BACKTRACE=1'
ExecStart=/usr/local/bin/garage server -c /etc/garage/config.toml
[Install]
WantedBy=multi-user.target
```
To start the service then automatically enable it at boot:
```bash
sudo systemctl start garage
sudo systemctl enable garage
```
To see if the service is running and to browse its logs:
```bash
sudo systemctl status garage
sudo journalctl -u garage
```
If you want to modify the service file, do not forget to run `systemctl daemon-reload`
to inform `systemd` of your modifications.

View file

@ -0,0 +1,42 @@
# Handle files
We recommend the use of MinIO Client to interact with Garage files (`mc`).
Instructions to install it and use it are provided on the [MinIO website](https://docs.min.io/docs/minio-client-quickstart-guide.html).
Before reading the following, you need a working `mc` command on your path.
## Configure `mc`
You need your access key and secret key created in the [previous section](bucket.md).
You also need to set the endpoint: it must match the IP address of one of the node of the cluster and the API port (3900 by default).
For this whole configuration, you must set an alias name: we chose `my-garage`, that you will used for all commands.
Adapt the following command accordingly and run it:
```bash
mc alias set \
my-garage \
http://172.20.0.101:3900 \
<access key> \
<secret key> \
--api S3v4
```
You must also add an environment variable to your configuration to inform MinIO of our region (`garage` by default).
The best way is to add the following snippet to your `$HOME/.bash_profile` or `$HOME/.bashrc` file:
```bash
export MC_REGION=garage
```
## Use `mc`
You can not list buckets from `mc` currently.
But the following commands and many more should work:
```bash
mc cp image.png my-garage/nextcloud-bucket
mc cp my-garage/nextcloud-bucket/image.png .
mc ls my-garage/nextcloud-bucket
mc mirror localdir/ my-garage/another-bucket
```

View file

@ -0,0 +1,5 @@
# Getting Started
Let's start your Garage journey!
In this chapter, we explain how to deploy a simple garage cluster and start interacting with it.
Our goal is to introduce you to Garage's workflows.

44
doc/book/src/img/logo.svg Normal file
View file

@ -0,0 +1,44 @@
<?xml version="1.0" encoding="UTF-8"?>
<svg width="128" height="128" version="1.1" viewBox="0 0 33.867 33.867" xmlns="http://www.w3.org/2000/svg" xmlns:cc="http://creativecommons.org/ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<metadata>
<rdf:RDF>
<cc:Work rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type rdf:resource="http://purl.org/dc/dcmitype/StillImage"/>
<dc:title/>
</cc:Work>
</rdf:RDF>
</metadata>
<g stroke-width=".14689">
<path d="m20.613 10.981a2.2034 2.2034 0 0 1-0.73445-0.07638l-9.2042-2.4839a2.2342 2.2342 0 0 1-0.69332-0.32757z"/>
<g fill="#4e4e4e">
<path class="cls-1" d="m6.6028 26.612 1.3661-0.0088h0.01763q0.75796 0 0.75796 0.71389v2.3003a6.5748 6.5748 0 0 1-2.2886 0.37898q-1.2515 0-1.8861-0.8505t-0.63457-2.3179q0-1.4689 0.7888-2.2827a2.5823 2.5823 0 0 1 1.9301-0.81524 3.5371 3.5371 0 0 1 2.0667 0.64338 1.0385 1.0385 0 0 1-0.18068 0.46711 1.2603 1.2603 0 0 1-0.33932 0.35254 2.5926 2.5926 0 0 0-1.5027-0.51999 1.4175 1.4175 0 0 0-1.1854 0.54203q-0.42304 0.53909-0.42304 1.6966 0 2.1769 1.604 2.1769a4.4743 4.4743 0 0 0 0.97829-0.11457v-0.83728q0-0.3966 0.01763-0.58756h-0.64633a0.60519 0.60519 0 0 1-0.40101-0.11018 0.44067 0.44067 0 0 1-0.12779-0.35254 1.51 1.51 0 0 1 0.088134-0.47446z"/>
<path class="cls-1" d="m13.401 29.379a1.1413 1.1413 0 0 1-0.14689 0.31288 1.0664 1.0664 0 0 1-0.22474 0.25118 0.99592 0.99592 0 0 1-0.80937-0.51705 1.7847 1.7847 0 0 1-1.2603 0.56406q-0.67863 0-1.0282-0.3966a1.3573 1.3573 0 0 1-0.34372-0.9166q0-0.73445 0.48033-1.1149a1.9404 1.9404 0 0 1 1.2354-0.3687q0.40542 0 0.76677 0.03525v-0.2644q0-0.69626-0.66982-0.69626-0.47592 0-1.3485 0.31728a1.2368 1.2368 0 0 1-0.29378-0.78439 4.9164 4.9164 0 0 1 1.9096-0.3966 1.5526 1.5526 0 0 1 1.0752 0.37016q0.41423 0.37016 0.41423 1.1193v1.7979q-0.0029 0.48474 0.24384 0.68745zm-2.2122-0.22034a1.2471 1.2471 0 0 0 0.88134-0.42304v-0.77852a5.9182 5.9182 0 0 0-0.66982-0.03525 0.73445 0.73445 0 0 0-0.54643 0.18214 0.6331 0.6331 0 0 0-0.18508 0.46711 0.62282 0.62282 0 0 0 0.14689 0.44067 0.48768 0.48768 0 0 0 0.3731 0.14689z"/>
<path class="cls-1" d="m14.115 26.012a1.0547 1.0547 0 0 1 0.14689-0.32169 0.88134 0.88134 0 0 1 0.22474-0.25118 1.1017 1.1017 0 0 1 0.92982 0.78439q0.35254-0.78439 1.1369-0.78439a2.7028 2.7028 0 0 1 0.51118 0.06169 1.9786 1.9786 0 0 1-0.2644 1.0282 2.2357 2.2357 0 0 0-0.3966-0.05288q-0.53762 0-0.86372 0.57287v2.8174a3.0627 3.0627 0 0 1-0.53762 0.04407 3.3785 3.3785 0 0 1-0.55525-0.04407v-2.9525q-0.0059-0.6375-0.33197-0.90191z"/>
<path class="cls-1" d="m21.157 29.379a1.1413 1.1413 0 0 1-0.15423 0.31288 1.0664 1.0664 0 0 1-0.22474 0.25118 0.99592 0.99592 0 0 1-0.8079-0.51705 1.7847 1.7847 0 0 1-1.2603 0.56406q-0.67864 0-1.0282-0.3966a1.3573 1.3573 0 0 1-0.34372-0.9166q0-0.73445 0.48033-1.1149a1.9404 1.9404 0 0 1 1.2295-0.37457q0.40542 0 0.76677 0.03525v-0.2644q0-0.69626-0.66982-0.69626-0.47592 0-1.3485 0.31728a1.2368 1.2368 0 0 1-0.29378-0.7844 4.9164 4.9164 0 0 1 1.9096-0.3966 1.5526 1.5526 0 0 1 1.0752 0.37016q0.41423 0.37016 0.41423 1.1193v1.8038q0.0088 0.48474 0.25559 0.68745zm-2.2151-0.22034a1.2471 1.2471 0 0 0 0.88134-0.42304v-0.77852a5.9182 5.9182 0 0 0-0.66982-0.03525 0.73445 0.73445 0 0 0-0.54643 0.18508 0.6331 0.6331 0 0 0-0.18508 0.46711 0.62282 0.62282 0 0 0 0.14689 0.44067 0.48768 0.48768 0 0 0 0.3731 0.14395z"/>
<path class="cls-1" d="m22.241 29.344q-0.3966-0.60813-0.3966-1.679t0.50236-1.679a1.5188 1.5188 0 0 1 1.2074-0.60813 1.7039 1.7039 0 0 1 1.1898 0.44067 0.99739 0.99739 0 0 1 0.69626-0.37898 0.82552 0.82552 0 0 1 0.23356 0.24677 1.0282 1.0282 0 0 1 0.14689 0.30847q-0.24678 0.21152-0.24678 0.75796v2.4971q0 1.4013-0.4583 1.983-0.4583 0.58169-1.5071 0.58756a4.2598 4.2598 0 0 1-1.5776-0.29378 1.1854 1.1854 0 0 1 0.27322-0.80202 2.882 2.882 0 0 0 1.1854 0.27322q0.57728 0 0.79761-0.29378a1.322 1.322 0 0 0 0.22034-0.81084v-0.35254a1.6936 1.6936 0 0 1-1.1017 0.41423 1.3014 1.3014 0 0 1-1.1648-0.61106zm2.2651-0.71389v-2.0447a1.1355 1.1355 0 0 0-0.75796-0.36135 0.63604 0.63604 0 0 0-0.57728 0.37898 2.2988 2.2988 0 0 0-0.20712 1.0841q0 0.70508 0.18949 1.04a0.56406 0.56406 0 0 0 0.49796 0.33491 1.1193 1.1193 0 0 0 0.8549-0.43186z"/>
<path class="cls-1" d="m30.105 28.039h-2.4678a1.4924 1.4924 0 0 0 0.23356 0.80643q0.20712 0.28644 0.72711 0.28644a2.6778 2.6778 0 0 0 1.1546-0.30847 1.159 1.159 0 0 1 0.31728 0.66982 2.8467 2.8467 0 0 1-1.6966 0.50236q-0.99151 0-1.4234-0.64338-0.43186-0.64338-0.43186-1.6657 0-1.0282 0.47592-1.6657a1.5923 1.5923 0 0 1 1.3617-0.64338q0.88134 0 1.3617 0.53321a1.9434 1.9434 0 0 1 0.47593 1.344 3.4519 3.4519 0 0 1-0.08813 0.7844zm-1.701-1.8684q-0.7227 0-0.77558 1.0929h1.5335v-0.10576a1.25 1.25 0 0 0-0.18508-0.71389 0.64338 0.64338 0 0 0-0.567-0.27321z"/>
</g>
<path d="m17.034 3.0341a2.9114 2.9114 0 0 0-1.1462 0.24753l-11.697 5.1749a0.42304 0.42304 0 0 0-0.22169 0.56586 0.20418 0.20418 0 0 0 0.01757 0.04702l1.8769 3.7099h1.6288l-0.23151-1.2935c-0.0191-0.10429-0.18819-0.84337-0.3483-1.3751l5.4746 1.71c0.07196 0.34089 0.16746 0.65935 0.28112 0.9586h8.8765c0.0978-0.29932 0.17499-0.61834 0.22738-0.9586l5.4627-1.7053c-0.16011 0.53174-0.32713 1.2662-0.34623 1.3705l-0.23151 1.2935h1.6283l1.8593-3.6763 0.01757-0.03359 0.0181-0.04547a0.027909 0.027909 0 0 0 0-0.01188 0.39367 0.39367 0 0 0 0.01757-0.13643 0.41864 0.41864 0 0 0-0.26303-0.4191l-11.697-5.1749a2.9114 2.9114 0 0 0-1.2041-0.24753z" fill="#ffd952"/>
<path d="m17.034 5.4825a2.9114 2.9114 0 0 0-1.1462 0.24753l-11.697 5.1749a0.42304 0.42304 0 0 0-0.22169 0.56534 0.20418 0.20418 0 0 0 0.01757 0.04703l1.018 2.0118h2.1632c-0.068234-0.28802-0.15662-0.64282-0.25528-0.97049l3.1073 0.97048h14.121l3.0939-0.96583c-0.09841 0.32682-0.18541 0.67924-0.25321 0.96583h2.1627l1.0005-1.9782 0.01757-0.03359 0.0181-0.04547a0.027909 0.027909 0 0 0 0-0.01188 0.39367 0.39367 0 0 0 0.01757-0.13643 0.41864 0.41864 0 0 0-0.26303-0.41858l-11.697-5.1749a2.9114 2.9114 0 0 0-1.2041-0.24753z" fill="#49c8fa"/>
<path class="cls-2" d="m30.198 13.82a0.39367 0.39367 0 0 1-0.01762 0.13661 0.027909 0.027909 0 0 1 0 0.01175l-0.01762 0.04554-0.01762 0.03379-2.8306 5.5965c-0.39367 0.77705-1.1178 0.75355-0.99592-0.03232l0.56993-3.1817c0.0191-0.10429 0.18655-0.83874 0.34666-1.3705l-5.4629 1.7054c-0.85784 5.5716-8.1891 5.6641-9.3848 0l-5.4746-1.7098c0.16011 0.53174 0.32904 1.2706 0.34813 1.3749l0.56994 3.1816c0.12192 0.78586-0.60225 0.80937-0.99592 0.03232l-2.8482-5.6303a0.20418 0.20418 0 0 1-0.01763-0.04701 0.42304 0.42304 0 0 1 0.2218-0.56553l11.697-5.175a2.9114 2.9114 0 0 1 2.3502 0l11.697 5.175a0.41864 0.41864 0 0 1 0.26294 0.41864z" fill="#ffd952"/>
<path class="cls-3" d="m20.801 14.796 5.0574-2.0359a0.21446 0.21446 0 0 0 0-0.39807c-0.58756-0.24531-1.3132-0.52734-2.0242-0.82259-0.13073-0.05435-1.369 0.83434-1.4821 0.92541l-2.1799 1.7421c-0.52734 0.44214-0.07051 0.86959 0.62869 0.58903z" fill="#45c8ff"/>
<circle class="cls-3" cx="17.135" cy="16.785" r="2.6367" fill="#45c8ff"/>
<path d="m20.613 10.981a2.2034 2.2034 0 0 1-0.73445-0.07638l-9.2042-2.4839a2.2342 2.2342 0 0 1-0.69332-0.32757z"/>
<g fill="#4e4e4e">
<path class="cls-1" d="m6.6028 26.612 1.3661-0.0088h0.01763q0.75796 0 0.75796 0.71389v2.3003a6.5748 6.5748 0 0 1-2.2886 0.37898q-1.2515 0-1.8861-0.8505t-0.63457-2.3179q0-1.4689 0.7888-2.2827a2.5823 2.5823 0 0 1 1.9301-0.81524 3.5371 3.5371 0 0 1 2.0667 0.64338 1.0385 1.0385 0 0 1-0.18068 0.46711 1.2603 1.2603 0 0 1-0.33932 0.35254 2.5926 2.5926 0 0 0-1.5027-0.51999 1.4175 1.4175 0 0 0-1.1854 0.54203q-0.42304 0.53909-0.42304 1.6966 0 2.1769 1.604 2.1769a4.4743 4.4743 0 0 0 0.97829-0.11457v-0.83728q0-0.3966 0.01763-0.58756h-0.64633a0.60519 0.60519 0 0 1-0.40101-0.11018 0.44067 0.44067 0 0 1-0.12779-0.35254 1.51 1.51 0 0 1 0.088134-0.47446z"/>
<path class="cls-1" d="m13.401 29.379a1.1413 1.1413 0 0 1-0.14689 0.31288 1.0664 1.0664 0 0 1-0.22474 0.25118 0.99592 0.99592 0 0 1-0.80937-0.51705 1.7847 1.7847 0 0 1-1.2603 0.56406q-0.67863 0-1.0282-0.3966a1.3573 1.3573 0 0 1-0.34372-0.9166q0-0.73445 0.48033-1.1149a1.9404 1.9404 0 0 1 1.2354-0.3687q0.40542 0 0.76677 0.03525v-0.2644q0-0.69626-0.66982-0.69626-0.47592 0-1.3485 0.31728a1.2368 1.2368 0 0 1-0.29378-0.78439 4.9164 4.9164 0 0 1 1.9096-0.3966 1.5526 1.5526 0 0 1 1.0752 0.37016q0.41423 0.37016 0.41423 1.1193v1.7979q-0.0029 0.48474 0.24384 0.68745zm-2.2122-0.22034a1.2471 1.2471 0 0 0 0.88134-0.42304v-0.77852a5.9182 5.9182 0 0 0-0.66982-0.03525 0.73445 0.73445 0 0 0-0.54643 0.18214 0.6331 0.6331 0 0 0-0.18508 0.46711 0.62282 0.62282 0 0 0 0.14689 0.44067 0.48768 0.48768 0 0 0 0.3731 0.14689z"/>
<path class="cls-1" d="m14.115 26.012a1.0547 1.0547 0 0 1 0.14689-0.32169 0.88134 0.88134 0 0 1 0.22474-0.25118 1.1017 1.1017 0 0 1 0.92982 0.78439q0.35254-0.78439 1.1369-0.78439a2.7028 2.7028 0 0 1 0.51118 0.06169 1.9786 1.9786 0 0 1-0.2644 1.0282 2.2357 2.2357 0 0 0-0.3966-0.05288q-0.53762 0-0.86372 0.57287v2.8174a3.0627 3.0627 0 0 1-0.53762 0.04407 3.3785 3.3785 0 0 1-0.55525-0.04407v-2.9525q-0.0059-0.6375-0.33197-0.90191z"/>
<path class="cls-1" d="m21.157 29.379a1.1413 1.1413 0 0 1-0.15423 0.31288 1.0664 1.0664 0 0 1-0.22474 0.25118 0.99592 0.99592 0 0 1-0.8079-0.51705 1.7847 1.7847 0 0 1-1.2603 0.56406q-0.67864 0-1.0282-0.3966a1.3573 1.3573 0 0 1-0.34372-0.9166q0-0.73445 0.48033-1.1149a1.9404 1.9404 0 0 1 1.2295-0.37457q0.40542 0 0.76677 0.03525v-0.2644q0-0.69626-0.66982-0.69626-0.47592 0-1.3485 0.31728a1.2368 1.2368 0 0 1-0.29378-0.7844 4.9164 4.9164 0 0 1 1.9096-0.3966 1.5526 1.5526 0 0 1 1.0752 0.37016q0.41423 0.37016 0.41423 1.1193v1.8038q0.0088 0.48474 0.25559 0.68745zm-2.2151-0.22034a1.2471 1.2471 0 0 0 0.88134-0.42304v-0.77852a5.9182 5.9182 0 0 0-0.66982-0.03525 0.73445 0.73445 0 0 0-0.54643 0.18508 0.6331 0.6331 0 0 0-0.18508 0.46711 0.62282 0.62282 0 0 0 0.14689 0.44067 0.48768 0.48768 0 0 0 0.3731 0.14395z"/>
<path class="cls-1" d="m22.241 29.344q-0.3966-0.60813-0.3966-1.679t0.50236-1.679a1.5188 1.5188 0 0 1 1.2074-0.60813 1.7039 1.7039 0 0 1 1.1898 0.44067 0.99739 0.99739 0 0 1 0.69626-0.37898 0.82552 0.82552 0 0 1 0.23356 0.24677 1.0282 1.0282 0 0 1 0.14689 0.30847q-0.24678 0.21152-0.24678 0.75796v2.4971q0 1.4013-0.4583 1.983-0.4583 0.58169-1.5071 0.58756a4.2598 4.2598 0 0 1-1.5776-0.29378 1.1854 1.1854 0 0 1 0.27322-0.80202 2.882 2.882 0 0 0 1.1854 0.27322q0.57728 0 0.79761-0.29378a1.322 1.322 0 0 0 0.22034-0.81084v-0.35254a1.6936 1.6936 0 0 1-1.1017 0.41423 1.3014 1.3014 0 0 1-1.1648-0.61106zm2.2651-0.71389v-2.0447a1.1355 1.1355 0 0 0-0.75796-0.36135 0.63604 0.63604 0 0 0-0.57728 0.37898 2.2988 2.2988 0 0 0-0.20712 1.0841q0 0.70508 0.18949 1.04a0.56406 0.56406 0 0 0 0.49796 0.33491 1.1193 1.1193 0 0 0 0.8549-0.43186z"/>
<path class="cls-1" d="m30.105 28.039h-2.4678a1.4924 1.4924 0 0 0 0.23356 0.80643q0.20712 0.28644 0.72711 0.28644a2.6778 2.6778 0 0 0 1.1546-0.30847 1.159 1.159 0 0 1 0.31728 0.66982 2.8467 2.8467 0 0 1-1.6966 0.50236q-0.99151 0-1.4234-0.64338-0.43186-0.64338-0.43186-1.6657 0-1.0282 0.47592-1.6657a1.5923 1.5923 0 0 1 1.3617-0.64338q0.88134 0 1.3617 0.53321a1.9434 1.9434 0 0 1 0.47593 1.344 3.4519 3.4519 0 0 1-0.08813 0.7844zm-1.701-1.8684q-0.7227 0-0.77558 1.0929h1.5335v-0.10576a1.25 1.25 0 0 0-0.18508-0.71389 0.64338 0.64338 0 0 0-0.567-0.27321z"/>
</g>
<g>
<path d="m17.034 3.0341a2.9114 2.9114 0 0 0-1.1462 0.24753l-11.697 5.1749a0.42304 0.42304 0 0 0-0.22169 0.56586 0.20418 0.20418 0 0 0 0.01757 0.04702l1.8769 3.7099h1.6288l-0.23151-1.2935c-0.0191-0.10429-0.18819-0.84337-0.3483-1.3751l5.4746 1.71c0.07196 0.34089 0.16746 0.65935 0.28112 0.9586h8.8765c0.0978-0.29932 0.17499-0.61834 0.22738-0.9586l5.4627-1.7053c-0.16011 0.53174-0.32713 1.2662-0.34623 1.3705l-0.23151 1.2935h1.6283l1.8593-3.6763 0.01757-0.03359 0.0181-0.04547a0.027909 0.027909 0 0 0 0-0.01188 0.39367 0.39367 0 0 0 0.01757-0.13643 0.41864 0.41864 0 0 0-0.26303-0.4191l-11.697-5.1749a2.9114 2.9114 0 0 0-1.2041-0.24753z" fill="#ff9329"/>
<path d="m17.034 5.4825a2.9114 2.9114 0 0 0-1.1462 0.24753l-11.697 5.1749a0.42304 0.42304 0 0 0-0.22169 0.56534 0.20418 0.20418 0 0 0 0.01757 0.04703l1.018 2.0118h2.1632c-0.068234-0.28802-0.15662-0.64282-0.25528-0.97049l3.1073 0.97048h14.121l3.0939-0.96583c-0.09841 0.32682-0.18541 0.67924-0.25321 0.96583h2.1627l1.0005-1.9782 0.01757-0.03359 0.0181-0.04547a0.027909 0.027909 0 0 0 0-0.01188 0.39367 0.39367 0 0 0 0.01757-0.13643 0.41864 0.41864 0 0 0-0.26303-0.41858l-11.697-5.1749a2.9114 2.9114 0 0 0-1.2041-0.24753z" fill="#4e4e4e"/>
<path class="cls-2" d="m30.198 13.82a0.39367 0.39367 0 0 1-0.01762 0.13661 0.027909 0.027909 0 0 1 0 0.01175l-0.01762 0.04554-0.01762 0.03379-2.8306 5.5965c-0.39367 0.77705-1.1178 0.75355-0.99592-0.03232l0.56993-3.1817c0.0191-0.10429 0.18655-0.83874 0.34666-1.3705l-5.4629 1.7054c-0.85784 5.5716-8.1891 5.6641-9.3848 0l-5.4746-1.7098c0.16011 0.53174 0.32904 1.2706 0.34813 1.3749l0.56994 3.1816c0.12192 0.78586-0.60225 0.80937-0.99592 0.03232l-2.8482-5.6303a0.20418 0.20418 0 0 1-0.01763-0.04701 0.42304 0.42304 0 0 1 0.2218-0.56553l11.697-5.175a2.9114 2.9114 0 0 1 2.3502 0l11.697 5.175a0.41864 0.41864 0 0 1 0.26294 0.41864z" fill="#ff9329"/>
<path class="cls-3" d="m20.801 14.796 5.0574-2.0359a0.21446 0.21446 0 0 0 0-0.39807c-0.58756-0.24531-1.3132-0.52734-2.0242-0.82259-0.13073-0.05435-1.369 0.83434-1.4821 0.92541l-2.1799 1.7421c-0.52734 0.44214-0.07051 0.86959 0.62869 0.58903z" fill="#4e4e4e"/>
<circle class="cls-3" cx="17.135" cy="16.785" r="2.6367" fill="#4e4e4e"/>
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 13 KiB

104
doc/book/src/intro.md Normal file
View file

@ -0,0 +1,104 @@
<p align="center" style="text-align:center;">
<a href="https://garagehq.deuxfleurs.fr">
<img alt="Garage's Logo" src="img/logo.svg" height="200" />
</a>
</p>
```
This very website is hosted using Garage. In other words: the doc is the PoC!
```
# The Garage Geo-Distributed Data Store
Garage is a lightweight geo-distributed data store.
It comes from the observation that despite numerous object stores
many people have broken data management policies (backup/replication on a single site or none at all).
To promote better data management policies, we focused on the following **desirable properties**:
- **Self-contained & lightweight**: works everywhere and integrates well in existing environments to target [hyperconverged infrastructures](https://en.wikipedia.org/wiki/Hyper-converged_infrastructure).
- **Highly resilient**: highly resilient to network failures, network latency, disk failures, sysadmin failures.
- **Simple**: simple to understand, simple to operate, simple to debug.
- **Internet enabled**: made for multi-sites (eg. datacenters, offices, households, etc.) interconnected through regular Internet connections.
We also noted that the pursuit of some other goals are detrimental to our initial goals.
The following has been identified as **non-goals** (if these points matter to you, you should not use Garage):
- **Extreme performances**: high performances constrain a lot the design and the infrastructure; we seek performances through minimalism only.
- **Feature extensiveness**: complete implementation of the S3 API or any other API to make garage a drop-in replacement is not targeted as it could lead to decisions impacting our desirable properties.
- **Storage optimizations**: erasure coding or any other coding technique both increase the difficulty of placing data and synchronizing; we limit ourselves to duplication.
- **POSIX/Filesystem compatibility**: we do not aim at being POSIX compatible or to emulate any kind of filesystem. Indeed, in a distributed environment, such synchronizations are translated in network messages that impose severe constraints on the deployment.
## Supported and planned protocols
Garage speaks (or will speak) the following protocols:
- [S3](https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html) - *SUPPORTED* - Enable applications to store large blobs such as pictures, video, images, documents, etc. S3 is versatile enough to also be used to publish a static website.
- [IMAP](https://github.com/go-pluto/pluto) - *PLANNED* - email storage is quite complex to get good performances.
To keep performances optimal, most IMAP servers only support on-disk storage.
We plan to add logic to Garage to make it a viable solution for email storage.
- *More to come*
## Use Cases
**[Deuxfleurs](https://deuxfleurs.fr):** Garage is used by Deuxfleurs which is a non-profit hosting organization.
Especially, it is used to host their main website, this documentation and some of its members' blogs.
Additionally, Garage is used as a [backend for Nextcloud](https://docs.nextcloud.com/server/20/admin_manual/configuration_files/primary_storage.html).
Deuxfleurs also plans to use Garage as their [Matrix's media backend](https://github.com/matrix-org/synapse-s3-storage-provider) and as the backend of [OCIS](https://github.com/owncloud/ocis).
*Are you using Garage? [Open a pull request](https://git.deuxfleurs.fr/Deuxfleurs/garage/) to add your organization here!*
## Comparison to existing software
**[MinIO](https://min.io/):** MinIO shares our *Self-contained & lightweight* goal but selected two of our non-goals: *Storage optimizations* through erasure coding and *POSIX/Filesystem compatibility* through strong consistency.
However, by pursuing these two non-goals, MinIO do not reach our desirable properties.
Firstly, it fails on the *Simple* property: due to the erasure coding, MinIO has severe limitations on how drives can be added or deleted from a cluster.
Secondly, it fails on the *Internet enabled* property: due to its strong consistency, MinIO is latency sensitive.
Furthermore, MinIO has no knowledge of "sites" and thus can not distribute data to minimize the failure of a given site.
**[Openstack Swift](https://docs.openstack.org/swift/latest/):**
OpenStack Swift at least fails on the *Self-contained & lightweight* goal.
Starting it requires around 8GB of RAM, which is too much especially in an hyperconverged infrastructure.
We also do not classify Swift as *Simple*.
**[Ceph](https://ceph.io/ceph-storage/object-storage/):**
This review holds for the whole Ceph stack, including the RADOS paper, Ceph Object Storage module, the RADOS Gateway, etc.
At its core, Ceph has been designed to provide *POSIX/Filesystem compatibility* which requires strong consistency, which in turn
makes Ceph latency-sensitive and fails our *Internet enabled* goal.
Due to its industry oriented design, Ceph is also far from being *Simple* to operate and from being *Self-contained & lightweight* which makes it hard to integrate it in an hyperconverged infrastructure.
In a certain way, Ceph and MinIO are closer together than they are from Garage or OpenStack Swift.
*More comparisons are available in our [Related Work](design/related_work.md) chapter.*
## Other Resources
This website is not the only source of information about Garage!
We reference here other places on the Internet where you can learn more about Garage.
### Rust API (docs.rs)
If you encounter a specific bug in Garage or plan to patch it, you may jump directly to the source code's documentation!
- [garage\_api](https://docs.rs/garage_api/latest/garage_api/) - contains the S3 standard API endpoint
- [garage\_model](https://docs.rs/garage_model/latest/garage_model/) - contains Garage's model built on the table abstraction
- [garage\_rpc](https://docs.rs/garage_rpc/latest/garage_rpc/) - contains Garage's federation protocol
- [garage\_table](https://docs.rs/garage_table/latest/garage_table/) - contains core Garage's CRDT datatypes
- [garage\_util](https://docs.rs/garage_util/latest/garage_util/) - contains garage helpers
- [garage\_web](https://docs.rs/garage_web/latest/garage_web/) - contains the S3 website endpoint
### Talks
We love to talk and hear about Garage, that's why we keep a log here:
- [(fr, 2020-12-02) Garage : jouer dans la cour des grands quand on est un hébergeur associatif](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/main/doc/20201202_talk/talk.pdf)
*Did you write or talk about Garage? [Open a pull request](https://git.deuxfleurs.fr/Deuxfleurs/garage/) to add a link here!*
## Community
If you want to discuss with us, you can join our Matrix channel at [#garage:deuxfleurs.fr](https://matrix.to/#/#garage:deuxfleurs.fr).
Our code repository and issue tracker, which is the place where you should report bugs, is managed on [Deuxfleurs' Gitea](https://git.deuxfleurs.fr/Deuxfleurs/garage).
## License
Garage's source code, is released under the [AGPL v3 License](https://www.gnu.org/licenses/agpl-3.0.en.html).
Please note that if you patch Garage and then use it to provide any service over a network, you must share your code!

View file

@ -0,0 +1,5 @@
# Reference Manual
A reference manual contains some extensive descriptions about the features and the behaviour of the software.
Reading of this chapter is recommended once you have a good knowledge/understanding of Garage.
It will be useful if you want to tune it or to use it in some exotic conditions.

View file

@ -0,0 +1,84 @@
## S3 Compatibility status
### Global S3 features
Implemented:
- path-style URLs (`garage.tld/bucket/key`)
- putting and getting objects in buckets
- multipart uploads
- listing objects
- access control on a per-key-per-bucket basis
Not implemented:
- vhost-style URLs (`bucket.garage.tld/key`)
- object-level ACL
- encryption
- most `x-amz-` headers
### Endpoint implementation
All APIs that are not mentionned are not implemented and will return a 400 bad request.
#### AbortMultipartUpload
Implemented.
#### CompleteMultipartUpload
Implemented badly. Garage will not check that all the parts stored correspond to the list given by the client in the request body. This means that the multipart upload might be completed with an invalid size. This is a bug and will be fixed.
#### CopyObject
Implemented.
#### CreateBucket
Garage does not accept creating buckets or giving access using API calls, it has to be done using the CLI tools. CreateBucket will return a 200 if the bucket exists and user has write access, and a 403 Forbidden in all other cases.
#### CreateMultipartUpload
Implemented.
#### DeleteBucket
Garage does not accept deleting buckets using API calls, it has to be done using the CLI tools. This request will return a 403 Forbidden.
#### DeleteObject
Implemented.
#### DeleteObjects
Implemented.
#### GetObject
Implemented.
#### HeadBucket
Implemented.
#### HeadObject
Implemented.
#### ListObjects
Implemented, but there isn't a very good specification of what `encoding-type=url` covers so there might be some encoding bugs. In our implementation the url-encoded fields are in the same in ListObjects as they are in ListObjectsV2.
#### ListObjectsV2
Implemented.
#### PutObject
Implemented.
#### UploadPart
Implemented.

View file

@ -0,0 +1,8 @@
# Working Documents
Working documents are documents that reflect the fact that Garage is a software that evolves quickly.
They are a way to communicate our ideas, our changes, and so on before or while we are implementing them in Garage.
If you like to live on the edge, it could also serve as a documentation of our next features to be released.
Ideally, once the feature/patch has been merged, the working document should serve as a source to
update the rest of the documentation and then be removed.

View file

@ -0,0 +1,197 @@
## Load Balancing Data (planned for version 0.2)
I have conducted a quick study of different methods to load-balance data over different Garage nodes using consistent hashing.
### Requirements
- *good balancing*: two nodes that have the same announced capacity should receive close to the same number of items
- *multi-datacenter*: the replicas of a partition should be distributed over as many datacenters as possible
- *minimal disruption*: when adding or removing a node, as few partitions as possible should have to move around
- *order-agnostic*: the same set of nodes (each associated with a datacenter name
and a capacity) should always return the same distribution of partition
replicas, independently of the order in which nodes were added/removed (this
is to keep the implementation simple)
### Methods
#### Naive multi-DC ring walking strategy
This strategy can be used with any ring-like algorithm to make it aware of the *multi-datacenter* requirement:
In this method, the ring is a list of positions, each associated with a single node in the cluster.
Partitions contain all the keys between two consecutive items of the ring.
To find the nodes that store replicas of a given partition:
- select the node for the position of the partition's lower bound
- go clockwise on the ring, skipping nodes that:
- we halve already selected
- are in a datacenter of a node we have selected, except if we already have nodes from all possible datacenters
In this way the selected nodes will always be distributed over
`min(n_datacenters, n_replicas)` different datacenters, which is the best we
can do.
This method was implemented in the first version of Garage, with the basic
ring construction from Dynamo DB that consists in associating `n_token` random positions to
each node (I know it's not optimal, the Dynamo paper already studies this).
#### Better rings
The ring construction that selects `n_token` random positions for each nodes gives a ring of positions that
is not well-balanced: the space between the tokens varies a lot, and some partitions are thus bigger than others.
This problem was demonstrated in the original Dynamo DB paper.
To solve this, we want to apply a better second method for partitionning our dataset:
1. fix an initially large number of partitions (say 1024) with evenly-spaced delimiters,
2. attribute each partition randomly to a node, with a probability
proportionnal to its capacity (which `n_tokens` represented in the first
method)
For now we continue using the multi-DC ring walking described above.
I have studied two ways to do the attribution of partitions to nodes, in a way that is deterministic:
- Min-hash: for each partition, select node that minimizes `hash(node, partition_number)`
- MagLev: see [here](https://blog.acolyer.org/2016/03/21/maglev-a-fast-and-reliable-software-network-load-balancer/)
MagLev provided significantly better balancing, as it guarantees that the exact
same number of partitions is attributed to all nodes that have the same
capacity (and that this number is proportionnal to the node's capacity, except
for large values), however in both cases:
- the distribution is still bad, because we use the naive multi-DC ring walking
that behaves strangely due to interactions between consecutive positions on
the ring
- the disruption in case of adding/removing a node is not as low as it can be,
as we show with the following method.
A quick description of MagLev (backend = node, lookup table = ring):
> The basic idea of Maglev hashing is to assign a preference list of all the
> lookup table positions to each backend. Then all the backends take turns
> filling their most-preferred table positions that are still empty, until the
> lookup table is completely filled in. Hence, Maglev hashing gives an almost
> equal share of the lookup table to each of the backends. Heterogeneous
> backend weights can be achieved by altering the relative frequency of the
> backends turns…
Here are some stats (run `scripts/simulate_ring.py` to reproduce):
```
##### Custom-ring (min-hash) #####
#partitions per node (capacity in parenthesis):
- datura (8) : 227
- digitale (8) : 351
- drosera (8) : 259
- geant (16) : 476
- gipsie (16) : 410
- io (16) : 495
- isou (8) : 231
- mini (4) : 149
- mixi (4) : 188
- modi (4) : 127
- moxi (4) : 159
Variance of load distribution for load normalized to intra-class mean
(a class being the set of nodes with the same announced capacity): 2.18% <-- REALLY BAD
Disruption when removing nodes (partitions moved on 0/1/2/3 nodes):
removing atuin digitale : 63.09% 30.18% 6.64% 0.10%
removing atuin drosera : 72.36% 23.44% 4.10% 0.10%
removing atuin datura : 73.24% 21.48% 5.18% 0.10%
removing jupiter io : 48.34% 38.48% 12.30% 0.88%
removing jupiter isou : 74.12% 19.73% 6.05% 0.10%
removing grog mini : 84.47% 12.40% 2.93% 0.20%
removing grog mixi : 80.76% 16.60% 2.64% 0.00%
removing grog moxi : 83.59% 14.06% 2.34% 0.00%
removing grog modi : 87.01% 11.43% 1.46% 0.10%
removing grisou geant : 48.24% 37.40% 13.67% 0.68%
removing grisou gipsie : 53.03% 33.59% 13.09% 0.29%
on average: 69.84% 23.53% 6.40% 0.23% <-- COULD BE BETTER
--------
##### MagLev #####
#partitions per node:
- datura (8) : 273
- digitale (8) : 256
- drosera (8) : 267
- geant (16) : 452
- gipsie (16) : 427
- io (16) : 483
- isou (8) : 272
- mini (4) : 184
- mixi (4) : 160
- modi (4) : 144
- moxi (4) : 154
Variance of load distribution: 0.37% <-- Already much better, but not optimal
Disruption when removing nodes (partitions moved on 0/1/2/3 nodes):
removing atuin digitale : 62.60% 29.20% 7.91% 0.29%
removing atuin drosera : 65.92% 26.56% 7.23% 0.29%
removing atuin datura : 63.96% 27.83% 7.71% 0.49%
removing jupiter io : 44.63% 40.33% 14.06% 0.98%
removing jupiter isou : 63.38% 27.25% 8.98% 0.39%
removing grog mini : 72.46% 21.00% 6.35% 0.20%
removing grog mixi : 72.95% 22.46% 4.39% 0.20%
removing grog moxi : 74.22% 20.61% 4.98% 0.20%
removing grog modi : 75.98% 18.36% 5.27% 0.39%
removing grisou geant : 46.97% 36.62% 15.04% 1.37%
removing grisou gipsie : 49.22% 36.52% 12.79% 1.46%
on average: 62.94% 27.89% 8.61% 0.57% <-- WORSE THAN PREVIOUSLY
```
#### The magical solution: multi-DC aware MagLev
Suppose we want to select three replicas for each partition (this is what we do in our simulation and in most Garage deployments).
We apply MagLev three times consecutively, one for each replica selection.
The first time is pretty much the same as normal MagLev, but for the following times, when a node runs through its preference
list to select a partition to replicate, we skip partitions for which adding this node would not bring datacenter-diversity.
More precisely, we skip a partition in the preference list if:
- the node already replicates the partition (from one of the previous rounds of MagLev)
- the node is in a datacenter where a node already replicates the partition and there are other datacenters available
Refer to `method4` in the simulation script for a formal definition.
```
##### Multi-DC aware MagLev #####
#partitions per node:
- datura (8) : 268 <-- NODES WITH THE SAME CAPACITY
- digitale (8) : 267 HAVE THE SAME NUM OF PARTITIONS
- drosera (8) : 267 (+- 1)
- geant (16) : 470
- gipsie (16) : 472
- io (16) : 516
- isou (8) : 268
- mini (4) : 136
- mixi (4) : 136
- modi (4) : 136
- moxi (4) : 136
Variance of load distribution: 0.06% <-- CAN'T DO BETTER THAN THIS
Disruption when removing nodes (partitions moved on 0/1/2/3 nodes):
removing atuin digitale : 65.72% 33.01% 1.27% 0.00%
removing atuin drosera : 64.65% 33.89% 1.37% 0.10%
removing atuin datura : 66.11% 32.62% 1.27% 0.00%
removing jupiter io : 42.97% 53.42% 3.61% 0.00%
removing jupiter isou : 66.11% 32.32% 1.56% 0.00%
removing grog mini : 80.47% 18.85% 0.68% 0.00%
removing grog mixi : 80.27% 18.85% 0.88% 0.00%
removing grog moxi : 80.18% 19.04% 0.78% 0.00%
removing grog modi : 79.69% 19.92% 0.39% 0.00%
removing grisou geant : 44.63% 52.15% 3.22% 0.00%
removing grisou gipsie : 43.55% 52.54% 3.91% 0.00%
on average: 64.94% 33.33% 1.72% 0.01% <-- VERY GOOD (VERY LOW VALUES FOR 2 AND 3 NODES)
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.1 KiB

View file

@ -0,0 +1,113 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
width="250"
height="250"
viewBox="0 0 66.145832 66.145831"
version="1.1"
id="svg916"
inkscape:version="1.0.2 (e86c870879, 2021-01-15)"
sodipodi:docname="garage-dark-notext.svg"
inkscape:export-filename="/home/lx/Deuxfleurs/garage/garage-dark-notext.png"
inkscape:export-xdpi="96"
inkscape:export-ydpi="96">
<defs
id="defs910" />
<sodipodi:namedview
id="base"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageopacity="0.0"
inkscape:pageshadow="2"
inkscape:zoom="2.3640695"
inkscape:cx="127.28732"
inkscape:cy="150.37984"
inkscape:document-units="mm"
inkscape:current-layer="layer1"
inkscape:document-rotation="0"
showgrid="false"
fit-margin-top="0"
fit-margin-left="0"
fit-margin-right="0"
fit-margin-bottom="0"
units="px"
inkscape:window-width="1920"
inkscape:window-height="1039"
inkscape:window-x="0"
inkscape:window-y="20"
inkscape:window-maximized="0" />
<metadata
id="metadata913">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title></dc:title>
</cc:Work>
</rdf:RDF>
</metadata>
<g
inkscape:label="Layer 1"
inkscape:groupmode="layer"
id="layer1"
transform="translate(-141.5009,-98.254059)">
<rect
style="fill:#4e4e4e;fill-opacity:1;stroke-width:1.01574"
id="rect858"
width="66.592186"
height="66.832306"
x="141.5009"
y="98.056282" />
<g
id="g1775"
transform="matrix(1.9019239,0,0,1.9019239,-157.45231,-108.13709)">
<path
class="cls-2"
d="m 187.70646,127.72029 a 0.39366647,0.39366647 0 0 1 -0.0176,0.1366 0.02790919,0.02790919 0 0 1 0,0.0117 l -0.0176,0.0455 v 0 l -0.0176,0.0338 -2.83058,5.59653 c -0.39367,0.77705 -1.11784,0.75355 -0.99592,-0.0323 l 0.56994,-3.18164 c 0.0191,-0.1043 0.18655,-0.83875 0.34666,-1.37049 l -5.46286,1.7054 c -0.85784,5.57155 -8.18914,5.66409 -9.38483,0 l -5.47461,-1.70981 c 0.16011,0.53174 0.32904,1.2706 0.34813,1.3749 l 0.56994,3.18164 c 0.12192,0.78587 -0.60225,0.80937 -0.99592,0.0323 l -2.84822,-5.63031 a 0.20417776,0.20417776 0 0 1 -0.0176,-0.047 0.42304456,0.42304456 0 0 1 0.22181,-0.56552 l 11.69689,-5.17495 a 2.9113691,2.9113691 0 0 1 2.35024,0 l 11.69689,5.17495 a 0.41863785,0.41863785 0 0 1 0.26293,0.41864 z"
id="path24-31"
style="stroke-width:0.146891" />
<path
class="cls-3"
d="m 178.30988,128.69564 5.05744,-2.03591 a 0.21446009,0.21446009 0 0 0 0,-0.39807 c -0.58756,-0.2453 -1.3132,-0.52733 -2.02415,-0.82259 -0.13073,-0.0543 -1.36902,0.83434 -1.48213,0.92542 l -2.17985,1.74212 c -0.52734,0.44214 -0.0705,0.86959 0.62869,0.58903 z"
id="path26-9"
style="stroke-width:0.146891" />
<circle
class="cls-3"
cx="174.64349"
cy="130.68452"
r="2.6366842"
id="circle28-4"
style="stroke-width:0.146891" />
<path
id="path24-3-6-9-0"
style="fill:#ff9329;fill-opacity:1;stroke-width:0.146891"
d="m 174.54269,116.93385 a 2.9113691,2.9113691 0 0 0 -1.14618,0.24753 l -11.69696,5.17488 a 0.42304456,0.42304456 0 0 0 -0.22169,0.56586 0.20417776,0.20417776 0 0 0 0.0176,0.047 l 0.79634,1.57355 11.10475,-4.91288 a 2.9113691,2.9113691 0 0 1 1.14618,-0.24753 2.9113691,2.9113691 0 0 1 1.20406,0.24753 l 11.12387,4.92115 0.7829,-1.54823 0.0176,-0.0336 0.0181,-0.0455 a 0.02790919,0.02790919 0 0 0 0,-0.0119 0.39366647,0.39366647 0 0 0 0.0176,-0.13642 0.41863785,0.41863785 0 0 0 -0.26303,-0.4191 l -11.69697,-5.17488 a 2.9113691,2.9113691 0 0 0 -1.20406,-0.24753 z m -10.12134,9.52449 c 0.0218,0.0723 0.0408,0.14674 0.0615,0.22066 h 0.51831 l -0.008,-0.0419 z m 20.32227,0.005 -0.57103,0.17828 -0.007,0.0377 h 0.5178 c 0.0202,-0.0723 0.0386,-0.14514 0.0599,-0.216 z" />
<path
class="cls-2"
d="m 187.70647,127.72029 a 0.39366647,0.39366647 0 0 1 -0.0176,0.13661 0.02790919,0.02790919 0 0 1 0,0.0117 l -0.0176,0.0455 v 0 l -0.0176,0.0338 -2.83058,5.59652 c -0.39366,0.77705 -1.11783,0.75355 -0.99591,-0.0323 l 0.56993,-3.18165 c 0.0191,-0.10429 0.18655,-0.83874 0.34666,-1.37049 l -5.46285,1.7054 c -0.85784,5.57156 -8.18915,5.6641 -9.38484,0 l -5.4746,-1.70981 c 0.16011,0.53175 0.32903,1.27061 0.34813,1.3749 l 0.56993,3.18165 c 0.12192,0.78586 -0.60225,0.80936 -0.99592,0.0323 l -2.84822,-5.63031 a 0.20417776,0.20417776 0 0 1 -0.0176,-0.047 0.42304456,0.42304456 0 0 1 0.22181,-0.56553 l 11.69688,-5.17495 a 2.9113691,2.9113691 0 0 1 2.35025,0 l 11.69689,5.17495 a 0.41863785,0.41863785 0 0 1 0.26293,0.41864 z"
id="path24-0-3"
style="fill:#ff9329;fill-opacity:1;stroke-width:0.146891" />
<path
class="cls-3"
d="m 178.30988,128.69564 5.05744,-2.0359 a 0.21446009,0.21446009 0 0 0 0,-0.39807 c -0.58756,-0.24531 -1.3132,-0.52734 -2.02415,-0.82259 -0.13073,-0.0543 -1.36902,0.83434 -1.48212,0.92541 l -2.17986,1.74212 c -0.52734,0.44214 -0.0705,0.86959 0.62869,0.58903 z"
id="path26-2-2"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.146891" />
<circle
class="cls-3"
cx="174.64349"
cy="130.68452"
r="2.6366842"
id="circle28-3-0"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.146891" />
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 5.8 KiB

174
doc/logo/garage-dark.svg Normal file
View file

@ -0,0 +1,174 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
width="250"
height="250"
viewBox="0 0 66.145832 66.145831"
version="1.1"
id="svg916"
inkscape:version="1.0.2 (e86c870879, 2021-01-15)"
sodipodi:docname="garage-dark.svg">
<defs
id="defs910" />
<sodipodi:namedview
id="base"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageopacity="0.0"
inkscape:pageshadow="2"
inkscape:zoom="2.3640695"
inkscape:cx="132.7426"
inkscape:cy="151.74366"
inkscape:document-units="mm"
inkscape:current-layer="layer1"
inkscape:document-rotation="0"
showgrid="false"
fit-margin-top="0"
fit-margin-left="0"
fit-margin-right="0"
fit-margin-bottom="0"
units="px"
inkscape:window-width="1920"
inkscape:window-height="1039"
inkscape:window-x="0"
inkscape:window-y="20"
inkscape:window-maximized="0" />
<metadata
id="metadata913">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title></dc:title>
</cc:Work>
</rdf:RDF>
</metadata>
<g
inkscape:label="Layer 1"
inkscape:groupmode="layer"
id="layer1"
transform="translate(-141.5009,-98.254059)">
<rect
style="fill:#4e4e4e;fill-opacity:1;stroke-width:1.01574"
id="rect858"
width="66.592186"
height="66.832306"
x="141.5009"
y="98.056282" />
<g
id="g1637"
transform="translate(1.5164686,-0.22143797)">
<g
id="g1034-5"
transform="matrix(0.26458333,0,0,0.26458333,140.0054,98.562655)">
<path
class="cls-1"
d="m 85.377935,159.38378 5.163143,-0.0333 h 0.06662 q 2.864711,0 2.864711,2.69816 v 8.69407 a 24.849705,24.849705 0 0 1 -8.649651,1.43235 q -4.730105,0 -7.128468,-3.21447 -2.398363,-3.21447 -2.398363,-8.76068 0,-5.55177 2.981299,-8.62745 a 9.7600046,9.7600046 0 0 1 7.29502,-3.08123 13.368653,13.368653 0 0 1 7.811335,2.43167 3.9250986,3.9250986 0 0 1 -0.682867,1.76547 4.7634152,4.7634152 0 0 1 -1.282458,1.33242 9.798867,9.798867 0 0 0 -5.679457,-1.96533 5.3574542,5.3574542 0 0 0 -4.480275,2.04861 q -1.598909,2.03749 -1.598909,6.41229 0,8.22771 6.062529,8.22771 a 16.910679,16.910679 0 0 0 3.697476,-0.43303 v -3.16451 q 0,-1.49898 0.06662,-2.22071 h -2.442777 a 2.2873276,2.2873276 0 0 1 -1.515632,-0.41638 1.6655298,1.6655298 0 0 1 -0.483004,-1.33242 5.7072154,5.7072154 0 0 1 0.333106,-1.79322 z"
id="path8-2"
style="stroke-width:0.555177" />
<path
class="cls-1"
d="m 111.07151,169.8433 a 4.3137222,4.3137222 0 0 1 -0.55518,1.18253 4.0305821,4.0305821 0 0 1 -0.84942,0.94935 3.7640973,3.7640973 0 0 1 -3.05902,-1.95422 6.7453957,6.7453957 0 0 1 -4.76342,2.13188 q -2.564913,0 -3.886233,-1.49898 a 5.1298318,5.1298318 0 0 1 -1.299113,-3.4643 q 0,-2.77588 1.815427,-4.21379 a 7.3338829,7.3338829 0 0 1 4.669039,-1.3935 q 1.53228,0 2.89802,0.13325 v -0.99932 q 0,-2.63154 -2.53161,-2.63154 -1.79877,0 -5.096518,1.19918 a 4.674587,4.674587 0 0 1 -1.110353,-2.96464 18.581761,18.581761 0 0 1 7.217291,-1.49898 5.8682167,5.8682167 0 0 1 4.0639,1.39905 q 1.56559,1.39904 1.56559,4.23044 v 6.79537 q -0.0111,1.83208 0.9216,2.59822 z m -8.36096,-0.83276 a 4.7134493,4.7134493 0 0 0 3.33106,-1.59891 v -2.94244 a 22.368065,22.368065 0 0 0 -2.53161,-0.13324 2.775883,2.775883 0 0 0 -2.06525,0.68842 2.3928111,2.3928111 0 0 0 -0.69953,1.76546 2.3539488,2.3539488 0 0 0 0.55518,1.66553 1.8431863,1.8431863 0 0 0 1.41015,0.55518 z"
id="path10-28"
style="stroke-width:0.555177" />
<path
class="cls-1"
d="m 113.76966,157.11865 a 3.986168,3.986168 0 0 1 0.55518,-1.21583 3.3310596,3.3310596 0 0 1 0.84942,-0.94935 4.1638245,4.1638245 0 0 1 3.51427,2.96464 q 1.33242,-2.96464 4.29707,-2.96464 a 10.215249,10.215249 0 0 1 1.93201,0.23317 7.4782288,7.4782288 0 0 1 -0.99932,3.88624 8.4497879,8.4497879 0 0 0 -1.49897,-0.19987 q -2.03195,0 -3.26444,2.16519 v 10.64829 a 11.575432,11.575432 0 0 1 -2.03195,0.16655 12.769062,12.769062 0 0 1 -2.09857,-0.16655 v -11.15905 q -0.0222,-2.40947 -1.2547,-3.40879 z"
id="path12-9"
style="stroke-width:0.555177" />
<path
class="cls-1"
d="m 140.38483,169.8433 a 4.3137222,4.3137222 0 0 1 -0.58293,1.18253 4.0305821,4.0305821 0 0 1 -0.84942,0.94935 3.7640973,3.7640973 0 0 1 -3.05348,-1.95422 6.7453957,6.7453957 0 0 1 -4.76341,2.13188 q -2.56492,0 -3.88624,-1.49898 a 5.1298318,5.1298318 0 0 1 -1.29911,-3.4643 q 0,-2.77588 1.81543,-4.21379 a 7.3338829,7.3338829 0 0 1 4.64682,-1.4157 q 1.53229,0 2.89803,0.13324 v -0.99932 q 0,-2.63153 -2.53161,-2.63153 -1.79877,0 -5.09652,1.19918 a 4.674587,4.674587 0 0 1 -1.11035,-2.96465 18.581761,18.581761 0 0 1 7.21729,-1.49897 5.8682167,5.8682167 0 0 1 4.0639,1.39904 q 1.56559,1.39905 1.56559,4.23045 v 6.81757 q 0.0333,1.83208 0.96601,2.59822 z m -8.37206,-0.83276 a 4.7134493,4.7134493 0 0 0 3.33106,-1.59891 v -2.94244 a 22.368065,22.368065 0 0 0 -2.53161,-0.13324 2.775883,2.775883 0 0 0 -2.06526,0.69952 2.3928111,2.3928111 0 0 0 -0.69952,1.76546 2.3539488,2.3539488 0 0 0 0.55518,1.66553 1.8431863,1.8431863 0 0 0 1.41015,0.54408 z"
id="path14-7"
style="stroke-width:0.555177" />
<path
class="cls-1"
d="m 144.48203,169.71006 q -1.49897,-2.29843 -1.49897,-6.34567 0,-4.04724 1.8987,-6.34567 a 5.740526,5.740526 0 0 1 4.56355,-2.29843 6.4400486,6.4400486 0 0 1 4.49693,1.66553 3.7696491,3.7696491 0 0 1 2.63154,-1.43235 3.1200925,3.1200925 0 0 1 0.88273,0.93269 3.8862362,3.8862362 0 0 1 0.55518,1.16587 q -0.9327,0.79946 -0.9327,2.86472 v 9.438 q 0,5.29638 -1.73215,7.49488 -1.73215,2.1985 -5.69611,2.22071 a 16.100121,16.100121 0 0 1 -5.9626,-1.11036 4.4802752,4.4802752 0 0 1 1.03263,-3.03126 10.892565,10.892565 0 0 0 4.48028,1.03263 q 2.18184,0 3.0146,-1.11035 a 4.9965894,4.9965894 0 0 0 0.83277,-3.06458 V 170.454 a 6.4011862,6.4011862 0 0 1 -4.16383,1.56559 4.9188647,4.9188647 0 0 1 -4.40255,-2.30953 z m 8.56083,-2.69816 v -7.72806 a 4.2915151,4.2915151 0 0 0 -2.86471,-1.36573 2.4039147,2.4039147 0 0 0 -2.18185,1.43235 8.6885138,8.6885138 0 0 0 -0.7828,4.09721 q 0,2.66485 0.71618,3.93065 a 2.1318781,2.1318781 0 0 0 1.88205,1.2658 4.2304457,4.2304457 0 0 0 3.23113,-1.63222 z"
id="path16-3"
style="stroke-width:0.555177" />
<path
class="cls-1"
d="m 174.20619,164.78009 h -9.32697 a 5.6405943,5.6405943 0 0 0 0.88273,3.04792 q 0.7828,1.0826 2.74813,1.0826 a 10.120869,10.120869 0 0 0 4.36369,-1.16587 4.3803434,4.3803434 0 0 1 1.19918,2.5316 10.759323,10.759323 0 0 1 -6.41229,1.8987 q -3.74744,0 -5.37966,-2.43167 -1.63222,-2.43167 -1.63222,-6.2957 0,-3.88624 1.79877,-6.2957 a 6.0181143,6.0181143 0 0 1 5.14649,-2.43168 q 3.33106,0 5.14648,2.01529 a 7.3449864,7.3449864 0 0 1 1.79878,5.07987 13.04665,13.04665 0 0 1 -0.33311,2.96464 z m -6.42895,-7.06184 q -2.73146,0 -2.93133,4.13051 h 5.79605 v -0.39973 a 4.7245529,4.7245529 0 0 0 -0.69953,-2.69816 2.4316735,2.4316735 0 0 0 -2.14298,-1.03262 z"
id="path18-6"
style="stroke-width:0.555177" />
<path
class="cls-2"
d="m 174.55595,111.039 a 1.4878733,1.4878733 0 0 1 -0.0666,0.51631 0.10548355,0.10548355 0 0 1 0,0.0444 l -0.0666,0.17211 v 0 l -0.0666,0.12769 -10.69826,21.15223 c -1.48787,2.93688 -4.22489,2.84806 -3.76409,-0.12214 l 2.15408,-12.02512 c 0.0722,-0.39418 0.70508,-3.17006 1.31022,-5.1798 l -20.64702,6.4456 c -3.24223,21.05785 -30.95109,21.40761 -35.47023,0 l -20.691432,-6.46226 c 0.605143,2.00974 1.243596,4.80228 1.315769,5.19646 l 2.154085,12.02512 c 0.460796,2.9702 -2.276224,3.05902 -3.764098,0.12214 L 75.49024,111.77183 a 0.77169547,0.77169547 0 0 1 -0.06662,-0.17766 1.5989086,1.5989086 0 0 1 0.838317,-2.13743 L 120.47065,89.897871 a 11.0036,11.0036 0 0 1 8.88282,0 l 44.20871,19.558869 a 1.5822533,1.5822533 0 0 1 0.99377,1.58226 z"
id="path24-31"
style="stroke-width:0.555177" />
<path
class="cls-3"
d="m 139.0413,114.72537 19.11473,-7.69475 a 0.81055784,0.81055784 0 0 0 0,-1.50453 c -2.2207,-0.92714 -4.96328,-1.99308 -7.65033,-3.10899 -0.49411,-0.20541 -5.17425,3.15341 -5.60173,3.49762 l -8.23882,6.58439 c -1.99309,1.67108 -0.26649,3.28665 2.37615,2.22626 z"
id="path26-9"
style="stroke-width:0.555177" />
<circle
class="cls-3"
cx="125.18409"
cy="122.24245"
r="9.9654207"
id="circle28-4"
style="stroke-width:0.555177" />
</g>
<path
class="cls-1"
d="m 162.59498,140.73295 1.36608,-0.009 h 0.0176 q 0.75796,0 0.75796,0.71389 v 2.30031 a 6.5748177,6.5748177 0 0 1 -2.28855,0.37897 q -1.25151,0 -1.88608,-0.85049 -0.63456,-0.8505 -0.63456,-2.31793 0,-1.46891 0.7888,-2.28268 a 2.5823345,2.5823345 0 0 1 1.93014,-0.81524 3.5371227,3.5371227 0 0 1 2.06675,0.64338 1.0385157,1.0385157 0 0 1 -0.18068,0.46711 1.2603203,1.2603203 0 0 1 -0.33931,0.35254 2.5926169,2.5926169 0 0 0 -1.5027,-0.52 1.4174931,1.4174931 0 0 0 -1.1854,0.54203 q -0.42305,0.53909 -0.42305,1.69658 0,2.17692 1.60405,2.17692 a 4.4742838,4.4742838 0 0 0 0.97829,-0.11457 v -0.83728 q 0,-0.3966 0.0176,-0.58756 h -0.64632 a 0.60518875,0.60518875 0 0 1 -0.40101,-0.11017 0.44067142,0.44067142 0 0 1 -0.12779,-0.35254 1.5100341,1.5100341 0 0 1 0.0881,-0.47445 z"
id="path8-6-4"
style="fill:#c3c3c3;fill-opacity:1;stroke-width:0.146891" />
<path
class="cls-1"
d="m 169.39307,143.50037 a 1.141339,1.141339 0 0 1 -0.14689,0.31288 1.0664248,1.0664248 0 0 1 -0.22474,0.25118 0.9959174,0.9959174 0 0 1 -0.80937,-0.51706 1.7847193,1.7847193 0 0 1 -1.26032,0.56406 q -0.67863,0 -1.02823,-0.3966 a 1.357268,1.357268 0 0 1 -0.34373,-0.9166 q 0,-0.73445 0.48034,-1.1149 a 1.9404232,1.9404232 0 0 1 1.23535,-0.36869 q 0.40541,0 0.76676,0.0352 v -0.2644 q 0,-0.69626 -0.66982,-0.69626 -0.47592,0 -1.34845,0.31728 a 1.2368178,1.2368178 0 0 1 -0.29378,-0.78439 4.9164242,4.9164242 0 0 1 1.90957,-0.39661 1.5526323,1.5526323 0 0 1 1.07524,0.37017 q 0.41423,0.37016 0.41423,1.1193 v 1.79794 q -0.003,0.48474 0.24384,0.68745 z m -2.21217,-0.22034 a 1.2471001,1.2471001 0 0 0 0.88134,-0.42304 v -0.77852 a 5.9182171,5.9182171 0 0 0 -0.66982,-0.0353 0.73445237,0.73445237 0 0 0 -0.54643,0.18215 0.63309793,0.63309793 0 0 0 -0.18508,0.46711 0.62281561,0.62281561 0 0 0 0.14689,0.44067 0.48767637,0.48767637 0 0 0 0.3731,0.14689 z"
id="path10-2-5"
style="fill:#c3c3c3;fill-opacity:1;stroke-width:0.146891" />
<path
class="cls-1"
d="m 170.10696,140.13364 a 1.0546736,1.0546736 0 0 1 0.14689,-0.32169 0.88134284,0.88134284 0 0 1 0.22474,-0.25118 1.1016786,1.1016786 0 0 1 0.92982,0.78439 q 0.35254,-0.78439 1.13693,-0.78439 a 2.7027846,2.7027846 0 0 1 0.51118,0.0617 1.9786147,1.9786147 0 0 1 -0.2644,1.02823 2.235673,2.235673 0 0 0 -0.39661,-0.0529 q -0.53762,0 -0.86371,0.57287 v 2.81736 a 3.0626663,3.0626663 0 0 1 -0.53762,0.0441 3.3784809,3.3784809 0 0 1 -0.55525,-0.0441 v -2.95249 q -0.006,-0.63751 -0.33197,-0.90191 z"
id="path12-6-0"
style="fill:#c3c3c3;fill-opacity:1;stroke-width:0.146891" />
<path
class="cls-1"
d="m 177.14889,143.50037 a 1.141339,1.141339 0 0 1 -0.15424,0.31288 1.0664248,1.0664248 0 0 1 -0.22474,0.25118 0.9959174,0.9959174 0 0 1 -0.8079,-0.51706 1.7847193,1.7847193 0 0 1 -1.26032,0.56406 q -0.67863,0 -1.02823,-0.3966 a 1.357268,1.357268 0 0 1 -0.34372,-0.9166 q 0,-0.73445 0.48033,-1.1149 a 1.9404232,1.9404232 0 0 1 1.22947,-0.37457 q 0.40542,0 0.76677,0.0353 v -0.26441 q 0,-0.69626 -0.66982,-0.69626 -0.47593,0 -1.34846,0.31729 a 1.2368178,1.2368178 0 0 1 -0.29378,-0.7844 4.9164242,4.9164242 0 0 1 1.90958,-0.3966 1.5526323,1.5526323 0 0 1 1.07524,0.37016 q 0.41423,0.37017 0.41423,1.11931 v 1.80381 q 0.009,0.48474 0.25559,0.68745 z m -2.21511,-0.22034 a 1.2471001,1.2471001 0 0 0 0.88134,-0.42304 v -0.77852 a 5.9182171,5.9182171 0 0 0 -0.66982,-0.0353 0.73445237,0.73445237 0 0 0 -0.54643,0.18509 0.63309793,0.63309793 0 0 0 -0.18508,0.46711 0.62281561,0.62281561 0 0 0 0.14689,0.44067 0.48767637,0.48767637 0 0 0 0.3731,0.14395 z"
id="path14-1-3"
style="fill:#c3c3c3;fill-opacity:1;stroke-width:0.146891" />
<path
class="cls-1"
d="m 178.23294,143.46511 q -0.3966,-0.60812 -0.3966,-1.67895 0,-1.07084 0.50236,-1.67896 a 1.5188475,1.5188475 0 0 1 1.20744,-0.60813 1.7039295,1.7039295 0 0 1 1.18981,0.44067 0.99738631,0.99738631 0 0 1 0.69626,-0.37897 0.82552446,0.82552446 0 0 1 0.23356,0.24677 1.0282333,1.0282333 0 0 1 0.14689,0.30847 q -0.24678,0.21152 -0.24678,0.75796 v 2.49714 q 0,1.40133 -0.45829,1.98302 -0.4583,0.58168 -1.5071,0.58756 a 4.2598236,4.2598236 0 0 1 -1.5776,-0.29378 1.1854061,1.1854061 0 0 1 0.27321,-0.80203 2.8819911,2.8819911 0 0 0 1.18541,0.27322 q 0.57728,0 0.79761,-0.29378 a 1.3220143,1.3220143 0 0 0 0.22034,-0.81084 v -0.35253 a 1.6936472,1.6936472 0 0 1 -1.10168,0.41423 1.3014496,1.3014496 0 0 1 -1.16484,-0.61107 z m 2.26505,-0.71388 v -2.04472 a 1.1354634,1.1354634 0 0 0 -0.75795,-0.36135 0.63603576,0.63603576 0 0 0 -0.57728,0.37898 2.2988359,2.2988359 0 0 0 -0.20712,1.08405 q 0,0.70508 0.18949,1.03998 a 0.56405941,0.56405941 0 0 0 0.49796,0.33491 1.1193054,1.1193054 0 0 0 0.8549,-0.43185 z"
id="path16-8-6"
style="fill:#c3c3c3;fill-opacity:1;stroke-width:0.146891" />
<path
class="cls-1"
d="m 186.09746,142.16073 h -2.46776 a 1.4924072,1.4924072 0 0 0 0.23355,0.80643 q 0.20712,0.28643 0.72711,0.28643 a 2.6778132,2.6778132 0 0 0 1.15456,-0.30847 1.1589658,1.1589658 0 0 1 0.31728,0.66982 2.8467375,2.8467375 0 0 1 -1.69658,0.50237 q -0.99151,0 -1.42337,-0.64338 -0.43186,-0.64338 -0.43186,-1.66574 0,-1.02823 0.47593,-1.66574 a 1.5922927,1.5922927 0 0 1 1.36167,-0.64338 q 0.88134,0 1.36167,0.53321 a 1.943361,1.943361 0 0 1 0.47593,1.34405 3.4519261,3.4519261 0 0 1 -0.0881,0.7844 z m -1.701,-1.86845 q -0.7227,0 -0.77558,1.09287 h 1.53354 v -0.10577 a 1.2500379,1.2500379 0 0 0 -0.18508,-0.71388 0.64338027,0.64338027 0 0 0 -0.567,-0.27322 z"
id="path18-7-1"
style="fill:#c3c3c3;fill-opacity:1;stroke-width:0.146891" />
<path
id="path24-3-6-9-0"
style="fill:#ff9329;fill-opacity:1;stroke-width:0.146891"
d="m 173.02622,117.15529 a 2.9113691,2.9113691 0 0 0 -1.14618,0.24753 l -11.69696,5.17488 a 0.42304456,0.42304456 0 0 0 -0.22169,0.56586 0.20417776,0.20417776 0 0 0 0.0176,0.047 l 0.79634,1.57355 11.10475,-4.91288 a 2.9113691,2.9113691 0 0 1 1.14618,-0.24753 2.9113691,2.9113691 0 0 1 1.20406,0.24753 l 11.12387,4.92115 0.7829,-1.54823 0.0176,-0.0336 0.0181,-0.0455 a 0.02790919,0.02790919 0 0 0 0,-0.0119 0.39366647,0.39366647 0 0 0 0.0176,-0.13642 0.41863785,0.41863785 0 0 0 -0.26303,-0.4191 l -11.69697,-5.17488 a 2.9113691,2.9113691 0 0 0 -1.20406,-0.24753 z m -10.12134,9.52449 c 0.0218,0.0723 0.0408,0.14674 0.0615,0.22066 h 0.51831 l -0.008,-0.0419 z m 20.32227,0.005 -0.57103,0.17828 -0.007,0.0377 h 0.5178 c 0.0202,-0.0723 0.0386,-0.14514 0.0599,-0.216 z" />
<path
class="cls-2"
d="m 186.19,127.94173 a 0.39366647,0.39366647 0 0 1 -0.0176,0.13661 0.02790919,0.02790919 0 0 1 0,0.0117 l -0.0176,0.0455 v 0 l -0.0176,0.0338 -2.83058,5.59652 c -0.39366,0.77705 -1.11783,0.75355 -0.99591,-0.0323 l 0.56993,-3.18165 c 0.0191,-0.10429 0.18655,-0.83874 0.34666,-1.37049 l -5.46285,1.7054 c -0.85784,5.57156 -8.18915,5.6641 -9.38484,0 l -5.4746,-1.70981 c 0.16011,0.53175 0.32903,1.27061 0.34813,1.3749 l 0.56993,3.18165 c 0.12192,0.78586 -0.60225,0.80936 -0.99592,0.0323 l -2.84822,-5.63031 a 0.20417776,0.20417776 0 0 1 -0.0176,-0.047 0.42304456,0.42304456 0 0 1 0.22181,-0.56553 l 11.69688,-5.17495 a 2.9113691,2.9113691 0 0 1 2.35025,0 l 11.69689,5.17495 a 0.41863785,0.41863785 0 0 1 0.26293,0.41864 z"
id="path24-0-3"
style="fill:#ff9329;fill-opacity:1;stroke-width:0.146891" />
<path
class="cls-3"
d="m 176.79341,128.91708 5.05744,-2.0359 a 0.21446009,0.21446009 0 0 0 0,-0.39807 c -0.58756,-0.24531 -1.3132,-0.52734 -2.02415,-0.82259 -0.13073,-0.0543 -1.36902,0.83434 -1.48212,0.92541 l -2.17986,1.74212 c -0.52734,0.44214 -0.0705,0.86959 0.62869,0.58903 z"
id="path26-2-2"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.146891" />
<circle
class="cls-3"
cx="173.12703"
cy="130.90596"
r="2.6366842"
id="circle28-3-0"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.146891" />
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 17 KiB

BIN
doc/logo/garage-notext.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.8 KiB

146
doc/logo/garage-notext.svg Normal file
View file

@ -0,0 +1,146 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
id="Calque_1"
data-name="Calque 1"
width="250"
height="250"
viewBox="0 0 249.99999 250"
version="1.1"
sodipodi:docname="garage-notext.svg"
inkscape:version="1.0.2 (e86c870879, 2021-01-15)"
inkscape:export-filename="/home/lx/Deuxfleurs/garage/garage-notext.png"
inkscape:export-xdpi="96"
inkscape:export-ydpi="96">
<metadata
id="metadata33">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title></dc:title>
</cc:Work>
</rdf:RDF>
</metadata>
<sodipodi:namedview
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1"
objecttolerance="10"
gridtolerance="10"
guidetolerance="10"
inkscape:pageopacity="1"
inkscape:pageshadow="2"
inkscape:window-width="1920"
inkscape:window-height="1039"
id="namedview31"
showgrid="false"
inkscape:zoom="2.1842656"
inkscape:cx="143.86571"
inkscape:cy="118.5836"
inkscape:window-x="0"
inkscape:window-y="20"
inkscape:window-maximized="0"
inkscape:current-layer="Calque_1"
inkscape:document-rotation="0"
units="px"
showguides="false"
inkscape:guide-bbox="true"
inkscape:snap-global="false"
width="250mm">
<sodipodi:guide
position="102.90662,161.07694"
orientation="0,-1"
id="guide1016" />
<sodipodi:guide
position="122.45269,170.65683"
orientation="0,-1"
id="guide1018" />
<sodipodi:guide
position="128.86504,180.08221"
orientation="0,-1"
id="guide1020" />
</sodipodi:namedview>
<defs
id="defs4">
<style
id="style2">.cls-1{fill:#3b2100;}.cls-2{fill:#ffd952;}.cls-3{fill:#45c8ff;}</style>
</defs>
<rect
style="fill:#ffffff;stroke-width:3.60793"
id="rect3824"
width="251.68179"
height="250.98253"
x="-0.59092933"
y="-0.31321606" />
<g
id="g1719"
transform="matrix(1.9099251,0,0,1.9099251,-113.74064,-74.610597)">
<path
d="m 138.41049,100.63656 a 8.327649,8.327649 0 0 1 -2.77589,-0.28869 l -34.78736,-9.388039 a 8.4442361,8.4442361 0 0 1 -2.620438,-1.238044 z"
id="path6"
style="stroke-width:0.555177" />
<path
id="path24-3-6"
style="fill:#ffd952;fill-opacity:1;stroke-width:0.555177"
d="m 124.88254,70.600847 a 11.0036,11.0036 0 0 0 -4.33203,0.935547 L 76.341524,91.094987 a 1.5989086,1.5989086 0 0 0 -0.837891,2.138672 0.77169547,0.77169547 0 0 0 0.06641,0.177735 l 7.09375,14.021486 h 6.15625 l -0.875,-4.88867 c -0.07217,-0.39418 -0.711263,-3.187537 -1.316406,-5.197269 l 20.691403,6.462899 c 0.27198,1.28839 0.63292,2.49204 1.0625,3.62304 h 33.54883 c 0.36964,-1.13128 0.66138,-2.33705 0.85938,-3.62304 l 20.64648,-6.445321 c -0.60514,2.009734 -1.23639,4.785511 -1.30859,5.179691 l -0.875,4.88867 h 6.15429 l 7.02735,-13.894533 0.0664,-0.126953 0.0684,-0.171875 a 0.10548355,0.10548355 0 0 0 0,-0.04492 1.4878733,1.4878733 0 0 0 0.0664,-0.515625 1.5822533,1.5822533 0 0 0 -0.99414,-1.583985 L 129.43333,71.536394 a 11.0036,11.0036 0 0 0 -4.55079,-0.935547 z" />
<path
id="path24-3"
style="fill:#49c8fa;fill-opacity:1;stroke-width:0.555177"
d="m 124.88254,79.854518 a 11.0036,11.0036 0 0 0 -4.33203,0.935547 L 76.341524,100.34866 a 1.5989086,1.5989086 0 0 0 -0.837891,2.13672 0.77169547,0.77169547 0 0 0 0.06641,0.17773 l 3.847657,7.60352 h 8.175781 c -0.257897,-1.08856 -0.591943,-2.42953 -0.964844,-3.66797 l 11.744141,3.66797 h 53.371092 l 11.69336,-3.65039 c -0.37193,1.23522 -0.70076,2.56719 -0.95703,3.65039 h 8.17383 l 3.78125,-7.47656 0.0664,-0.12696 0.0684,-0.17187 a 0.10548355,0.10548355 0 0 0 0,-0.0449 1.4878733,1.4878733 0 0 0 0.0664,-0.51563 1.5822533,1.5822533 0 0 0 -0.99414,-1.58203 L 129.43333,80.790065 a 11.0036,11.0036 0 0 0 -4.55079,-0.935547 z" />
<path
class="cls-2"
d="m 174.63576,111.36813 a 1.4878733,1.4878733 0 0 1 -0.0666,0.51631 0.10548355,0.10548355 0 0 1 0,0.0444 l -0.0666,0.17211 v 0 l -0.0666,0.12769 -10.69826,21.15223 c -1.48787,2.93688 -4.22489,2.84806 -3.76409,-0.12214 l 2.15408,-12.02512 c 0.0722,-0.39418 0.70508,-3.17006 1.31022,-5.1798 l -20.64702,6.4456 c -3.24223,21.05785 -30.95109,21.40761 -35.47023,0 l -20.691437,-6.46226 c 0.605143,2.00974 1.243596,4.80228 1.315769,5.19646 l 2.154085,12.02512 c 0.460796,2.9702 -2.276224,3.05902 -3.764098,0.12214 L 75.570045,112.10096 a 0.77169547,0.77169547 0 0 1 -0.06662,-0.17766 1.5989086,1.5989086 0 0 1 0.838317,-2.13743 L 120.55046,90.226998 a 11.0036,11.0036 0 0 1 8.88282,0 l 44.20871,19.558872 a 1.5822533,1.5822533 0 0 1 0.99377,1.58226 z"
id="path24"
style="stroke-width:0.555177" />
<path
class="cls-3"
d="m 139.12111,115.0545 19.11473,-7.69475 a 0.81055784,0.81055784 0 0 0 0,-1.50453 c -2.2207,-0.92714 -4.96328,-1.99308 -7.65033,-3.10899 -0.49411,-0.20541 -5.17425,3.15341 -5.60173,3.49762 l -8.23882,6.58439 c -1.99309,1.67108 -0.26649,3.28665 2.37615,2.22626 z"
id="path26"
style="stroke-width:0.555177" />
<circle
class="cls-3"
cx="125.26389"
cy="122.57157"
r="9.9654207"
id="circle28"
style="stroke-width:0.555177" />
<path
d="m 138.41049,100.63656 a 8.327649,8.327649 0 0 1 -2.77589,-0.28869 l -34.78736,-9.388039 a 8.4442361,8.4442361 0 0 1 -2.620438,-1.238044 z"
id="path6-0"
style="stroke-width:0.555177" />
<path
id="path24-3-6-9"
style="fill:#ff9329;fill-opacity:1;stroke-width:0.555177"
d="m 124.88254,70.600847 a 11.0036,11.0036 0 0 0 -4.33203,0.935547 L 76.341524,91.094987 a 1.5989086,1.5989086 0 0 0 -0.837891,2.138672 0.77169547,0.77169547 0 0 0 0.06641,0.177735 l 7.09375,14.021486 h 6.15625 l -0.875,-4.88867 c -0.07217,-0.39418 -0.711263,-3.187537 -1.316406,-5.197269 l 20.691403,6.462899 c 0.27198,1.28839 0.63292,2.49204 1.0625,3.62304 h 33.54883 c 0.36964,-1.13128 0.66138,-2.33705 0.85938,-3.62304 l 20.64648,-6.445321 c -0.60514,2.009734 -1.23639,4.785511 -1.30859,5.179691 l -0.875,4.88867 h 6.15429 l 7.02735,-13.894533 0.0664,-0.126953 0.0684,-0.171875 a 0.10548355,0.10548355 0 0 0 0,-0.04492 1.4878733,1.4878733 0 0 0 0.0664,-0.515625 1.5822533,1.5822533 0 0 0 -0.99414,-1.583985 L 129.43333,71.536394 a 11.0036,11.0036 0 0 0 -4.55079,-0.935547 z" />
<path
id="path24-3-2"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177"
d="m 124.88254,79.854518 a 11.0036,11.0036 0 0 0 -4.33203,0.935547 L 76.341524,100.34866 a 1.5989086,1.5989086 0 0 0 -0.837891,2.13672 0.77169547,0.77169547 0 0 0 0.06641,0.17773 l 3.847657,7.60352 h 8.175781 c -0.257897,-1.08856 -0.591943,-2.42953 -0.964844,-3.66797 l 11.744141,3.66797 h 53.371092 l 11.69336,-3.65039 c -0.37193,1.23522 -0.70076,2.56719 -0.95703,3.65039 h 8.17383 l 3.78125,-7.47656 0.0664,-0.12696 0.0684,-0.17187 a 0.10548355,0.10548355 0 0 0 0,-0.0449 1.4878733,1.4878733 0 0 0 0.0664,-0.51563 1.5822533,1.5822533 0 0 0 -0.99414,-1.58203 L 129.43333,80.790065 a 11.0036,11.0036 0 0 0 -4.55079,-0.935547 z" />
<path
class="cls-2"
d="m 174.63576,111.36813 a 1.4878733,1.4878733 0 0 1 -0.0666,0.51631 0.10548355,0.10548355 0 0 1 0,0.0444 l -0.0666,0.17211 v 0 l -0.0666,0.12769 -10.69826,21.15223 c -1.48787,2.93688 -4.22489,2.84806 -3.76409,-0.12214 l 2.15408,-12.02512 c 0.0722,-0.39418 0.70508,-3.17006 1.31022,-5.1798 l -20.64702,6.4456 c -3.24223,21.05785 -30.95109,21.40761 -35.47023,0 l -20.691437,-6.46226 c 0.605143,2.00974 1.243596,4.80228 1.315769,5.19646 l 2.154085,12.02512 c 0.460796,2.9702 -2.276224,3.05902 -3.764098,0.12214 L 75.570045,112.10096 a 0.77169547,0.77169547 0 0 1 -0.06662,-0.17766 1.5989086,1.5989086 0 0 1 0.838317,-2.13743 L 120.55046,90.226998 a 11.0036,11.0036 0 0 1 8.88282,0 l 44.20871,19.558872 a 1.5822533,1.5822533 0 0 1 0.99377,1.58226 z"
id="path24-0"
style="fill:#ff9329;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-3"
d="m 139.12111,115.0545 19.11473,-7.69475 a 0.81055784,0.81055784 0 0 0 0,-1.50453 c -2.2207,-0.92714 -4.96328,-1.99308 -7.65033,-3.10899 -0.49411,-0.20541 -5.17425,3.15341 -5.60173,3.49762 l -8.23882,6.58439 c -1.99309,1.67108 -0.26649,3.28665 2.37615,2.22626 z"
id="path26-2"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<circle
class="cls-3"
cx="125.26389"
cy="122.57157"
r="9.9654207"
id="circle28-3"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
</g>
</svg>

After

Width:  |  Height:  |  Size: 8.8 KiB

BIN
doc/logo/garage.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

206
doc/logo/garage.svg Normal file
View file

@ -0,0 +1,206 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
id="Calque_1"
data-name="Calque 1"
width="250"
height="250"
viewBox="0 0 249.99999 250"
version="1.1"
sodipodi:docname="garage.svg"
inkscape:version="1.0.2 (e86c870879, 2021-01-15)"
inkscape:export-filename="/home/lx/Deuxfleurs/garage/doc/logo/garage.png"
inkscape:export-xdpi="96"
inkscape:export-ydpi="96">
<metadata
id="metadata33">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title />
</cc:Work>
</rdf:RDF>
</metadata>
<sodipodi:namedview
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1"
objecttolerance="10"
gridtolerance="10"
guidetolerance="10"
inkscape:pageopacity="1"
inkscape:pageshadow="2"
inkscape:window-width="1920"
inkscape:window-height="1080"
id="namedview31"
showgrid="false"
inkscape:zoom="2.1842656"
inkscape:cx="90.853672"
inkscape:cy="123.63257"
inkscape:window-x="0"
inkscape:window-y="0"
inkscape:window-maximized="0"
inkscape:current-layer="Calque_1"
inkscape:document-rotation="0"
units="px"
showguides="false"
inkscape:guide-bbox="true"
inkscape:snap-global="false"
width="250mm">
<sodipodi:guide
position="102.90662,161.07694"
orientation="0,-1"
id="guide1016" />
<sodipodi:guide
position="122.45269,170.65683"
orientation="0,-1"
id="guide1018" />
<sodipodi:guide
position="128.86504,180.08221"
orientation="0,-1"
id="guide1020" />
</sodipodi:namedview>
<defs
id="defs4">
<style
id="style2">.cls-1{fill:#3b2100;}.cls-2{fill:#ffd952;}.cls-3{fill:#45c8ff;}</style>
</defs>
<rect
style="fill:#ffffff;stroke-width:3.60793"
id="rect3824"
width="251.68179"
height="250.98253"
x="-0.59092933"
y="-0.31321606" />
<g
id="g1663"
transform="matrix(1.7099534,0,0,1.7099534,-88.607712,-87.994557)">
<path
d="m 138.33068,100.19817 a 8.327649,8.327649 0 0 1 -2.77589,-0.288688 l -34.78736,-9.388036 a 8.4442361,8.4442361 0 0 1 -2.620433,-1.238044 z"
id="path6"
style="stroke-width:0.555177" />
<path
class="cls-1"
d="m 85.377935,159.27452 5.163143,-0.0333 h 0.06662 q 2.864711,0 2.864711,2.69816 v 8.69407 a 24.849705,24.849705 0 0 1 -8.649651,1.43235 q -4.730105,0 -7.128468,-3.21447 -2.398363,-3.21447 -2.398363,-8.76068 0,-5.55177 2.981299,-8.62745 a 9.7600046,9.7600046 0 0 1 7.29502,-3.08123 13.368653,13.368653 0 0 1 7.811335,2.43167 3.9250986,3.9250986 0 0 1 -0.682867,1.76547 4.7634152,4.7634152 0 0 1 -1.282458,1.33242 9.798867,9.798867 0 0 0 -5.679457,-1.96533 5.3574542,5.3574542 0 0 0 -4.480275,2.04861 q -1.598909,2.03749 -1.598909,6.41229 0,8.22771 6.062529,8.22771 a 16.910679,16.910679 0 0 0 3.697476,-0.43303 v -3.16451 q 0,-1.49898 0.06662,-2.22071 h -2.442777 a 2.2873276,2.2873276 0 0 1 -1.515632,-0.41638 1.6655298,1.6655298 0 0 1 -0.483004,-1.33242 5.7072154,5.7072154 0 0 1 0.333106,-1.79322 z"
id="path8"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-1"
d="m 111.07151,169.73404 a 4.3137222,4.3137222 0 0 1 -0.55518,1.18253 4.0305821,4.0305821 0 0 1 -0.84942,0.94935 3.7640973,3.7640973 0 0 1 -3.05902,-1.95422 6.7453957,6.7453957 0 0 1 -4.76342,2.13188 q -2.564913,0 -3.886233,-1.49898 a 5.1298318,5.1298318 0 0 1 -1.299113,-3.4643 q 0,-2.77588 1.815427,-4.21379 a 7.3338829,7.3338829 0 0 1 4.669039,-1.3935 q 1.53228,0 2.89802,0.13325 v -0.99932 q 0,-2.63154 -2.53161,-2.63154 -1.79877,0 -5.096518,1.19918 a 4.674587,4.674587 0 0 1 -1.110353,-2.96464 18.581761,18.581761 0 0 1 7.217291,-1.49898 5.8682167,5.8682167 0 0 1 4.0639,1.39905 q 1.56559,1.39904 1.56559,4.23044 v 6.79537 q -0.0111,1.83208 0.9216,2.59822 z m -8.36096,-0.83276 a 4.7134493,4.7134493 0 0 0 3.33106,-1.59891 v -2.94244 a 22.368065,22.368065 0 0 0 -2.53161,-0.13324 2.775883,2.775883 0 0 0 -2.06525,0.68842 2.3928111,2.3928111 0 0 0 -0.69953,1.76546 2.3539488,2.3539488 0 0 0 0.55518,1.66553 1.8431863,1.8431863 0 0 0 1.41015,0.55518 z"
id="path10"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-1"
d="m 113.76966,157.00939 a 3.986168,3.986168 0 0 1 0.55518,-1.21583 3.3310596,3.3310596 0 0 1 0.84942,-0.94935 4.1638245,4.1638245 0 0 1 3.51427,2.96464 q 1.33242,-2.96464 4.29707,-2.96464 a 10.215249,10.215249 0 0 1 1.93201,0.23317 7.4782288,7.4782288 0 0 1 -0.99932,3.88624 8.4497879,8.4497879 0 0 0 -1.49897,-0.19987 q -2.03195,0 -3.26444,2.16519 v 10.64829 a 11.575432,11.575432 0 0 1 -2.03195,0.16655 12.769062,12.769062 0 0 1 -2.09857,-0.16655 v -11.15905 q -0.0222,-2.40947 -1.2547,-3.40879 z"
id="path12"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-1"
d="m 140.38483,169.73404 a 4.3137222,4.3137222 0 0 1 -0.58293,1.18253 4.0305821,4.0305821 0 0 1 -0.84942,0.94935 3.7640973,3.7640973 0 0 1 -3.05348,-1.95422 6.7453957,6.7453957 0 0 1 -4.76341,2.13188 q -2.56492,0 -3.88624,-1.49898 a 5.1298318,5.1298318 0 0 1 -1.29911,-3.4643 q 0,-2.77588 1.81543,-4.21379 a 7.3338829,7.3338829 0 0 1 4.64682,-1.4157 q 1.53229,0 2.89803,0.13324 v -0.99932 q 0,-2.63153 -2.53161,-2.63153 -1.79877,0 -5.09652,1.19918 a 4.674587,4.674587 0 0 1 -1.11035,-2.96465 18.581761,18.581761 0 0 1 7.21729,-1.49897 5.8682167,5.8682167 0 0 1 4.0639,1.39904 q 1.56559,1.39905 1.56559,4.23045 v 6.81757 q 0.0333,1.83208 0.96601,2.59822 z m -8.37206,-0.83276 a 4.7134493,4.7134493 0 0 0 3.33106,-1.59891 v -2.94244 a 22.368065,22.368065 0 0 0 -2.53161,-0.13324 2.775883,2.775883 0 0 0 -2.06526,0.69952 2.3928111,2.3928111 0 0 0 -0.69952,1.76546 2.3539488,2.3539488 0 0 0 0.55518,1.66553 1.8431863,1.8431863 0 0 0 1.41015,0.54408 z"
id="path14"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-1"
d="m 144.48203,169.6008 q -1.49897,-2.29843 -1.49897,-6.34567 0,-4.04724 1.8987,-6.34567 a 5.740526,5.740526 0 0 1 4.56355,-2.29843 6.4400486,6.4400486 0 0 1 4.49693,1.66553 3.7696491,3.7696491 0 0 1 2.63154,-1.43235 3.1200925,3.1200925 0 0 1 0.88273,0.93269 3.8862362,3.8862362 0 0 1 0.55518,1.16587 q -0.9327,0.79946 -0.9327,2.86472 v 9.438 q 0,5.29638 -1.73215,7.49488 -1.73215,2.1985 -5.69611,2.22071 a 16.100121,16.100121 0 0 1 -5.9626,-1.11036 4.4802752,4.4802752 0 0 1 1.03263,-3.03126 10.892565,10.892565 0 0 0 4.48028,1.03263 q 2.18184,0 3.0146,-1.11035 a 4.9965894,4.9965894 0 0 0 0.83277,-3.06458 v -1.33242 a 6.4011862,6.4011862 0 0 1 -4.16383,1.56559 4.9188647,4.9188647 0 0 1 -4.40255,-2.30953 z m 8.56083,-2.69816 v -7.72806 a 4.2915151,4.2915151 0 0 0 -2.86471,-1.36573 2.4039147,2.4039147 0 0 0 -2.18185,1.43235 8.6885138,8.6885138 0 0 0 -0.7828,4.09721 q 0,2.66485 0.71618,3.93065 a 2.1318781,2.1318781 0 0 0 1.88205,1.2658 4.2304457,4.2304457 0 0 0 3.23113,-1.63222 z"
id="path16"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-1"
d="m 174.20619,164.67083 h -9.32697 a 5.6405943,5.6405943 0 0 0 0.88273,3.04792 q 0.7828,1.0826 2.74813,1.0826 a 10.120869,10.120869 0 0 0 4.36369,-1.16587 4.3803434,4.3803434 0 0 1 1.19918,2.5316 10.759323,10.759323 0 0 1 -6.41229,1.8987 q -3.74744,0 -5.37966,-2.43167 -1.63222,-2.43167 -1.63222,-6.2957 0,-3.88624 1.79877,-6.2957 a 6.0181143,6.0181143 0 0 1 5.14649,-2.43168 q 3.33106,0 5.14648,2.01529 a 7.3449864,7.3449864 0 0 1 1.79878,5.07987 13.04665,13.04665 0 0 1 -0.33311,2.96464 z m -6.42895,-7.06184 q -2.73146,0 -2.93133,4.13051 h 5.79605 v -0.39973 a 4.7245529,4.7245529 0 0 0 -0.69953,-2.69816 2.4316735,2.4316735 0 0 0 -2.14298,-1.03262 z"
id="path18"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
id="path24-3-6"
style="fill:#ffd952;fill-opacity:1;stroke-width:0.555177"
d="m 124.80273,70.162462 a 11.0036,11.0036 0 0 0 -4.33203,0.935547 L 76.261719,90.656602 a 1.5989086,1.5989086 0 0 0 -0.837891,2.138672 0.77169547,0.77169547 0 0 0 0.06641,0.177735 l 7.09375,14.021481 h 6.15625 l -0.875,-4.88867 c -0.07217,-0.39418 -0.711263,-3.187532 -1.316406,-5.197264 l 20.691398,6.462894 c 0.27198,1.28839 0.63292,2.49204 1.0625,3.62304 h 33.54883 c 0.36964,-1.13128 0.66138,-2.33705 0.85938,-3.62304 l 20.64648,-6.445316 c -0.60514,2.009734 -1.23639,4.785506 -1.30859,5.179686 l -0.875,4.88867 h 6.15429 l 7.02735,-13.894528 0.0664,-0.126953 0.0684,-0.171875 a 0.10548355,0.10548355 0 0 0 0,-0.04492 1.4878733,1.4878733 0 0 0 0.0664,-0.515625 1.5822533,1.5822533 0 0 0 -0.99414,-1.583985 L 129.35352,71.098009 a 11.0036,11.0036 0 0 0 -4.55079,-0.935547 z" />
<path
id="path24-3"
style="fill:#49c8fa;fill-opacity:1;stroke-width:0.555177"
d="M 124.80273,79.416133 A 11.0036,11.0036 0 0 0 120.4707,80.35168 L 76.261719,99.910272 a 1.5989086,1.5989086 0 0 0 -0.837891,2.136718 0.77169547,0.77169547 0 0 0 0.06641,0.17773 l 3.847657,7.60352 h 8.175781 c -0.257897,-1.08856 -0.591943,-2.42953 -0.964844,-3.66797 l 11.744141,3.66797 h 53.371087 l 11.69336,-3.65039 c -0.37193,1.23522 -0.70076,2.56719 -0.95703,3.65039 h 8.17383 l 3.78125,-7.47656 0.0664,-0.12696 0.0684,-0.17187 a 0.10548355,0.10548355 0 0 0 0,-0.0449 1.4878733,1.4878733 0 0 0 0.0664,-0.51563 1.5822533,1.5822533 0 0 0 -0.99414,-1.582028 L 129.35352,80.35168 a 11.0036,11.0036 0 0 0 -4.55079,-0.935547 z" />
<path
class="cls-2"
d="m 174.55595,110.92974 a 1.4878733,1.4878733 0 0 1 -0.0666,0.51631 0.10548355,0.10548355 0 0 1 0,0.0444 l -0.0666,0.17211 v 0 l -0.0666,0.12769 -10.69826,21.15223 c -1.48787,2.93688 -4.22489,2.84806 -3.76409,-0.12214 l 2.15408,-12.02512 c 0.0722,-0.39418 0.70508,-3.17006 1.31022,-5.1798 l -20.64702,6.4456 c -3.24223,21.05785 -30.95109,21.40761 -35.47023,0 l -20.691432,-6.46226 c 0.605143,2.00974 1.243596,4.80228 1.315769,5.19646 l 2.154085,12.02512 c 0.460796,2.9702 -2.276224,3.05902 -3.764098,0.12214 L 75.49024,111.66257 a 0.77169547,0.77169547 0 0 1 -0.06662,-0.17766 1.5989086,1.5989086 0 0 1 0.838317,-2.13743 L 120.47065,89.788613 a 11.0036,11.0036 0 0 1 8.88282,0 l 44.20871,19.558867 a 1.5822533,1.5822533 0 0 1 0.99377,1.58226 z"
id="path24"
style="stroke-width:0.555177" />
<path
class="cls-3"
d="m 139.0413,114.61611 19.11473,-7.69475 a 0.81055784,0.81055784 0 0 0 0,-1.50453 c -2.2207,-0.92714 -4.96328,-1.99308 -7.65033,-3.10899 -0.49411,-0.20541 -5.17425,3.15341 -5.60173,3.49762 l -8.23882,6.58439 c -1.99309,1.67108 -0.26649,3.28665 2.37615,2.22626 z"
id="path26"
style="stroke-width:0.555177" />
<circle
class="cls-3"
cx="125.18409"
cy="122.13319"
r="9.9654207"
id="circle28"
style="stroke-width:0.555177" />
<path
d="m 138.33068,100.19817 a 8.327649,8.327649 0 0 1 -2.77589,-0.288688 l -34.78736,-9.388036 a 8.4442361,8.4442361 0 0 1 -2.620433,-1.238044 z"
id="path6-0"
style="stroke-width:0.555177" />
<path
class="cls-1"
d="m 85.377935,159.27452 5.163143,-0.0333 h 0.06662 q 2.864711,0 2.864711,2.69816 v 8.69407 a 24.849705,24.849705 0 0 1 -8.649651,1.43235 q -4.730105,0 -7.128468,-3.21447 -2.398363,-3.21447 -2.398363,-8.76068 0,-5.55177 2.981299,-8.62745 a 9.7600046,9.7600046 0 0 1 7.29502,-3.08123 13.368653,13.368653 0 0 1 7.811335,2.43167 3.9250986,3.9250986 0 0 1 -0.682867,1.76547 4.7634152,4.7634152 0 0 1 -1.282458,1.33242 9.798867,9.798867 0 0 0 -5.679457,-1.96533 5.3574542,5.3574542 0 0 0 -4.480275,2.04861 q -1.598909,2.03749 -1.598909,6.41229 0,8.22771 6.062529,8.22771 a 16.910679,16.910679 0 0 0 3.697476,-0.43303 v -3.16451 q 0,-1.49898 0.06662,-2.22071 h -2.442777 a 2.2873276,2.2873276 0 0 1 -1.515632,-0.41638 1.6655298,1.6655298 0 0 1 -0.483004,-1.33242 5.7072154,5.7072154 0 0 1 0.333106,-1.79322 z"
id="path8-6"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-1"
d="m 111.07151,169.73404 a 4.3137222,4.3137222 0 0 1 -0.55518,1.18253 4.0305821,4.0305821 0 0 1 -0.84942,0.94935 3.7640973,3.7640973 0 0 1 -3.05902,-1.95422 6.7453957,6.7453957 0 0 1 -4.76342,2.13188 q -2.564913,0 -3.886233,-1.49898 a 5.1298318,5.1298318 0 0 1 -1.299113,-3.4643 q 0,-2.77588 1.815427,-4.21379 a 7.3338829,7.3338829 0 0 1 4.669039,-1.3935 q 1.53228,0 2.89802,0.13325 v -0.99932 q 0,-2.63154 -2.53161,-2.63154 -1.79877,0 -5.096518,1.19918 a 4.674587,4.674587 0 0 1 -1.110353,-2.96464 18.581761,18.581761 0 0 1 7.217291,-1.49898 5.8682167,5.8682167 0 0 1 4.0639,1.39905 q 1.56559,1.39904 1.56559,4.23044 v 6.79537 q -0.0111,1.83208 0.9216,2.59822 z m -8.36096,-0.83276 a 4.7134493,4.7134493 0 0 0 3.33106,-1.59891 v -2.94244 a 22.368065,22.368065 0 0 0 -2.53161,-0.13324 2.775883,2.775883 0 0 0 -2.06525,0.68842 2.3928111,2.3928111 0 0 0 -0.69953,1.76546 2.3539488,2.3539488 0 0 0 0.55518,1.66553 1.8431863,1.8431863 0 0 0 1.41015,0.55518 z"
id="path10-2"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-1"
d="m 113.76966,157.00939 a 3.986168,3.986168 0 0 1 0.55518,-1.21583 3.3310596,3.3310596 0 0 1 0.84942,-0.94935 4.1638245,4.1638245 0 0 1 3.51427,2.96464 q 1.33242,-2.96464 4.29707,-2.96464 a 10.215249,10.215249 0 0 1 1.93201,0.23317 7.4782288,7.4782288 0 0 1 -0.99932,3.88624 8.4497879,8.4497879 0 0 0 -1.49897,-0.19987 q -2.03195,0 -3.26444,2.16519 v 10.64829 a 11.575432,11.575432 0 0 1 -2.03195,0.16655 12.769062,12.769062 0 0 1 -2.09857,-0.16655 v -11.15905 q -0.0222,-2.40947 -1.2547,-3.40879 z"
id="path12-6"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-1"
d="m 140.38483,169.73404 a 4.3137222,4.3137222 0 0 1 -0.58293,1.18253 4.0305821,4.0305821 0 0 1 -0.84942,0.94935 3.7640973,3.7640973 0 0 1 -3.05348,-1.95422 6.7453957,6.7453957 0 0 1 -4.76341,2.13188 q -2.56492,0 -3.88624,-1.49898 a 5.1298318,5.1298318 0 0 1 -1.29911,-3.4643 q 0,-2.77588 1.81543,-4.21379 a 7.3338829,7.3338829 0 0 1 4.64682,-1.4157 q 1.53229,0 2.89803,0.13324 v -0.99932 q 0,-2.63153 -2.53161,-2.63153 -1.79877,0 -5.09652,1.19918 a 4.674587,4.674587 0 0 1 -1.11035,-2.96465 18.581761,18.581761 0 0 1 7.21729,-1.49897 5.8682167,5.8682167 0 0 1 4.0639,1.39904 q 1.56559,1.39905 1.56559,4.23045 v 6.81757 q 0.0333,1.83208 0.96601,2.59822 z m -8.37206,-0.83276 a 4.7134493,4.7134493 0 0 0 3.33106,-1.59891 v -2.94244 a 22.368065,22.368065 0 0 0 -2.53161,-0.13324 2.775883,2.775883 0 0 0 -2.06526,0.69952 2.3928111,2.3928111 0 0 0 -0.69952,1.76546 2.3539488,2.3539488 0 0 0 0.55518,1.66553 1.8431863,1.8431863 0 0 0 1.41015,0.54408 z"
id="path14-1"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-1"
d="m 144.48203,169.6008 q -1.49897,-2.29843 -1.49897,-6.34567 0,-4.04724 1.8987,-6.34567 a 5.740526,5.740526 0 0 1 4.56355,-2.29843 6.4400486,6.4400486 0 0 1 4.49693,1.66553 3.7696491,3.7696491 0 0 1 2.63154,-1.43235 3.1200925,3.1200925 0 0 1 0.88273,0.93269 3.8862362,3.8862362 0 0 1 0.55518,1.16587 q -0.9327,0.79946 -0.9327,2.86472 v 9.438 q 0,5.29638 -1.73215,7.49488 -1.73215,2.1985 -5.69611,2.22071 a 16.100121,16.100121 0 0 1 -5.9626,-1.11036 4.4802752,4.4802752 0 0 1 1.03263,-3.03126 10.892565,10.892565 0 0 0 4.48028,1.03263 q 2.18184,0 3.0146,-1.11035 a 4.9965894,4.9965894 0 0 0 0.83277,-3.06458 v -1.33242 a 6.4011862,6.4011862 0 0 1 -4.16383,1.56559 4.9188647,4.9188647 0 0 1 -4.40255,-2.30953 z m 8.56083,-2.69816 v -7.72806 a 4.2915151,4.2915151 0 0 0 -2.86471,-1.36573 2.4039147,2.4039147 0 0 0 -2.18185,1.43235 8.6885138,8.6885138 0 0 0 -0.7828,4.09721 q 0,2.66485 0.71618,3.93065 a 2.1318781,2.1318781 0 0 0 1.88205,1.2658 4.2304457,4.2304457 0 0 0 3.23113,-1.63222 z"
id="path16-8"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-1"
d="m 174.20619,164.67083 h -9.32697 a 5.6405943,5.6405943 0 0 0 0.88273,3.04792 q 0.7828,1.0826 2.74813,1.0826 a 10.120869,10.120869 0 0 0 4.36369,-1.16587 4.3803434,4.3803434 0 0 1 1.19918,2.5316 10.759323,10.759323 0 0 1 -6.41229,1.8987 q -3.74744,0 -5.37966,-2.43167 -1.63222,-2.43167 -1.63222,-6.2957 0,-3.88624 1.79877,-6.2957 a 6.0181143,6.0181143 0 0 1 5.14649,-2.43168 q 3.33106,0 5.14648,2.01529 a 7.3449864,7.3449864 0 0 1 1.79878,5.07987 13.04665,13.04665 0 0 1 -0.33311,2.96464 z m -6.42895,-7.06184 q -2.73146,0 -2.93133,4.13051 h 5.79605 v -0.39973 a 4.7245529,4.7245529 0 0 0 -0.69953,-2.69816 2.4316735,2.4316735 0 0 0 -2.14298,-1.03262 z"
id="path18-7"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<path
id="path24-3-6-9"
style="fill:#ff9329;fill-opacity:1;stroke-width:0.555177"
d="m 124.80273,70.162462 a 11.0036,11.0036 0 0 0 -4.33203,0.935547 L 76.261719,90.656602 a 1.5989086,1.5989086 0 0 0 -0.837891,2.138672 0.77169547,0.77169547 0 0 0 0.06641,0.177735 l 7.09375,14.021481 h 6.15625 l -0.875,-4.88867 c -0.07217,-0.39418 -0.711263,-3.187532 -1.316406,-5.197264 l 20.691398,6.462894 c 0.27198,1.28839 0.63292,2.49204 1.0625,3.62304 h 33.54883 c 0.36964,-1.13128 0.66138,-2.33705 0.85938,-3.62304 l 20.64648,-6.445316 c -0.60514,2.009734 -1.23639,4.785506 -1.30859,5.179686 l -0.875,4.88867 h 6.15429 l 7.02735,-13.894528 0.0664,-0.126953 0.0684,-0.171875 a 0.10548355,0.10548355 0 0 0 0,-0.04492 1.4878733,1.4878733 0 0 0 0.0664,-0.515625 1.5822533,1.5822533 0 0 0 -0.99414,-1.583985 L 129.35352,71.098009 a 11.0036,11.0036 0 0 0 -4.55079,-0.935547 z" />
<path
id="path24-3-2"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177"
d="M 124.80273,79.416133 A 11.0036,11.0036 0 0 0 120.4707,80.35168 L 76.261719,99.910272 a 1.5989086,1.5989086 0 0 0 -0.837891,2.136718 0.77169547,0.77169547 0 0 0 0.06641,0.17773 l 3.847657,7.60352 h 8.175781 c -0.257897,-1.08856 -0.591943,-2.42953 -0.964844,-3.66797 l 11.744141,3.66797 h 53.371087 l 11.69336,-3.65039 c -0.37193,1.23522 -0.70076,2.56719 -0.95703,3.65039 h 8.17383 l 3.78125,-7.47656 0.0664,-0.12696 0.0684,-0.17187 a 0.10548355,0.10548355 0 0 0 0,-0.0449 1.4878733,1.4878733 0 0 0 0.0664,-0.51563 1.5822533,1.5822533 0 0 0 -0.99414,-1.582028 L 129.35352,80.35168 a 11.0036,11.0036 0 0 0 -4.55079,-0.935547 z" />
<path
class="cls-2"
d="m 174.55595,110.92974 a 1.4878733,1.4878733 0 0 1 -0.0666,0.51631 0.10548355,0.10548355 0 0 1 0,0.0444 l -0.0666,0.17211 v 0 l -0.0666,0.12769 -10.69826,21.15223 c -1.48787,2.93688 -4.22489,2.84806 -3.76409,-0.12214 l 2.15408,-12.02512 c 0.0722,-0.39418 0.70508,-3.17006 1.31022,-5.1798 l -20.64702,6.4456 c -3.24223,21.05785 -30.95109,21.40761 -35.47023,0 l -20.691432,-6.46226 c 0.605143,2.00974 1.243596,4.80228 1.315769,5.19646 l 2.154085,12.02512 c 0.460796,2.9702 -2.276224,3.05902 -3.764098,0.12214 L 75.49024,111.66257 a 0.77169547,0.77169547 0 0 1 -0.06662,-0.17766 1.5989086,1.5989086 0 0 1 0.838317,-2.13743 L 120.47065,89.788613 a 11.0036,11.0036 0 0 1 8.88282,0 l 44.20871,19.558867 a 1.5822533,1.5822533 0 0 1 0.99377,1.58226 z"
id="path24-0"
style="fill:#ff9329;fill-opacity:1;stroke-width:0.555177" />
<path
class="cls-3"
d="m 139.0413,114.61611 19.11473,-7.69475 a 0.81055784,0.81055784 0 0 0 0,-1.50453 c -2.2207,-0.92714 -4.96328,-1.99308 -7.65033,-3.10899 -0.49411,-0.20541 -5.17425,3.15341 -5.60173,3.49762 l -8.23882,6.58439 c -1.99309,1.67108 -0.26649,3.28665 2.37615,2.22626 z"
id="path26-2"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
<circle
class="cls-3"
cx="125.18409"
cy="122.13319"
r="9.9654207"
id="circle28-3"
style="fill:#4e4e4e;fill-opacity:1;stroke-width:0.555177" />
</g>
</svg>

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 15 KiB

View file

@ -1,119 +0,0 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
width="108.2099mm"
height="108.00987mm"
viewBox="0 0 108.2099 108.00987"
version="1.1"
id="svg8"
inkscape:version="1.0.1 (3bc2e813f5, 2020-09-07)"
sodipodi:docname="garage.svg"
inkscape:export-filename="/home/lx/garage.png"
inkscape:export-xdpi="96"
inkscape:export-ydpi="96">
<defs
id="defs2" />
<sodipodi:namedview
id="base"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageopacity="1"
inkscape:pageshadow="2"
inkscape:zoom="0.5"
inkscape:cx="-212.52783"
inkscape:cy="204.9547"
inkscape:document-units="mm"
inkscape:current-layer="layer1"
inkscape:document-rotation="0"
showgrid="false"
fit-margin-top="20"
fit-margin-left="20"
fit-margin-right="20"
fit-margin-bottom="20"
inkscape:window-width="1404"
inkscape:window-height="1016"
inkscape:window-x="103"
inkscape:window-y="27"
inkscape:window-maximized="0" />
<metadata
id="metadata5">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title></dc:title>
</cc:Work>
</rdf:RDF>
</metadata>
<g
inkscape:label="Layer 1"
inkscape:groupmode="layer"
id="layer1"
transform="translate(-45.667412,-33.028536)">
<path
style="fill:none;stroke:#000000;stroke-width:2.065;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="M 66.78016,73.340623 99.921832,54.219898 132.84481,73.130965 V 120.00591 H 66.701651 Z"
id="path124"
sodipodi:nodetypes="cccccc" />
<g
id="g1106-5"
transform="matrix(0,0.95201267,-0.95201267,0,194.01664,-65.058377)"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<g
id="g1061-3"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path956-5"
cx="168.8569"
cy="92.889587"
r="13.125794" />
<circle
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="path958-6"
cx="168.77444"
cy="92.702293"
r="3.0778286" />
<path
id="path960-2"
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 169.46072,82.84435 c 4.95795,0.336608 8.87296,4.341959 9.09638,9.306301"
sodipodi:nodetypes="cc" />
</g>
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 154.67824,112.84018 11.89881,-13.038071 c 1.46407,-1.552664 3.79541,0.878511 2.81832,2.089181 l -10.57965,14.481 c -1.8851,2.02632 -6.10786,-1.06119 -4.13748,-3.53211 z"
id="path964-9"
sodipodi:nodetypes="ccccc" />
<g
id="g1071-1"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none" />
<g
id="g1065-3"
style="stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none">
<rect
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
id="rect949-6"
width="35.576611"
height="48.507355"
x="150.9623"
y="74.698929"
ry="2.7302756" />
<path
style="fill:none;stroke:#000000;stroke-width:2.17959;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
d="m 150.76919,106.16944 6.36181,-0.0223 c 2.53845,3.46232 6.29787,4.20243 10.1055,4.40362 l 0.0176,13.09251"
id="path1033-0"
sodipodi:nodetypes="cccc" />
</g>
</g>
</g>
</svg>

Before

Width:  |  Height:  |  Size: 4.5 KiB

View file

@ -1,16 +1,18 @@
#!/bin/bash
set -ex
SCRIPT_FOLDER="`dirname \"$0\"`"
REPO_FOLDER="${SCRIPT_FOLDER}/../"
GARAGE_DEBUG="${REPO_FOLDER}/target/debug/"
GARAGE_RELEASE="${REPO_FOLDER}/target/release/"
PATH="${GARAGE_DEBUG}:${GARAGE_RELEASE}:$PATH"
garage bucket create éprouvette
garage bucket create eprouvette
KEY_INFO=`garage key new --name opérateur`
ACCESS_KEY=`echo $KEY_INFO|grep -Po 'GK[a-f0-9]+'`
SECRET_KEY=`echo $KEY_INFO|grep -Po 'secret_key: "[a-f0-9]+'|grep -Po '[a-f0-9]+$'`
garage bucket allow éprouvette --read --write --key $ACCESS_KEY
SECRET_KEY=`echo $KEY_INFO|grep -Po 'Secret key: [a-f0-9]+'|grep -Po '[a-f0-9]+$'`
garage bucket allow eprouvette --read --write --key $ACCESS_KEY
echo "$ACCESS_KEY $SECRET_KEY" > /tmp/garage.s3
echo "Bucket s3://éprouvette created. Credentials stored in /tmp/garage.s3."
echo "Bucket s3://eprouvette created. Credentials stored in /tmp/garage.s3."

7
script/dev-clean.sh Executable file
View file

@ -0,0 +1,7 @@
#!/bin/bash
set -ex
killall -9 garage || echo "garage is not running"
rm -rf /tmp/garage*
rm -rf /tmp/config.*.toml

View file

@ -10,7 +10,7 @@ PATH="${GARAGE_DEBUG}:${GARAGE_RELEASE}:$PATH"
FANCYCOLORS=("41m" "42m" "44m" "45m" "100m" "104m")
export RUST_BACKTRACE=1
export RUST_LOG=garage=info
export RUST_LOG=garage=info,garage_api=debug
MAIN_LABEL="\e[${FANCYCOLORS[0]}[main]\e[49m"
WHICH_GARAGE=$(which garage || exit 1)
@ -24,11 +24,11 @@ cat > $CONF_PATH <<EOF
block_size = 1048576 # objects are split in blocks of maximum this number of bytes
metadata_dir = "/tmp/garage-meta-$count"
data_dir = "/tmp/garage-data-$count"
rpc_bind_addr = "127.0.0.$count:3901" # the port other Garage nodes will use to talk to this node
rpc_bind_addr = "0.0.0.0:$((3900+$count))" # the port other Garage nodes will use to talk to this node
bootstrap_peers = [
"127.0.0.1:3901",
"127.0.0.2:3901",
"127.0.0.3:3901"
"127.0.0.1:3902",
"127.0.0.1:3903"
]
max_concurrent_rpc_requests = 12
data_replication_factor = 3
@ -36,11 +36,13 @@ meta_replication_factor = 3
meta_epidemic_fanout = 3
[s3_api]
api_bind_addr = "127.0.0.$count:3900" # the S3 API port, HTTP without TLS. Add a reverse proxy for the TLS part.
api_bind_addr = "0.0.0.0:$((3910+$count))" # the S3 API port, HTTP without TLS. Add a reverse proxy for the TLS part.
s3_region = "garage" # set this to anything. S3 API calls will fail if they are not made against the region set here.
[s3_web]
bind_addr = "127.0.0.$count:3902"
bind_addr = "0.0.0.0:$((3920+$count))"
root_domain = ".garage.tld"
index = "index.html"
EOF
echo -en "$LABEL configuration written to $CONF_PATH\n"

View file

@ -1,15 +1,23 @@
#!/bin/bash
set -ex
SCRIPT_FOLDER="`dirname \"$0\"`"
REPO_FOLDER="${SCRIPT_FOLDER}/../"
GARAGE_DEBUG="${REPO_FOLDER}/target/debug/"
GARAGE_RELEASE="${REPO_FOLDER}/target/release/"
PATH="${GARAGE_DEBUG}:${GARAGE_RELEASE}:$PATH"
sleep 5
until garage status 2>&1|grep -q Healthy ; do
echo "cluster starting..."
sleep 1
done
garage status \
| grep UNCONFIGURED \
| grep -Po '^[0-9a-f]+' \
| while read id; do
garage node configure -d dc1 -n 100 $id
garage node configure -d dc1 -c 1 $id
done

14
script/dev-env-aws.sh Normal file
View file

@ -0,0 +1,14 @@
#!/bin/bash
SCRIPT_FOLDER="`dirname \"${BASH_SOURCE[0]}\"`"
REPO_FOLDER="${SCRIPT_FOLDER}/../"
GARAGE_DEBUG="${REPO_FOLDER}/target/debug/"
GARAGE_RELEASE="${REPO_FOLDER}/target/release/"
PATH="${GARAGE_DEBUG}:${GARAGE_RELEASE}:$PATH"
export AWS_ACCESS_KEY_ID=`cat /tmp/garage.s3 |cut -d' ' -f1`
export AWS_SECRET_ACCESS_KEY=`cat /tmp/garage.s3 |cut -d' ' -f2`
export AWS_DEFAULT_REGION='garage'
alias awsgrg="aws s3 \
--endpoint-url http://127.0.0.1:3911"

3
script/dev-env.sh → script/dev-env-s3cmd.sh Executable file → Normal file
View file

@ -10,7 +10,8 @@ ACCESS_KEY=`cat /tmp/garage.s3 |cut -d' ' -f1`
SECRET_KEY=`cat /tmp/garage.s3 |cut -d' ' -f2`
alias s3grg="s3cmd \
--host 127.0.0.1:3900 \
--host 127.0.0.1:3911 \
--host-bucket 127.0.0.1:3911 \
--access_key=$ACCESS_KEY \
--secret_key=$SECRET_KEY \
--region=garage \

290
script/simulate_ring.py Executable file
View file

@ -0,0 +1,290 @@
#!/usr/bin/env python3
import hashlib
import bisect
import xxhash
import numpy as np
REPLICATION_FACTOR = 3
def hash_str(s):
xxh = xxhash.xxh64()
xxh.update(s.encode('ascii'))
return xxh.hexdigest()
def sha256_str(s):
return hashlib.sha256(s.encode('ascii')).hexdigest()
def walk_ring_from_pos(tokens, dcs, start):
ret = []
ret_dcs = set()
delta = 0
while len(ret) < REPLICATION_FACTOR:
i = (start + delta) % len(tokens)
delta = delta + 1
(token_k, token_dc, token_node) = tokens[i]
if token_dc not in ret_dcs:
ret_dcs |= set([token_dc])
ret.append(token_node)
elif len(ret_dcs) == len(dcs) and token_node not in ret:
ret.append(token_node)
return ret
"""
def count_tokens_per_node(tokens):
tokens_of_node = {}
for _, _, token_node in tokens:
if token_node not in tokens_of_node:
tokens_of_node[token_node] = 0
tokens_of_node[token_node] += 1
print("#tokens per node:")
for node, ntok in sorted(list(tokens_of_node.items())):
print("-", node, ": ", ntok)
"""
def count_partitions_per_node(ring_node_list):
tokens_of_node = {}
for nodelist in ring_node_list:
for node_id in nodelist:
if node_id not in tokens_of_node:
tokens_of_node[node_id] = 0
tokens_of_node[node_id] += 1
print("#partitions per node:")
for node, ntok in sorted(list(tokens_of_node.items())):
print("-", node, ": ", ntok)
def method1(nodes):
tokens = []
dcs = set()
for (node_id, dc, n_tokens) in nodes:
dcs |= set([dc])
for i in range(n_tokens):
token = hash_str(f"{node_id} {i}")
tokens.append((token, dc, node_id))
tokens.sort(key=lambda tok: tok[0])
space_of_node = {}
def walk_ring(v):
i = bisect.bisect_left([tok for tok, _, _ in tokens], hash_str(v))
return walk_ring_from_pos(tokens, dcs, i)
ring_node_list = [walk_ring_from_pos(tokens, dcs, i) for i in range(len(tokens))]
return walk_ring, ring_node_list
def method2(nodes):
partition_bits = 10
partitions = list(range(2**partition_bits))
def partition_node(i):
h, hn, hndc = None, None, None
for (node_id, node_dc, n_tokens) in nodes:
for tok in range(n_tokens):
hnode = hash_str(f"partition {i} node {node_id} token {tok}")
if h is None or hnode < h:
h = hnode
hn = node_id
hndc = node_dc
return (i, hndc, hn)
partition_nodes = [partition_node(i) for i in partitions]
dcs = list(set(node_dc for _, node_dc, _ in nodes))
def walk_ring(v):
# xxh = xxhash.xxh32()
# xxh.update(v.encode('ascii'))
# vh = xxh.intdigest()
# i = vh % (2**partition_bits)
vh = hashlib.sha256(v.encode('ascii')).digest()
i = (vh[0]<<8 | vh[1]) % (2**partition_bits)
return walk_ring_from_pos(partition_nodes, dcs, i)
ring_node_list = [walk_ring_from_pos(partition_nodes, dcs, i) for i in range(len(partition_nodes))]
return walk_ring, ring_node_list
def method3(nodes):
partition_bits = 10
queues = []
for (node_id, node_dc, n_tokens) in nodes:
que = [(i, hash_str(f"{node_id} {i}")) for i in range(2**partition_bits)]
que.sort(key=lambda x: x[1])
que = [x[0] for x in que]
queues.append((node_id, node_dc, n_tokens, que))
partitions = [None for _ in range(2**partition_bits)]
queues.sort(key=lambda x: hash_str(x[0]))
# Maglev
remaining = 2**partition_bits
while remaining > 0:
for toktok in range(100):
for iq in range(len(queues)):
node_id, node_dc, n_tokens, node_queue = queues[iq]
if toktok >= n_tokens:
continue
for qi, qv in enumerate(node_queue):
if partitions[qv] == None:
partitions[qv] = (qv, node_dc, node_id)
remaining -= 1
queues[iq] = (node_id, node_dc, n_tokens, node_queue[qi+1:])
break
dcs = list(set(node_dc for _, node_dc, _ in nodes))
def walk_ring(v):
vh = hashlib.sha256(v.encode('ascii')).digest()
i = (vh[0]<<8 | vh[1]) % (2**partition_bits)
return walk_ring_from_pos(partitions, dcs, i)
ring_node_list = [walk_ring_from_pos(partitions, dcs, i) for i in range(len(partitions))]
return walk_ring, ring_node_list
def method4(nodes):
partition_bits = 10
partitions = [[] for _ in range(2**partition_bits)]
dcs = list(set(node_dc for _, node_dc, _ in nodes))
# Maglev, improved for several replicas on several DCs
for ri in range(REPLICATION_FACTOR):
queues = []
for (node_id, node_dc, n_tokens) in nodes:
que = [(i, hash_str(f"{node_id} {i}")) for i in range(2**partition_bits)]
que.sort(key=lambda x: x[1])
que = [x[0] for x in que]
queues.append((node_id, node_dc, n_tokens, que))
queues.sort(key=lambda x: hash_str("{} {}".format(ri, x[0])))
remaining = 2**partition_bits
while remaining > 0:
for toktok in range(100):
for iq in range(len(queues)):
node_id, node_dc, n_tokens, node_queue = queues[iq]
if toktok >= n_tokens:
continue
for qi, qv in enumerate(node_queue):
if len(partitions[qv]) != ri:
continue
p_dcs = set([x[0] for x in partitions[qv]])
p_nodes = [x[1] for x in partitions[qv]]
if node_dc not in p_dcs or (len(p_dcs) == len(dcs) and node_id not in p_nodes):
partitions[qv].append((node_dc, node_id))
remaining -= 1
queues[iq] = (node_id, node_dc, n_tokens, node_queue[qi+1:])
break
def walk_ring(v):
vh = hashlib.sha256(v.encode('ascii')).digest()
i = (vh[0]<<8 | vh[1]) % (2**partition_bits)
assert len(set([node_dc for node_dc, _ in partitions[i]])) == min(REPLICATION_FACTOR, len(dcs))
return [node_id for _, node_id in partitions[i]]
ring_node_list = [[node_id for _, node_id in p] for p in partitions]
return walk_ring, ring_node_list
def evaluate_method(method, nodes):
walk_ring, ring_node_list = method(nodes)
print("Ring length:", len(ring_node_list))
count_partitions_per_node(ring_node_list)
print("Number of data items per node (100000 simulation):")
node_data_counts = {}
for i in range(100000):
inodes = walk_ring(f"{i}")
for n in inodes:
if n not in node_data_counts:
node_data_counts[n] = 0
node_data_counts[n] += 1
for n, v in sorted(list(node_data_counts.items())):
print("-", n, ": ", v)
dclist_per_ntok = {}
for node_id, _, ntok in nodes:
if ntok not in dclist_per_ntok:
dclist_per_ntok[ntok] = []
dclist_per_ntok[ntok].append(node_data_counts[node_id])
list_normalized = []
for ntok, dclist in dclist_per_ntok.items():
avg = sum(dclist)/len(dclist)
for v in dclist:
list_normalized.append(v / avg)
print("variance wrt. same-ntok mean:", "%.2f%%"%(np.var(list_normalized)*100))
num_changes_sum = [0, 0, 0, 0]
for n in nodes:
nodes2 = [nn for nn in nodes if nn != n]
_, ring_node_list_2 = method(nodes2)
if len(ring_node_list_2) != len(ring_node_list):
continue
num_changes = [0, 0, 0, 0]
for (ns1, ns2) in zip(ring_node_list, ring_node_list_2):
changes = sum(1 for x in ns1 if x not in ns2)
num_changes[changes] += 1
for i, v in enumerate(num_changes):
num_changes_sum[i] += v / len(ring_node_list)
print("removing", n[1], n[0], ":", " ".join(["%.2f%%"%(x*100/len(ring_node_list)) for x in num_changes]))
print("1-node removal: partitions moved on 0/1/2/3 nodes: ", " ".join(["%.2f%%"%(x*100/len(nodes)) for x in num_changes_sum]))
if __name__ == "__main__":
print("------")
print("method 1 (standard ring)")
nodes = [('digitale', 'atuin', 64),
('drosera', 'atuin', 64),
('datura', 'atuin', 64),
('io', 'jupiter', 128)]
nodes2 = [('digitale', 'atuin', 64),
('drosera', 'atuin', 64),
('datura', 'atuin', 64),
('io', 'jupiter', 128),
('isou', 'jupiter', 64),
('mini', 'grog', 32),
('mixi', 'grog', 32),
('moxi', 'grog', 32),
('modi', 'grog', 32),
('geant', 'grisou', 128),
('gipsie', 'grisou', 128),
]
evaluate_method(method1, nodes2)
print("------")
print("method 2 (custom ring)")
nodes = [('digitale', 'atuin', 1),
('drosera', 'atuin', 1),
('datura', 'atuin', 1),
('io', 'jupiter', 2)]
nodes2 = [('digitale', 'atuin', 2),
('drosera', 'atuin', 2),
('datura', 'atuin', 2),
('io', 'jupiter', 4),
('isou', 'jupiter', 2),
('mini', 'grog', 1),
('mixi', 'grog', 1),
('moxi', 'grog', 1),
('modi', 'grog', 1),
('geant', 'grisou', 4),
('gipsie', 'grisou', 4),
]
evaluate_method(method2, nodes2)
print("------")
print("method 3 (maglev)")
evaluate_method(method3, nodes2)
print("------")
print("method 4 (maglev, multi-dc twist)")
evaluate_method(method4, nodes2)

76
script/test-smoke.sh Executable file
View file

@ -0,0 +1,76 @@
#!/bin/bash
set -ex
shopt -s expand_aliases
SCRIPT_FOLDER="`dirname \"$0\"`"
REPO_FOLDER="${SCRIPT_FOLDER}/../"
echo "setup"
cargo build
${SCRIPT_FOLDER}/dev-clean.sh
${SCRIPT_FOLDER}/dev-cluster.sh > /tmp/garage.log 2>&1 &
${SCRIPT_FOLDER}/dev-configure.sh
${SCRIPT_FOLDER}/dev-bucket.sh
source ${SCRIPT_FOLDER}/dev-env-aws.sh
source ${SCRIPT_FOLDER}/dev-env-s3cmd.sh
garage status
garage key list
garage bucket list
dd if=/dev/urandom of=/tmp/garage.1.rnd bs=1k count=2 # < INLINE_THRESHOLD = 3072 bytes
dd if=/dev/urandom of=/tmp/garage.2.rnd bs=1M count=5
dd if=/dev/urandom of=/tmp/garage.3.rnd bs=1M count=10
echo "s3 api testing..."
for idx in $(seq 1 3); do
# AWS sends
awsgrg cp /tmp/garage.$idx.rnd s3://eprouvette/garage.$idx.aws
awsgrg ls s3://eprouvette
awsgrg cp s3://eprouvette/garage.$idx.aws /tmp/garage.$idx.dl
diff /tmp/garage.$idx.rnd /tmp/garage.$idx.dl
rm /tmp/garage.$idx.dl
s3grg get s3://eprouvette/garage.$idx.aws /tmp/garage.$idx.dl
diff /tmp/garage.$idx.rnd /tmp/garage.$idx.dl
rm /tmp/garage.$idx.dl
awsgrg rm s3://eprouvette/garage.$idx.aws
# S3CMD sends
s3grg put /tmp/garage.$idx.rnd s3://eprouvette/garage.$idx.s3cmd
s3grg ls s3://eprouvette
s3grg get s3://eprouvette/garage.$idx.s3cmd /tmp/garage.$idx.dl
diff /tmp/garage.$idx.rnd /tmp/garage.$idx.dl
rm /tmp/garage.$idx.dl
awsgrg cp s3://eprouvette/garage.$idx.s3cmd /tmp/garage.$idx.dl
diff /tmp/garage.$idx.rnd /tmp/garage.$idx.dl
rm /tmp/garage.$idx.dl
s3grg rm s3://eprouvette/garage.$idx.s3cmd
done
rm /tmp/garage.{1,2,3}.rnd
echo "website testing"
echo "<h1>hello world</h1>" > /tmp/garage-index.html
awsgrg cp /tmp/garage-index.html s3://eprouvette/index.html
[ `curl -s -o /dev/null -w "%{http_code}" --header "Host: eprouvette.garage.tld" http://127.0.0.1:3923/ ` == 404 ]
garage bucket website --allow eprouvette
[ `curl -s -o /dev/null -w "%{http_code}" --header "Host: eprouvette.garage.tld" http://127.0.0.1:3923/ ` == 200 ]
garage bucket website --deny eprouvette
[ `curl -s -o /dev/null -w "%{http_code}" --header "Host: eprouvette.garage.tld" http://127.0.0.1:3923/ ` == 404 ]
awsgrg rm s3://eprouvette/index.html
rm /tmp/garage-index.html
echo "teardown"
garage bucket deny --read --write eprouvette --key $AWS_ACCESS_KEY_ID
garage bucket delete --yes eprouvette
garage key delete --yes $AWS_ACCESS_KEY_ID
echo "success"

View file

@ -1,9 +1,9 @@
[package]
name = "garage_api"
version = "0.1.1"
version = "0.2.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "GPL-3.0"
license = "AGPL-3.0"
description = "S3 API server crate for the Garage object store"
repository = "https://git.deuxfleurs.fr/Deuxfleurs/garage"
@ -13,29 +13,31 @@ path = "lib.rs"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
garage_util = { version = "0.1", path = "../util" }
garage_table = { version = "0.1.1", path = "../table" }
garage_model = { version = "0.1.1", path = "../model" }
garage_model = { version = "0.2.1", path = "../model" }
garage_table = { version = "0.2.1", path = "../table" }
garage_util = { version = "0.2.1", path = "../util" }
err-derive = "0.2.3"
bytes = "0.4"
hex = "0.3"
base64 = "0.13"
log = "0.4"
bytes = "1.0"
chrono = "0.4"
md-5 = "0.9.1"
sha2 = "0.8"
hmac = "0.7"
crypto-mac = "0.7"
crypto-mac = "0.10"
err-derive = "0.3"
fastcdc = "1.0.5"
hex = "0.4"
hmac = "0.10"
log = "0.4"
md-5 = "0.9"
rand = "0.7"
sha2 = "0.9"
futures = "0.3"
futures-util = "0.3"
tokio = { version = "0.2", default-features = false, features = ["rt-core", "rt-threaded", "io-driver", "net", "tcp", "time", "macros", "sync", "signal", "fs"] }
tokio = { version = "1.0", default-features = false, features = ["rt", "rt-multi-thread", "io-util", "net", "time", "macros", "sync", "signal", "fs"] }
http = "0.2"
hyper = "^0.13.6"
url = "2.1"
httpdate = "0.3"
percent-encoding = "2.1.0"
roxmltree = "0.11"
http-range = "0.1"
hyper = "0.14"
percent-encoding = "2.1.0"
roxmltree = "0.14"
url = "2.1"

View file

@ -62,7 +62,11 @@ async fn handler(
let body: Body = Body::from(format!("{}\n", e));
let mut http_error = Response::new(body);
*http_error.status_mut() = e.http_status_code();
warn!("Response: error {}, {}", e.http_status_code(), e);
if e.http_status_code().is_server_error() {
warn!("Response: error {}, {}", e.http_status_code(), e);
} else {
info!("Response: error {}, {}", e.http_status_code(), e);
}
Ok(http_error)
}
}
@ -97,7 +101,7 @@ async fn handler_inner(garage: Arc<Garage>, req: Request<Body>) -> Result<Respon
match req.method() {
&Method::HEAD => {
// HeadObject query
Ok(handle_head(garage, &bucket, &key).await?)
Ok(handle_head(garage, &req, &bucket, &key).await?)
}
&Method::GET => {
// GetObject query
@ -133,7 +137,10 @@ async fn handler_inner(garage: Arc<Garage>, req: Request<Body>) -> Result<Respon
)));
}
let source_key = source_key.ok_or_bad_request("No source key specified")?;
Ok(handle_copy(garage, &bucket, &key, &source_bucket, &source_key).await?)
Ok(
handle_copy(garage, &req, &bucket, &key, &source_bucket, &source_key)
.await?,
)
} else {
// PutObject query
Ok(handle_put(garage, req, &bucket, &key, content_sha256).await?)
@ -156,10 +163,15 @@ async fn handler_inner(garage: Arc<Garage>, req: Request<Body>) -> Result<Respon
} else if params.contains_key(&"uploadid".to_string()) {
// CompleteMultipartUpload call
let upload_id = params.get("uploadid").unwrap();
Ok(
handle_complete_multipart_upload(garage, req, &bucket, &key, upload_id)
.await?,
Ok(handle_complete_multipart_upload(
garage,
req,
&bucket,
&key,
upload_id,
content_sha256,
)
.await?)
} else {
Err(Error::BadRequest(format!(
"Not a CreateMultipartUpload call, what is it?"
@ -173,7 +185,7 @@ async fn handler_inner(garage: Arc<Garage>, req: Request<Body>) -> Result<Respon
&Method::PUT => {
// CreateBucket
// If we're here, the bucket already exists, so just answer ok
println!(
debug!(
"Body: {}",
std::str::from_utf8(&hyper::body::to_bytes(req.into_body()).await?)
.unwrap_or("<invalid utf8>")
@ -198,38 +210,16 @@ async fn handler_inner(garage: Arc<Garage>, req: Request<Body>) -> Result<Respon
))
}
&Method::GET => {
// ListObjects query
let delimiter = params.get("delimiter").map(|x| x.as_str()).unwrap_or(&"");
let max_keys = params
.get("max-keys")
.map(|x| {
x.parse::<usize>()
.ok_or_bad_request("Invalid value for max-keys")
})
.unwrap_or(Ok(1000))?;
let prefix = params.get("prefix").map(|x| x.as_str()).unwrap_or(&"");
let urlencode_resp = params
.get("encoding-type")
.map(|x| x == "url")
.unwrap_or(false);
let marker = params.get("marker").map(String::as_str);
Ok(handle_list(
garage,
bucket,
delimiter,
max_keys,
prefix,
marker,
urlencode_resp,
)
.await?)
// ListObjects or ListObjectsV2 query
let q = parse_list_objects_query(bucket, &params)?;
Ok(handle_list(garage, &q).await?)
}
&Method::POST => {
if params.contains_key(&"delete".to_string()) {
// DeleteObjects
Ok(handle_delete_objects(garage, bucket, req).await?)
Ok(handle_delete_objects(garage, bucket, req, content_sha256).await?)
} else {
println!(
debug!(
"Body: {}",
std::str::from_utf8(&hyper::body::to_bytes(req.into_body()).await?)
.unwrap_or("<invalid utf8>")

View file

@ -24,10 +24,16 @@ pub enum Error {
// Category: bad request
#[error(display = "Invalid UTF-8: {}", _0)]
InvalidUTF8(#[error(source)] std::str::Utf8Error),
InvalidUTF8Str(#[error(source)] std::str::Utf8Error),
#[error(display = "Invalid UTF-8: {}", _0)]
InvalidUTF8String(#[error(source)] std::string::FromUtf8Error),
#[error(display = "Invalid base64: {}", _0)]
InvalidBase64(#[error(source)] base64::DecodeError),
#[error(display = "Invalid XML: {}", _0)]
InvalidXML(#[error(source)] roxmltree::Error),
InvalidXML(String),
#[error(display = "Invalid header value: {}", _0)]
InvalidHeader(#[error(source)] hyper::header::ToStrError),
@ -39,6 +45,12 @@ pub enum Error {
BadRequest(String),
}
impl From<roxmltree::Error> for Error {
fn from(err: roxmltree::Error) -> Self {
Self::InvalidXML(format!("{}", err))
}
}
impl Error {
pub fn http_status_code(&self) -> StatusCode {
match self {

View file

@ -1,11 +1,11 @@
use std::fmt::Write;
use std::sync::Arc;
use chrono::{SecondsFormat, Utc};
use hyper::{Body, Response};
use hyper::{Body, Request, Response};
use garage_table::*;
use garage_util::data::*;
use garage_util::time::*;
use garage_model::block_ref_table::*;
use garage_model::garage::Garage;
@ -13,9 +13,11 @@ use garage_model::object_table::*;
use garage_model::version_table::*;
use crate::error::*;
use crate::s3_put::get_headers;
pub async fn handle_copy(
garage: Arc<Garage>,
req: &Request<Body>,
dest_bucket: &str,
dest_key: &str,
source_bucket: &str,
@ -41,17 +43,37 @@ pub async fn handle_copy(
};
let new_uuid = gen_uuid();
let dest_object_version = ObjectVersion {
uuid: new_uuid,
timestamp: now_msec(),
state: ObjectVersionState::Complete(source_last_state.clone()),
};
let new_timestamp = now_msec();
match &source_last_state {
// Implement x-amz-metadata-directive: REPLACE
let old_meta = match source_last_state {
ObjectVersionData::DeleteMarker => {
return Err(Error::NotFound);
}
ObjectVersionData::Inline(_meta, _bytes) => {
ObjectVersionData::Inline(meta, _bytes) => meta,
ObjectVersionData::FirstBlock(meta, _fbh) => meta,
};
let new_meta = match req.headers().get("x-amz-metadata-directive") {
Some(v) if v == hyper::header::HeaderValue::from_static("REPLACE") => ObjectVersionMeta {
headers: get_headers(req)?,
size: old_meta.size,
etag: old_meta.etag.clone(),
},
_ => old_meta.clone(),
};
// Save object copy
match source_last_state {
ObjectVersionData::DeleteMarker => unreachable!(),
ObjectVersionData::Inline(_meta, bytes) => {
let dest_object_version = ObjectVersion {
uuid: new_uuid,
timestamp: new_timestamp,
state: ObjectVersionState::Complete(ObjectVersionData::Inline(
new_meta,
bytes.clone(),
)),
};
let dest_object = Object::new(
dest_bucket.to_string(),
dest_key.to_string(),
@ -59,49 +81,91 @@ pub async fn handle_copy(
);
garage.object_table.insert(&dest_object).await?;
}
ObjectVersionData::FirstBlock(_meta, _first_block_hash) => {
ObjectVersionData::FirstBlock(_meta, first_block_hash) => {
// Get block list from source version
let source_version = garage
.version_table
.get(&source_last_v.uuid, &EmptyKey)
.await?;
let source_version = source_version.ok_or(Error::NotFound)?;
let dest_version = Version::new(
// Write an "uploading" marker in Object table
// This holds a reference to the object in the Version table
// so that it won't be deleted, e.g. by repair_versions.
let tmp_dest_object_version = ObjectVersion {
uuid: new_uuid,
timestamp: new_timestamp,
state: ObjectVersionState::Uploading(new_meta.headers.clone()),
};
let tmp_dest_object = Object::new(
dest_bucket.to_string(),
dest_key.to_string(),
vec![tmp_dest_object_version],
);
garage.object_table.insert(&tmp_dest_object).await?;
// Write version in the version table. Even with empty block list,
// this means that the BlockRef entries linked to this version cannot be
// marked as deleted (they are marked as deleted only if the Version
// doesn't exist or is marked as deleted).
let mut dest_version = Version::new(
new_uuid,
dest_bucket.to_string(),
dest_key.to_string(),
false,
source_version.blocks().to_vec(),
);
garage.version_table.insert(&dest_version).await?;
// Fill in block list for version and insert block refs
for (bk, bv) in source_version.blocks.items().iter() {
dest_version.blocks.put(*bk, *bv);
}
let dest_block_refs = dest_version
.blocks
.items()
.iter()
.map(|b| BlockRef {
block: b.1.hash,
version: new_uuid,
deleted: false.into(),
})
.collect::<Vec<_>>();
futures::try_join!(
garage.version_table.insert(&dest_version),
garage.block_ref_table.insert_many(&dest_block_refs[..]),
)?;
// Insert final object
// We do this last because otherwise there is a race condition in the case where
// the copy call has the same source and destination (this happens, rclone does
// it to update the modification timestamp for instance). If we did this concurrently
// with the stuff before, the block's reference counts could be decremented before
// they are incremented again for the new version, leading to data being deleted.
let dest_object_version = ObjectVersion {
uuid: new_uuid,
timestamp: new_timestamp,
state: ObjectVersionState::Complete(ObjectVersionData::FirstBlock(
new_meta,
*first_block_hash,
)),
};
let dest_object = Object::new(
dest_bucket.to_string(),
dest_key.to_string(),
vec![dest_object_version],
);
let dest_block_refs = dest_version
.blocks()
.iter()
.map(|b| BlockRef {
block: b.hash,
version: new_uuid,
deleted: false,
})
.collect::<Vec<_>>();
futures::try_join!(
garage.object_table.insert(&dest_object),
garage.version_table.insert(&dest_version),
garage.block_ref_table.insert_many(&dest_block_refs[..]),
)?;
garage.object_table.insert(&dest_object).await?;
}
}
let now = Utc::now();
let last_modified = now.to_rfc3339_opts(SecondsFormat::Secs, true);
let last_modified = msec_to_rfc3339(new_timestamp);
let mut xml = String::new();
writeln!(&mut xml, r#"<?xml version="1.0" encoding="UTF-8"?>"#).unwrap();
writeln!(&mut xml, r#"<CopyObjectResult>"#).unwrap();
writeln!(&mut xml, "\t<LastModified>{}</LastModified>", last_modified).unwrap();
writeln!(&mut xml, "</CopyObjectResult>").unwrap();
Ok(Response::new(Body::from(xml.into_bytes())))
Ok(Response::builder()
.header("Content-Type", "application/xml")
.body(Body::from(xml.into_bytes()))?)
}

View file

@ -4,12 +4,14 @@ use std::sync::Arc;
use hyper::{Body, Request, Response};
use garage_util::data::*;
use garage_util::time::*;
use garage_model::garage::Garage;
use garage_model::object_table::*;
use crate::encoding::*;
use crate::error::*;
use crate::signature::verify_signed_content;
async fn handle_delete_internal(
garage: &Garage,
@ -28,16 +30,16 @@ async fn handle_delete_internal(
_ => true,
});
let mut must_delete = None;
let mut version_to_delete = None;
let mut timestamp = now_msec();
for v in interesting_versions {
if v.timestamp + 1 > timestamp || must_delete.is_none() {
must_delete = Some(v.uuid);
if v.timestamp + 1 > timestamp || version_to_delete.is_none() {
version_to_delete = Some(v.uuid);
}
timestamp = std::cmp::max(timestamp, v.timestamp + 1);
}
let deleted_version = must_delete.ok_or(Error::NotFound)?;
let deleted_version = version_to_delete.ok_or(Error::NotFound)?;
let version_uuid = gen_uuid();
@ -46,7 +48,7 @@ async fn handle_delete_internal(
key.into(),
vec![ObjectVersion {
uuid: version_uuid,
timestamp: now_msec(),
timestamp,
state: ObjectVersionState::Complete(ObjectVersionData::DeleteMarker),
}],
);
@ -73,8 +75,11 @@ pub async fn handle_delete_objects(
garage: Arc<Garage>,
bucket: &str,
req: Request<Body>,
content_sha256: Option<Hash>,
) -> Result<Response<Body>, Error> {
let body = hyper::body::to_bytes(req.into_body()).await?;
verify_signed_content(content_sha256, &body[..])?;
let cmd_xml = roxmltree::Document::parse(&std::str::from_utf8(&body)?)?;
let cmd = parse_delete_objects_xml(&cmd_xml).ok_or_bad_request("Invalid delete XML query")?;
@ -86,7 +91,7 @@ pub async fn handle_delete_objects(
match handle_delete_internal(&garage, bucket, &obj.key).await {
Ok((deleted_version, delete_marker_version)) => {
writeln!(&mut retxml, "\t<Deleted>").unwrap();
writeln!(&mut retxml, "\t\t<Key>{}</Key>", obj.key).unwrap();
writeln!(&mut retxml, "\t\t<Key>{}</Key>", xml_escape(&obj.key)).unwrap();
writeln!(
&mut retxml,
"\t\t<VersionId>{}</VersionId>",
@ -104,7 +109,7 @@ pub async fn handle_delete_objects(
Err(e) => {
writeln!(&mut retxml, "\t<Error>").unwrap();
writeln!(&mut retxml, "\t\t<Code>{}</Code>", e.http_status_code()).unwrap();
writeln!(&mut retxml, "\t\t<Key>{}</Key>", obj.key).unwrap();
writeln!(&mut retxml, "\t\t<Key>{}</Key>", xml_escape(&obj.key)).unwrap();
writeln!(
&mut retxml,
"\t\t<Message>{}</Message>",
@ -118,7 +123,9 @@ pub async fn handle_delete_objects(
writeln!(&mut retxml, "</DeleteObjectsOutput>").unwrap();
Ok(Response::new(Body::from(retxml.into_bytes())))
Ok(Response::builder()
.header("Content-Type", "application/xml")
.body(Body::from(retxml.into_bytes()))?)
}
struct DeleteRequest {
@ -129,33 +136,27 @@ struct DeleteObject {
key: String,
}
fn parse_delete_objects_xml(xml: &roxmltree::Document) -> Result<DeleteRequest, String> {
fn parse_delete_objects_xml(xml: &roxmltree::Document) -> Option<DeleteRequest> {
let mut ret = DeleteRequest { objects: vec![] };
let root = xml.root();
let delete = root.first_child().ok_or(format!("Delete tag not found"))?;
let delete = root.first_child()?;
if !delete.has_tag_name("Delete") {
return Err(format!("Invalid root tag: {:?}", root));
return None;
}
for item in delete.children() {
if item.has_tag_name("Object") {
if let Some(key) = item.children().find(|e| e.has_tag_name("Key")) {
if let Some(key_str) = key.text() {
ret.objects.push(DeleteObject {
key: key_str.to_string(),
});
} else {
return Err(format!("No text for key: {:?}", key));
}
} else {
return Err(format!("No delete key for item: {:?}", item));
}
let key = item.children().find(|e| e.has_tag_name("Key"))?;
let key_str = key.text()?;
ret.objects.push(DeleteObject {
key: key_str.to_string(),
});
} else {
return Err(format!("Invalid delete item: {:?}", item));
return None;
}
}
Ok(ret)
Some(ret)
}

View file

@ -16,6 +16,8 @@ fn object_headers(
version: &ObjectVersion,
version_meta: &ObjectVersionMeta,
) -> http::response::Builder {
debug!("Version meta: {:?}", version_meta);
let date = UNIX_EPOCH + Duration::from_millis(version.timestamp);
let date_str = httpdate::fmt_http_date(date);
@ -24,11 +26,13 @@ fn object_headers(
"Content-Type",
version_meta.headers.content_type.to_string(),
)
.header("Content-Length", format!("{}", version_meta.size))
.header("ETag", version_meta.etag.to_string())
.header("Last-Modified", date_str)
.header("Accept-Ranges", format!("bytes"));
if !version_meta.etag.is_empty() {
resp = resp.header("ETag", format!("\"{}\"", version_meta.etag));
}
for (k, v) in version_meta.headers.other.iter() {
resp = resp.header(k, v.to_string());
}
@ -36,8 +40,48 @@ fn object_headers(
resp
}
fn try_answer_cached(
version: &ObjectVersion,
version_meta: &ObjectVersionMeta,
req: &Request<Body>,
) -> Option<Response<Body>> {
// <trinity> It is possible, and is even usually the case, [that both If-None-Match and
// If-Modified-Since] are present in a request. In this situation If-None-Match takes
// precedence and If-Modified-Since is ignored (as per 6.Precedence from rfc7232). The rational
// being that etag based matching is more accurate, it has no issue with sub-second precision
// for instance (in case of very fast updates)
let cached = if let Some(none_match) = req.headers().get(http::header::IF_NONE_MATCH) {
let none_match = none_match.to_str().ok()?;
let expected = format!("\"{}\"", version_meta.etag);
let found = none_match
.split(',')
.map(str::trim)
.any(|etag| etag == expected || etag == "\"*\"");
found
} else if let Some(modified_since) = req.headers().get(http::header::IF_MODIFIED_SINCE) {
let modified_since = modified_since.to_str().ok()?;
let client_date = httpdate::parse_http_date(modified_since).ok()?;
let server_date = UNIX_EPOCH + Duration::from_millis(version.timestamp);
client_date >= server_date
} else {
false
};
if cached {
Some(
Response::builder()
.status(StatusCode::NOT_MODIFIED)
.body(Body::empty())
.unwrap(),
)
} else {
None
}
}
pub async fn handle_head(
garage: Arc<Garage>,
req: &Request<Body>,
bucket: &str,
key: &str,
) -> Result<Response<Body>, Error> {
@ -61,8 +105,13 @@ pub async fn handle_head(
_ => unreachable!(),
};
let body: Body = Body::from(vec![]);
if let Some(cached) = try_answer_cached(&version, version_meta, req) {
return Ok(cached);
}
let body: Body = Body::empty();
let response = object_headers(&version, version_meta)
.header("Content-Length", format!("{}", version_meta.size))
.status(StatusCode::OK)
.body(body)
.unwrap();
@ -99,6 +148,10 @@ pub async fn handle_get(
ObjectVersionData::FirstBlock(meta, _) => meta,
};
if let Some(cached) = try_answer_cached(&last_v, last_v_meta, req) {
return Ok(cached);
}
let range = match req.headers().get("range") {
Some(range) => {
let range_str = range.to_str()?;
@ -123,7 +176,9 @@ pub async fn handle_get(
.await;
}
let resp_builder = object_headers(&last_v, last_v_meta).status(StatusCode::OK);
let resp_builder = object_headers(&last_v, last_v_meta)
.header("Content-Length", format!("{}", last_v_meta.size))
.status(StatusCode::OK);
match &last_v_data {
ObjectVersionData::DeleteMarker => unreachable!(),
@ -139,9 +194,10 @@ pub async fn handle_get(
let version = version.ok_or(Error::NotFound)?;
let mut blocks = version
.blocks()
.blocks
.items()
.iter()
.map(|vb| (vb.hash, None))
.map(|(_, vb)| (vb.hash, None))
.collect::<Vec<_>>();
blocks[0].1 = Some(first_block);
@ -161,7 +217,7 @@ pub async fn handle_get(
}
})
.buffered(2);
//let body: Body = Box::new(StreamBody::new(Box::pin(body_stream)));
let body = hyper::body::Body::wrap_stream(body_stream);
Ok(resp_builder.body(body)?)
}
@ -181,9 +237,10 @@ pub async fn handle_get_range(
}
let resp_builder = object_headers(version, version_meta)
.header("Content-Length", format!("{}", end - begin))
.header(
"Content-Range",
format!("bytes {}-{}/{}", begin, end, version_meta.size),
format!("bytes {}-{}/{}", begin, end - 1, version_meta.size),
)
.status(StatusCode::PARTIAL_CONTENT);
@ -206,35 +263,50 @@ pub async fn handle_get_range(
None => return Err(Error::NotFound),
};
let blocks = version
.blocks()
.iter()
.cloned()
.filter(|block| block.offset + block.size > begin && block.offset < end)
.collect::<Vec<_>>();
// We will store here the list of blocks that have an intersection with the requested
// range, as well as their "true offset", which is their actual offset in the complete
// file (whereas block.offset designates the offset of the block WITHIN THE PART
// block.part_number, which is not the same in the case of a multipart upload)
let mut blocks = Vec::with_capacity(std::cmp::min(
version.blocks.len(),
4 + ((end - begin) / std::cmp::max(version.blocks.items()[0].1.size as u64, 1024))
as usize,
));
let mut true_offset = 0;
for (_, b) in version.blocks.items().iter() {
if true_offset >= end {
break;
}
// Keep only blocks that have an intersection with the requested range
if true_offset < end && true_offset + b.size > begin {
blocks.push((b.clone(), true_offset));
}
true_offset += b.size;
}
let body_stream = futures::stream::iter(blocks)
.map(move |block| {
.map(move |(block, true_offset)| {
let garage = garage.clone();
async move {
let data = garage.block_manager.rpc_get_block(&block.hash).await?;
let start_in_block = if block.offset > begin {
let data = Bytes::from(data);
let start_in_block = if true_offset > begin {
0
} else {
begin - block.offset
begin - true_offset
};
let end_in_block = if block.offset + block.size < end {
let end_in_block = if true_offset + block.size < end {
block.size
} else {
end - block.offset
end - true_offset
};
Result::<Bytes, Error>::Ok(Bytes::from(
data[start_in_block as usize..end_in_block as usize].to_vec(),
data.slice(start_in_block as usize..end_in_block as usize),
))
}
})
.buffered(2);
//let body: Body = Box::new(StreamBody::new(Box::pin(body_stream)));
let body = hyper::body::Body::wrap_stream(body_stream);
Ok(resp_builder.body(body)?)
}

View file

@ -1,11 +1,11 @@
use std::collections::{BTreeMap, BTreeSet};
use std::collections::{BTreeMap, BTreeSet, HashMap};
use std::fmt::Write;
use std::sync::Arc;
use chrono::{DateTime, NaiveDateTime, SecondsFormat, Utc};
use hyper::{Body, Response};
use garage_util::error::Error;
use garage_util::error::Error as GarageError;
use garage_util::time::*;
use garage_model::garage::Garage;
use garage_model::object_table::*;
@ -13,92 +13,146 @@ use garage_model::object_table::*;
use garage_table::DeletedFilter;
use crate::encoding::*;
use crate::error::*;
#[derive(Debug)]
pub struct ListObjectsQuery {
pub is_v2: bool,
pub bucket: String,
pub delimiter: Option<String>,
pub max_keys: usize,
pub prefix: String,
pub marker: Option<String>,
pub continuation_token: Option<String>,
pub start_after: Option<String>,
pub urlencode_resp: bool,
}
#[derive(Debug)]
struct ListResultInfo {
last_modified: u64,
size: u64,
etag: String,
}
pub fn parse_list_objects_query(
bucket: &str,
params: &HashMap<String, String>,
) -> Result<ListObjectsQuery, Error> {
Ok(ListObjectsQuery {
is_v2: params.get("list-type").map(|x| x == "2").unwrap_or(false),
bucket: bucket.to_string(),
delimiter: params.get("delimiter").filter(|x| !x.is_empty()).cloned(),
max_keys: params
.get("max-keys")
.map(|x| {
x.parse::<usize>()
.ok_or_bad_request("Invalid value for max-keys")
})
.unwrap_or(Ok(1000))?,
prefix: params.get("prefix").cloned().unwrap_or(String::new()),
marker: params.get("marker").cloned(),
continuation_token: params.get("continuation-token").cloned(),
start_after: params.get("start-after").cloned(),
urlencode_resp: params
.get("encoding-type")
.map(|x| x == "url")
.unwrap_or(false),
})
}
pub async fn handle_list(
garage: Arc<Garage>,
bucket: &str,
delimiter: &str,
max_keys: usize,
prefix: &str,
marker: Option<&str>,
urlencode_resp: bool,
query: &ListObjectsQuery,
) -> Result<Response<Body>, Error> {
let mut result_keys = BTreeMap::<String, ListResultInfo>::new();
let mut result_common_prefixes = BTreeSet::<String>::new();
let mut next_chunk_start = marker.unwrap_or(prefix).to_string();
let mut next_chunk_start = if query.is_v2 {
if let Some(ct) = &query.continuation_token {
String::from_utf8(base64::decode(ct.as_bytes())?)?
} else {
query.start_after.clone().unwrap_or(query.prefix.clone())
}
} else {
query.marker.clone().unwrap_or(query.prefix.clone())
};
debug!("List request: `{}` {} `{}`", delimiter, max_keys, prefix);
debug!(
"List request: `{:?}` {} `{}`",
query.delimiter, query.max_keys, query.prefix
);
let truncated;
'query_loop: loop {
let objects = garage
.object_table
.get_range(
&bucket.to_string(),
&query.bucket,
Some(next_chunk_start.clone()),
Some(DeletedFilter::NotDeleted),
max_keys + 1,
query.max_keys + 1,
)
.await?;
debug!(
"List: get range {} (max {}), results: {}",
next_chunk_start,
max_keys + 1,
query.max_keys + 1,
objects.len()
);
for object in objects.iter() {
if !object.key.starts_with(prefix) {
truncated = false;
if !object.key.starts_with(&query.prefix) {
truncated = None;
break 'query_loop;
}
if query.is_v2 && query.start_after.as_ref() == Some(&object.key) {
continue;
}
if let Some(version) = object.versions().iter().find(|x| x.is_data()) {
if result_keys.len() + result_common_prefixes.len() >= max_keys {
truncated = true;
if result_keys.len() + result_common_prefixes.len() >= query.max_keys {
truncated = Some(object.key.to_string());
break 'query_loop;
}
let common_prefix = if delimiter.len() > 0 {
let relative_key = &object.key[prefix.len()..];
let common_prefix = if let Some(delimiter) = &query.delimiter {
let relative_key = &object.key[query.prefix.len()..];
relative_key
.find(delimiter)
.map(|i| &object.key[..prefix.len() + i + delimiter.len()])
.map(|i| &object.key[..query.prefix.len() + i + delimiter.len()])
} else {
None
};
if let Some(pfx) = common_prefix {
result_common_prefixes.insert(pfx.to_string());
} else {
let size = match &version.state {
ObjectVersionState::Complete(ObjectVersionData::Inline(meta, _)) => {
meta.size
}
let meta = match &version.state {
ObjectVersionState::Complete(ObjectVersionData::Inline(meta, _)) => meta,
ObjectVersionState::Complete(ObjectVersionData::FirstBlock(meta, _)) => {
meta.size
meta
}
_ => unreachable!(),
};
let info = match result_keys.get(&object.key) {
None => ListResultInfo {
last_modified: version.timestamp,
size,
size: meta.size,
etag: meta.etag.to_string(),
},
Some(_lri) => {
return Err(Error::Message(format!("Duplicate key?? {}", object.key)))
return Err(Error::InternalError(GarageError::Message(format!(
"Duplicate key?? {}",
object.key
))))
}
};
result_keys.insert(object.key.clone(), info);
};
}
}
if objects.len() < max_keys + 1 {
truncated = false;
if objects.len() < query.max_keys + 1 {
truncated = None;
break 'query_loop;
}
if objects.len() > 0 {
@ -113,43 +167,119 @@ pub async fn handle_list(
r#"<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">"#
)
.unwrap();
writeln!(&mut xml, "\t<Bucket>{}</Bucket>", bucket).unwrap();
writeln!(&mut xml, "\t<Prefix>{}</Prefix>", prefix).unwrap();
writeln!(&mut xml, "\t<KeyCount>{}</KeyCount>", result_keys.len()).unwrap();
writeln!(&mut xml, "\t<MaxKeys>{}</MaxKeys>", max_keys).unwrap();
writeln!(&mut xml, "\t<IsTruncated>{}</IsTruncated>", truncated).unwrap();
writeln!(&mut xml, "\t<Name>{}</Name>", query.bucket).unwrap();
// TODO: in V1, is this supposed to be urlencoded when encoding-type is URL??
writeln!(
&mut xml,
"\t<Prefix>{}</Prefix>",
xml_encode_key(&query.prefix, query.urlencode_resp),
)
.unwrap();
if let Some(delim) = &query.delimiter {
// TODO: in V1, is this supposed to be urlencoded when encoding-type is URL??
writeln!(
&mut xml,
"\t<Delimiter>{}</Delimiter>",
xml_encode_key(delim, query.urlencode_resp),
)
.unwrap();
}
writeln!(&mut xml, "\t<MaxKeys>{}</MaxKeys>", query.max_keys).unwrap();
if query.urlencode_resp {
writeln!(&mut xml, "\t<EncodingType>url</EncodingType>").unwrap();
}
writeln!(
&mut xml,
"\t<KeyCount>{}</KeyCount>",
result_keys.len() + result_common_prefixes.len()
)
.unwrap();
writeln!(
&mut xml,
"\t<IsTruncated>{}</IsTruncated>",
truncated.is_some()
)
.unwrap();
if query.is_v2 {
if let Some(ct) = &query.continuation_token {
writeln!(&mut xml, "\t<ContinuationToken>{}</ContinuationToken>", ct).unwrap();
}
if let Some(sa) = &query.start_after {
writeln!(
&mut xml,
"\t<StartAfter>{}</StartAfter>",
xml_encode_key(sa, query.urlencode_resp)
)
.unwrap();
}
if let Some(nct) = truncated {
writeln!(
&mut xml,
"\t<NextContinuationToken>{}</NextContinuationToken>",
base64::encode(nct.as_bytes())
)
.unwrap();
}
} else {
// TODO: are these supposed to be urlencoded when encoding-type is URL??
if let Some(mkr) = &query.marker {
writeln!(
&mut xml,
"\t<Marker>{}</Marker>",
xml_encode_key(mkr, query.urlencode_resp)
)
.unwrap();
}
if let Some(next_marker) = truncated {
writeln!(
&mut xml,
"\t<NextMarker>{}</NextMarker>",
xml_encode_key(&next_marker, query.urlencode_resp)
)
.unwrap();
}
}
for (key, info) in result_keys.iter() {
let last_modif = NaiveDateTime::from_timestamp(info.last_modified as i64 / 1000, 0);
let last_modif = DateTime::<Utc>::from_utc(last_modif, Utc);
let last_modif = last_modif.to_rfc3339_opts(SecondsFormat::Millis, true);
let last_modif = msec_to_rfc3339(info.last_modified);
writeln!(&mut xml, "\t<Contents>").unwrap();
writeln!(
&mut xml,
"\t\t<Key>{}</Key>",
xml_escape(key),
//xml_encode_key(key, urlencode_resp) // doesn't work with nextcloud, wtf
xml_encode_key(key, query.urlencode_resp),
)
.unwrap();
writeln!(&mut xml, "\t\t<LastModified>{}</LastModified>", last_modif).unwrap();
writeln!(&mut xml, "\t\t<Size>{}</Size>", info.size).unwrap();
if !info.etag.is_empty() {
writeln!(&mut xml, "\t\t<ETag>\"{}\"</ETag>", info.etag).unwrap();
}
writeln!(&mut xml, "\t\t<StorageClass>STANDARD</StorageClass>").unwrap();
writeln!(&mut xml, "\t</Contents>").unwrap();
}
if result_common_prefixes.len() > 0 {
for pfx in result_common_prefixes.iter() {
writeln!(&mut xml, "\t<CommonPrefixes>").unwrap();
for pfx in result_common_prefixes.iter() {
writeln!(
&mut xml,
"\t\t<Prefix>{}</Prefix>",
xml_escape(pfx),
//xml_encode_key(pfx, urlencode_resp)
)
.unwrap();
}
//TODO: in V1, are these urlencoded when urlencode_resp is true ?? (proably)
writeln!(
&mut xml,
"\t\t<Prefix>{}</Prefix>",
xml_encode_key(pfx, query.urlencode_resp),
)
.unwrap();
writeln!(&mut xml, "\t</CommonPrefixes>").unwrap();
}
writeln!(&mut xml, "</ListBucketResult>").unwrap();
println!("{}", xml);
Ok(Response::new(Body::from(xml.into_bytes())))
writeln!(&mut xml, "</ListBucketResult>").unwrap();
debug!("{}", xml);
Ok(Response::builder()
.header("Content-Type", "application/xml")
.body(Body::from(xml.into_bytes()))?)
}

View file

@ -2,16 +2,17 @@ use std::collections::{BTreeMap, VecDeque};
use std::fmt::Write;
use std::sync::Arc;
use fastcdc::{Chunk, FastCDC};
use futures::stream::*;
use hyper::{Body, Request, Response};
use md5::{digest::generic_array::*, Digest as Md5Digest, Md5};
use sha2::{Digest as Sha256Digest, Sha256};
use sha2::Sha256;
use garage_table::*;
use garage_util::data::*;
use garage_util::error::Error as GarageError;
use garage_util::time::*;
use crate::error::*;
use garage_model::block::INLINE_THRESHOLD;
use garage_model::block_ref_table::*;
use garage_model::garage::Garage;
@ -19,6 +20,10 @@ use garage_model::object_table::*;
use garage_model::version_table::*;
use crate::encoding::*;
use crate::error::*;
use crate::signature::verify_signed_content;
// ---- PutObject call ----
pub async fn handle_put(
garage: Arc<Garage>,
@ -27,85 +32,113 @@ pub async fn handle_put(
key: &str,
content_sha256: Option<Hash>,
) -> Result<Response<Body>, Error> {
// Generate identity of new version
let version_uuid = gen_uuid();
let version_timestamp = now_msec();
// Retrieve interesting headers from request
let headers = get_headers(&req)?;
debug!("Object headers: {:?}", headers);
let content_md5 = match req.headers().get("content-md5") {
Some(x) => Some(x.to_str()?.to_string()),
None => None,
};
// Parse body of uploaded file
let body = req.into_body();
let mut chunker = BodyChunker::new(body, garage.config.block_size);
let first_block = chunker.next().await?.unwrap_or(vec![]);
let mut object_version = ObjectVersion {
uuid: version_uuid,
timestamp: now_msec(),
state: ObjectVersionState::Uploading(headers.clone()),
};
// If body is small enough, store it directly in the object table
// as "inline data". We can then return immediately.
if first_block.len() < INLINE_THRESHOLD {
let mut md5sum = Md5::new();
md5sum.update(&first_block[..]);
let md5sum_arr = md5sum.finalize();
let md5sum_hex = hex::encode(md5sum_arr);
let data_md5sum = md5sum.finalize();
let data_md5sum_hex = hex::encode(data_md5sum);
let mut sha256sum = Sha256::new();
sha256sum.input(&first_block[..]);
let sha256sum_arr = sha256sum.result();
let mut hash = [0u8; 32];
hash.copy_from_slice(&sha256sum_arr[..]);
let sha256sum_hash = Hash::from(hash);
let data_sha256sum = sha256sum(&first_block[..]);
ensure_checksum_matches(
md5sum_arr.as_slice(),
sha256sum_hash,
data_md5sum.as_slice(),
data_sha256sum,
content_md5.as_deref(),
content_sha256,
)?;
object_version.state = ObjectVersionState::Complete(ObjectVersionData::Inline(
ObjectVersionMeta {
headers,
size: first_block.len() as u64,
etag: md5sum_hex.clone(),
},
first_block,
));
let object_version = ObjectVersion {
uuid: version_uuid,
timestamp: version_timestamp,
state: ObjectVersionState::Complete(ObjectVersionData::Inline(
ObjectVersionMeta {
headers,
size: first_block.len() as u64,
etag: data_md5sum_hex.clone(),
},
first_block,
)),
};
let object = Object::new(bucket.into(), key.into(), vec![object_version]);
garage.object_table.insert(&object).await?;
return Ok(put_response(version_uuid, md5sum_hex));
return Ok(put_response(version_uuid, data_md5sum_hex));
}
let version = Version::new(version_uuid, bucket.into(), key.into(), false, vec![]);
let first_block_hash = hash(&first_block[..]);
// Write version identifier in object table so that we have a trace
// that we are uploading something
let mut object_version = ObjectVersion {
uuid: version_uuid,
timestamp: version_timestamp,
state: ObjectVersionState::Uploading(headers.clone()),
};
let object = Object::new(bucket.into(), key.into(), vec![object_version.clone()]);
garage.object_table.insert(&object).await?;
let (total_size, md5sum_arr, sha256sum) = read_and_put_blocks(
// Initialize corresponding entry in version table
// Write this entry now, even with empty block list,
// to prevent block_ref entries from being deleted (they can be deleted
// if the reference a version that isn't found in the version table)
let version = Version::new(version_uuid, bucket.into(), key.into(), false);
garage.version_table.insert(&version).await?;
// Transfer data and verify checksum
let first_block_hash = blake2sum(&first_block[..]);
let tx_result = read_and_put_blocks(
&garage,
version,
&version,
1,
first_block,
first_block_hash,
&mut chunker,
)
.await?;
.await
.and_then(|(total_size, data_md5sum, data_sha256sum)| {
ensure_checksum_matches(
data_md5sum.as_slice(),
data_sha256sum,
content_md5.as_deref(),
content_sha256,
)
.map(|()| (total_size, data_md5sum))
});
ensure_checksum_matches(
md5sum_arr.as_slice(),
sha256sum,
content_md5.as_deref(),
content_sha256,
)?;
// TODO: if at any step we have an error, we should undo everything we did
// If something went wrong, clean up
let (total_size, md5sum_arr) = match tx_result {
Ok(rv) => rv,
Err(e) => {
// Mark object as aborted, this will free the blocks further down
object_version.state = ObjectVersionState::Aborted;
let object = Object::new(bucket.into(), key.into(), vec![object_version.clone()]);
garage.object_table.insert(&object).await?;
return Err(e);
}
};
// Save final object state, marked as Complete
let md5sum_hex = hex::encode(md5sum_arr);
object_version.state = ObjectVersionState::Complete(ObjectVersionData::FirstBlock(
ObjectVersionMeta {
headers,
@ -114,179 +147,22 @@ pub async fn handle_put(
},
first_block_hash,
));
let object = Object::new(bucket.into(), key.into(), vec![object_version]);
garage.object_table.insert(&object).await?;
Ok(put_response(version_uuid, md5sum_hex))
}
/// Validate MD5 sum against content-md5 header
/// and sha256sum against signed content-sha256
fn ensure_checksum_matches(
md5sum: &[u8],
sha256sum: garage_util::data::FixedBytes32,
content_md5: Option<&str>,
content_sha256: Option<garage_util::data::FixedBytes32>,
) -> Result<(), Error> {
if let Some(expected_sha256) = content_sha256 {
if expected_sha256 != sha256sum {
return Err(Error::BadRequest(format!(
"Unable to validate x-amz-content-sha256"
)));
} else {
trace!("Successfully validated x-amz-content-sha256");
}
}
if let Some(expected_md5) = content_md5 {
if expected_md5.trim_matches('"') != base64::encode(md5sum) {
return Err(Error::BadRequest(format!("Unable to validate content-md5")));
} else {
trace!("Successfully validated content-md5");
}
}
Ok(())
}
async fn read_and_put_blocks(
garage: &Arc<Garage>,
version: Version,
part_number: u64,
first_block: Vec<u8>,
first_block_hash: Hash,
chunker: &mut BodyChunker,
) -> Result<(u64, GenericArray<u8, typenum::U16>, Hash), Error> {
let mut md5sum = Md5::new();
let mut sha256sum = Sha256::new();
md5sum.update(&first_block[..]);
sha256sum.input(&first_block[..]);
let mut next_offset = first_block.len();
let mut put_curr_version_block = put_block_meta(
garage.clone(),
&version,
part_number,
0,
first_block_hash,
first_block.len() as u64,
);
let mut put_curr_block = garage
.block_manager
.rpc_put_block(first_block_hash, first_block);
loop {
let (_, _, next_block) =
futures::try_join!(put_curr_block, put_curr_version_block, chunker.next())?;
if let Some(block) = next_block {
md5sum.update(&block[..]);
sha256sum.input(&block[..]);
let block_hash = hash(&block[..]);
let block_len = block.len();
put_curr_version_block = put_block_meta(
garage.clone(),
&version,
part_number,
next_offset as u64,
block_hash,
block_len as u64,
);
put_curr_block = garage.block_manager.rpc_put_block(block_hash, block);
next_offset += block_len;
} else {
break;
}
}
let total_size = next_offset as u64;
let md5sum_arr = md5sum.finalize();
let sha256sum = sha256sum.result();
let mut hash = [0u8; 32];
hash.copy_from_slice(&sha256sum[..]);
let sha256sum = Hash::from(hash);
Ok((total_size, md5sum_arr, sha256sum))
}
async fn put_block_meta(
garage: Arc<Garage>,
version: &Version,
part_number: u64,
offset: u64,
hash: Hash,
size: u64,
) -> Result<(), GarageError> {
// TODO: don't clone, restart from empty block list ??
let mut version = version.clone();
version
.add_block(VersionBlock {
part_number,
offset,
hash,
size,
})
.unwrap();
let block_ref = BlockRef {
block: hash,
version: version.uuid,
deleted: false,
};
futures::try_join!(
garage.version_table.insert(&version),
garage.block_ref_table.insert(&block_ref),
)?;
Ok(())
}
struct BodyChunker {
body: Body,
read_all: bool,
block_size: usize,
buf: VecDeque<u8>,
}
impl BodyChunker {
fn new(body: Body, block_size: usize) -> Self {
Self {
body,
read_all: false,
block_size,
buf: VecDeque::new(),
}
}
async fn next(&mut self) -> Result<Option<Vec<u8>>, GarageError> {
while !self.read_all && self.buf.len() < self.block_size {
if let Some(block) = self.body.next().await {
let bytes = block?;
trace!("Body next: {} bytes", bytes.len());
self.buf.extend(&bytes[..]);
} else {
self.read_all = true;
}
}
if self.buf.len() == 0 {
Ok(None)
} else if self.buf.len() <= self.block_size {
let block = self.buf.drain(..).collect::<Vec<u8>>();
Ok(Some(block))
} else {
let block = self.buf.drain(..self.block_size).collect::<Vec<u8>>();
Ok(Some(block))
}
}
}
pub fn put_response(version_uuid: UUID, etag: String) -> Response<Body> {
pub fn put_response(version_uuid: UUID, md5sum_hex: String) -> Response<Body> {
Response::builder()
.header("x-amz-version-id", hex::encode(version_uuid))
.header("ETag", etag)
// TODO ETag
.header("ETag", format!("\"{}\"", md5sum_hex))
.body(Body::from(vec![]))
.unwrap()
}
// ---- Mutlipart upload calls ----
pub async fn handle_create_multipart_upload(
garage: Arc<Garage>,
req: &Request<Body>,
@ -296,6 +172,7 @@ pub async fn handle_create_multipart_upload(
let version_uuid = gen_uuid();
let headers = get_headers(req)?;
// Create object in object table
let object_version = ObjectVersion {
uuid: version_uuid,
timestamp: now_msec(),
@ -304,6 +181,14 @@ pub async fn handle_create_multipart_upload(
let object = Object::new(bucket.to_string(), key.to_string(), vec![object_version]);
garage.object_table.insert(&object).await?;
// Insert empty version so that block_ref entries refer to something
// (they are inserted concurrently with blocks in the version table, so
// there is the possibility that they are inserted before the version table
// is created, in which case it is allowed to delete them, e.g. in repair_*)
let version = Version::new(version_uuid, bucket.into(), key.into(), false);
garage.version_table.insert(&version).await?;
// Send success response
let mut xml = String::new();
writeln!(&mut xml, r#"<?xml version="1.0" encoding="UTF-8"?>"#).unwrap();
writeln!(
@ -346,13 +231,12 @@ pub async fn handle_put_part(
};
// Read first chuck, and at the same time try to get object to see if it exists
let mut chunker = BodyChunker::new(req.into_body(), garage.config.block_size);
let bucket = bucket.to_string();
let key = key.to_string();
let get_object_fut = garage.object_table.get(&bucket, &key);
let get_first_block_fut = chunker.next();
let (object, first_block) = futures::try_join!(get_object_fut, get_first_block_fut)?;
let mut chunker = BodyChunker::new(req.into_body(), garage.config.block_size);
let (object, first_block) =
futures::try_join!(garage.object_table.get(&bucket, &key), chunker.next(),)?;
// Check object is valid and multipart block can be accepted
let first_block = first_block.ok_or(Error::BadRequest(format!("Empty body")))?;
@ -363,17 +247,15 @@ pub async fn handle_put_part(
.iter()
.any(|v| v.uuid == version_uuid && v.is_uploading())
{
return Err(Error::BadRequest(format!(
"Multipart upload does not exist or is otherwise invalid"
)));
return Err(Error::NotFound);
}
// Copy block to store
let version = Version::new(version_uuid, bucket.into(), key.into(), false, vec![]);
let first_block_hash = hash(&first_block[..]);
let (_, md5sum_arr, sha256sum) = read_and_put_blocks(
let version = Version::new(version_uuid, bucket, key, false);
let first_block_hash = blake2sum(&first_block[..]);
let (_, data_md5sum, data_sha256sum) = read_and_put_blocks(
&garage,
version,
&version,
part_number,
first_block,
first_block_hash,
@ -381,23 +263,48 @@ pub async fn handle_put_part(
)
.await?;
// Verify that checksums map
ensure_checksum_matches(
md5sum_arr.as_slice(),
sha256sum,
data_md5sum.as_slice(),
data_sha256sum,
content_md5.as_deref(),
content_sha256,
)?;
Ok(Response::new(Body::from(vec![])))
// Store part etag in version
let data_md5sum_hex = hex::encode(data_md5sum);
let mut version = version;
version
.parts_etags
.put(part_number, data_md5sum_hex.clone());
garage.version_table.insert(&version).await?;
let response = Response::builder()
.header("ETag", format!("\"{}\"", data_md5sum_hex))
.body(Body::from(vec![]))
.unwrap();
Ok(response)
}
pub async fn handle_complete_multipart_upload(
garage: Arc<Garage>,
_req: Request<Body>,
req: Request<Body>,
bucket: &str,
key: &str,
upload_id: &str,
content_sha256: Option<Hash>,
) -> Result<Response<Body>, Error> {
let body = hyper::body::to_bytes(req.into_body()).await?;
verify_signed_content(content_sha256, &body[..])?;
let body_xml = roxmltree::Document::parse(&std::str::from_utf8(&body)?)?;
let body_list_of_parts = parse_complete_multpart_upload_body(&body_xml)
.ok_or_bad_request("Invalid CompleteMultipartUpload XML")?;
debug!(
"CompleteMultipartUpload list of parts: {:?}",
body_list_of_parts
);
let version_uuid = decode_upload_id(upload_id)?;
let bucket = bucket.to_string();
@ -406,52 +313,70 @@ pub async fn handle_complete_multipart_upload(
garage.object_table.get(&bucket, &key),
garage.version_table.get(&version_uuid, &EmptyKey),
)?;
let object = object.ok_or(Error::BadRequest(format!("Object not found")))?;
let object_version = object
let object = object.ok_or(Error::BadRequest(format!("Object not found")))?;
let mut object_version = object
.versions()
.iter()
.find(|v| v.uuid == version_uuid && v.is_uploading());
let mut object_version = match object_version {
None => {
return Err(Error::BadRequest(format!(
"Multipart upload does not exist or has already been completed"
)))
}
Some(x) => x.clone(),
};
let version = version.ok_or(Error::BadRequest(format!("Version not found")))?;
.find(|v| v.uuid == version_uuid && v.is_uploading())
.cloned()
.ok_or(Error::BadRequest(format!("Version not found")))?;
if version.blocks().len() == 0 {
let version = version.ok_or(Error::BadRequest(format!("Version not found")))?;
if version.blocks.len() == 0 {
return Err(Error::BadRequest(format!("No data was uploaded")));
}
let headers = match object_version.state {
ObjectVersionState::Uploading(headers) => headers.clone(),
_ => unreachable!(),
};
// TODO: check that all the parts that they pretend they gave us are indeed there
// TODO: when we read the XML from _req, remember to check the sha256 sum of the payload
// against the signed x-amz-content-sha256
// TODO: check MD5 sum of all uploaded parts? but that would mean we have to store them somewhere...
let total_size = version
.blocks()
// Check that the list of parts they gave us corresponds to the parts we have here
debug!("Expected parts from request: {:?}", body_list_of_parts);
debug!("Parts stored in version: {:?}", version.parts_etags.items());
let parts = version
.parts_etags
.items()
.iter()
.map(|x| x.size)
.fold(0, |x, y| x + y);
.map(|pair| (&pair.0, &pair.1));
let same_parts = body_list_of_parts
.iter()
.map(|x| (&x.part_number, &x.etag))
.eq(parts);
if !same_parts {
return Err(Error::BadRequest(format!("We don't have the same parts")));
}
// Calculate etag of final object
// To understand how etags are calculated, read more here:
// https://teppen.io/2018/06/23/aws_s3_etags/
let num_parts = version.blocks.items().last().unwrap().0.part_number
- version.blocks.items().first().unwrap().0.part_number
+ 1;
let mut etag_md5_hasher = Md5::new();
for (_, etag) in version.parts_etags.items().iter() {
etag_md5_hasher.update(etag.as_bytes());
}
let etag = format!("{}-{}", hex::encode(etag_md5_hasher.finalize()), num_parts);
// Calculate total size of final object
let total_size = version.blocks.items().iter().map(|x| x.1.size).sum();
// Write final object version
object_version.state = ObjectVersionState::Complete(ObjectVersionData::FirstBlock(
ObjectVersionMeta {
headers,
size: total_size,
etag: "".to_string(), // TODO
etag,
},
version.blocks()[0].hash,
version.blocks.items()[0].1.hash,
));
let final_object = Object::new(bucket.clone(), key.clone(), vec![object_version]);
garage.object_table.insert(&final_object).await?;
// Send response saying ok we're done
let mut xml = String::new();
writeln!(&mut xml, r#"<?xml version="1.0" encoding="UTF-8"?>"#).unwrap();
writeln!(
@ -491,11 +416,7 @@ pub async fn handle_abort_multipart_upload(
.iter()
.find(|v| v.uuid == version_uuid && v.is_uploading());
let mut object_version = match object_version {
None => {
return Err(Error::BadRequest(format!(
"Multipart upload does not exist or has already been completed"
)))
}
None => return Err(Error::NotFound),
Some(x) => x.clone(),
};
@ -506,37 +427,7 @@ pub async fn handle_abort_multipart_upload(
Ok(Response::new(Body::from(vec![])))
}
fn get_mime_type(req: &Request<Body>) -> Result<String, Error> {
Ok(req
.headers()
.get(hyper::header::CONTENT_TYPE)
.map(|x| x.to_str())
.unwrap_or(Ok("blob"))?
.to_string())
}
fn get_headers(req: &Request<Body>) -> Result<ObjectVersionHeaders, Error> {
let content_type = get_mime_type(req)?;
let other_headers = vec![
hyper::header::CACHE_CONTROL,
hyper::header::CONTENT_DISPOSITION,
hyper::header::CONTENT_ENCODING,
hyper::header::CONTENT_LANGUAGE,
hyper::header::EXPIRES,
];
let mut other = BTreeMap::new();
for h in other_headers.iter() {
if let Some(v) = req.headers().get(h) {
if let Ok(v_str) = v.to_str() {
other.insert(h.to_string(), v_str.to_string());
}
}
}
Ok(ObjectVersionHeaders {
content_type,
other: BTreeMap::new(),
})
}
// ---- Parsing input to multipart upload calls ----
fn decode_upload_id(id: &str) -> Result<UUID, Error> {
let id_bin = hex::decode(id).ok_or_bad_request("Invalid upload ID")?;
@ -547,3 +438,260 @@ fn decode_upload_id(id: &str) -> Result<UUID, Error> {
uuid.copy_from_slice(&id_bin[..]);
Ok(UUID::from(uuid))
}
#[derive(Debug)]
struct CompleteMultipartUploadPart {
etag: String,
part_number: u64,
}
fn parse_complete_multpart_upload_body(
xml: &roxmltree::Document,
) -> Option<Vec<CompleteMultipartUploadPart>> {
let mut parts = vec![];
let root = xml.root();
let cmu = root.first_child()?;
if !cmu.has_tag_name("CompleteMultipartUpload") {
return None;
}
for item in cmu.children() {
if item.has_tag_name("Part") {
let etag = item.children().find(|e| e.has_tag_name("ETag"))?.text()?;
let part_number = item
.children()
.find(|e| e.has_tag_name("PartNumber"))?
.text()?;
parts.push(CompleteMultipartUploadPart {
etag: etag.trim_matches('"').to_string(),
part_number: part_number.parse().ok()?,
});
} else {
return None;
}
}
Some(parts)
}
// ---- Common code ----
pub(crate) fn get_headers(req: &Request<Body>) -> Result<ObjectVersionHeaders, Error> {
let content_type = req
.headers()
.get(hyper::header::CONTENT_TYPE)
.map(|x| x.to_str())
.unwrap_or(Ok("blob"))?
.to_string();
let mut other = BTreeMap::new();
// Preserve standard headers
let standard_header = vec![
hyper::header::CACHE_CONTROL,
hyper::header::CONTENT_DISPOSITION,
hyper::header::CONTENT_ENCODING,
hyper::header::CONTENT_LANGUAGE,
hyper::header::EXPIRES,
];
for h in standard_header.iter() {
if let Some(v) = req.headers().get(h) {
match v.to_str() {
Ok(v_str) => {
other.insert(h.to_string(), v_str.to_string());
}
Err(e) => {
warn!("Discarding header {}, error in .to_str(): {}", h, e);
}
}
}
}
// Preserve x-amz-meta- headers
for (k, v) in req.headers().iter() {
if k.as_str().starts_with("x-amz-meta-") {
match v.to_str() {
Ok(v_str) => {
other.insert(k.to_string(), v_str.to_string());
}
Err(e) => {
warn!("Discarding header {}, error in .to_str(): {}", k, e);
}
}
}
}
Ok(ObjectVersionHeaders {
content_type,
other,
})
}
struct BodyChunker {
body: Body,
read_all: bool,
min_block_size: usize,
avg_block_size: usize,
max_block_size: usize,
buf: VecDeque<u8>,
}
impl BodyChunker {
fn new(body: Body, block_size: usize) -> Self {
let min_block_size = block_size / 4 * 3;
let avg_block_size = block_size;
let max_block_size = block_size * 2;
Self {
body,
read_all: false,
min_block_size,
avg_block_size,
max_block_size,
buf: VecDeque::with_capacity(2 * max_block_size),
}
}
async fn next(&mut self) -> Result<Option<Vec<u8>>, GarageError> {
while !self.read_all && self.buf.len() < self.max_block_size {
if let Some(block) = self.body.next().await {
let bytes = block?;
trace!("Body next: {} bytes", bytes.len());
self.buf.extend(&bytes[..]);
} else {
self.read_all = true;
}
}
if self.buf.len() == 0 {
Ok(None)
} else {
let mut iter = FastCDC::with_eof(
self.buf.make_contiguous(),
self.min_block_size,
self.avg_block_size,
self.max_block_size,
self.read_all,
);
if let Some(Chunk { length, .. }) = iter.next() {
let block = self.buf.drain(..length).collect::<Vec<u8>>();
Ok(Some(block))
} else {
unreachable!("FastCDC returned not chunk")
}
}
}
}
async fn read_and_put_blocks(
garage: &Garage,
version: &Version,
part_number: u64,
first_block: Vec<u8>,
first_block_hash: Hash,
chunker: &mut BodyChunker,
) -> Result<(u64, GenericArray<u8, typenum::U16>, Hash), Error> {
let mut md5hasher = Md5::new();
let mut sha256hasher = Sha256::new();
md5hasher.update(&first_block[..]);
sha256hasher.update(&first_block[..]);
let mut next_offset = first_block.len();
let mut put_curr_version_block = put_block_meta(
&garage,
&version,
part_number,
0,
first_block_hash,
first_block.len() as u64,
);
let mut put_curr_block = garage
.block_manager
.rpc_put_block(first_block_hash, first_block);
loop {
let (_, _, next_block) =
futures::try_join!(put_curr_block, put_curr_version_block, chunker.next())?;
if let Some(block) = next_block {
md5hasher.update(&block[..]);
sha256hasher.update(&block[..]);
let block_hash = blake2sum(&block[..]);
let block_len = block.len();
put_curr_version_block = put_block_meta(
&garage,
&version,
part_number,
next_offset as u64,
block_hash,
block_len as u64,
);
put_curr_block = garage.block_manager.rpc_put_block(block_hash, block);
next_offset += block_len;
} else {
break;
}
}
let total_size = next_offset as u64;
let data_md5sum = md5hasher.finalize();
let data_sha256sum = sha256hasher.finalize();
let data_sha256sum = Hash::try_from(&data_sha256sum[..]).unwrap();
Ok((total_size, data_md5sum, data_sha256sum))
}
async fn put_block_meta(
garage: &Garage,
version: &Version,
part_number: u64,
offset: u64,
hash: Hash,
size: u64,
) -> Result<(), GarageError> {
let mut version = version.clone();
version.blocks.put(
VersionBlockKey {
part_number,
offset,
},
VersionBlock { hash, size },
);
let block_ref = BlockRef {
block: hash,
version: version.uuid,
deleted: false.into(),
};
futures::try_join!(
garage.version_table.insert(&version),
garage.block_ref_table.insert(&block_ref),
)?;
Ok(())
}
/// Validate MD5 sum against content-md5 header
/// and sha256sum against signed content-sha256
fn ensure_checksum_matches(
data_md5sum: &[u8],
data_sha256sum: garage_util::data::FixedBytes32,
content_md5: Option<&str>,
content_sha256: Option<garage_util::data::FixedBytes32>,
) -> Result<(), Error> {
if let Some(expected_sha256) = content_sha256 {
if expected_sha256 != data_sha256sum {
return Err(Error::BadRequest(format!(
"Unable to validate x-amz-content-sha256"
)));
} else {
trace!("Successfully validated x-amz-content-sha256");
}
}
if let Some(expected_md5) = content_md5 {
if expected_md5.trim_matches('"') != base64::encode(data_md5sum) {
return Err(Error::BadRequest(format!("Unable to validate content-md5")));
} else {
trace!("Successfully validated content-md5");
}
}
Ok(())
}

View file

@ -1,12 +1,12 @@
use std::collections::HashMap;
use chrono::{DateTime, Duration, NaiveDateTime, Utc};
use hmac::{Hmac, Mac};
use hmac::{Hmac, Mac, NewMac};
use hyper::{Body, Method, Request};
use sha2::{Digest, Sha256};
use garage_table::*;
use garage_util::data::Hash;
use garage_util::data::{sha256sum, Hash};
use garage_model::garage::Garage;
use garage_model::key_table::*;
@ -91,8 +91,8 @@ pub async fn check_signature(
"s3",
)
.ok_or_internal_error("Unable to build signing HMAC")?;
hmac.input(string_to_sign.as_bytes());
let signature = hex::encode(hmac.result().code());
hmac.update(string_to_sign.as_bytes());
let signature = hex::encode(hmac.finalize().into_bytes());
if authorization.signature != signature {
trace!("Canonical request: ``{}``", canonical_request);
@ -106,12 +106,10 @@ pub async fn check_signature(
} else {
let bytes = hex::decode(authorization.content_sha256)
.ok_or_bad_request("Invalid content sha256 hash")?;
let mut hash = [0u8; 32];
if bytes.len() != 32 {
return Err(Error::BadRequest(format!("Invalid content sha256 hash")));
}
hash.copy_from_slice(&bytes[..]);
Some(Hash::from(hash))
Some(
Hash::try_from(&bytes[..])
.ok_or(Error::BadRequest(format!("Invalid content sha256 hash")))?,
)
};
Ok((key, content_sha256))
@ -220,12 +218,12 @@ fn parse_credential(cred: &str) -> Result<(String, String), Error> {
fn string_to_sign(datetime: &DateTime<Utc>, scope_string: &str, canonical_req: &str) -> String {
let mut hasher = Sha256::default();
hasher.input(canonical_req.as_bytes());
hasher.update(canonical_req.as_bytes());
[
"AWS4-HMAC-SHA256",
&datetime.format(LONG_DATETIME).to_string(),
scope_string,
&hex::encode(hasher.result().as_slice()),
&hex::encode(hasher.finalize().as_slice()),
]
.join("\n")
}
@ -238,14 +236,14 @@ fn signing_hmac(
) -> Result<HmacSha256, crypto_mac::InvalidKeyLength> {
let secret = String::from("AWS4") + secret_key;
let mut date_hmac = HmacSha256::new_varkey(secret.as_bytes())?;
date_hmac.input(datetime.format(SHORT_DATE).to_string().as_bytes());
let mut region_hmac = HmacSha256::new_varkey(&date_hmac.result().code())?;
region_hmac.input(region.as_bytes());
let mut service_hmac = HmacSha256::new_varkey(&region_hmac.result().code())?;
service_hmac.input(service.as_bytes());
let mut signing_hmac = HmacSha256::new_varkey(&service_hmac.result().code())?;
signing_hmac.input(b"aws4_request");
let hmac = HmacSha256::new_varkey(&signing_hmac.result().code())?;
date_hmac.update(datetime.format(SHORT_DATE).to_string().as_bytes());
let mut region_hmac = HmacSha256::new_varkey(&date_hmac.finalize().into_bytes())?;
region_hmac.update(region.as_bytes());
let mut service_hmac = HmacSha256::new_varkey(&region_hmac.finalize().into_bytes())?;
service_hmac.update(service.as_bytes());
let mut signing_hmac = HmacSha256::new_varkey(&service_hmac.finalize().into_bytes())?;
signing_hmac.update(b"aws4_request");
let hmac = HmacSha256::new_varkey(&signing_hmac.finalize().into_bytes())?;
Ok(hmac)
}
@ -293,3 +291,14 @@ fn canonical_query_string(uri: &hyper::Uri) -> String {
"".to_string()
}
}
pub fn verify_signed_content(content_sha256: Option<Hash>, body: &[u8]) -> Result<(), Error> {
let expected_sha256 =
content_sha256.ok_or_bad_request("Request content hash not signed, aborting.")?;
if expected_sha256 != sha256sum(body) {
return Err(Error::BadRequest(format!(
"Request content hash does not match signed hash"
)));
}
Ok(())
}

View file

@ -1,9 +1,9 @@
[package]
name = "garage"
version = "0.1.1"
version = "0.2.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "GPL-3.0"
license = "AGPL-3.0"
description = "Garage, an S3-compatible distributed object store for self-hosted deployments"
repository = "https://git.deuxfleurs.fr/Deuxfleurs/garage"
@ -14,26 +14,27 @@ path = "main.rs"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
garage_util = { version = "0.1", path = "../util" }
garage_rpc = { version = "0.1", path = "../rpc" }
garage_table = { version = "0.1.1", path = "../table" }
garage_model = { version = "0.1.1", path = "../model" }
garage_api = { version = "0.1.1", path = "../api" }
garage_api = { version = "0.2.1", path = "../api" }
garage_model = { version = "0.2.1", path = "../model" }
garage_rpc = { version = "0.2.1", path = "../rpc" }
garage_table = { version = "0.2.1", path = "../table" }
garage_util = { version = "0.2.1", path = "../util" }
garage_web = { version = "0.2.1", path = "../web" }
bytes = "0.4"
rand = "0.7"
hex = "0.3"
sha2 = "0.8"
bytes = "1.0"
git-version = "0.3.4"
hex = "0.4"
log = "0.4"
pretty_env_logger = "0.4"
rand = "0.8"
sled = "0.31"
sled = "0.34"
rmp-serde = "0.15"
serde = { version = "1.0", default-features = false, features = ["derive", "rc"] }
structopt = { version = "0.3", default-features = false }
toml = "0.5"
rmp-serde = "0.14.3"
serde = { version = "1.0", default-features = false, features = ["derive", "rc"] }
futures = "0.3"
futures-util = "0.3"
tokio = { version = "0.2", default-features = false, features = ["rt-core", "rt-threaded", "io-driver", "net", "tcp", "time", "macros", "sync", "signal", "fs"] }
tokio = { version = "1.0", default-features = false, features = ["rt", "rt-multi-thread", "io-util", "net", "time", "macros", "sync", "signal", "fs"] }

View file

@ -1,3 +1,5 @@
use std::collections::HashMap;
use std::fmt::Write;
use std::sync::Arc;
use serde::{Deserialize, Serialize};
@ -5,6 +7,7 @@ use serde::{Deserialize, Serialize};
use garage_util::error::Error;
use garage_table::crdt::CRDT;
use garage_table::replication::*;
use garage_table::*;
use garage_rpc::rpc_client::*;
@ -14,6 +17,7 @@ use garage_model::bucket_table::*;
use garage_model::garage::Garage;
use garage_model::key_table::*;
use crate::cli::*;
use crate::repair::Repair;
use crate::*;
@ -25,6 +29,7 @@ pub enum AdminRPC {
BucketOperation(BucketOperation),
KeyOperation(KeyOperation),
LaunchRepair(RepairOpt),
Stats(StatsOpt),
// Replies
Ok(String),
@ -55,6 +60,7 @@ impl AdminRpcHandler {
AdminRPC::BucketOperation(bo) => self2.handle_bucket_cmd(bo).await,
AdminRPC::KeyOperation(ko) => self2.handle_key_cmd(ko).await,
AdminRPC::LaunchRepair(opt) => self2.handle_launch_repair(opt).await,
AdminRPC::Stats(opt) => self2.handle_stats(opt).await,
_ => Err(Error::BadRPC(format!("Invalid RPC"))),
}
}
@ -89,7 +95,7 @@ impl AdminRpcHandler {
}
bucket
.state
.update(BucketState::Present(crdt::LWWMap::new()));
.update(BucketState::Present(BucketParams::new()));
bucket
}
None => Bucket::new(query.name.clone()),
@ -116,7 +122,7 @@ impl AdminRpcHandler {
for (key_id, _, _) in bucket.authorized_keys() {
if let Some(key) = self.garage.key_table.get(&EmptyKey, key_id).await? {
if !key.deleted.get() {
self.update_key_bucket(key, &bucket.name, false, false)
self.update_key_bucket(&key, &bucket.name, false, false)
.await?;
}
} else {
@ -128,33 +134,56 @@ impl AdminRpcHandler {
Ok(AdminRPC::Ok(format!("Bucket {} was deleted.", query.name)))
}
BucketOperation::Allow(query) => {
let key = self.get_existing_key(&query.key_id).await?;
let key = self.get_existing_key(&query.key_pattern).await?;
let bucket = self.get_existing_bucket(&query.bucket).await?;
let allow_read = query.read || key.allow_read(&query.bucket);
let allow_write = query.write || key.allow_write(&query.bucket);
self.update_key_bucket(key, &query.bucket, allow_read, allow_write)
self.update_key_bucket(&key, &query.bucket, allow_read, allow_write)
.await?;
self.update_bucket_key(bucket, &query.key_id, allow_read, allow_write)
self.update_bucket_key(bucket, &key.key_id, allow_read, allow_write)
.await?;
Ok(AdminRPC::Ok(format!(
"New permissions for {} on {}: read {}, write {}.",
&query.key_id, &query.bucket, allow_read, allow_write
&key.key_id, &query.bucket, allow_read, allow_write
)))
}
BucketOperation::Deny(query) => {
let key = self.get_existing_key(&query.key_id).await?;
let key = self.get_existing_key(&query.key_pattern).await?;
let bucket = self.get_existing_bucket(&query.bucket).await?;
let allow_read = !query.read && key.allow_read(&query.bucket);
let allow_write = !query.write && key.allow_write(&query.bucket);
self.update_key_bucket(key, &query.bucket, allow_read, allow_write)
self.update_key_bucket(&key, &query.bucket, allow_read, allow_write)
.await?;
self.update_bucket_key(bucket, &query.key_id, allow_read, allow_write)
self.update_bucket_key(bucket, &key.key_id, allow_read, allow_write)
.await?;
Ok(AdminRPC::Ok(format!(
"New permissions for {} on {}: read {}, write {}.",
&query.key_id, &query.bucket, allow_read, allow_write
&key.key_id, &query.bucket, allow_read, allow_write
)))
}
BucketOperation::Website(query) => {
let mut bucket = self.get_existing_bucket(&query.bucket).await?;
if !(query.allow ^ query.deny) {
return Err(Error::Message(format!(
"You must specify exactly one flag, either --allow or --deny"
)));
}
if let BucketState::Present(state) = bucket.state.get_mut() {
state.website.update(query.allow);
self.garage.bucket_table.insert(&bucket).await?;
let msg = if query.allow {
format!("Website access allowed for {}", &query.bucket)
} else {
format!("Website access denied for {}", &query.bucket)
};
Ok(AdminRPC::Ok(msg.to_string()))
} else {
unreachable!();
}
}
}
}
@ -164,7 +193,12 @@ impl AdminRpcHandler {
let key_ids = self
.garage
.key_table
.get_range(&EmptyKey, None, Some(DeletedFilter::NotDeleted), 10000)
.get_range(
&EmptyKey,
None,
Some(KeyFilter::Deleted(DeletedFilter::NotDeleted)),
10000,
)
.await?
.iter()
.map(|k| (k.key_id.to_string(), k.name.get().clone()))
@ -172,7 +206,7 @@ impl AdminRpcHandler {
Ok(AdminRPC::KeyList(key_ids))
}
KeyOperation::Info(query) => {
let key = self.get_existing_key(&query.key_id).await?;
let key = self.get_existing_key(&query.key_pattern).await?;
Ok(AdminRPC::KeyInfo(key))
}
KeyOperation::New(query) => {
@ -181,13 +215,13 @@ impl AdminRpcHandler {
Ok(AdminRPC::KeyInfo(key))
}
KeyOperation::Rename(query) => {
let mut key = self.get_existing_key(&query.key_id).await?;
let mut key = self.get_existing_key(&query.key_pattern).await?;
key.name.update(query.new_name);
self.garage.key_table.insert(&key).await?;
Ok(AdminRPC::KeyInfo(key))
}
KeyOperation::Delete(query) => {
let key = self.get_existing_key(&query.key_id).await?;
let key = self.get_existing_key(&query.key_pattern).await?;
if !query.yes {
return Err(Error::BadRPC(format!(
"Add --yes flag to really perform this operation"
@ -204,13 +238,22 @@ impl AdminRpcHandler {
return Err(Error::Message(format!("Bucket not found: {}", ab_name)));
}
}
let del_key = Key::delete(key.key_id);
let del_key = Key::delete(key.key_id.to_string());
self.garage.key_table.insert(&del_key).await?;
Ok(AdminRPC::Ok(format!(
"Key {} was deleted successfully.",
query.key_id
key.key_id
)))
}
KeyOperation::Import(query) => {
let prev_key = self.garage.key_table.get(&EmptyKey, &query.key_id).await?;
if prev_key.is_some() {
return Err(Error::Message(format!("Key {} already exists in data store. Even if it is deleted, we can't let you create a new key with the same ID. Sorry.", query.key_id)));
}
let imported_key = Key::import(&query.key_id, &query.secret_key, &query.name);
self.garage.key_table.insert(&imported_key).await?;
Ok(AdminRPC::KeyInfo(imported_key))
}
}
}
@ -227,16 +270,31 @@ impl AdminRpcHandler {
))))
}
async fn get_existing_key(&self, id: &String) -> Result<Key, Error> {
self.garage
async fn get_existing_key(&self, pattern: &str) -> Result<Key, Error> {
let candidates = self
.garage
.key_table
.get(&EmptyKey, id)
.get_range(
&EmptyKey,
None,
Some(KeyFilter::Matches(pattern.to_string())),
10,
)
.await?
.into_iter()
.filter(|k| !k.deleted.get())
.map(Ok)
.unwrap_or(Err(Error::BadRPC(format!("Key {} does not exist", id))))
.collect::<Vec<_>>();
if candidates.len() != 1 {
Err(Error::Message(format!(
"{} matching keys",
candidates.len()
)))
} else {
Ok(candidates.into_iter().next().unwrap())
}
}
/// Update **bucket table** to inform of the new linked key
async fn update_bucket_key(
&self,
mut bucket: Bucket,
@ -244,7 +302,8 @@ impl AdminRpcHandler {
allow_read: bool,
allow_write: bool,
) -> Result<(), Error> {
if let BucketState::Present(ak) = bucket.state.get_mut() {
if let BucketState::Present(params) = bucket.state.get_mut() {
let ak = &mut params.authorized_keys;
let old_ak = ak.take_and_clear();
ak.merge(&old_ak.update_mutator(
key_id.to_string(),
@ -262,13 +321,15 @@ impl AdminRpcHandler {
Ok(())
}
/// Update **key table** to inform of the new linked bucket
async fn update_key_bucket(
&self,
mut key: Key,
key: &Key,
bucket: &String,
allow_read: bool,
allow_write: bool,
) -> Result<(), Error> {
let mut key = key.clone();
let old_map = key.authorized_buckets.take_and_clear();
key.authorized_buckets.merge(&old_map.update_mutator(
bucket.clone(),
@ -324,12 +385,118 @@ impl AdminRpcHandler {
.background
.spawn_worker("Repair worker".into(), move |must_exit| async move {
repair.repair_worker(opt, must_exit).await
})
.await;
});
Ok(AdminRPC::Ok(format!(
"Repair launched on {:?}",
self.garage.system.id
)))
}
}
async fn handle_stats(&self, opt: StatsOpt) -> Result<AdminRPC, Error> {
if opt.all_nodes {
let mut ret = String::new();
let ring = self.garage.system.ring.borrow().clone();
for node in ring.config.members.keys() {
let mut opt = opt.clone();
opt.all_nodes = false;
writeln!(&mut ret, "\n======================").unwrap();
writeln!(&mut ret, "Stats for node {:?}:", node).unwrap();
match self
.rpc_client
.call(*node, AdminRPC::Stats(opt), ADMIN_RPC_TIMEOUT)
.await
{
Ok(AdminRPC::Ok(s)) => writeln!(&mut ret, "{}", s).unwrap(),
Ok(x) => writeln!(&mut ret, "Bad answer: {:?}", x).unwrap(),
Err(e) => writeln!(&mut ret, "Error: {}", e).unwrap(),
}
}
Ok(AdminRPC::Ok(ret))
} else {
Ok(AdminRPC::Ok(self.gather_stats_local(opt)?))
}
}
fn gather_stats_local(&self, opt: StatsOpt) -> Result<String, Error> {
let mut ret = String::new();
writeln!(
&mut ret,
"\nGarage version: {}",
git_version::git_version!()
)
.unwrap();
// Gather ring statistics
let ring = self.garage.system.ring.borrow().clone();
let mut ring_nodes = HashMap::new();
for r in ring.ring.iter() {
for n in r.nodes.iter() {
if !ring_nodes.contains_key(n) {
ring_nodes.insert(*n, 0usize);
}
*ring_nodes.get_mut(n).unwrap() += 1;
}
}
writeln!(&mut ret, "\nRing nodes & partition count:").unwrap();
for (n, c) in ring_nodes.iter() {
writeln!(&mut ret, " {:?} {}", n, c).unwrap();
}
self.gather_table_stats(&mut ret, &self.garage.bucket_table, &opt)?;
self.gather_table_stats(&mut ret, &self.garage.key_table, &opt)?;
self.gather_table_stats(&mut ret, &self.garage.object_table, &opt)?;
self.gather_table_stats(&mut ret, &self.garage.version_table, &opt)?;
self.gather_table_stats(&mut ret, &self.garage.block_ref_table, &opt)?;
writeln!(&mut ret, "\nBlock manager stats:").unwrap();
if opt.detailed {
writeln!(
&mut ret,
" number of blocks: {}",
self.garage.block_manager.rc_len()
)
.unwrap();
}
writeln!(
&mut ret,
" resync queue length: {}",
self.garage.block_manager.resync_queue_len()
)
.unwrap();
Ok(ret)
}
fn gather_table_stats<F, R>(
&self,
to: &mut String,
t: &Arc<Table<F, R>>,
opt: &StatsOpt,
) -> Result<(), Error>
where
F: TableSchema + 'static,
R: TableReplication + 'static,
{
writeln!(to, "\nTable stats for {}", t.data.name).unwrap();
if opt.detailed {
writeln!(to, " number of items: {}", t.data.store.len()).unwrap();
writeln!(
to,
" Merkle tree size: {}",
t.merkle_updater.merkle_tree_len()
)
.unwrap();
}
writeln!(
to,
" Merkle updater todo queue length: {}",
t.merkle_updater.todo_len()
)
.unwrap();
writeln!(to, " GC todo queue length: {}", t.data.gc_todo_len()).unwrap();
Ok(())
}
}

566
src/garage/cli.rs Normal file
View file

@ -0,0 +1,566 @@
use std::collections::HashSet;
use std::net::SocketAddr;
use std::path::PathBuf;
use serde::{Deserialize, Serialize};
use structopt::StructOpt;
use garage_util::data::UUID;
use garage_util::error::Error;
use garage_util::time::*;
use garage_rpc::membership::*;
use garage_rpc::ring::*;
use garage_rpc::rpc_client::*;
use garage_model::bucket_table::*;
use garage_model::key_table::*;
use crate::admin_rpc::*;
#[derive(StructOpt, Debug)]
pub enum Command {
/// Run Garage server
#[structopt(name = "server")]
Server(ServerOpt),
/// Get network status
#[structopt(name = "status")]
Status,
/// Garage node operations
#[structopt(name = "node")]
Node(NodeOperation),
/// Bucket operations
#[structopt(name = "bucket")]
Bucket(BucketOperation),
/// Key operations
#[structopt(name = "key")]
Key(KeyOperation),
/// Start repair of node data
#[structopt(name = "repair")]
Repair(RepairOpt),
/// Gather node statistics
#[structopt(name = "stats")]
Stats(StatsOpt),
}
#[derive(StructOpt, Debug)]
pub struct ServerOpt {
/// Configuration file
#[structopt(short = "c", long = "config", default_value = "./config.toml")]
pub config_file: PathBuf,
}
#[derive(StructOpt, Debug)]
pub enum NodeOperation {
/// Configure Garage node
#[structopt(name = "configure")]
Configure(ConfigureNodeOpt),
/// Remove Garage node from cluster
#[structopt(name = "remove")]
Remove(RemoveNodeOpt),
}
#[derive(StructOpt, Debug)]
pub struct ConfigureNodeOpt {
/// Node to configure (prefix of hexadecimal node id)
node_id: String,
/// Location (datacenter) of the node
#[structopt(short = "d", long = "datacenter")]
datacenter: Option<String>,
/// Capacity (in relative terms, use 1 to represent your smallest server)
#[structopt(short = "c", long = "capacity")]
capacity: Option<u32>,
/// Optionnal node tag
#[structopt(short = "t", long = "tag")]
tag: Option<String>,
/// Replaced node(s): list of node IDs that will be removed from the current cluster
#[structopt(long = "replace")]
replace: Vec<String>,
}
#[derive(StructOpt, Debug)]
pub struct RemoveNodeOpt {
/// Node to configure (prefix of hexadecimal node id)
node_id: String,
/// If this flag is not given, the node won't be removed
#[structopt(long = "yes")]
yes: bool,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub enum BucketOperation {
/// List buckets
#[structopt(name = "list")]
List,
/// Get bucket info
#[structopt(name = "info")]
Info(BucketOpt),
/// Create bucket
#[structopt(name = "create")]
Create(BucketOpt),
/// Delete bucket
#[structopt(name = "delete")]
Delete(DeleteBucketOpt),
/// Allow key to read or write to bucket
#[structopt(name = "allow")]
Allow(PermBucketOpt),
/// Allow key to read or write to bucket
#[structopt(name = "deny")]
Deny(PermBucketOpt),
/// Expose as website or not
#[structopt(name = "website")]
Website(WebsiteOpt),
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct WebsiteOpt {
/// Create
#[structopt(long = "allow")]
pub allow: bool,
/// Delete
#[structopt(long = "deny")]
pub deny: bool,
/// Bucket name
pub bucket: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct BucketOpt {
/// Bucket name
pub name: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct DeleteBucketOpt {
/// Bucket name
pub name: String,
/// If this flag is not given, the bucket won't be deleted
#[structopt(long = "yes")]
pub yes: bool,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct PermBucketOpt {
/// Access key name or ID
#[structopt(long = "key")]
pub key_pattern: String,
/// Allow/deny read operations
#[structopt(long = "read")]
pub read: bool,
/// Allow/deny write operations
#[structopt(long = "write")]
pub write: bool,
/// Bucket name
pub bucket: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub enum KeyOperation {
/// List keys
#[structopt(name = "list")]
List,
/// Get key info
#[structopt(name = "info")]
Info(KeyOpt),
/// Create new key
#[structopt(name = "new")]
New(KeyNewOpt),
/// Rename key
#[structopt(name = "rename")]
Rename(KeyRenameOpt),
/// Delete key
#[structopt(name = "delete")]
Delete(KeyDeleteOpt),
/// Import key
#[structopt(name = "import")]
Import(KeyImportOpt),
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct KeyOpt {
/// ID or name of the key
pub key_pattern: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct KeyNewOpt {
/// Name of the key
#[structopt(long = "name", default_value = "Unnamed key")]
pub name: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct KeyRenameOpt {
/// ID or name of the key
pub key_pattern: String,
/// New name of the key
pub new_name: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct KeyDeleteOpt {
/// ID or name of the key
pub key_pattern: String,
/// Confirm deletion
#[structopt(long = "yes")]
pub yes: bool,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct KeyImportOpt {
/// Access key ID
pub key_id: String,
/// Secret access key
pub secret_key: String,
/// Key name
#[structopt(short = "n", default_value = "Imported key")]
pub name: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug, Clone)]
pub struct RepairOpt {
/// Launch repair operation on all nodes
#[structopt(short = "a", long = "all-nodes")]
pub all_nodes: bool,
/// Confirm the launch of the repair operation
#[structopt(long = "yes")]
pub yes: bool,
#[structopt(subcommand)]
pub what: Option<RepairWhat>,
}
#[derive(Serialize, Deserialize, StructOpt, Debug, Eq, PartialEq, Clone)]
pub enum RepairWhat {
/// Only do a full sync of metadata tables
#[structopt(name = "tables")]
Tables,
/// Only repair (resync/rebalance) the set of stored blocks
#[structopt(name = "blocks")]
Blocks,
/// Only redo the propagation of object deletions to the version table (slow)
#[structopt(name = "versions")]
Versions,
/// Only redo the propagation of version deletions to the block ref table (extremely slow)
#[structopt(name = "block_refs")]
BlockRefs,
}
#[derive(Serialize, Deserialize, StructOpt, Debug, Clone)]
pub struct StatsOpt {
/// Gather statistics from all nodes
#[structopt(short = "a", long = "all-nodes")]
pub all_nodes: bool,
/// Gather detailed statistics (this can be long)
#[structopt(short = "d", long = "detailed")]
pub detailed: bool,
}
pub async fn cli_cmd(
cmd: Command,
membership_rpc_cli: RpcAddrClient<Message>,
admin_rpc_cli: RpcAddrClient<AdminRPC>,
rpc_host: SocketAddr,
) -> Result<(), Error> {
match cmd {
Command::Status => cmd_status(membership_rpc_cli, rpc_host).await,
Command::Node(NodeOperation::Configure(configure_opt)) => {
cmd_configure(membership_rpc_cli, rpc_host, configure_opt).await
}
Command::Node(NodeOperation::Remove(remove_opt)) => {
cmd_remove(membership_rpc_cli, rpc_host, remove_opt).await
}
Command::Bucket(bo) => {
cmd_admin(admin_rpc_cli, rpc_host, AdminRPC::BucketOperation(bo)).await
}
Command::Key(ko) => cmd_admin(admin_rpc_cli, rpc_host, AdminRPC::KeyOperation(ko)).await,
Command::Repair(ro) => cmd_admin(admin_rpc_cli, rpc_host, AdminRPC::LaunchRepair(ro)).await,
Command::Stats(so) => cmd_admin(admin_rpc_cli, rpc_host, AdminRPC::Stats(so)).await,
_ => unreachable!(),
}
}
pub async fn cmd_status(
rpc_cli: RpcAddrClient<Message>,
rpc_host: SocketAddr,
) -> Result<(), Error> {
let status = match rpc_cli
.call(&rpc_host, &Message::PullStatus, ADMIN_RPC_TIMEOUT)
.await??
{
Message::AdvertiseNodesUp(nodes) => nodes,
resp => return Err(Error::Message(format!("Invalid RPC response: {:?}", resp))),
};
let config = match rpc_cli
.call(&rpc_host, &Message::PullConfig, ADMIN_RPC_TIMEOUT)
.await??
{
Message::AdvertiseConfig(cfg) => cfg,
resp => return Err(Error::Message(format!("Invalid RPC response: {:?}", resp))),
};
println!("Healthy nodes:");
for adv in status.iter().filter(|x| x.is_up) {
if let Some(cfg) = config.members.get(&adv.id) {
println!(
"{:?}\t{}\t{}\t[{}]\t{}\t{}",
adv.id, adv.state_info.hostname, adv.addr, cfg.tag, cfg.datacenter, cfg.capacity
);
} else {
println!(
"{:?}\t{}\t{}\tUNCONFIGURED/REMOVED",
adv.id, adv.state_info.hostname, adv.addr
);
}
}
let status_keys = status.iter().map(|x| x.id).collect::<HashSet<_>>();
let failure_case_1 = status.iter().any(|x| !x.is_up);
let failure_case_2 = config
.members
.iter()
.any(|(id, _)| !status_keys.contains(id));
if failure_case_1 || failure_case_2 {
println!("\nFailed nodes:");
for adv in status.iter().filter(|x| !x.is_up) {
if let Some(cfg) = config.members.get(&adv.id) {
println!(
"{:?}\t{}\t{}\t[{}]\t{}\t{}\tlast seen: {}s ago",
adv.id,
adv.state_info.hostname,
adv.addr,
cfg.tag,
cfg.datacenter,
cfg.capacity,
(now_msec() - adv.last_seen) / 1000,
);
}
}
for (id, cfg) in config.members.iter() {
if !status.iter().any(|x| x.id == *id) {
println!(
"{:?}\t{}\t{}\t{}\tnever seen",
id, cfg.tag, cfg.datacenter, cfg.capacity
);
}
}
}
Ok(())
}
pub fn find_matching_node(
cand: impl std::iter::Iterator<Item = UUID>,
pattern: &str,
) -> Result<UUID, Error> {
let mut candidates = vec![];
for c in cand {
if hex::encode(&c).starts_with(&pattern) {
candidates.push(c);
}
}
if candidates.len() != 1 {
Err(Error::Message(format!(
"{} nodes match '{}'",
candidates.len(),
pattern,
)))
} else {
Ok(candidates[0])
}
}
pub async fn cmd_configure(
rpc_cli: RpcAddrClient<Message>,
rpc_host: SocketAddr,
args: ConfigureNodeOpt,
) -> Result<(), Error> {
let status = match rpc_cli
.call(&rpc_host, &Message::PullStatus, ADMIN_RPC_TIMEOUT)
.await??
{
Message::AdvertiseNodesUp(nodes) => nodes,
resp => return Err(Error::Message(format!("Invalid RPC response: {:?}", resp))),
};
let added_node = find_matching_node(status.iter().map(|x| x.id), &args.node_id)?;
let mut config = match rpc_cli
.call(&rpc_host, &Message::PullConfig, ADMIN_RPC_TIMEOUT)
.await??
{
Message::AdvertiseConfig(cfg) => cfg,
resp => return Err(Error::Message(format!("Invalid RPC response: {:?}", resp))),
};
for replaced in args.replace.iter() {
let replaced_node = find_matching_node(config.members.keys().cloned(), replaced)?;
if config.members.remove(&replaced_node).is_none() {
return Err(Error::Message(format!(
"Cannot replace node {:?} as it is not in current configuration",
replaced_node
)));
}
}
let new_entry = match config.members.get(&added_node) {
None => NetworkConfigEntry {
datacenter: args
.datacenter
.expect("Please specifiy a datacenter with the -d flag"),
capacity: args
.capacity
.expect("Please specifiy a capacity with the -c flag"),
tag: args.tag.unwrap_or("".to_string()),
},
Some(old) => NetworkConfigEntry {
datacenter: args.datacenter.unwrap_or(old.datacenter.to_string()),
capacity: args.capacity.unwrap_or(old.capacity),
tag: args.tag.unwrap_or(old.tag.to_string()),
},
};
config.members.insert(added_node, new_entry);
config.version += 1;
rpc_cli
.call(
&rpc_host,
&Message::AdvertiseConfig(config),
ADMIN_RPC_TIMEOUT,
)
.await??;
Ok(())
}
pub async fn cmd_remove(
rpc_cli: RpcAddrClient<Message>,
rpc_host: SocketAddr,
args: RemoveNodeOpt,
) -> Result<(), Error> {
let mut config = match rpc_cli
.call(&rpc_host, &Message::PullConfig, ADMIN_RPC_TIMEOUT)
.await??
{
Message::AdvertiseConfig(cfg) => cfg,
resp => return Err(Error::Message(format!("Invalid RPC response: {:?}", resp))),
};
let deleted_node = find_matching_node(config.members.keys().cloned(), &args.node_id)?;
if !args.yes {
return Err(Error::Message(format!(
"Add the flag --yes to really remove {:?} from the cluster",
deleted_node
)));
}
config.members.remove(&deleted_node);
config.version += 1;
rpc_cli
.call(
&rpc_host,
&Message::AdvertiseConfig(config),
ADMIN_RPC_TIMEOUT,
)
.await??;
Ok(())
}
pub async fn cmd_admin(
rpc_cli: RpcAddrClient<AdminRPC>,
rpc_host: SocketAddr,
args: AdminRPC,
) -> Result<(), Error> {
match rpc_cli.call(&rpc_host, args, ADMIN_RPC_TIMEOUT).await?? {
AdminRPC::Ok(msg) => {
println!("{}", msg);
}
AdminRPC::BucketList(bl) => {
println!("List of buckets:");
for bucket in bl {
println!("{}", bucket);
}
}
AdminRPC::BucketInfo(bucket) => {
print_bucket_info(&bucket);
}
AdminRPC::KeyList(kl) => {
println!("List of keys:");
for key in kl {
println!("{}\t{}", key.0, key.1);
}
}
AdminRPC::KeyInfo(key) => {
print_key_info(&key);
}
r => {
error!("Unexpected response: {:?}", r);
}
}
Ok(())
}
fn print_key_info(key: &Key) {
println!("Key name: {}", key.name.get());
println!("Key ID: {}", key.key_id);
println!("Secret key: {}", key.secret_key);
if key.deleted.get() {
println!("Key is deleted.");
} else {
println!("Authorized buckets:");
for (b, _, perm) in key.authorized_buckets.items().iter() {
println!("- {} R:{} W:{}", b, perm.allow_read, perm.allow_write);
}
}
}
fn print_bucket_info(bucket: &Bucket) {
println!("Bucket name: {}", bucket.name);
match bucket.state.get() {
BucketState::Deleted => println!("Bucket is deleted."),
BucketState::Present(p) => {
println!("Authorized keys:");
for (k, _, perm) in p.authorized_keys.items().iter() {
println!("- {} R:{} W:{}", k, perm.allow_read, perm.allow_write);
}
println!("Website access: {}", p.website.get());
}
};
}

View file

@ -4,270 +4,67 @@
extern crate log;
mod admin_rpc;
mod cli;
mod repair;
mod server;
use std::collections::HashSet;
use std::net::SocketAddr;
use std::path::PathBuf;
use std::sync::Arc;
use std::time::Duration;
use serde::{Deserialize, Serialize};
use structopt::StructOpt;
use garage_util::config::TlsConfig;
use garage_util::data::*;
use garage_util::error::Error;
use garage_rpc::membership::*;
use garage_rpc::rpc_client::*;
use admin_rpc::*;
use cli::*;
#[derive(StructOpt, Debug)]
#[structopt(name = "garage")]
pub struct Opt {
/// RPC connect to this host to execute client operations
#[structopt(short = "h", long = "rpc-host", default_value = "127.0.0.1:3901")]
rpc_host: SocketAddr,
pub rpc_host: SocketAddr,
#[structopt(long = "ca-cert")]
ca_cert: Option<String>,
pub ca_cert: Option<String>,
#[structopt(long = "client-cert")]
client_cert: Option<String>,
pub client_cert: Option<String>,
#[structopt(long = "client-key")]
client_key: Option<String>,
pub client_key: Option<String>,
#[structopt(subcommand)]
cmd: Command,
}
#[derive(StructOpt, Debug)]
pub enum Command {
/// Run Garage server
#[structopt(name = "server")]
Server(ServerOpt),
/// Get network status
#[structopt(name = "status")]
Status,
/// Garage node operations
#[structopt(name = "node")]
Node(NodeOperation),
/// Bucket operations
#[structopt(name = "bucket")]
Bucket(BucketOperation),
/// Key operations
#[structopt(name = "key")]
Key(KeyOperation),
/// Start repair of node data
#[structopt(name = "repair")]
Repair(RepairOpt),
}
#[derive(StructOpt, Debug)]
pub struct ServerOpt {
/// Configuration file
#[structopt(short = "c", long = "config", default_value = "./config.toml")]
config_file: PathBuf,
}
#[derive(StructOpt, Debug)]
pub enum NodeOperation {
/// Configure Garage node
#[structopt(name = "configure")]
Configure(ConfigureNodeOpt),
/// Remove Garage node from cluster
#[structopt(name = "remove")]
Remove(RemoveNodeOpt),
}
#[derive(StructOpt, Debug)]
pub struct ConfigureNodeOpt {
/// Node to configure (prefix of hexadecimal node id)
node_id: String,
/// Location (datacenter) of the node
#[structopt(short = "d", long = "datacenter")]
datacenter: Option<String>,
/// Number of tokens
#[structopt(short = "n", long = "n-tokens")]
n_tokens: Option<u32>,
/// Optionnal node tag
#[structopt(short = "t", long = "tag")]
tag: Option<String>,
}
#[derive(StructOpt, Debug)]
pub struct RemoveNodeOpt {
/// Node to configure (prefix of hexadecimal node id)
node_id: String,
/// If this flag is not given, the node won't be removed
#[structopt(long = "yes")]
yes: bool,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub enum BucketOperation {
/// List buckets
#[structopt(name = "list")]
List,
/// Get bucket info
#[structopt(name = "info")]
Info(BucketOpt),
/// Create bucket
#[structopt(name = "create")]
Create(BucketOpt),
/// Delete bucket
#[structopt(name = "delete")]
Delete(DeleteBucketOpt),
/// Allow key to read or write to bucket
#[structopt(name = "allow")]
Allow(PermBucketOpt),
/// Allow key to read or write to bucket
#[structopt(name = "deny")]
Deny(PermBucketOpt),
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct BucketOpt {
/// Bucket name
pub name: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct DeleteBucketOpt {
/// Bucket name
pub name: String,
/// If this flag is not given, the bucket won't be deleted
#[structopt(long = "yes")]
pub yes: bool,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct PermBucketOpt {
/// Access key ID
#[structopt(long = "key")]
pub key_id: String,
/// Allow/deny read operations
#[structopt(long = "read")]
pub read: bool,
/// Allow/deny write operations
#[structopt(long = "write")]
pub write: bool,
/// Bucket name
pub bucket: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub enum KeyOperation {
/// List keys
#[structopt(name = "list")]
List,
/// Get key info
#[structopt(name = "info")]
Info(KeyOpt),
/// Create new key
#[structopt(name = "new")]
New(KeyNewOpt),
/// Rename key
#[structopt(name = "rename")]
Rename(KeyRenameOpt),
/// Delete key
#[structopt(name = "delete")]
Delete(KeyDeleteOpt),
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct KeyOpt {
/// ID of the key
key_id: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct KeyNewOpt {
/// Name of the key
#[structopt(long = "name", default_value = "Unnamed key")]
name: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct KeyRenameOpt {
/// ID of the key
key_id: String,
/// New name of the key
new_name: String,
}
#[derive(Serialize, Deserialize, StructOpt, Debug)]
pub struct KeyDeleteOpt {
/// ID of the key
key_id: String,
/// Confirm deletion
#[structopt(long = "yes")]
yes: bool,
}
#[derive(Serialize, Deserialize, StructOpt, Debug, Clone)]
pub struct RepairOpt {
/// Launch repair operation on all nodes
#[structopt(short = "a", long = "all-nodes")]
pub all_nodes: bool,
/// Confirm the launch of the repair operation
#[structopt(long = "yes")]
pub yes: bool,
#[structopt(subcommand)]
pub what: Option<RepairWhat>,
}
#[derive(Serialize, Deserialize, StructOpt, Debug, Eq, PartialEq, Clone)]
pub enum RepairWhat {
/// Only do a full sync of metadata tables
#[structopt(name = "tables")]
Tables,
/// Only repair (resync/rebalance) the set of stored blocks
#[structopt(name = "blocks")]
Blocks,
/// Only redo the propagation of object deletions to the version table (slow)
#[structopt(name = "versions")]
Versions,
/// Only redo the propagation of version deletions to the block ref table (extremely slow)
#[structopt(name = "block_refs")]
BlockRefs,
}
#[tokio::main]
async fn main() {
pretty_env_logger::init();
let opt = Opt::from_args();
let res = if let Command::Server(server_opt) = opt.cmd {
// Abort on panic (same behavior as in Go)
std::panic::set_hook(Box::new(|panic_info| {
error!("{}", panic_info.to_string());
std::process::abort();
}));
server::run_server(server_opt.config_file).await
} else {
cli_command(opt).await
};
if let Err(e) = res {
error!("{}", e);
}
}
async fn cli_command(opt: Opt) -> Result<(), Error> {
let tls_config = match (opt.ca_cert, opt.client_cert, opt.client_key) {
(Some(ca_cert), Some(client_cert), Some(client_key)) => Some(TlsConfig {
ca_cert,
@ -287,245 +84,5 @@ async fn main() {
RpcAddrClient::new(rpc_http_cli.clone(), MEMBERSHIP_RPC_PATH.to_string());
let admin_rpc_cli = RpcAddrClient::new(rpc_http_cli.clone(), ADMIN_RPC_PATH.to_string());
let resp = match opt.cmd {
Command::Server(server_opt) => {
// Abort on panic (same behavior as in Go)
std::panic::set_hook(Box::new(|panic_info| {
error!("{}", panic_info.to_string());
std::process::abort();
}));
server::run_server(server_opt.config_file).await
}
Command::Status => cmd_status(membership_rpc_cli, opt.rpc_host).await,
Command::Node(NodeOperation::Configure(configure_opt)) => {
cmd_configure(membership_rpc_cli, opt.rpc_host, configure_opt).await
}
Command::Node(NodeOperation::Remove(remove_opt)) => {
cmd_remove(membership_rpc_cli, opt.rpc_host, remove_opt).await
}
Command::Bucket(bo) => {
cmd_admin(admin_rpc_cli, opt.rpc_host, AdminRPC::BucketOperation(bo)).await
}
Command::Key(bo) => {
cmd_admin(admin_rpc_cli, opt.rpc_host, AdminRPC::KeyOperation(bo)).await
}
Command::Repair(ro) => {
cmd_admin(admin_rpc_cli, opt.rpc_host, AdminRPC::LaunchRepair(ro)).await
}
};
if let Err(e) = resp {
error!("Error: {}", e);
}
}
async fn cmd_status(rpc_cli: RpcAddrClient<Message>, rpc_host: SocketAddr) -> Result<(), Error> {
let status = match rpc_cli
.call(&rpc_host, &Message::PullStatus, ADMIN_RPC_TIMEOUT)
.await??
{
Message::AdvertiseNodesUp(nodes) => nodes,
resp => return Err(Error::Message(format!("Invalid RPC response: {:?}", resp))),
};
let config = match rpc_cli
.call(&rpc_host, &Message::PullConfig, ADMIN_RPC_TIMEOUT)
.await??
{
Message::AdvertiseConfig(cfg) => cfg,
resp => return Err(Error::Message(format!("Invalid RPC response: {:?}", resp))),
};
println!("Healthy nodes:");
for adv in status.iter().filter(|x| x.is_up) {
if let Some(cfg) = config.members.get(&adv.id) {
println!(
"{:?}\t{}\t{}\t[{}]\t{}\t{}",
adv.id, adv.state_info.hostname, adv.addr, cfg.tag, cfg.datacenter, cfg.n_tokens
);
} else {
println!(
"{:?}\t{}\t{}\tUNCONFIGURED/REMOVED",
adv.id, adv.state_info.hostname, adv.addr
);
}
}
let status_keys = status.iter().map(|x| x.id).collect::<HashSet<_>>();
let failure_case_1 = status.iter().any(|x| !x.is_up);
let failure_case_2 = config
.members
.iter()
.any(|(id, _)| !status_keys.contains(id));
if failure_case_1 || failure_case_2 {
println!("\nFailed nodes:");
for adv in status.iter().filter(|x| !x.is_up) {
if let Some(cfg) = config.members.get(&adv.id) {
println!(
"{:?}\t{}\t{}\t[{}]\t{}\t{}\tlast seen: {}s ago",
adv.id,
adv.state_info.hostname,
adv.addr,
cfg.tag,
cfg.datacenter,
cfg.n_tokens,
(now_msec() - adv.last_seen) / 1000,
);
}
}
for (id, cfg) in config.members.iter() {
if !status.iter().any(|x| x.id == *id) {
println!(
"{:?}\t{}\t{}\t{}\tnever seen",
id, cfg.tag, cfg.datacenter, cfg.n_tokens
);
}
}
}
Ok(())
}
async fn cmd_configure(
rpc_cli: RpcAddrClient<Message>,
rpc_host: SocketAddr,
args: ConfigureNodeOpt,
) -> Result<(), Error> {
let status = match rpc_cli
.call(&rpc_host, &Message::PullStatus, ADMIN_RPC_TIMEOUT)
.await??
{
Message::AdvertiseNodesUp(nodes) => nodes,
resp => return Err(Error::Message(format!("Invalid RPC response: {:?}", resp))),
};
let mut candidates = vec![];
for adv in status.iter() {
if hex::encode(&adv.id).starts_with(&args.node_id) {
candidates.push(adv.id);
}
}
if candidates.len() != 1 {
return Err(Error::Message(format!(
"{} matching nodes",
candidates.len()
)));
}
let mut config = match rpc_cli
.call(&rpc_host, &Message::PullConfig, ADMIN_RPC_TIMEOUT)
.await??
{
Message::AdvertiseConfig(cfg) => cfg,
resp => return Err(Error::Message(format!("Invalid RPC response: {:?}", resp))),
};
let new_entry = match config.members.get(&candidates[0]) {
None => NetworkConfigEntry {
datacenter: args
.datacenter
.expect("Please specifiy a datacenter with the -d flag"),
n_tokens: args
.n_tokens
.expect("Please specifiy a number of tokens with the -n flag"),
tag: args.tag.unwrap_or("".to_string()),
},
Some(old) => NetworkConfigEntry {
datacenter: args.datacenter.unwrap_or(old.datacenter.to_string()),
n_tokens: args.n_tokens.unwrap_or(old.n_tokens),
tag: args.tag.unwrap_or(old.tag.to_string()),
},
};
config.members.insert(candidates[0].clone(), new_entry);
config.version += 1;
rpc_cli
.call(
&rpc_host,
&Message::AdvertiseConfig(config),
ADMIN_RPC_TIMEOUT,
)
.await??;
Ok(())
}
async fn cmd_remove(
rpc_cli: RpcAddrClient<Message>,
rpc_host: SocketAddr,
args: RemoveNodeOpt,
) -> Result<(), Error> {
let mut config = match rpc_cli
.call(&rpc_host, &Message::PullConfig, ADMIN_RPC_TIMEOUT)
.await??
{
Message::AdvertiseConfig(cfg) => cfg,
resp => return Err(Error::Message(format!("Invalid RPC response: {:?}", resp))),
};
let mut candidates = vec![];
for (key, _) in config.members.iter() {
if hex::encode(key).starts_with(&args.node_id) {
candidates.push(*key);
}
}
if candidates.len() != 1 {
return Err(Error::Message(format!(
"{} matching nodes",
candidates.len()
)));
}
if !args.yes {
return Err(Error::Message(format!(
"Add the flag --yes to really remove {:?} from the cluster",
candidates[0]
)));
}
config.members.remove(&candidates[0]);
config.version += 1;
rpc_cli
.call(
&rpc_host,
&Message::AdvertiseConfig(config),
ADMIN_RPC_TIMEOUT,
)
.await??;
Ok(())
}
async fn cmd_admin(
rpc_cli: RpcAddrClient<AdminRPC>,
rpc_host: SocketAddr,
args: AdminRPC,
) -> Result<(), Error> {
match rpc_cli.call(&rpc_host, args, ADMIN_RPC_TIMEOUT).await?? {
AdminRPC::Ok(msg) => {
println!("{}", msg);
}
AdminRPC::BucketList(bl) => {
println!("List of buckets:");
for bucket in bl {
println!("{}", bucket);
}
}
AdminRPC::BucketInfo(bucket) => {
println!("{:?}", bucket);
}
AdminRPC::KeyList(kl) => {
println!("List of keys:");
for key in kl {
println!("{}\t{}", key.0, key.1);
}
}
AdminRPC::KeyInfo(key) => {
println!("{:?}", key);
}
r => {
error!("Unexpected response: {:?}", r);
}
}
Ok(())
cli_cmd(opt.cmd, membership_rpc_cli, admin_rpc_cli, opt.rpc_host).await
}

View file

@ -16,7 +16,13 @@ pub struct Repair {
}
impl Repair {
pub async fn repair_worker(
pub async fn repair_worker(&self, opt: RepairOpt, must_exit: watch::Receiver<bool>) {
if let Err(e) = self.repair_worker_aux(opt, must_exit).await {
warn!("Repair worker failed with error: {}", e);
}
}
async fn repair_worker_aux(
&self,
opt: RepairOpt,
must_exit: watch::Receiver<bool>,
@ -25,41 +31,11 @@ impl Repair {
if todo(RepairWhat::Tables) {
info!("Launching a full sync of tables");
self.garage
.bucket_table
.syncer
.load_full()
.unwrap()
.add_full_scan()
.await;
self.garage
.object_table
.syncer
.load_full()
.unwrap()
.add_full_scan()
.await;
self.garage
.version_table
.syncer
.load_full()
.unwrap()
.add_full_scan()
.await;
self.garage
.block_ref_table
.syncer
.load_full()
.unwrap()
.add_full_scan()
.await;
self.garage
.key_table
.syncer
.load_full()
.unwrap()
.add_full_scan()
.await;
self.garage.bucket_table.syncer.add_full_sync();
self.garage.object_table.syncer.add_full_sync();
self.garage.version_table.syncer.add_full_sync();
self.garage.block_ref_table.syncer.add_full_sync();
self.garage.key_table.syncer.add_full_sync();
}
// TODO: wait for full sync to finish before proceeding to the rest?
@ -93,11 +69,13 @@ impl Repair {
async fn repair_versions(&self, must_exit: &watch::Receiver<bool>) -> Result<(), Error> {
let mut pos = vec![];
while let Some((item_key, item_bytes)) = self.garage.version_table.store.get_gt(&pos)? {
while let Some((item_key, item_bytes)) =
self.garage.version_table.data.store.get_gt(&pos)?
{
pos = item_key.to_vec();
let version = rmp_serde::decode::from_read_ref::<_, Version>(item_bytes.as_ref())?;
if version.deleted {
if version.deleted.get() {
continue;
}
let object = self
@ -110,13 +88,7 @@ impl Repair {
.versions()
.iter()
.any(|x| x.uuid == version.uuid && x.state != ObjectVersionState::Aborted),
None => {
warn!(
"Repair versions: object for version {:?} not found, skipping.",
version
);
continue;
}
None => false,
};
if !version_exists {
info!("Repair versions: marking version as deleted: {:?}", version);
@ -127,7 +99,6 @@ impl Repair {
version.bucket,
version.key,
true,
vec![],
))
.await?;
}
@ -142,11 +113,13 @@ impl Repair {
async fn repair_block_ref(&self, must_exit: &watch::Receiver<bool>) -> Result<(), Error> {
let mut pos = vec![];
while let Some((item_key, item_bytes)) = self.garage.block_ref_table.store.get_gt(&pos)? {
while let Some((item_key, item_bytes)) =
self.garage.block_ref_table.data.store.get_gt(&pos)?
{
pos = item_key.to_vec();
let block_ref = rmp_serde::decode::from_read_ref::<_, BlockRef>(item_bytes.as_ref())?;
if block_ref.deleted {
if block_ref.deleted.get() {
continue;
}
let version = self
@ -154,16 +127,8 @@ impl Repair {
.version_table
.get(&block_ref.version, &EmptyKey)
.await?;
let ref_exists = match version {
Some(v) => !v.deleted,
None => {
warn!(
"Block ref repair: version for block ref {:?} not found, skipping.",
block_ref
);
continue;
}
};
// The version might not exist if it has been GC'ed
let ref_exists = version.map(|v| !v.deleted.get()).unwrap_or(false);
if !ref_exists {
info!(
"Repair block ref: marking block_ref as deleted: {:?}",
@ -174,7 +139,7 @@ impl Repair {
.insert(&BlockRef {
block: block_ref.block,
version: block_ref.version,
deleted: true,
deleted: true.into(),
})
.await?;
}

View file

@ -11,6 +11,7 @@ use garage_util::error::Error;
use garage_api::api_server;
use garage_model::garage::Garage;
use garage_rpc::rpc_server::RpcServer;
use garage_web::web_server;
use crate::admin_rpc::*;
@ -20,13 +21,13 @@ async fn shutdown_signal(send_cancel: watch::Sender<bool>) -> Result<(), Error>
.await
.expect("failed to install CTRL+C signal handler");
info!("Received CTRL+C, shutting down.");
send_cancel.broadcast(true)?;
send_cancel.send(true)?;
Ok(())
}
async fn wait_from(mut chan: watch::Receiver<bool>) -> () {
while let Some(exit_now) = chan.recv().await {
if exit_now {
while !*chan.borrow() {
if chan.changed().await.is_err() {
return;
}
}
@ -39,16 +40,22 @@ pub async fn run_server(config_file: PathBuf) -> Result<(), Error> {
info!("Opening database...");
let mut db_path = config.metadata_dir.clone();
db_path.push("db");
let db = sled::open(db_path).expect("Unable to open DB");
let db = sled::open(&db_path).expect("Unable to open sled DB");
info!("Initialize RPC server...");
let mut rpc_server = RpcServer::new(config.rpc_bind_addr.clone(), config.rpc_tls.clone());
info!("Initializing background runner...");
let (send_cancel, watch_cancel) = watch::channel(false);
let background = BackgroundRunner::new(16, watch_cancel.clone());
let (background, await_background_done) = BackgroundRunner::new(16, watch_cancel.clone());
let garage = Garage::new(config, db, background.clone(), &mut rpc_server).await;
info!("Initializing Garage main data store...");
let garage = Garage::new(config.clone(), db, background, &mut rpc_server);
let bootstrap = garage.system.clone().bootstrap(
config.bootstrap_peers,
config.consul_host,
config.consul_service_name,
);
info!("Crate admin RPC handler...");
AdminRpcHandler::new(garage.clone()).register_handler(&mut rpc_server);
@ -56,20 +63,13 @@ pub async fn run_server(config_file: PathBuf) -> Result<(), Error> {
info!("Initializing RPC and API servers...");
let run_rpc_server = Arc::new(rpc_server).run(wait_from(watch_cancel.clone()));
let api_server = api_server::run_api_server(garage.clone(), wait_from(watch_cancel.clone()));
let web_server = web_server::run_web_server(garage, wait_from(watch_cancel.clone()));
futures::try_join!(
garage
.system
.clone()
.bootstrap(
&garage.config.bootstrap_peers[..],
garage.config.consul_host.clone(),
garage.config.consul_service_name.clone()
)
.map(|rv| {
info!("Bootstrap done");
Ok(rv)
}),
bootstrap.map(|rv| {
info!("Bootstrap done");
Ok(rv)
}),
run_rpc_server.map(|rv| {
info!("RPC server exited");
rv
@ -78,9 +78,13 @@ pub async fn run_server(config_file: PathBuf) -> Result<(), Error> {
info!("API server exited");
rv
}),
background.run().map(|rv| {
info!("Background runner exited");
Ok(rv)
web_server.map(|rv| {
info!("Web server exited");
rv
}),
await_background_done.map(|rv| {
info!("Background runner exited: {:?}", rv);
Ok(())
}),
shutdown_signal(send_cancel),
)?;

View file

@ -1,9 +1,9 @@
[package]
name = "garage_model"
version = "0.1.1"
version = "0.2.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "GPL-3.0"
license = "AGPL-3.0"
description = "Core data model for the Garage object store"
repository = "https://git.deuxfleurs.fr/Deuxfleurs/garage"
@ -13,26 +13,21 @@ path = "lib.rs"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
garage_util = { version = "0.1", path = "../util" }
garage_rpc = { version = "0.1", path = "../rpc" }
garage_table = { version = "0.1.1", path = "../table" }
model010 = { package = "garage_model", version = "0.1.0" }
garage_rpc = { version = "0.2.1", path = "../rpc" }
garage_table = { version = "0.2.1", path = "../table" }
garage_util = { version = "0.2.1", path = "../util" }
bytes = "0.4"
rand = "0.7"
hex = "0.3"
sha2 = "0.8"
arc-swap = "0.4"
arc-swap = "1.0"
hex = "0.4"
log = "0.4"
rand = "0.8"
sled = "0.31"
sled = "0.34"
rmp-serde = "0.14.3"
rmp-serde = "0.15"
serde = { version = "1.0", default-features = false, features = ["derive", "rc"] }
serde_bytes = "0.11"
async-trait = "0.1.30"
futures = "0.3"
futures-util = "0.3"
tokio = { version = "0.2", default-features = false, features = ["rt-core", "rt-threaded", "io-driver", "net", "tcp", "time", "macros", "sync", "signal", "fs"] }
tokio = { version = "1.0", default-features = false, features = ["rt", "rt-multi-thread", "io-util", "net", "time", "macros", "sync", "signal", "fs"] }

View file

@ -5,22 +5,20 @@ use std::time::Duration;
use arc_swap::ArcSwapOption;
use futures::future::*;
use futures::select;
use futures::stream::*;
use serde::{Deserialize, Serialize};
use tokio::fs;
use tokio::prelude::*;
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::sync::{watch, Mutex, Notify};
use garage_util::data;
use garage_util::data::*;
use garage_util::error::Error;
use garage_util::time::*;
use garage_rpc::membership::System;
use garage_rpc::rpc_client::*;
use garage_rpc::rpc_server::*;
use garage_table::table_sharded::TableShardedReplication;
use garage_table::{DeletedFilter, TableReplication};
use garage_table::replication::{sharded::TableShardedReplication, TableReplication};
use crate::block_ref_table::*;
@ -28,7 +26,10 @@ use crate::garage::Garage;
pub const INLINE_THRESHOLD: usize = 3072;
pub const BACKGROUND_WORKERS: u64 = 1;
const BLOCK_RW_TIMEOUT: Duration = Duration::from_secs(42);
const BLOCK_GC_TIMEOUT: Duration = Duration::from_secs(60);
const NEED_BLOCK_QUERY_TIMEOUT: Duration = Duration::from_secs(5);
const RESYNC_RETRY_TIMEOUT: Duration = Duration::from_secs(10);
@ -56,14 +57,14 @@ pub struct BlockManager {
pub data_dir: PathBuf,
pub data_dir_lock: Mutex<()>,
pub rc: sled::Tree,
rc: sled::Tree,
pub resync_queue: sled::Tree,
pub resync_notify: Notify,
resync_queue: sled::Tree,
resync_notify: Notify,
pub system: Arc<System>,
system: Arc<System>,
rpc_client: Arc<RpcClient<Message>>,
pub garage: ArcSwapOption<Garage>,
pub(crate) garage: ArcSwapOption<Garage>,
}
impl BlockManager {
@ -77,7 +78,6 @@ impl BlockManager {
let rc = db
.open_tree("block_local_rc")
.expect("Unable to open block_local_rc tree");
rc.set_merge_operator(rc_merge);
let resync_queue = db
.open_tree("block_local_resync_queue")
@ -127,18 +127,16 @@ impl BlockManager {
}
}
pub async fn spawn_background_worker(self: Arc<Self>) {
pub fn spawn_background_worker(self: Arc<Self>) {
// Launch 2 simultaneous workers for background resync loop preprocessing
for i in 0..2usize {
for i in 0..BACKGROUND_WORKERS {
let bm2 = self.clone();
let background = self.system.background.clone();
tokio::spawn(async move {
tokio::time::delay_for(Duration::from_secs(10)).await;
background
.spawn_worker(format!("block resync worker {}", i), move |must_exit| {
bm2.resync_loop(must_exit)
})
.await;
tokio::time::sleep(Duration::from_secs(10 * (i + 1))).await;
background.spawn_worker(format!("block resync worker {}", i), move |must_exit| {
bm2.resync_loop(must_exit)
});
});
}
}
@ -168,7 +166,7 @@ impl BlockManager {
Ok(f) => f,
Err(e) => {
// Not found but maybe we should have had it ??
self.put_to_resync(hash, 0)?;
self.put_to_resync(hash, Duration::from_millis(0))?;
return Err(Into::into(e));
}
};
@ -176,11 +174,16 @@ impl BlockManager {
f.read_to_end(&mut data).await?;
drop(f);
if data::hash(&data[..]) != *hash {
if blake2sum(&data[..]) != *hash {
let _lock = self.data_dir_lock.lock().await;
warn!("Block {:?} is corrupted. Deleting and resyncing.", hash);
fs::remove_file(path).await?;
self.put_to_resync(&hash, 0)?;
warn!(
"Block {:?} is corrupted. Renaming to .corrupted and resyncing.",
hash
);
let mut path2 = path.clone();
path2.set_extension(".corrupted");
fs::rename(path, path2).await?;
self.put_to_resync(&hash, Duration::from_millis(0))?;
return Err(Error::CorruptData(*hash));
}
@ -191,7 +194,7 @@ impl BlockManager {
let needed = self
.rc
.get(hash.as_ref())?
.map(|x| u64_from_bytes(x.as_ref()) > 0)
.map(|x| u64_from_be_bytes(x) > 0)
.unwrap_or(false);
if needed {
let path = self.block_path(hash);
@ -215,84 +218,95 @@ impl BlockManager {
}
pub fn block_incref(&self, hash: &Hash) -> Result<(), Error> {
let old_rc = self.rc.get(&hash)?;
self.rc.merge(&hash, vec![1])?;
if old_rc.map(|x| u64_from_bytes(&x[..]) == 0).unwrap_or(true) {
self.put_to_resync(&hash, BLOCK_RW_TIMEOUT.as_millis() as u64)?;
let old_rc = self.rc.fetch_and_update(&hash, |old| {
let old_v = old.map(u64_from_be_bytes).unwrap_or(0);
Some(u64::to_be_bytes(old_v + 1).to_vec())
})?;
let old_rc = old_rc.map(u64_from_be_bytes).unwrap_or(0);
if old_rc == 0 {
self.put_to_resync(&hash, BLOCK_RW_TIMEOUT)?;
}
Ok(())
}
pub fn block_decref(&self, hash: &Hash) -> Result<(), Error> {
let new_rc = self.rc.merge(&hash, vec![0])?;
if new_rc.map(|x| u64_from_bytes(&x[..]) == 0).unwrap_or(true) {
self.put_to_resync(&hash, 0)?;
let new_rc = self.rc.update_and_fetch(&hash, |old| {
let old_v = old.map(u64_from_be_bytes).unwrap_or(0);
if old_v > 1 {
Some(u64::to_be_bytes(old_v - 1).to_vec())
} else {
None
}
})?;
if new_rc.is_none() {
self.put_to_resync(&hash, BLOCK_GC_TIMEOUT)?;
}
Ok(())
}
fn put_to_resync(&self, hash: &Hash, delay_millis: u64) -> Result<(), Error> {
let when = now_msec() + delay_millis;
fn put_to_resync(&self, hash: &Hash, delay: Duration) -> Result<(), Error> {
let when = now_msec() + delay.as_millis() as u64;
trace!("Put resync_queue: {} {:?}", when, hash);
let mut key = u64::to_be_bytes(when).to_vec();
key.extend(hash.as_ref());
self.resync_queue.insert(key, hash.as_ref())?;
self.resync_notify.notify();
self.resync_notify.notify_waiters();
Ok(())
}
async fn resync_loop(
self: Arc<Self>,
mut must_exit: watch::Receiver<bool>,
) -> Result<(), Error> {
let mut n_failures = 0usize;
async fn resync_loop(self: Arc<Self>, mut must_exit: watch::Receiver<bool>) {
while !*must_exit.borrow() {
if let Some((time_bytes, hash_bytes)) = self.resync_queue.pop_min()? {
let time_msec = u64_from_bytes(&time_bytes[0..8]);
let now = now_msec();
if now >= time_msec {
let mut hash = [0u8; 32];
hash.copy_from_slice(hash_bytes.as_ref());
let hash = Hash::from(hash);
if let Err(e) = self.resync_iter(&hash).await {
warn!("Failed to resync block {:?}, retrying later: {}", hash, e);
self.put_to_resync(&hash, RESYNC_RETRY_TIMEOUT.as_millis() as u64)?;
n_failures += 1;
if n_failures >= 10 {
warn!("Too many resync failures, throttling.");
tokio::time::delay_for(Duration::from_secs(1)).await;
}
} else {
n_failures = 0;
}
} else {
self.resync_queue.insert(time_bytes, hash_bytes)?;
let delay = tokio::time::delay_for(Duration::from_millis(time_msec - now));
select! {
_ = delay.fuse() => (),
_ = self.resync_notify.notified().fuse() => (),
_ = must_exit.recv().fuse() => (),
}
}
} else {
if let Err(e) = self.resync_iter(&mut must_exit).await {
warn!("Error in block resync loop: {}", e);
select! {
_ = self.resync_notify.notified().fuse() => (),
_ = must_exit.recv().fuse() => (),
_ = tokio::time::sleep(Duration::from_secs(1)).fuse() => (),
_ = must_exit.changed().fuse() => (),
}
}
}
}
async fn resync_iter(&self, must_exit: &mut watch::Receiver<bool>) -> Result<(), Error> {
if let Some(first_item) = self.resync_queue.iter().next() {
let (time_bytes, hash_bytes) = first_item?;
let time_msec = u64_from_be_bytes(&time_bytes[0..8]);
let now = now_msec();
if now >= time_msec {
let hash = Hash::try_from(&hash_bytes[..]).unwrap();
let res = self.resync_block(&hash).await;
if let Err(e) = &res {
warn!("Error when resyncing {:?}: {}", hash, e);
self.put_to_resync(&hash, RESYNC_RETRY_TIMEOUT)?;
}
self.resync_queue.remove(&time_bytes)?;
res?; // propagate error to delay main loop
} else {
let delay = tokio::time::sleep(Duration::from_millis(time_msec - now));
select! {
_ = delay.fuse() => (),
_ = self.resync_notify.notified().fuse() => (),
_ = must_exit.changed().fuse() => (),
}
}
} else {
select! {
_ = self.resync_notify.notified().fuse() => (),
_ = must_exit.changed().fuse() => (),
}
}
Ok(())
}
async fn resync_iter(&self, hash: &Hash) -> Result<(), Error> {
async fn resync_block(&self, hash: &Hash) -> Result<(), Error> {
let lock = self.data_dir_lock.lock().await;
let path = self.block_path(hash);
let exists = fs::metadata(&path).await.is_ok();
let needed = self
.rc
.get(hash.as_ref())?
.map(|x| u64_from_bytes(x.as_ref()) > 0)
.map(|x| u64_from_be_bytes(x) > 0)
.unwrap_or(false);
if exists != needed {
@ -303,61 +317,67 @@ impl BlockManager {
}
if exists && !needed {
let garage = self.garage.load_full().unwrap();
let active_refs = garage
.block_ref_table
.get_range(&hash, None, Some(DeletedFilter::NotDeleted), 1)
.await?;
let needed_by_others = !active_refs.is_empty();
if needed_by_others {
let ring = self.system.ring.borrow().clone();
let who = self.replication.replication_nodes(&hash, &ring);
let msg = Arc::new(Message::NeedBlockQuery(*hash));
let who_needs_fut = who.iter().map(|to| {
self.rpc_client
.call_arc(*to, msg.clone(), NEED_BLOCK_QUERY_TIMEOUT)
});
let who_needs = join_all(who_needs_fut).await;
trace!("Offloading block {:?}", hash);
let mut need_nodes = vec![];
for (node, needed) in who.into_iter().zip(who_needs.iter()) {
match needed {
Ok(Message::NeedBlockReply(needed)) => {
if *needed {
need_nodes.push(node);
}
}
Err(e) => {
return Err(Error::Message(format!(
"Should delete block, but unable to confirm that all other nodes that need it have it: {}",
e
)));
}
Ok(_) => {
return Err(Error::Message(format!(
"Unexpected response to NeedBlockQuery RPC"
)));
let mut who = self.replication.write_nodes(&hash);
if who.len() < self.replication.write_quorum() {
return Err(Error::Message(format!("Not trying to offload block because we don't have a quorum of nodes to write to")));
}
who.retain(|id| *id != self.system.id);
let msg = Arc::new(Message::NeedBlockQuery(*hash));
let who_needs_fut = who.iter().map(|to| {
self.rpc_client
.call_arc(*to, msg.clone(), NEED_BLOCK_QUERY_TIMEOUT)
});
let who_needs_resps = join_all(who_needs_fut).await;
let mut need_nodes = vec![];
for (node, needed) in who.iter().zip(who_needs_resps.into_iter()) {
match needed? {
Message::NeedBlockReply(needed) => {
if needed {
need_nodes.push(*node);
}
}
}
if need_nodes.len() > 0 {
let put_block_message = self.read_block(hash).await?;
self.rpc_client
.try_call_many(
&need_nodes[..],
put_block_message,
RequestStrategy::with_quorum(need_nodes.len())
.with_timeout(BLOCK_RW_TIMEOUT),
)
.await?;
_ => {
return Err(Error::Message(format!(
"Unexpected response to NeedBlockQuery RPC"
)));
}
}
}
if need_nodes.len() > 0 {
trace!(
"Block {:?} needed by {} nodes, sending",
hash,
need_nodes.len()
);
let put_block_message = self.read_block(hash).await?;
self.rpc_client
.try_call_many(
&need_nodes[..],
put_block_message,
RequestStrategy::with_quorum(need_nodes.len())
.with_timeout(BLOCK_RW_TIMEOUT),
)
.await?;
}
info!(
"Deleting block {:?}, offload finished ({} / {})",
hash,
need_nodes.len(),
who.len()
);
fs::remove_file(path).await?;
self.resync_queue.remove(&hash)?;
}
if needed && !exists {
drop(lock);
// TODO find a way to not do this if they are sending it to us
// Let's suppose this isn't an issue for now with the BLOCK_RW_TIMEOUT delay
// between the RC being incremented and this part being called.
@ -369,7 +389,7 @@ impl BlockManager {
}
pub async fn rpc_get_block(&self, hash: &Hash) -> Result<Vec<u8>, Error> {
let who = self.replication.read_nodes(&hash, &self.system);
let who = self.replication.read_nodes(&hash);
let resps = self
.rpc_client
.try_call_many(
@ -393,7 +413,7 @@ impl BlockManager {
}
pub async fn rpc_put_block(&self, hash: Hash, data: Vec<u8>) -> Result<(), Error> {
let who = self.replication.write_nodes(&hash, &self.system);
let who = self.replication.write_nodes(&hash);
self.rpc_client
.try_call_many(
&who[..],
@ -410,15 +430,15 @@ impl BlockManager {
let garage = self.garage.load_full().unwrap();
let mut last_hash = None;
let mut i = 0usize;
for entry in garage.block_ref_table.store.iter() {
for entry in garage.block_ref_table.data.store.iter() {
let (_k, v_bytes) = entry?;
let block_ref = rmp_serde::decode::from_read_ref::<_, BlockRef>(v_bytes.as_ref())?;
if Some(&block_ref.block) == last_hash.as_ref() {
continue;
}
if !block_ref.deleted {
if !block_ref.deleted.get() {
last_hash = Some(block_ref.block);
self.put_to_resync(&block_ref.block, 0)?;
self.put_to_resync(&block_ref.block, Duration::from_secs(0))?;
}
i += 1;
if i & 0xFF == 0 && *must_exit.borrow() {
@ -427,80 +447,69 @@ impl BlockManager {
}
// 2. Repair blocks actually on disk
let mut ls_data_dir = fs::read_dir(&self.data_dir).await?;
while let Some(data_dir_ent) = ls_data_dir.next().await {
let data_dir_ent = data_dir_ent?;
let dir_name = data_dir_ent.file_name();
let dir_name = match dir_name.into_string() {
Ok(x) => x,
Err(_) => continue,
};
if dir_name.len() != 2 || hex::decode(&dir_name).is_err() {
continue;
}
self.repair_aux_read_dir_rec(&self.data_dir, must_exit)
.await?;
let mut ls_data_dir_2 = match fs::read_dir(data_dir_ent.path()).await {
Err(e) => {
warn!(
"Warning: could not list dir {:?}: {}",
data_dir_ent.path().to_str(),
e
);
continue;
}
Ok(x) => x,
};
while let Some(file) = ls_data_dir_2.next().await {
let file = file?;
let file_name = file.file_name();
let file_name = match file_name.into_string() {
Ok(())
}
fn repair_aux_read_dir_rec<'a>(
&'a self,
path: &'a PathBuf,
must_exit: &'a watch::Receiver<bool>,
) -> BoxFuture<'a, Result<(), Error>> {
// Lists all blocks on disk and adds them to the resync queue.
// This allows us to find blocks we are storing but don't actually need,
// so that we can offload them if necessary and then delete them locally.
async move {
let mut ls_data_dir = fs::read_dir(path).await?;
loop {
let data_dir_ent = ls_data_dir.next_entry().await?;
let data_dir_ent = match data_dir_ent {
Some(x) => x,
None => break,
};
let name = data_dir_ent.file_name();
let name = match name.into_string() {
Ok(x) => x,
Err(_) => continue,
};
if file_name.len() != 64 {
continue;
let ent_type = data_dir_ent.file_type().await?;
if name.len() == 2 && hex::decode(&name).is_ok() && ent_type.is_dir() {
self.repair_aux_read_dir_rec(&data_dir_ent.path(), must_exit)
.await?;
} else if name.len() == 64 {
let hash_bytes = match hex::decode(&name) {
Ok(h) => h,
Err(_) => continue,
};
let mut hash = [0u8; 32];
hash.copy_from_slice(&hash_bytes[..]);
self.put_to_resync(&hash.into(), Duration::from_secs(0))?;
}
let hash_bytes = match hex::decode(&file_name) {
Ok(h) => h,
Err(_) => continue,
};
let mut hash = [0u8; 32];
hash.copy_from_slice(&hash_bytes[..]);
self.put_to_resync(&hash.into(), 0)?;
if *must_exit.borrow() {
return Ok(());
break;
}
}
Ok(())
}
Ok(())
.boxed()
}
pub fn resync_queue_len(&self) -> usize {
self.resync_queue.len()
}
pub fn rc_len(&self) -> usize {
self.rc.len()
}
}
fn u64_from_bytes(bytes: &[u8]) -> u64 {
assert!(bytes.len() == 8);
fn u64_from_be_bytes<T: AsRef<[u8]>>(bytes: T) -> u64 {
assert!(bytes.as_ref().len() == 8);
let mut x8 = [0u8; 8];
x8.copy_from_slice(bytes);
x8.copy_from_slice(bytes.as_ref());
u64::from_be_bytes(x8)
}
fn rc_merge(_key: &[u8], old: Option<&[u8]>, new: &[u8]) -> Option<Vec<u8>> {
let old = old.map(u64_from_bytes).unwrap_or(0);
assert!(new.len() == 1);
let new = match new[0] {
0 => {
if old > 0 {
old - 1
} else {
0
}
}
1 => old + 1,
_ => unreachable!(),
};
if new == 0 {
None
} else {
Some(u64::to_be_bytes(new).to_vec())
}
}

View file

@ -1,11 +1,9 @@
use async_trait::async_trait;
use serde::{Deserialize, Serialize};
use std::sync::Arc;
use garage_util::background::*;
use garage_util::data::*;
use garage_util::error::Error;
use garage_table::crdt::CRDT;
use garage_table::*;
use crate::block::*;
@ -19,7 +17,7 @@ pub struct BlockRef {
pub version: UUID,
// Keep track of deleted status
pub deleted: bool,
pub deleted: crdt::Bool,
}
impl Entry<Hash, UUID> for BlockRef {
@ -29,40 +27,44 @@ impl Entry<Hash, UUID> for BlockRef {
fn sort_key(&self) -> &UUID {
&self.version
}
fn is_tombstone(&self) -> bool {
self.deleted.get()
}
}
impl CRDT for BlockRef {
fn merge(&mut self, other: &Self) {
if other.deleted {
self.deleted = true;
}
self.deleted.merge(&other.deleted);
}
}
pub struct BlockRefTable {
pub background: Arc<BackgroundRunner>,
pub block_manager: Arc<BlockManager>,
}
#[async_trait]
impl TableSchema for BlockRefTable {
type P = Hash;
type S = UUID;
type E = BlockRef;
type Filter = DeletedFilter;
async fn updated(&self, old: Option<Self::E>, new: Option<Self::E>) -> Result<(), Error> {
fn updated(&self, old: Option<Self::E>, new: Option<Self::E>) {
let block = &old.as_ref().or(new.as_ref()).unwrap().block;
let was_before = old.as_ref().map(|x| !x.deleted).unwrap_or(false);
let is_after = new.as_ref().map(|x| !x.deleted).unwrap_or(false);
let was_before = old.as_ref().map(|x| !x.deleted.get()).unwrap_or(false);
let is_after = new.as_ref().map(|x| !x.deleted.get()).unwrap_or(false);
if is_after && !was_before {
self.block_manager.block_incref(block)?;
if let Err(e) = self.block_manager.block_incref(block) {
warn!("block_incref failed for block {:?}: {}", block, e);
}
}
if was_before && !is_after {
self.block_manager.block_decref(block)?;
if let Err(e) = self.block_manager.block_decref(block) {
warn!("block_decref failed for block {:?}: {}", block, e);
}
}
Ok(())
}
fn matches_filter(entry: &Self::E, filter: &Self::Filter) -> bool {
filter.apply(entry.deleted)
filter.apply(entry.deleted.get())
}
}

View file

@ -1,15 +1,15 @@
use async_trait::async_trait;
use serde::{Deserialize, Serialize};
use garage_table::crdt::CRDT;
use garage_table::*;
use garage_util::error::Error;
use crate::key_table::PermissionSet;
use model010::bucket_table as prev;
/// A bucket is a collection of objects
///
/// Its parameters are not directly accessible as:
/// - It must be possible to merge paramaters, hence the use of a LWW CRDT.
/// - A bucket has 2 states, Present or Deleted and parameters make sense only if present.
#[derive(PartialEq, Clone, Debug, Serialize, Deserialize)]
pub struct Bucket {
// Primary key
@ -21,27 +21,49 @@ pub struct Bucket {
#[derive(PartialEq, Clone, Debug, Serialize, Deserialize)]
pub enum BucketState {
Deleted,
Present(crdt::LWWMap<String, PermissionSet>),
Present(BucketParams),
}
impl CRDT for BucketState {
fn merge(&mut self, o: &Self) {
match o {
BucketState::Deleted => *self = BucketState::Deleted,
BucketState::Present(other_ak) => {
if let BucketState::Present(ak) = self {
ak.merge(other_ak);
BucketState::Present(other_params) => {
if let BucketState::Present(params) = self {
params.merge(other_params);
}
}
}
}
}
#[derive(PartialEq, Clone, Debug, Serialize, Deserialize)]
pub struct BucketParams {
pub authorized_keys: crdt::LWWMap<String, PermissionSet>,
pub website: crdt::LWW<bool>,
}
impl CRDT for BucketParams {
fn merge(&mut self, o: &Self) {
self.authorized_keys.merge(&o.authorized_keys);
self.website.merge(&o.website);
}
}
impl BucketParams {
pub fn new() -> Self {
BucketParams {
authorized_keys: crdt::LWWMap::new(),
website: crdt::LWW::new(false),
}
}
}
impl Bucket {
pub fn new(name: String) -> Self {
Bucket {
name,
state: crdt::LWW::new(BucketState::Present(crdt::LWWMap::new())),
state: crdt::LWW::new(BucketState::Present(BucketParams::new())),
}
}
pub fn is_deleted(&self) -> bool {
@ -50,7 +72,7 @@ impl Bucket {
pub fn authorized_keys(&self) -> &[(String, u64, PermissionSet)] {
match self.state.get() {
BucketState::Deleted => &[],
BucketState::Present(ak) => ak.items(),
BucketState::Present(state) => state.authorized_keys.items(),
}
}
}
@ -62,7 +84,9 @@ impl Entry<EmptyKey, String> for Bucket {
fn sort_key(&self) -> &String {
&self.name
}
}
impl CRDT for Bucket {
fn merge(&mut self, other: &Self) {
self.state.merge(&other.state);
}
@ -70,47 +94,13 @@ impl Entry<EmptyKey, String> for Bucket {
pub struct BucketTable;
#[async_trait]
impl TableSchema for BucketTable {
type P = EmptyKey;
type S = String;
type E = Bucket;
type Filter = DeletedFilter;
async fn updated(&self, _old: Option<Self::E>, _new: Option<Self::E>) -> Result<(), Error> {
Ok(())
}
fn matches_filter(entry: &Self::E, filter: &Self::Filter) -> bool {
filter.apply(entry.is_deleted())
}
fn try_migrate(bytes: &[u8]) -> Option<Self::E> {
let old = match rmp_serde::decode::from_read_ref::<_, prev::Bucket>(bytes) {
Ok(x) => x,
Err(_) => return None,
};
if old.deleted {
Some(Bucket {
name: old.name,
state: crdt::LWW::migrate_from_raw(old.timestamp, BucketState::Deleted),
})
} else {
let mut keys = crdt::LWWMap::new();
for ak in old.authorized_keys() {
keys.merge(&crdt::LWWMap::migrate_from_raw_item(
ak.key_id.clone(),
ak.timestamp,
PermissionSet {
allow_read: ak.allow_read,
allow_write: ak.allow_write,
},
));
}
Some(Bucket {
name: old.name,
state: crdt::LWW::migrate_from_raw(old.timestamp, BucketState::Present(keys)),
})
}
}
}

View file

@ -7,8 +7,8 @@ use garage_rpc::membership::System;
use garage_rpc::rpc_client::RpcHttpClient;
use garage_rpc::rpc_server::RpcServer;
use garage_table::table_fullcopy::*;
use garage_table::table_sharded::*;
use garage_table::replication::fullcopy::*;
use garage_table::replication::sharded::*;
use garage_table::*;
use crate::block::*;
@ -35,7 +35,7 @@ pub struct Garage {
}
impl Garage {
pub async fn new(
pub fn new(
config: Config,
db: sled::Db,
background: Arc<BackgroundRunner>,
@ -54,21 +54,23 @@ impl Garage {
);
let data_rep_param = TableShardedReplication {
system: system.clone(),
replication_factor: config.data_replication_factor,
write_quorum: (config.data_replication_factor + 1) / 2,
read_quorum: 1,
};
let meta_rep_param = TableShardedReplication {
system: system.clone(),
replication_factor: config.meta_replication_factor,
write_quorum: (config.meta_replication_factor + 1) / 2,
read_quorum: (config.meta_replication_factor + 1) / 2,
};
let control_rep_param = TableFullReplication::new(
config.meta_epidemic_fanout,
(config.meta_epidemic_fanout + 1) / 2,
);
let control_rep_param = TableFullReplication {
system: system.clone(),
max_faults: config.control_write_max_faults,
};
info!("Initialize block manager...");
let block_manager = BlockManager::new(
@ -82,7 +84,6 @@ impl Garage {
info!("Initialize block_ref_table...");
let block_ref_table = Table::new(
BlockRefTable {
background: background.clone(),
block_manager: block_manager.clone(),
},
data_rep_param.clone(),
@ -90,8 +91,7 @@ impl Garage {
&db,
"block_ref".to_string(),
rpc_server,
)
.await;
);
info!("Initialize version_table...");
let version_table = Table::new(
@ -104,8 +104,7 @@ impl Garage {
&db,
"version".to_string(),
rpc_server,
)
.await;
);
info!("Initialize object_table...");
let object_table = Table::new(
@ -118,8 +117,7 @@ impl Garage {
&db,
"object".to_string(),
rpc_server,
)
.await;
);
info!("Initialize bucket_table...");
let bucket_table = Table::new(
@ -129,8 +127,7 @@ impl Garage {
&db,
"bucket".to_string(),
rpc_server,
)
.await;
);
info!("Initialize key_table_table...");
let key_table = Table::new(
@ -140,8 +137,7 @@ impl Garage {
&db,
"key".to_string(),
rpc_server,
)
.await;
);
info!("Initialize Garage...");
let garage = Arc::new(Self {
@ -159,7 +155,7 @@ impl Garage {
info!("Start block manager background thread...");
garage.block_manager.garage.swap(Some(garage.clone()));
garage.block_manager.clone().spawn_background_worker().await;
garage.block_manager.clone().spawn_background_worker();
garage
}

View file

@ -1,13 +1,8 @@
use async_trait::async_trait;
use serde::{Deserialize, Serialize};
use garage_table::crdt::CRDT;
use garage_table::crdt::*;
use garage_table::*;
use garage_util::error::Error;
use model010::key_table as prev;
#[derive(PartialEq, Clone, Debug, Serialize, Deserialize)]
pub struct Key {
// Primary key
@ -39,6 +34,15 @@ impl Key {
authorized_buckets: crdt::LWWMap::new(),
}
}
pub fn import(key_id: &str, secret_key: &str, name: &str) -> Self {
Self {
key_id: key_id.to_string(),
secret_key: secret_key.to_string(),
name: crdt::LWW::new(name.to_string()),
deleted: crdt::Bool::new(false),
authorized_buckets: crdt::LWWMap::new(),
}
}
pub fn delete(key_id: String) -> Self {
Self {
key_id,
@ -69,6 +73,10 @@ pub struct PermissionSet {
pub allow_write: bool,
}
impl AutoCRDT for PermissionSet {
const WARN_IF_DIFFERENT: bool = true;
}
impl Entry<EmptyKey, String> for Key {
fn partition_key(&self) -> &EmptyKey {
&EmptyKey
@ -76,60 +84,43 @@ impl Entry<EmptyKey, String> for Key {
fn sort_key(&self) -> &String {
&self.key_id
}
}
impl CRDT for Key {
fn merge(&mut self, other: &Self) {
self.name.merge(&other.name);
self.deleted.merge(&other.deleted);
if self.deleted.get() {
self.authorized_buckets.clear();
return;
} else {
self.authorized_buckets.merge(&other.authorized_buckets);
}
self.authorized_buckets.merge(&other.authorized_buckets);
}
}
pub struct KeyTable;
#[async_trait]
#[derive(Clone, Debug, Serialize, Deserialize)]
pub enum KeyFilter {
Deleted(DeletedFilter),
Matches(String),
}
impl TableSchema for KeyTable {
type P = EmptyKey;
type S = String;
type E = Key;
type Filter = DeletedFilter;
async fn updated(&self, _old: Option<Self::E>, _new: Option<Self::E>) -> Result<(), Error> {
Ok(())
}
type Filter = KeyFilter;
fn matches_filter(entry: &Self::E, filter: &Self::Filter) -> bool {
filter.apply(entry.deleted.get())
}
fn try_migrate(bytes: &[u8]) -> Option<Self::E> {
let old = match rmp_serde::decode::from_read_ref::<_, prev::Key>(bytes) {
Ok(x) => x,
Err(_) => return None,
};
let mut new = Self::E {
key_id: old.key_id.clone(),
secret_key: old.secret_key.clone(),
name: crdt::LWW::migrate_from_raw(old.name_timestamp, old.name.clone()),
deleted: crdt::Bool::new(old.deleted),
authorized_buckets: crdt::LWWMap::new(),
};
for ab in old.authorized_buckets() {
let it = crdt::LWWMap::migrate_from_raw_item(
ab.bucket.clone(),
ab.timestamp,
PermissionSet {
allow_read: ab.allow_read,
allow_write: ab.allow_write,
},
);
new.authorized_buckets.merge(&it);
match filter {
KeyFilter::Deleted(df) => df.apply(entry.deleted.get()),
KeyFilter::Matches(pat) => {
let pat = pat.to_lowercase();
entry.key_id.to_lowercase().starts_with(&pat)
|| entry.name.get().to_lowercase() == pat
}
}
Some(new)
}
}

View file

@ -1,19 +1,16 @@
use async_trait::async_trait;
use serde::{Deserialize, Serialize};
use std::collections::BTreeMap;
use std::sync::Arc;
use garage_util::background::BackgroundRunner;
use garage_util::data::*;
use garage_util::error::Error;
use garage_table::table_sharded::*;
use garage_table::crdt::*;
use garage_table::replication::sharded::*;
use garage_table::*;
use crate::version_table::*;
use model010::object_table as prev;
#[derive(PartialEq, Clone, Debug, Serialize, Deserialize)]
pub struct Object {
// Primary key
@ -72,7 +69,7 @@ pub enum ObjectVersionState {
Aborted,
}
impl ObjectVersionState {
impl CRDT for ObjectVersionState {
fn merge(&mut self, other: &Self) {
use ObjectVersionState::*;
match other {
@ -93,37 +90,30 @@ impl ObjectVersionState {
}
}
#[derive(PartialEq, Clone, Debug, Serialize, Deserialize)]
#[derive(PartialEq, Eq, PartialOrd, Ord, Clone, Debug, Serialize, Deserialize)]
pub enum ObjectVersionData {
DeleteMarker,
Inline(ObjectVersionMeta, #[serde(with = "serde_bytes")] Vec<u8>),
FirstBlock(ObjectVersionMeta, Hash),
}
#[derive(PartialEq, Clone, Debug, Serialize, Deserialize)]
impl AutoCRDT for ObjectVersionData {
const WARN_IF_DIFFERENT: bool = true;
}
#[derive(PartialEq, Eq, PartialOrd, Ord, Clone, Debug, Serialize, Deserialize)]
pub struct ObjectVersionMeta {
pub headers: ObjectVersionHeaders,
pub size: u64,
pub etag: String,
}
#[derive(PartialEq, Clone, Debug, Serialize, Deserialize)]
#[derive(PartialEq, Eq, PartialOrd, Ord, Clone, Debug, Serialize, Deserialize)]
pub struct ObjectVersionHeaders {
pub content_type: String,
pub other: BTreeMap<String, String>,
}
impl ObjectVersionData {
fn merge(&mut self, b: &Self) {
if *self != *b {
warn!(
"Inconsistent object version data: {:?} (local) vs {:?} (remote)",
self, b
);
}
}
}
impl ObjectVersion {
fn cmp_key(&self) -> (u64, UUID) {
(self.timestamp, self.uuid)
@ -156,8 +146,16 @@ impl Entry<String, String> for Object {
fn sort_key(&self) -> &String {
&self.key
}
fn is_tombstone(&self) -> bool {
self.versions.len() == 1
&& self.versions[0].state
== ObjectVersionState::Complete(ObjectVersionData::DeleteMarker)
}
}
impl CRDT for Object {
fn merge(&mut self, other: &Self) {
// Merge versions from other into here
for other_v in other.versions.iter() {
match self
.versions
@ -171,6 +169,9 @@ impl Entry<String, String> for Object {
}
}
}
// Remove versions which are obsolete, i.e. those that come
// before the last version which .is_complete().
let last_complete = self
.versions
.iter()
@ -191,96 +192,41 @@ pub struct ObjectTable {
pub version_table: Arc<Table<VersionTable, TableShardedReplication>>,
}
#[async_trait]
impl TableSchema for ObjectTable {
type P = String;
type S = String;
type E = Object;
type Filter = DeletedFilter;
async fn updated(&self, old: Option<Self::E>, new: Option<Self::E>) -> Result<(), Error> {
fn updated(&self, old: Option<Self::E>, new: Option<Self::E>) {
let version_table = self.version_table.clone();
if let (Some(old_v), Some(new_v)) = (old, new) {
// Propagate deletion of old versions
for v in old_v.versions.iter() {
let newly_deleted = match new_v
.versions
.binary_search_by(|nv| nv.cmp_key().cmp(&v.cmp_key()))
{
Err(_) => true,
Ok(i) => {
new_v.versions[i].state == ObjectVersionState::Aborted
&& v.state != ObjectVersionState::Aborted
self.background.spawn(async move {
if let (Some(old_v), Some(new_v)) = (old, new) {
// Propagate deletion of old versions
for v in old_v.versions.iter() {
let newly_deleted = match new_v
.versions
.binary_search_by(|nv| nv.cmp_key().cmp(&v.cmp_key()))
{
Err(_) => true,
Ok(i) => {
new_v.versions[i].state == ObjectVersionState::Aborted
&& v.state != ObjectVersionState::Aborted
}
};
if newly_deleted {
let deleted_version =
Version::new(v.uuid, old_v.bucket.clone(), old_v.key.clone(), true);
version_table.insert(&deleted_version).await?;
}
};
if newly_deleted {
let deleted_version = Version::new(
v.uuid,
old_v.bucket.clone(),
old_v.key.clone(),
true,
vec![],
);
version_table.insert(&deleted_version).await?;
}
}
}
Ok(())
Ok(())
})
}
fn matches_filter(entry: &Self::E, filter: &Self::Filter) -> bool {
let deleted = !entry.versions.iter().any(|v| v.is_data());
filter.apply(deleted)
}
fn try_migrate(bytes: &[u8]) -> Option<Self::E> {
let old = match rmp_serde::decode::from_read_ref::<_, prev::Object>(bytes) {
Ok(x) => x,
Err(_) => return None,
};
let new_v = old
.versions()
.iter()
.map(migrate_version)
.collect::<Vec<_>>();
let new = Object::new(old.bucket.clone(), old.key.clone(), new_v);
Some(new)
}
}
fn migrate_version(old: &prev::ObjectVersion) -> ObjectVersion {
let headers = ObjectVersionHeaders {
content_type: old.mime_type.clone(),
other: BTreeMap::new(),
};
let meta = ObjectVersionMeta {
headers: headers.clone(),
size: old.size,
etag: "".to_string(),
};
let state = match old.state {
prev::ObjectVersionState::Uploading => ObjectVersionState::Uploading(headers),
prev::ObjectVersionState::Aborted => ObjectVersionState::Aborted,
prev::ObjectVersionState::Complete => match &old.data {
prev::ObjectVersionData::Uploading => ObjectVersionState::Uploading(headers),
prev::ObjectVersionData::DeleteMarker => {
ObjectVersionState::Complete(ObjectVersionData::DeleteMarker)
}
prev::ObjectVersionData::Inline(x) => {
ObjectVersionState::Complete(ObjectVersionData::Inline(meta, x.clone()))
}
prev::ObjectVersionData::FirstBlock(h) => {
let mut hash = [0u8; 32];
hash.copy_from_slice(h.as_ref());
ObjectVersionState::Complete(ObjectVersionData::FirstBlock(meta, Hash::from(hash)))
}
},
};
let mut uuid = [0u8; 32];
uuid.copy_from_slice(old.uuid.as_ref());
ObjectVersion {
uuid: UUID::from(uuid),
timestamp: old.timestamp,
state,
}
}

View file

@ -1,12 +1,11 @@
use async_trait::async_trait;
use serde::{Deserialize, Serialize};
use std::sync::Arc;
use garage_util::background::BackgroundRunner;
use garage_util::data::*;
use garage_util::error::Error;
use garage_table::table_sharded::*;
use garage_table::crdt::*;
use garage_table::replication::sharded::*;
use garage_table::*;
use crate::block_ref_table::*;
@ -17,8 +16,11 @@ pub struct Version {
pub uuid: UUID,
// Actual data: the blocks for this version
pub deleted: bool,
blocks: Vec<VersionBlock>,
// In the case of a multipart upload, also store the etags
// of individual parts and check them when doing CompleteMultipartUpload
pub deleted: crdt::Bool,
pub blocks: crdt::Map<VersionBlockKey, VersionBlock>,
pub parts_etags: crdt::Map<u64, String>,
// Back link to bucket+key so that we can figure if
// this was deleted later on
@ -27,56 +29,46 @@ pub struct Version {
}
impl Version {
pub fn new(
uuid: UUID,
bucket: String,
key: String,
deleted: bool,
blocks: Vec<VersionBlock>,
) -> Self {
let mut ret = Self {
pub fn new(uuid: UUID, bucket: String, key: String, deleted: bool) -> Self {
Self {
uuid,
deleted,
blocks: vec![],
deleted: deleted.into(),
blocks: crdt::Map::new(),
parts_etags: crdt::Map::new(),
bucket,
key,
};
for b in blocks {
ret.add_block(b)
.expect("Twice the same VersionBlock in Version constructor");
}
ret
}
/// Adds a block if it wasn't already present
pub fn add_block(&mut self, new: VersionBlock) -> Result<(), ()> {
match self
.blocks
.binary_search_by(|b| b.cmp_key().cmp(&new.cmp_key()))
{
Err(i) => {
self.blocks.insert(i, new);
Ok(())
}
Ok(_) => Err(()),
}
}
pub fn blocks(&self) -> &[VersionBlock] {
&self.blocks[..]
}
}
#[derive(PartialEq, Clone, Debug, Serialize, Deserialize)]
pub struct VersionBlock {
#[derive(PartialEq, Eq, Clone, Copy, Debug, Serialize, Deserialize)]
pub struct VersionBlockKey {
pub part_number: u64,
pub offset: u64,
}
impl Ord for VersionBlockKey {
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
self.part_number
.cmp(&other.part_number)
.then(self.offset.cmp(&other.offset))
}
}
impl PartialOrd for VersionBlockKey {
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
Some(self.cmp(other))
}
}
#[derive(PartialEq, Eq, Ord, PartialOrd, Clone, Copy, Debug, Serialize, Deserialize)]
pub struct VersionBlock {
pub hash: Hash,
pub size: u64,
}
impl VersionBlock {
fn cmp_key(&self) -> (u64, u64) {
(self.part_number, self.offset)
}
impl AutoCRDT for VersionBlock {
const WARN_IF_DIFFERENT: bool = true;
}
impl Entry<Hash, EmptyKey> for Version {
@ -86,23 +78,21 @@ impl Entry<Hash, EmptyKey> for Version {
fn sort_key(&self) -> &EmptyKey {
&EmptyKey
}
fn is_tombstone(&self) -> bool {
self.deleted.get()
}
}
impl CRDT for Version {
fn merge(&mut self, other: &Self) {
if other.deleted {
self.deleted = true;
self.deleted.merge(&other.deleted);
if self.deleted.get() {
self.blocks.clear();
} else if !self.deleted {
for bi in other.blocks.iter() {
match self
.blocks
.binary_search_by(|x| x.cmp_key().cmp(&bi.cmp_key()))
{
Ok(_) => (),
Err(pos) => {
self.blocks.insert(pos, bi.clone());
}
}
}
self.parts_etags.clear();
} else {
self.blocks.merge(&other.blocks);
self.parts_etags.merge(&other.parts_etags);
}
}
}
@ -112,34 +102,36 @@ pub struct VersionTable {
pub block_ref_table: Arc<Table<BlockRefTable, TableShardedReplication>>,
}
#[async_trait]
impl TableSchema for VersionTable {
type P = Hash;
type S = EmptyKey;
type E = Version;
type Filter = DeletedFilter;
async fn updated(&self, old: Option<Self::E>, new: Option<Self::E>) -> Result<(), Error> {
fn updated(&self, old: Option<Self::E>, new: Option<Self::E>) {
let block_ref_table = self.block_ref_table.clone();
if let (Some(old_v), Some(new_v)) = (old, new) {
// Propagate deletion of version blocks
if new_v.deleted && !old_v.deleted {
let deleted_block_refs = old_v
.blocks
.iter()
.map(|vb| BlockRef {
block: vb.hash,
version: old_v.uuid,
deleted: true,
})
.collect::<Vec<_>>();
block_ref_table.insert_many(&deleted_block_refs[..]).await?;
self.background.spawn(async move {
if let (Some(old_v), Some(new_v)) = (old, new) {
// Propagate deletion of version blocks
if new_v.deleted.get() && !old_v.deleted.get() {
let deleted_block_refs = old_v
.blocks
.items()
.iter()
.map(|(_k, vb)| BlockRef {
block: vb.hash,
version: old_v.uuid,
deleted: true.into(),
})
.collect::<Vec<_>>();
block_ref_table.insert_many(&deleted_block_refs[..]).await?;
}
}
}
Ok(())
Ok(())
})
}
fn matches_filter(entry: &Self::E, filter: &Self::Filter) -> bool {
filter.apply(entry.deleted)
filter.apply(entry.deleted.get())
}
}

View file

@ -1,9 +1,9 @@
[package]
name = "garage_rpc"
version = "0.1.0"
version = "0.2.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "GPL-3.0"
license = "AGPL-3.0"
description = "Cluster membership management and RPC protocol for the Garage object store"
repository = "https://git.deuxfleurs.fr/Deuxfleurs/garage"
@ -13,29 +13,28 @@ path = "lib.rs"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
garage_util = { version = "0.1", path = "../util" }
garage_util = { version = "0.2.1", path = "../util" }
bytes = "0.4"
rand = "0.7"
hex = "0.3"
sha2 = "0.8"
arc-swap = "0.4"
arc-swap = "1.0"
bytes = "1.0"
gethostname = "0.2"
hex = "0.4"
log = "0.4"
rmp-serde = "0.14.3"
rmp-serde = "0.15"
serde = { version = "1.0", default-features = false, features = ["derive", "rc"] }
serde_json = "1.0"
futures = "0.3"
futures-util = "0.3"
tokio = { version = "0.2", default-features = false, features = ["rt-core", "rt-threaded", "io-driver", "net", "tcp", "time", "macros", "sync", "signal", "fs"] }
tokio = { version = "1.0", default-features = false, features = ["rt", "rt-multi-thread", "io-util", "net", "time", "macros", "sync", "signal", "fs"] }
tokio-stream = { version = "0.1", features = ["net"] }
http = "0.2"
hyper = "0.13"
rustls = "0.17"
tokio-rustls = "0.13"
hyper-rustls = { version = "0.20", default-features = false }
hyper = { version = "0.14", features = ["full"] }
hyper-rustls = { version = "0.22", default-features = false }
rustls = "0.19"
tokio-rustls = "0.22"
webpki = "0.21"

View file

@ -2,7 +2,10 @@
extern crate log;
pub mod consul;
pub(crate) mod tls_util;
pub mod membership;
pub mod ring;
pub mod rpc_client;
pub mod rpc_server;
pub mod tls_util;

View file

@ -1,6 +1,5 @@
use std::collections::HashMap;
use std::hash::Hash as StdHash;
use std::hash::Hasher;
use std::fmt::Write as FmtWrite;
use std::io::{Read, Write};
use std::net::{IpAddr, SocketAddr};
use std::path::PathBuf;
@ -12,21 +11,22 @@ use futures::future::join_all;
use futures::select;
use futures_util::future::*;
use serde::{Deserialize, Serialize};
use sha2::{Digest, Sha256};
use tokio::prelude::*;
use tokio::sync::watch;
use tokio::sync::Mutex;
use garage_util::background::BackgroundRunner;
use garage_util::data::*;
use garage_util::error::Error;
use garage_util::persister::Persister;
use garage_util::time::*;
use crate::consul::get_consul_nodes;
use crate::ring::*;
use crate::rpc_client::*;
use crate::rpc_server::*;
const PING_INTERVAL: Duration = Duration::from_secs(10);
const CONSUL_INTERVAL: Duration = Duration::from_secs(60);
const DISCOVERY_INTERVAL: Duration = Duration::from_secs(60);
const PING_TIMEOUT: Duration = Duration::from_secs(2);
const MAX_FAILURES_BEFORE_CONSIDERED_DOWN: usize = 5;
@ -46,13 +46,13 @@ impl RpcMessage for Message {}
#[derive(Debug, Serialize, Deserialize)]
pub struct PingMessage {
pub id: UUID,
pub rpc_port: u16,
id: UUID,
rpc_port: u16,
pub status_hash: Hash,
pub config_version: u64,
status_hash: Hash,
config_version: u64,
pub state_info: StateInfo,
state_info: StateInfo,
}
#[derive(Clone, Debug, Serialize, Deserialize)]
@ -66,37 +66,31 @@ pub struct AdvertisedNode {
pub state_info: StateInfo,
}
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct NetworkConfig {
pub members: HashMap<UUID, NetworkConfigEntry>,
pub version: u64,
}
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct NetworkConfigEntry {
pub datacenter: String,
pub n_tokens: u32,
pub tag: String,
}
pub struct System {
pub id: UUID,
pub data_dir: PathBuf,
pub rpc_local_port: u16,
pub state_info: StateInfo,
persist_config: Persister<NetworkConfig>,
persist_status: Persister<Vec<AdvertisedNode>>,
rpc_local_port: u16,
pub rpc_http_client: Arc<RpcHttpClient>,
state_info: StateInfo,
rpc_http_client: Arc<RpcHttpClient>,
rpc_client: Arc<RpcClient<Message>>,
pub status: watch::Receiver<Arc<Status>>,
pub(crate) status: watch::Receiver<Arc<Status>>,
pub ring: watch::Receiver<Arc<Ring>>,
update_lock: Mutex<(watch::Sender<Arc<Status>>, watch::Sender<Arc<Ring>>)>,
update_lock: Mutex<Updaters>,
pub background: Arc<BackgroundRunner>,
}
struct Updaters {
update_status: watch::Sender<Arc<Status>>,
update_ring: watch::Sender<Arc<Ring>>,
}
#[derive(Debug, Clone)]
pub struct Status {
pub nodes: HashMap<UUID, Arc<StatusEntry>>,
@ -122,20 +116,6 @@ pub struct StateInfo {
pub hostname: String,
}
#[derive(Clone)]
pub struct Ring {
pub config: NetworkConfig,
pub ring: Vec<RingEntry>,
pub n_datacenters: usize,
}
#[derive(Clone, Debug)]
pub struct RingEntry {
pub location: Hash,
pub node: UUID,
pub datacenter: u64,
}
impl Status {
fn handle_ping(&mut self, ip: IpAddr, info: &PingMessage) -> bool {
let addr = SocketAddr::new(ip, info.rpc_port);
@ -161,96 +141,33 @@ impl Status {
let mut nodes = self.nodes.iter().collect::<Vec<_>>();
nodes.sort_unstable_by_key(|(id, _status)| *id);
let mut hasher = Sha256::new();
let mut nodes_txt = String::new();
debug!("Current set of pingable nodes: --");
for (id, status) in nodes {
debug!("{} {}", hex::encode(&id), status.addr);
hasher.input(format!("{} {}\n", hex::encode(&id), status.addr));
writeln!(&mut nodes_txt, "{} {}", hex::encode(&id), status.addr).unwrap();
}
debug!("END --");
self.hash
.as_slice_mut()
.copy_from_slice(&hasher.result()[..]);
}
}
impl Ring {
fn rebuild_ring(&mut self) {
let mut new_ring = vec![];
let mut datacenters = vec![];
for (id, config) in self.config.members.iter() {
let mut dc_hasher = std::collections::hash_map::DefaultHasher::new();
config.datacenter.hash(&mut dc_hasher);
let datacenter = dc_hasher.finish();
if !datacenters.contains(&datacenter) {
datacenters.push(datacenter);
}
for i in 0..config.n_tokens {
let location = hash(format!("{} {}", hex::encode(&id), i).as_bytes());
new_ring.push(RingEntry {
location: location.into(),
node: *id,
datacenter,
})
}
}
new_ring.sort_unstable_by(|x, y| x.location.cmp(&y.location));
self.ring = new_ring;
self.n_datacenters = datacenters.len();
// eprintln!("RING: --");
// for e in self.ring.iter() {
// eprintln!("{:?}", e);
// }
// eprintln!("END --");
self.hash = blake2sum(nodes_txt.as_bytes());
}
pub fn walk_ring(&self, from: &Hash, n: usize) -> Vec<UUID> {
if n >= self.config.members.len() {
return self.config.members.keys().cloned().collect::<Vec<_>>();
fn to_serializable_membership(&self, system: &System) -> Vec<AdvertisedNode> {
let mut mem = vec![];
for (node, status) in self.nodes.iter() {
let state_info = if *node == system.id {
system.state_info.clone()
} else {
status.state_info.clone()
};
mem.push(AdvertisedNode {
id: *node,
addr: status.addr,
is_up: status.is_up(),
last_seen: status.last_seen,
state_info,
});
}
let start = match self.ring.binary_search_by(|x| x.location.cmp(from)) {
Ok(i) => i,
Err(i) => {
if i == 0 {
self.ring.len() - 1
} else {
i - 1
}
}
};
self.walk_ring_from_pos(start, n)
}
fn walk_ring_from_pos(&self, start: usize, n: usize) -> Vec<UUID> {
if n >= self.config.members.len() {
return self.config.members.keys().cloned().collect::<Vec<_>>();
}
let mut ret = vec![];
let mut datacenters = vec![];
let mut delta = 0;
while ret.len() < n {
let i = (start + delta) % self.ring.len();
delta += 1;
if !datacenters.contains(&self.ring[i].datacenter) {
ret.push(self.ring[i].node);
datacenters.push(self.ring[i].datacenter);
} else if datacenters.len() == self.n_datacenters && !ret.contains(&self.ring[i].node) {
ret.push(self.ring[i].node);
}
}
ret
mem
}
}
@ -277,46 +194,30 @@ fn gen_node_id(metadata_dir: &PathBuf) -> Result<UUID, Error> {
}
}
fn read_network_config(metadata_dir: &PathBuf) -> Result<NetworkConfig, Error> {
let mut path = metadata_dir.clone();
path.push("network_config");
let mut file = std::fs::OpenOptions::new()
.read(true)
.open(path.as_path())?;
let mut net_config_bytes = vec![];
file.read_to_end(&mut net_config_bytes)?;
let net_config = rmp_serde::decode::from_read_ref(&net_config_bytes[..])
.expect("Unable to parse network configuration file (has version format changed?).");
Ok(net_config)
}
impl System {
pub fn new(
data_dir: PathBuf,
metadata_dir: PathBuf,
rpc_http_client: Arc<RpcHttpClient>,
background: Arc<BackgroundRunner>,
rpc_server: &mut RpcServer,
) -> Arc<Self> {
let id = gen_node_id(&data_dir).expect("Unable to read or generate node ID");
let id = gen_node_id(&metadata_dir).expect("Unable to read or generate node ID");
info!("Node ID: {}", hex::encode(&id));
let net_config = match read_network_config(&data_dir) {
let persist_config = Persister::new(&metadata_dir, "network_config");
let persist_status = Persister::new(&metadata_dir, "peer_info");
let net_config = match persist_config.load() {
Ok(x) => x,
Err(e) => {
info!(
"No valid previous network configuration stored ({}), starting fresh.",
e
);
NetworkConfig {
members: HashMap::new(),
version: 0,
}
NetworkConfig::new()
}
};
let mut status = Status {
nodes: HashMap::new(),
hash: Hash::default(),
@ -330,12 +231,7 @@ impl System {
.unwrap_or("<invalid utf-8>".to_string()),
};
let mut ring = Ring {
config: net_config,
ring: Vec::new(),
n_datacenters: 0,
};
ring.rebuild_ring();
let ring = Ring::new(net_config);
let (update_ring, ring) = watch::channel(Arc::new(ring));
let rpc_path = MEMBERSHIP_RPC_PATH.to_string();
@ -347,14 +243,18 @@ impl System {
let sys = Arc::new(System {
id,
data_dir,
persist_config,
persist_status,
rpc_local_port: rpc_server.bind_addr.port(),
state_info,
rpc_http_client,
rpc_client,
status,
ring,
update_lock: Mutex::new((update_status, update_ring)),
update_lock: Mutex::new(Updaters {
update_status,
update_ring,
}),
background,
});
sys.clone().register_handler(rpc_server, rpc_path);
@ -388,18 +288,15 @@ impl System {
}
async fn save_network_config(self: Arc<Self>) -> Result<(), Error> {
let mut path = self.data_dir.clone();
path.push("network_config");
let ring = self.ring.borrow().clone();
let data = rmp_to_vec_all_named(&ring.config)?;
let mut f = tokio::fs::File::create(path.as_path()).await?;
f.write_all(&data[..]).await?;
self.persist_config
.save_async(&ring.config)
.await
.expect("Cannot save current cluster configuration");
Ok(())
}
pub fn make_ping(&self) -> Message {
fn make_ping(&self) -> Message {
let status = self.status.borrow().clone();
let ring = self.ring.borrow().clone();
Message::Ping(PingMessage {
@ -411,7 +308,7 @@ impl System {
})
}
pub async fn broadcast(self: Arc<Self>, msg: Message, timeout: Duration) {
async fn broadcast(self: Arc<Self>, msg: Message, timeout: Duration) {
let status = self.status.borrow().clone();
let to = status
.nodes
@ -424,32 +321,21 @@ impl System {
pub async fn bootstrap(
self: Arc<Self>,
peers: &[SocketAddr],
peers: Vec<SocketAddr>,
consul_host: Option<String>,
consul_service_name: Option<String>,
) {
let bootstrap_peers = peers.iter().map(|ip| (*ip, None)).collect::<Vec<_>>();
self.clone().ping_nodes(bootstrap_peers).await;
let self2 = self.clone();
self.background
.spawn_worker(format!("discovery loop"), |stop_signal| {
self2.discovery_loop(peers, consul_host, consul_service_name, stop_signal)
});
let self2 = self.clone();
self.clone()
.background
self.background
.spawn_worker(format!("ping loop"), |stop_signal| {
self2.ping_loop(stop_signal).map(Ok)
})
.await;
if let (Some(consul_host), Some(consul_service_name)) = (consul_host, consul_service_name) {
let self2 = self.clone();
self.clone()
.background
.spawn_worker(format!("Consul loop"), |stop_signal| {
self2
.consul_loop(stop_signal, consul_host, consul_service_name)
.map(Ok)
})
.await;
}
self2.ping_loop(stop_signal)
});
}
async fn ping_nodes(self: Arc<Self>, peers: Vec<(SocketAddr, Option<UUID>)>) {
@ -516,9 +402,7 @@ impl System {
if has_changes {
status.recalculate_hash();
}
if let Err(e) = update_locked.0.broadcast(Arc::new(status)) {
error!("In ping_nodes: could not save status update ({})", e);
}
self.update_status(&update_locked, status).await;
drop(update_locked);
if to_advertise.len() > 0 {
@ -527,7 +411,7 @@ impl System {
}
}
pub async fn handle_ping(
async fn handle_ping(
self: Arc<Self>,
from: &SocketAddr,
ping: &PingMessage,
@ -542,7 +426,7 @@ impl System {
let status_hash = status.hash;
let config_version = self.ring.borrow().config.version;
update_locked.0.broadcast(Arc::new(status))?;
self.update_status(&update_locked, status).await;
drop(update_locked);
if is_new || status_hash != ping.status_hash {
@ -557,32 +441,18 @@ impl System {
Ok(self.make_ping())
}
pub fn handle_pull_status(&self) -> Result<Message, Error> {
let status = self.status.borrow().clone();
let mut mem = vec![];
for (node, status) in status.nodes.iter() {
let state_info = if *node == self.id {
self.state_info.clone()
} else {
status.state_info.clone()
};
mem.push(AdvertisedNode {
id: *node,
addr: status.addr,
is_up: status.is_up(),
last_seen: status.last_seen,
state_info,
});
}
Ok(Message::AdvertiseNodesUp(mem))
fn handle_pull_status(&self) -> Result<Message, Error> {
Ok(Message::AdvertiseNodesUp(
self.status.borrow().to_serializable_membership(self),
))
}
pub fn handle_pull_config(&self) -> Result<Message, Error> {
fn handle_pull_config(&self) -> Result<Message, Error> {
let ring = self.ring.borrow().clone();
Ok(Message::AdvertiseConfig(ring.config.clone()))
}
pub async fn handle_advertise_nodes_up(
async fn handle_advertise_nodes_up(
self: Arc<Self>,
adv: &[AdvertisedNode],
) -> Result<Message, Error> {
@ -624,7 +494,7 @@ impl System {
if has_changed {
status.recalculate_hash();
}
update_lock.0.broadcast(Arc::new(status))?;
self.update_status(&update_lock, status).await;
drop(update_lock);
if to_ping.len() > 0 {
@ -635,17 +505,16 @@ impl System {
Ok(Message::Ok)
}
pub async fn handle_advertise_config(
async fn handle_advertise_config(
self: Arc<Self>,
adv: &NetworkConfig,
) -> Result<Message, Error> {
let update_lock = self.update_lock.lock().await;
let mut ring: Ring = self.ring.borrow().as_ref().clone();
let ring: Arc<Ring> = self.ring.borrow().clone();
if adv.version > ring.config.version {
ring.config = adv.clone();
ring.rebuild_ring();
update_lock.1.broadcast(Arc::new(ring))?;
let ring = Ring::new(adv.clone());
update_lock.update_ring.send(Arc::new(ring))?;
drop(update_lock);
self.background.spawn_cancellable(
@ -660,8 +529,8 @@ impl System {
}
async fn ping_loop(self: Arc<Self>, mut stop_signal: watch::Receiver<bool>) {
loop {
let restart_at = tokio::time::delay_for(PING_INTERVAL);
while !*stop_signal.borrow() {
let restart_at = tokio::time::sleep(PING_INTERVAL);
let status = self.status.borrow().clone();
let ping_addrs = status
@ -675,48 +544,72 @@ impl System {
select! {
_ = restart_at.fuse() => (),
must_exit = stop_signal.recv().fuse() => {
match must_exit {
None | Some(true) => return,
_ => (),
}
}
_ = stop_signal.changed().fuse() => (),
}
}
}
async fn consul_loop(
async fn discovery_loop(
self: Arc<Self>,
bootstrap_peers: Vec<SocketAddr>,
consul_host: Option<String>,
consul_service_name: Option<String>,
mut stop_signal: watch::Receiver<bool>,
consul_host: String,
consul_service_name: String,
) {
loop {
let restart_at = tokio::time::delay_for(CONSUL_INTERVAL);
let consul_config = match (consul_host, consul_service_name) {
(Some(ch), Some(csn)) => Some((ch, csn)),
_ => None,
};
match get_consul_nodes(&consul_host, &consul_service_name).await {
Ok(mut node_list) => {
let ping_addrs = node_list.drain(..).map(|a| (a, None)).collect::<Vec<_>>();
self.clone().ping_nodes(ping_addrs).await;
while !*stop_signal.borrow() {
let not_configured = self.ring.borrow().config.members.len() == 0;
let no_peers = self.status.borrow().nodes.len() < 3;
let bad_peers = self
.status
.borrow()
.nodes
.iter()
.filter(|(_, v)| v.is_up())
.count() != self.ring.borrow().config.members.len();
if not_configured || no_peers || bad_peers {
info!("Doing a bootstrap/discovery step (not_configured: {}, no_peers: {}, bad_peers: {})", not_configured, no_peers, bad_peers);
let mut ping_list = bootstrap_peers
.iter()
.map(|ip| (*ip, None))
.collect::<Vec<_>>();
match self.persist_status.load_async().await {
Ok(peers) => {
ping_list.extend(peers.iter().map(|x| (x.addr, Some(x.id))));
}
_ => (),
}
Err(e) => {
warn!("Could not retrieve node list from Consul: {}", e);
if let Some((consul_host, consul_service_name)) = &consul_config {
match get_consul_nodes(consul_host, consul_service_name).await {
Ok(node_list) => {
ping_list.extend(node_list.iter().map(|a| (*a, None)));
}
Err(e) => {
warn!("Could not retrieve node list from Consul: {}", e);
}
}
}
self.clone().ping_nodes(ping_list).await;
}
let restart_at = tokio::time::sleep(DISCOVERY_INTERVAL);
select! {
_ = restart_at.fuse() => (),
must_exit = stop_signal.recv().fuse() => {
match must_exit {
None | Some(true) => return,
_ => (),
}
}
_ = stop_signal.changed().fuse() => (),
}
}
}
pub fn pull_status(
fn pull_status(
self: Arc<Self>,
peer: UUID,
) -> impl futures::future::Future<Output = ()> + Send + 'static {
@ -731,7 +624,7 @@ impl System {
}
}
pub async fn pull_config(self: Arc<Self>, peer: UUID) {
async fn pull_config(self: Arc<Self>, peer: UUID) {
let resp = self
.rpc_client
.call(peer, Message::PullConfig, PING_TIMEOUT)
@ -740,4 +633,35 @@ impl System {
let _: Result<_, _> = self.handle_advertise_config(&config).await;
}
}
async fn update_status(self: &Arc<Self>, updaters: &Updaters, status: Status) {
if status.hash != self.status.borrow().hash {
let mut list = status.to_serializable_membership(&self);
// Combine with old peer list to make sure no peer is lost
match self.persist_status.load_async().await {
Ok(old_list) => {
for pp in old_list {
if !list.iter().any(|np| pp.id == np.id) {
list.push(pp);
}
}
}
_ => (),
}
if list.len() > 0 {
info!("Persisting new peer list ({} peers)", list.len());
self.persist_status
.save_async(&list)
.await
.expect("Unable to persist peer list");
}
}
updaters
.update_status
.send(Arc::new(status))
.expect("Could not update internal membership status");
}
}

215
src/rpc/ring.rs Normal file
View file

@ -0,0 +1,215 @@
use std::collections::{HashMap, HashSet};
use std::convert::TryInto;
use serde::{Deserialize, Serialize};
use garage_util::data::*;
// A partition number is encoded on 16 bits,
// i.e. we have up to 2**16 partitions.
// (in practice we have exactly 2**PARTITION_BITS partitions)
pub type Partition = u16;
// TODO: make this constant parametrizable in the config file
// For deployments with many nodes it might make sense to bump
// it up to 10.
// Maximum value : 16
pub const PARTITION_BITS: usize = 8;
const PARTITION_MASK_U16: u16 = ((1 << PARTITION_BITS) - 1) << (16 - PARTITION_BITS);
// TODO: make this constant paraetrizable in the config file
// (most deployments use a replication factor of 3, so...)
pub const MAX_REPLICATION: usize = 3;
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct NetworkConfig {
pub members: HashMap<UUID, NetworkConfigEntry>,
pub version: u64,
}
impl NetworkConfig {
pub(crate) fn new() -> Self {
Self {
members: HashMap::new(),
version: 0,
}
}
}
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct NetworkConfigEntry {
pub datacenter: String,
pub capacity: u32,
pub tag: String,
}
#[derive(Clone)]
pub struct Ring {
pub config: NetworkConfig,
pub ring: Vec<RingEntry>,
}
#[derive(Clone, Debug)]
pub struct RingEntry {
pub location: Hash,
pub nodes: [UUID; MAX_REPLICATION],
}
impl Ring {
pub(crate) fn new(config: NetworkConfig) -> Self {
// Create a vector of partition indices (0 to 2**PARTITION_BITS-1)
let partitions_idx = (0usize..(1usize << PARTITION_BITS)).collect::<Vec<_>>();
let datacenters = config
.members
.iter()
.map(|(_id, info)| info.datacenter.as_str())
.collect::<HashSet<&str>>();
let n_datacenters = datacenters.len();
// Prepare ring
let mut partitions: Vec<Vec<(&UUID, &NetworkConfigEntry)>> = partitions_idx
.iter()
.map(|_i| Vec::new())
.collect::<Vec<_>>();
// Create MagLev priority queues for each node
let mut queues = config
.members
.iter()
.map(|(node_id, node_info)| {
let mut parts = partitions_idx
.iter()
.map(|i| {
let part_data =
[&u16::to_be_bytes(*i as u16)[..], node_id.as_slice()].concat();
(*i, fasthash(&part_data[..]))
})
.collect::<Vec<_>>();
parts.sort_by_key(|(_i, h)| *h);
let parts_i = parts.iter().map(|(i, _h)| *i).collect::<Vec<_>>();
(node_id, node_info, parts_i, 0)
})
.collect::<Vec<_>>();
let max_capacity = config
.members
.iter()
.map(|(_, node_info)| node_info.capacity)
.fold(0, std::cmp::max);
// Fill up ring
for rep in 0..MAX_REPLICATION {
queues.sort_by_key(|(ni, _np, _q, _p)| {
let queue_data = [&u16::to_be_bytes(rep as u16)[..], ni.as_slice()].concat();
fasthash(&queue_data[..])
});
for (_, _, _, pos) in queues.iter_mut() {
*pos = 0;
}
let mut remaining = partitions_idx.len();
while remaining > 0 {
let remaining0 = remaining;
for i_round in 0..max_capacity {
for (node_id, node_info, q, pos) in queues.iter_mut() {
if i_round >= node_info.capacity {
continue;
}
for pos2 in *pos..q.len() {
let qv = q[pos2];
if partitions[qv].len() != rep {
continue;
}
let p_dcs = partitions[qv]
.iter()
.map(|(_id, info)| info.datacenter.as_str())
.collect::<HashSet<&str>>();
if (p_dcs.len() < n_datacenters
&& !p_dcs.contains(&node_info.datacenter.as_str()))
|| (p_dcs.len() == n_datacenters
&& !partitions[qv].iter().any(|(id, _i)| id == node_id))
{
partitions[qv].push((node_id, node_info));
remaining -= 1;
*pos = pos2 + 1;
break;
}
}
}
}
if remaining == remaining0 {
// No progress made, exit
warn!("Could not build ring, not enough nodes configured.");
return Self {
config,
ring: vec![],
};
}
}
}
let ring = partitions
.iter()
.enumerate()
.map(|(i, nodes)| {
let top = (i as u16) << (16 - PARTITION_BITS);
let mut hash = [0u8; 32];
hash[0..2].copy_from_slice(&u16::to_be_bytes(top)[..]);
let nodes = nodes.iter().map(|(id, _info)| **id).collect::<Vec<UUID>>();
RingEntry {
location: hash.into(),
nodes: nodes.try_into().unwrap(),
}
})
.collect::<Vec<_>>();
// eprintln!("RING: --");
// for e in ring.iter() {
// eprintln!("{:?}", e);
// }
// eprintln!("END --");
Self { config, ring }
}
pub fn partition_of(&self, from: &Hash) -> Partition {
let top = u16::from_be_bytes(from.as_slice()[0..2].try_into().unwrap());
top >> (16 - PARTITION_BITS)
}
pub fn partitions(&self) -> Vec<(Partition, Hash)> {
let mut ret = vec![];
for (i, entry) in self.ring.iter().enumerate() {
ret.push((i as u16, entry.location));
}
if ret.len() > 0 {
assert_eq!(ret[0].1, [0u8; 32].into());
}
ret
}
pub fn walk_ring(&self, from: &Hash, n: usize) -> Vec<UUID> {
if self.ring.len() != 1 << PARTITION_BITS {
warn!("Ring not yet ready, read/writes will be lost!");
return vec![];
}
let top = u16::from_be_bytes(from.as_slice()[0..2].try_into().unwrap());
let partition_idx = (top >> (16 - PARTITION_BITS)) as usize;
assert_eq!(partition_idx, self.partition_of(from) as usize);
let partition = &self.ring[partition_idx];
let partition_top =
u16::from_be_bytes(partition.location.as_slice()[0..2].try_into().unwrap());
assert_eq!(partition_top & PARTITION_MASK_U16, top & PARTITION_MASK_U16);
assert!(n <= partition.nodes.len());
partition.nodes[..n].iter().cloned().collect::<Vec<_>>()
}
}

View file

@ -7,7 +7,6 @@ use std::sync::Arc;
use std::time::Duration;
use arc_swap::ArcSwapOption;
use bytes::IntoBuf;
use futures::future::Future;
use futures::stream::futures_unordered::FuturesUnordered;
use futures::stream::StreamExt;
@ -61,7 +60,7 @@ pub struct RpcClient<M: RpcMessage> {
local_handler: ArcSwapOption<(UUID, LocalHandlerFn<M>)>,
pub rpc_addr_client: RpcAddrClient<M>,
rpc_addr_client: RpcAddrClient<M>,
}
impl<M: RpcMessage + 'static> RpcClient<M> {
@ -129,7 +128,6 @@ impl<M: RpcMessage + 'static> RpcClient<M> {
{
Err(rpc_error) => {
node_status.num_failures.fetch_add(1, Ordering::SeqCst);
// TODO: Save failure info somewhere
Err(Error::from(rpc_error))
}
Ok(x) => x,
@ -197,11 +195,8 @@ impl<M: RpcMessage + 'static> RpcClient<M> {
if !strategy.rs_interrupt_after_quorum {
let wait_finished_fut = tokio::spawn(async move {
resp_stream.collect::<Vec<_>>().await;
Ok(())
});
self.background.spawn(wait_finished_fut.map(|x| {
x.unwrap_or_else(|e| Err(Error::Message(format!("Await failed: {}", e))))
}));
self.background.spawn(wait_finished_fut.map(|_| Ok(())));
}
Ok(results)
@ -215,8 +210,8 @@ impl<M: RpcMessage + 'static> RpcClient<M> {
pub struct RpcAddrClient<M: RpcMessage> {
phantom: PhantomData<M>,
pub http_client: Arc<RpcHttpClient>,
pub path: String,
http_client: Arc<RpcHttpClient>,
path: String,
}
impl<M: RpcMessage> RpcAddrClient<M> {
@ -310,7 +305,9 @@ impl RpcHttpClient {
ClientMethod::HTTPS(client) => client.request(req).fuse(),
};
trace!("({}) Acquiring request_limiter slot...", path);
let slot = self.request_limiter.acquire().await;
trace!("({}) Got slot, doing request to {}...", path, to_addr);
let resp = tokio::time::timeout(timeout, resp_fut)
.await
.map_err(|e| {
@ -330,10 +327,11 @@ impl RpcHttpClient {
})?;
let status = resp.status();
trace!("({}) Request returned, got status {}", path, status);
let body = hyper::body::to_bytes(resp.into_body()).await?;
drop(slot);
match rmp_serde::decode::from_read::<_, Result<M, String>>(body.into_buf())? {
match rmp_serde::decode::from_read::<_, Result<M, String>>(&body[..])? {
Err(e) => Ok(Err(Error::RemoteError(e, status))),
Ok(x) => Ok(Ok(x)),
}

View file

@ -4,7 +4,6 @@ use std::pin::Pin;
use std::sync::Arc;
use std::time::Instant;
use bytes::IntoBuf;
use futures::future::Future;
use futures_util::future::*;
use futures_util::stream::*;
@ -15,6 +14,7 @@ use serde::{Deserialize, Serialize};
use tokio::net::{TcpListener, TcpStream};
use tokio_rustls::server::TlsStream;
use tokio_rustls::TlsAcceptor;
use tokio_stream::wrappers::TcpListenerStream;
use garage_util::config::TlsConfig;
use garage_util::data::*;
@ -47,7 +47,17 @@ where
{
let begin_time = Instant::now();
let whole_body = hyper::body::to_bytes(req.into_body()).await?;
let msg = rmp_serde::decode::from_read::<_, M>(whole_body.into_buf())?;
let msg = rmp_serde::decode::from_read::<_, M>(&whole_body[..])?;
trace!(
"Request message: {}",
serde_json::to_string(&msg)
.unwrap_or("<json error>".into())
.chars()
.take(100)
.collect::<String>()
);
match handler(msg, sockaddr).await {
Ok(resp) => {
let resp_bytes = rmp_to_vec_all_named::<Result<M, String>>(&Ok(resp))?;
@ -112,7 +122,8 @@ impl RpcServer {
return Ok(bad_request);
}
let path = &req.uri().path()[1..];
let path = &req.uri().path()[1..].to_string();
let handler = match self.handlers.get(path) {
Some(h) => h,
None => {
@ -122,6 +133,8 @@ impl RpcServer {
}
};
trace!("({}) Handling request", path);
let resp_waiter = tokio::spawn(handler(req, addr));
match resp_waiter.await {
Err(err) => {
@ -131,11 +144,15 @@ impl RpcServer {
Ok(ise)
}
Ok(Err(err)) => {
trace!("({}) Request handler failed: {}", path, err);
let mut bad_request = Response::new(Body::from(format!("{}", err)));
*bad_request.status_mut() = StatusCode::BAD_REQUEST;
Ok(bad_request)
}
Ok(Ok(resp)) => Ok(resp),
Ok(Ok(resp)) => {
trace!("({}) Request handler succeeded", path);
Ok(resp)
}
}
}
@ -158,8 +175,8 @@ impl RpcServer {
config.set_single_cert([&node_certs[..], &ca_certs[..]].concat(), node_key)?;
let tls_acceptor = Arc::new(TlsAcceptor::from(Arc::new(config)));
let mut listener = TcpListener::bind(&self.bind_addr).await?;
let incoming = listener.incoming().filter_map(|socket| async {
let listener = TcpListener::bind(&self.bind_addr).await?;
let incoming = TcpListenerStream::new(listener).filter_map(|socket| async {
match socket {
Ok(stream) => match tls_acceptor.clone().accept(stream).await {
Ok(x) => Some(Ok::<_, hyper::Error>(x)),

View file

@ -1,9 +1,9 @@
[package]
name = "garage_table"
version = "0.1.1"
version = "0.2.1"
authors = ["Alex Auvolat <alex@adnab.me>"]
edition = "2018"
license = "GPL-3.0"
license = "AGPL-3.0"
description = "Table sharding and replication engine (DynamoDB-like) for the Garage object store"
repository = "https://git.deuxfleurs.fr/Deuxfleurs/garage"
@ -13,24 +13,21 @@ path = "lib.rs"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
garage_util = { version = "0.1", path = "../util" }
garage_rpc = { version = "0.1", path = "../rpc" }
garage_rpc = { version = "0.2.1", path = "../rpc" }
garage_util = { version = "0.2.1", path = "../util" }
bytes = "0.4"
rand = "0.7"
hex = "0.3"
arc-swap = "0.4"
log = "0.4"
bytes = "1.0"
hexdump = "0.1"
log = "0.4"
rand = "0.8"
sled = "0.31"
sled = "0.34"
rmp-serde = "0.14.3"
rmp-serde = "0.15"
serde = { version = "1.0", default-features = false, features = ["derive", "rc"] }
serde_bytes = "0.11"
async-trait = "0.1.30"
futures = "0.3"
futures-util = "0.3"
tokio = { version = "0.2", default-features = false, features = ["rt-core", "rt-threaded", "io-driver", "net", "tcp", "time", "macros", "sync", "signal", "fs"] }
tokio = { version = "1.0", default-features = false, features = ["rt", "rt-multi-thread", "io-util", "net", "time", "macros", "sync", "signal", "fs"] }

View file

@ -1,162 +0,0 @@
use serde::{Deserialize, Serialize};
use garage_util::data::*;
pub trait CRDT {
fn merge(&mut self, other: &Self);
}
impl<T> CRDT for T
where
T: Ord + Clone,
{
fn merge(&mut self, other: &Self) {
if other > self {
*self = other.clone();
}
}
}
// ---- LWW Register ----
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq)]
pub struct LWW<T> {
ts: u64,
v: T,
}
impl<T> LWW<T>
where
T: CRDT,
{
pub fn new(value: T) -> Self {
Self {
ts: now_msec(),
v: value,
}
}
pub fn migrate_from_raw(ts: u64, value: T) -> Self {
Self { ts, v: value }
}
pub fn update(&mut self, new_value: T) {
self.ts = std::cmp::max(self.ts + 1, now_msec());
self.v = new_value;
}
pub fn get(&self) -> &T {
&self.v
}
pub fn get_mut(&mut self) -> &mut T {
&mut self.v
}
}
impl<T> CRDT for LWW<T>
where
T: Clone + CRDT,
{
fn merge(&mut self, other: &Self) {
if other.ts > self.ts {
self.ts = other.ts;
self.v = other.v.clone();
} else if other.ts == self.ts {
self.v.merge(&other.v);
}
}
}
// ---- Boolean (true as absorbing state) ----
#[derive(Clone, Copy, Debug, Serialize, Deserialize, PartialEq)]
pub struct Bool(bool);
impl Bool {
pub fn new(b: bool) -> Self {
Self(b)
}
pub fn set(&mut self) {
self.0 = true;
}
pub fn get(&self) -> bool {
self.0
}
}
impl CRDT for Bool {
fn merge(&mut self, other: &Self) {
self.0 = self.0 || other.0;
}
}
// ---- LWW Map ----
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq)]
pub struct LWWMap<K, V> {
vals: Vec<(K, u64, V)>,
}
impl<K, V> LWWMap<K, V>
where
K: Ord,
V: CRDT,
{
pub fn new() -> Self {
Self { vals: vec![] }
}
pub fn migrate_from_raw_item(k: K, ts: u64, v: V) -> Self {
Self {
vals: vec![(k, ts, v)],
}
}
pub fn take_and_clear(&mut self) -> Self {
let vals = std::mem::replace(&mut self.vals, vec![]);
Self { vals }
}
pub fn clear(&mut self) {
self.vals.clear();
}
pub fn update_mutator(&self, k: K, new_v: V) -> Self {
let new_vals = match self.vals.binary_search_by(|(k2, _, _)| k2.cmp(&k)) {
Ok(i) => {
let (_, old_ts, _) = self.vals[i];
let new_ts = std::cmp::max(old_ts + 1, now_msec());
vec![(k, new_ts, new_v)]
}
Err(_) => vec![(k, now_msec(), new_v)],
};
Self { vals: new_vals }
}
pub fn get(&self, k: &K) -> Option<&V> {
match self.vals.binary_search_by(|(k2, _, _)| k2.cmp(&k)) {
Ok(i) => Some(&self.vals[i].2),
Err(_) => None,
}
}
pub fn items(&self) -> &[(K, u64, V)] {
&self.vals[..]
}
}
impl<K, V> CRDT for LWWMap<K, V>
where
K: Clone + Ord,
V: Clone + CRDT,
{
fn merge(&mut self, other: &Self) {
for (k, ts2, v2) in other.vals.iter() {
match self.vals.binary_search_by(|(k2, _, _)| k2.cmp(&k)) {
Ok(i) => {
let (_, ts1, _v1) = &self.vals[i];
if ts2 > ts1 {
self.vals[i].1 = *ts2;
self.vals[i].2 = v2.clone();
} else if ts1 == ts2 {
self.vals[i].2.merge(&v2);
}
}
Err(i) => {
self.vals.insert(i, (k.clone(), *ts2, v2.clone()));
}
}
}
}
}

34
src/table/crdt/bool.rs Normal file
View file

@ -0,0 +1,34 @@
use serde::{Deserialize, Serialize};
use crate::crdt::crdt::*;
/// Boolean, where `true` is an absorbing state
#[derive(Clone, Copy, Debug, Serialize, Deserialize, PartialEq)]
pub struct Bool(bool);
impl Bool {
/// Create a new boolean with the specified value
pub fn new(b: bool) -> Self {
Self(b)
}
/// Set the boolean to true
pub fn set(&mut self) {
self.0 = true;
}
/// Get the boolean value
pub fn get(&self) -> bool {
self.0
}
}
impl From<bool> for Bool {
fn from(b: bool) -> Bool {
Bool::new(b)
}
}
impl CRDT for Bool {
fn merge(&mut self, other: &Self) {
self.0 = self.0 || other.0;
}
}

73
src/table/crdt/crdt.rs Normal file
View file

@ -0,0 +1,73 @@
use garage_util::data::*;
/// Definition of a CRDT - all CRDT Rust types implement this.
///
/// A CRDT is defined as a merge operator that respects a certain set of axioms.
///
/// In particular, the merge operator must be commutative, associative,
/// idempotent, and monotonic.
/// In other words, if `a`, `b` and `c` are CRDTs, and `⊔` denotes the merge operator,
/// the following axioms must apply:
///
/// ```text
/// a ⊔ b = b ⊔ a (commutativity)
/// (a ⊔ b) ⊔ c = a ⊔ (b ⊔ c) (associativity)
/// (a ⊔ b) ⊔ b = a ⊔ b (idempotence)
/// ```
///
/// Moreover, the relationship `≥` defined by `a ≥ b ⇔ ∃c. a = b ⊔ c` must be a partial order.
/// This implies a few properties such as: if `a ⊔ b ≠ a`, then there is no `c` such that `(a ⊔ b) ⊔ c = a`,
/// as this would imply a cycle in the partial order.
pub trait CRDT {
/// Merge the two datastructures according to the CRDT rules.
/// `self` is modified to contain the merged CRDT value. `other` is not modified.
///
/// # Arguments
///
/// * `other` - the other CRDT we wish to merge with
fn merge(&mut self, other: &Self);
}
/// All types that implement `Ord` (a total order) can also implement a trivial CRDT
/// defined by the merge rule: `a ⊔ b = max(a, b)`. Implement this trait for your type
/// to enable this behavior.
pub trait AutoCRDT: Ord + Clone + std::fmt::Debug {
/// WARN_IF_DIFFERENT: emit a warning when values differ. Set this to true if
/// different values in your application should never happen. Set this to false
/// if you are actually relying on the semantics of `a ⊔ b = max(a, b)`.
const WARN_IF_DIFFERENT: bool;
}
impl<T> CRDT for T
where
T: AutoCRDT,
{
fn merge(&mut self, other: &Self) {
if Self::WARN_IF_DIFFERENT && self != other {
warn!(
"Different CRDT values should be the same (logic error!): {:?} vs {:?}",
self, other
);
if other > self {
*self = other.clone();
}
warn!("Making an arbitrary choice: {:?}", self);
} else {
if other > self {
*self = other.clone();
}
}
}
}
impl AutoCRDT for String {
const WARN_IF_DIFFERENT: bool = true;
}
impl AutoCRDT for bool {
const WARN_IF_DIFFERENT: bool = true;
}
impl AutoCRDT for FixedBytes32 {
const WARN_IF_DIFFERENT: bool = true;
}

114
src/table/crdt/lww.rs Normal file
View file

@ -0,0 +1,114 @@
use serde::{Deserialize, Serialize};
use garage_util::time::now_msec;
use crate::crdt::crdt::*;
/// Last Write Win (LWW)
///
/// An LWW CRDT associates a timestamp with a value, in order to implement a
/// time-based reconciliation rule: the most recent write wins.
/// For completeness, the LWW reconciliation rule must also be defined for two LWW CRDTs
/// with the same timestamp but different values.
///
/// In our case, we add the constraint that the value that is wrapped inside the LWW CRDT must
/// itself be a CRDT: in the case when the timestamp does not allow us to decide on which value to
/// keep, the merge rule of the inner CRDT is applied on the wrapped values. (Note that all types
/// that implement the `Ord` trait get a default CRDT implemetnation that keeps the maximum value.
/// This enables us to use LWW directly with primitive data types such as numbers or strings. It is
/// generally desirable in this case to never explicitly produce LWW values with the same timestamp
/// but different inner values, as the rule to keep the maximum value isn't generally the desired
/// semantics.)
///
/// As multiple computers clocks are always desynchronized,
/// when operations are close enough, it is equivalent to
/// take one copy and drop the other one.
///
/// Given that clocks are not too desynchronized, this assumption
/// is enough for most cases, as there is few chance that two humans
/// coordonate themself faster than the time difference between two NTP servers.
///
/// As a more concret example, let's suppose you want to upload a file
/// with the same key (path) in the same bucket at the very same time.
/// For each request, the file will be timestamped by the receiving server
/// and may differ from what you observed with your atomic clock!
///
/// This scheme is used by AWS S3 or Soundcloud and often without knowing
/// in entreprise when reconciliating databases with ad-hoc scripts.
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq)]
pub struct LWW<T> {
ts: u64,
v: T,
}
impl<T> LWW<T>
where
T: CRDT,
{
/// Creates a new CRDT
///
/// CRDT's internal timestamp is set with current node's clock.
pub fn new(value: T) -> Self {
Self {
ts: now_msec(),
v: value,
}
}
/// Build a new CRDT from a previous non-compatible one
///
/// Compared to new, the CRDT's timestamp is not set to now
/// but must be set to the previous, non-compatible, CRDT's timestamp.
pub fn migrate_from_raw(ts: u64, value: T) -> Self {
Self { ts, v: value }
}
/// Update the LWW CRDT while keeping some causal ordering.
///
/// The timestamp of the LWW CRDT is updated to be the current node's clock
/// at time of update, or the previous timestamp + 1 if that's bigger,
/// so that the new timestamp is always strictly larger than the previous one.
/// This ensures that merging the update with the old value will result in keeping
/// the updated value.
pub fn update(&mut self, new_value: T) {
self.ts = std::cmp::max(self.ts + 1, now_msec());
self.v = new_value;
}
/// Get the CRDT value
pub fn get(&self) -> &T {
&self.v
}
/// Get a mutable reference to the CRDT's value
///
/// This is usefull to mutate the inside value without changing the LWW timestamp.
/// When such mutation is done, the merge between two LWW values is done using the inner
/// CRDT's merge operation. This is usefull in the case where the inner CRDT is a large
/// data type, such as a map, and we only want to change a single item in the map.
/// To do this, we can produce a "CRDT delta", i.e. a LWW that contains only the modification.
/// This delta consists in a LWW with the same timestamp, and the map
/// inside only contains the updated value.
/// The advantage of such a delta is that it is much smaller than the whole map.
///
/// Avoid using this if the inner data type is a primitive type such as a number or a string,
/// as you will then rely on the merge function defined on `Ord` types by keeping the maximum
/// of both values.
pub fn get_mut(&mut self) -> &mut T {
&mut self.v
}
}
impl<T> CRDT for LWW<T>
where
T: Clone + CRDT,
{
fn merge(&mut self, other: &Self) {
if other.ts > self.ts {
self.ts = other.ts;
self.v = other.v.clone();
} else if other.ts == self.ts {
self.v.merge(&other.v);
}
}
}

Some files were not shown because too many files have changed in this diff Show more