Commit graph

29 commits

Author SHA1 Message Date
Mendes ceac3713d6 modifications in several files to :
- have consistent error return types
- store the zone redundancy in a Lww
- print the error and message in the CLI (TODO: for the server Api, should msg be returned in the body response?)
2022-10-05 15:29:48 +02:00
Mendes 829f815a89 Merge remote-tracking branch 'origin/main' into optimal-layout
Some checks failed
continuous-integration/drone/pr Build is failing
continuous-integration/drone/push Build is failing
2022-10-04 18:14:49 +02:00
Mendes 99f96b9564 deleted zone_redundancy from System struct
Some checks are pending
continuous-integration/drone/push Build is pending
continuous-integration/drone/pr Build is pending
2022-10-04 18:09:24 +02:00
Alex ad917ffd3f
Fix instant substractions that might have panicked
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2022-09-29 15:53:54 +02:00
Mendes 7f3249a237 New version of the algorithm that calculate the layout.
It takes as paramters the replication factor and the zone redundancy, computes the
largest partition size reachable with these constraints, and among the possible
assignation with this partition size, it computes the one that moves the least number
of partitions compared to the previous assignation.
This computation uses graph algorithms defined in graph_algo.rs
2022-09-21 14:39:59 +02:00
Alex 56592e1853
RPC performance changes
Some checks reported errors
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
continuous-integration/drone Build was killed
- configurable ping timeout
- single, much higher, configurable RPC timeout
- no more concurrency semaphore
2022-09-19 20:31:00 +02:00
Alex e46dc2a8ef
Allow for hostnames in bootstrap_peers and rpc_public_addr (fix #353)
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
2022-09-14 16:09:38 +02:00
Alex ab722cb40f
Add checks on replication_factor of layouts we use (fix #363, fix #364)
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2022-09-13 16:22:23 +02:00
Alex 7f54706b95
Merge branch 'lx-perf-improvements' into netapp-stream-body
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2022-09-08 15:50:56 +02:00
Alex db61f41030
Move GIT_VERSION injection later in build chain to reduce build times
Some checks failed
continuous-integration/drone/pr Build is failing
continuous-integration/drone/push Build is passing
2022-09-07 11:59:56 +02:00
Alex df094bd807
Less strict timeouts 2022-09-01 16:30:44 +02:00
Alex 1921f4f7e6
Merge branch 'lx-perf-improvements' into netapp-stream-body
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build is failing
2022-08-29 16:45:05 +02:00
Quentin 2c7bae935a
Configure structopt to report the right version
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/tag Build is passing
continuous-integration/drone Build is passing
continuous-integration/drone/push Build is passing
By default, structopt reports the value provided by
the env var CARGO_PKG_VERSION, feeded by Cargo when reading
Cargo.toml. However for Garage we use a versioning based on git,
so we often report a version that is behind the real version.
In this commit, we create garage_util::version::garage() that
reports the right version and configure all structopt subcommands
to call this function instead of using the env var.
2022-08-11 10:21:45 +02:00
Alex 8e7e680afe
First adaptation to WIP netapp with streaming body 2022-07-29 12:25:02 +02:00
Alex 4f38cadf6e Background task manager (#332)
All checks were successful
continuous-integration/drone/push Build is passing
- [x] New background worker trait
- [x] Adapt all current workers to use new API
- [x] Command to list currently running workers, and whether they are active, idle, or dead
- [x] Error reporting
- Optimizations
  - [x] Merkle updater: several items per iteration
  - [ ] Use `tokio::task::spawn_blocking` where appropriate so that CPU-intensive tasks don't block other things going on
- scrub:
  - [x] have only one worker with a channel to start/pause/cancel
  - [x] automatic scrub
  - [x] ability to view and change tranquility from CLI
  - [x] persistence of a few info
- [ ] Testing

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: #332
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-07-08 13:30:26 +02:00
Alex 382e74c798 First version of admin API (#298)
All checks were successful
continuous-integration/drone/push Build is passing
**Spec:**

- [x] Start writing
- [x] Specify all layout endpoints
- [x] Specify all endpoints for operations on keys
- [x] Specify all endpoints for operations on key/bucket permissions
- [x] Specify all endpoints for operations on buckets
- [x] Specify all endpoints for operations on bucket aliases

View rendered spec at <https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/admin-api/doc/drafts/admin-api.md>

**Code:**

- [x] Refactor code for admin api to use common api code that was created for K2V

**General endpoints:**

- [x] Metrics
- [x] GetClusterStatus
- [x] ConnectClusterNodes
- [x] GetClusterLayout
- [x] UpdateClusterLayout
- [x] ApplyClusterLayout
- [x] RevertClusterLayout

**Key-related endpoints:**

- [x] ListKeys
- [x] CreateKey
- [x] ImportKey
- [x] GetKeyInfo
- [x] UpdateKey
- [x] DeleteKey

**Bucket-related endpoints:**

- [x] ListBuckets
- [x] CreateBucket
- [x] GetBucketInfo
- [x] DeleteBucket
- [x] PutBucketWebsite
- [x] DeleteBucketWebsite

**Operations on key/bucket permissions:**

- [x] BucketAllowKey
- [x] BucketDenyKey

**Operations on bucket aliases:**

- [x] GlobalAliasBucket
- [x] GlobalUnaliasBucket
- [x] LocalAliasBucket
- [x] LocalUnaliasBucket

**And also:**

- [x] Separate error type for the admin API (this PR includes a quite big refactoring of error handling)
- [x] Add management of website access
- [ ] Check that nothing is missing wrt what can be done using the CLI
- [ ] Improve formatting of the spec
- [x] Make sure everyone is cool with the API design

Fix #231
Fix #295

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: #298
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-24 12:16:39 +02:00
Alex 9d0ed78887 Add feature flag for Kubernetes discovery 2022-03-24 16:57:43 +01:00
Alex 203e8d2c34
Bump version to 0.7 because of incompatible Netapp 2022-03-14 10:54:24 +01:00
Alex bb04d94fa9
Update to Netapp 0.4 which supports distributed tracing 2022-03-14 10:52:30 +01:00
Max Audron 9d44127245
add support for kubernetes service discovery
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
This commit adds support to discover garage instances running in
kubernetes.

Once enabled by setting `kubernetes_namespace` and
`kubernetes_service_name` garage will create a Custom Resources
`garagenodes.deuxfleurs.fr` with nodes public key as the resource name.
and IP and Port information as spec in the namespace configured by
`kubernetes_namespace`.

For discovering nodes the resources are filtered with the optionally set
`kubernetes_service_name` which sets a label
`garage.deuxfleurs.fr/service` on the resources.

This allows to separate multiple garage deployments in a single
namespace.

the `kubernetes_skip_crd` variable allows to disable the creation of the
CRD by garage itself. The user must deploy this manually.
2022-03-12 13:05:52 +01:00
Alex beeef4758e
Some movement of helper code and refactoring of error handling 2022-01-04 12:52:46 +01:00
Alex c94406f428
Improve how node roles are assigned in Garage
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/tag Build is passing
continuous-integration/drone/push Build is passing
continuous-integration/drone Build is passing
- change the terminology: the network configuration becomes the role
  table, the configuration of a nodes becomes a node's role
- the modification of the role table takes place in two steps: first,
  changes are staged in a CRDT data structure. Then, once the user is
  happy with the changes, they can commit them all at once (or revert
  them).
- update documentation
- fix tests
- implement smarter partition assignation algorithm

This patch breaks the format of the network configuration: when
migrating, the cluster will be in a state where no roles are assigned.
All roles must be re-assigned and commited at once. This migration
should not pose an issue.
2021-11-16 16:05:53 +01:00
Alex e8811f7c9d
Request strategy: don't launch all 3 requests if not needed
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
continuous-integration/drone/tag Build is passing
continuous-integration/drone Build is passing
2021-11-04 16:19:27 +01:00
Alex 6f13d083ab
Add semaphore to limit RAM used by buffered outgoing requests
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
continuous-integration/drone Build is passing
2021-11-03 18:02:57 +01:00
Alex 8c4f418fe8
Fix peer list persistence: do not forget previous peers
Some checks reported errors
continuous-integration/drone/pr Build was killed
continuous-integration/drone Build is passing
continuous-integration/drone/push Build is passing
2021-11-03 17:34:44 +01:00
Alex ada7899b24
Fix clippy lints (fix #121)
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
2021-10-26 10:20:05 +02:00
Alex de4276202a
Improve CLI, adapt tests, update documentation 2021-10-25 14:21:48 +02:00
Alex 1b450c4b49
Improvements to CLI and various fixes for netapp version
Discovery via consul, persist peer list to file
2021-10-22 16:55:24 +02:00
Alex 4067797d01
First port of Garage to Netapp 2021-10-22 15:55:18 +02:00