Commit graph

56 commits

Author SHA1 Message Date
networkException c99cb58d71
util: move reading secret file into seperate helper
this patch moves the logic to read a secret file (and check for correct
permissions) from `secret_from_file` into a new `read_secret_file`
helper.
2023-10-19 03:29:48 +02:00
networkException 7353038a64
config: allow using paths for unix domain sockets in various places
this patch updates the config format to also allow paths in bind
addresses for unix domain sockets.

this has been added to all apis except rpc.
2023-09-29 18:38:30 +02:00
Alex f8b3883611 config: make block_size and sled_cache_capacity expressable as strings
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
2023-09-11 18:34:59 +02:00
Alex 51b9731a08 make lmdb's map_size configurable (fix #628)
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
2023-09-11 18:03:44 +02:00
Roberto Hidalgo ef8a7add08 set default for [consul-services] api 2023-05-22 08:57:15 -06:00
Roberto Hidalgo b770504126 simplify code according to feedback 2023-05-22 08:57:15 -06:00
Roberto Hidalgo 6b69404f1a rename mode to consul_http_api 2023-05-22 08:57:15 -06:00
Roberto Hidalgo fd7dbea5b8 follow feedback, fold into existing feature 2023-05-22 08:57:15 -06:00
Roberto Hidalgo bd6485565e allow additional ServiceMeta, docs 2023-05-22 08:57:15 -06:00
Roberto Hidalgo 02ba9016ab register consul services against local agent instead of catalog api 2023-05-22 08:57:15 -06:00
Jonathan Davies c783194e8b *: apply clippy recommendations.
All checks were successful
continuous-integration/drone/pr Build is passing
2023-05-09 20:49:34 +01:00
Alex 80e2326998 fixes for pr 499
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2023-02-06 12:23:55 +01:00
Alex 656b8d42de secrets can be passed directly in config, as file, or as env
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
2023-02-03 15:27:39 +01:00
Felix Scheinost d6ea0cbefa Add tests for rpc_secret_file
All checks were successful
continuous-integration/drone/pr Build is passing
2023-01-07 14:19:36 +01:00
Felix Scheinost 7b62fe3f0b Error on both rpc_secret and rpc_secret_file 2023-01-07 13:49:03 +01:00
Felix Scheinost f2106c2733 Implement rpc_secret_file
All checks were successful
continuous-integration/drone/pr Build is passing
2023-01-04 18:35:10 +01:00
Alex 8bc5caf7aa
Fix issue with 'http(s)://' prefix
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
continuous-integration/drone Build is passing
2022-10-18 21:17:11 +02:00
Alex 002b9fc50c
Add TLS support for Consul discovery + refactoring
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
2022-10-18 18:38:20 +02:00
Alex 56592e1853
RPC performance changes
Some checks reported errors
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
continuous-integration/drone Build was killed
- configurable ping timeout
- single, much higher, configurable RPC timeout
- no more concurrency semaphore
2022-09-19 20:31:00 +02:00
Alex e46dc2a8ef
Allow for hostnames in bootstrap_peers and rpc_public_addr (fix #353)
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
2022-09-14 16:09:38 +02:00
Alex 8adc654713
Merge branch 'main' into improve-deps 2022-09-07 18:13:27 +02:00
Alex 2559f63e9b
Make all HTTP services optionnal
Some checks are pending
continuous-integration/drone/push Build is pending
continuous-integration/drone/pr Build is pending
2022-09-07 17:54:16 +02:00
Alex 943d76c583
Ability to dynamically set resync tranquility
All checks were successful
continuous-integration/drone/push Build is passing
2022-09-02 15:34:21 +02:00
Alex b44d3fc796 Abstract database behind generic interface and implement alternative drivers (#322)
All checks were successful
continuous-integration/drone/push Build is passing
- [x] Design interface
- [x] Implement Sled backend
  - [x] Re-implement the SledCountedTree hack ~~on Sled backend~~ on all backends (i.e. over the abstraction)
- [x] Convert Garage code to use generic interface
- [x] Proof-read converted Garage code
- [ ] Test everything well
- [x] Implement sqlite backend
- [x] Implement LMDB backend
- [ ] (Implement Persy backend?)
- [ ] (Implement other backends? (like RocksDB, ...))
- [x] Implement backend choice in config file and garage server module
- [x] Add CLI for converting between DB formats
- Exploit the new interface to put more things in transactions
  - [x] `.updated()` trigger on Garage tables

Fix #284

**Bugs**

- [x] When exporting sqlite, trees iterate empty??
- [x] LMDB doesn't work

**Known issues for various back-ends**

- Sled:
  - Eats all my RAM and also all my disk space
  - `.len()` has to traverse the whole table
  - Is actually quite slow on some operations
  - And is actually pretty bad code...
- Sqlite:
  - Requires a lock to be taken on all operations. The lock is also taken when iterating on a table with `.iter()`, and the lock isn't released until the iterator is dropped. This means that we must be VERY carefull to not do anything else inside a `.iter()` loop or else we will have a deadlock! Most such cases have been eliminated from the Garage codebase, but there might still be some that remain. If your Garage-over-Sqlite seems to hang/freeze, this is the reason.
  - (adapter uses a bunch of unsafe code)
- Heed (LMDB):
  - Not suited for 32-bit machines as it has to map the whole DB in memory.
  - (adpater uses a tiny bit of unsafe code)

**My recommendation:** avoid 32-bit machines and use LMDB as much as possible.

**Converting databases** is actually quite easy. For example from Sled to LMDB:

```bash
cd src/db
cargo run --features cli --bin convert -- -i path/to/garage/meta/db -a sled -o path/to/garage/meta/db.lmdb -b lmdb
```

Then, just add this to your `config.toml`:

```toml
db_engine = "lmdb"
```

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: #322
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-06-08 10:01:44 +02:00
Alex 382e74c798 First version of admin API (#298)
All checks were successful
continuous-integration/drone/push Build is passing
**Spec:**

- [x] Start writing
- [x] Specify all layout endpoints
- [x] Specify all endpoints for operations on keys
- [x] Specify all endpoints for operations on key/bucket permissions
- [x] Specify all endpoints for operations on buckets
- [x] Specify all endpoints for operations on bucket aliases

View rendered spec at <https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/admin-api/doc/drafts/admin-api.md>

**Code:**

- [x] Refactor code for admin api to use common api code that was created for K2V

**General endpoints:**

- [x] Metrics
- [x] GetClusterStatus
- [x] ConnectClusterNodes
- [x] GetClusterLayout
- [x] UpdateClusterLayout
- [x] ApplyClusterLayout
- [x] RevertClusterLayout

**Key-related endpoints:**

- [x] ListKeys
- [x] CreateKey
- [x] ImportKey
- [x] GetKeyInfo
- [x] UpdateKey
- [x] DeleteKey

**Bucket-related endpoints:**

- [x] ListBuckets
- [x] CreateBucket
- [x] GetBucketInfo
- [x] DeleteBucket
- [x] PutBucketWebsite
- [x] DeleteBucketWebsite

**Operations on key/bucket permissions:**

- [x] BucketAllowKey
- [x] BucketDenyKey

**Operations on bucket aliases:**

- [x] GlobalAliasBucket
- [x] GlobalUnaliasBucket
- [x] LocalAliasBucket
- [x] LocalUnaliasBucket

**And also:**

- [x] Separate error type for the admin API (this PR includes a quite big refactoring of error handling)
- [x] Add management of website access
- [ ] Check that nothing is missing wrt what can be done using the CLI
- [ ] Improve formatting of the spec
- [x] Make sure everyone is cool with the API design

Fix #231
Fix #295

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: #298
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-24 12:16:39 +02:00
Alex 5768bf3622 First implementation of K2V (#293)
All checks were successful
continuous-integration/drone/push Build is passing
**Specification:**

View spec at [this URL](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/k2v/doc/drafts/k2v-spec.md)

- [x] Specify the structure of K2V triples
- [x] Specify the DVVS format used for causality detection
- [x] Specify the K2V index (just a counter of number of values per partition key)
- [x] Specify single-item endpoints: ReadItem, InsertItem, DeleteItem
- [x] Specify index endpoint: ReadIndex
- [x] Specify multi-item endpoints: InsertBatch, ReadBatch, DeleteBatch
- [x] Move to JSON objects instead of tuples
- [x] Specify endpoints for polling for updates on single values (PollItem)

**Implementation:**

- [x] Table for K2V items, causal contexts
- [x] Indexing mechanism and table for K2V index
- [x] Make API handlers a bit more generic
- [x] K2V API endpoint
- [x] K2V API router
- [x] ReadItem
- [x] InsertItem
- [x] DeleteItem
- [x] PollItem
- [x] ReadIndex
- [x] InsertBatch
- [x] ReadBatch
- [x] DeleteBatch

**Testing:**

- [x] Just a simple Python script that does some requests to check visually that things are going right (does not contain parsing of results or assertions on returned values)
- [x] Actual tests:
  - [x] Adapt testing framework
  - [x] Simple test with InsertItem + ReadItem
  - [x] Test with several Insert/Read/DeleteItem + ReadIndex
  - [x] Test all combinations of return formats for ReadItem
  - [x] Test with ReadBatch, InsertBatch, DeleteBatch
  - [x] Test with PollItem
  - [x] Test error codes
- [ ] Fix most broken stuff
  - [x] test PollItem broken randomly
  - [x] when invalid causality tokens are given, errors should be 4xx not 5xx

**Improvements:**

- [x] Descending range queries
  - [x] Specify
  - [x] Implement
  - [x] Add test
- [x] Batch updates to index counter
- [x] Put K2V behind `k2v` feature flag

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: #293
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-10 13:16:57 +02:00
Alex e480aaf338
Make background tranquility a configurable parameter 2022-03-23 10:25:19 +01:00
Alex 9b2b531f4d
Make admin server optional
Some checks reported errors
continuous-integration/drone/push Build was killed
continuous-integration/drone/pr Build was killed
2022-03-14 10:54:25 +01:00
Alex dc8d0496cc
Refactoring: rename config files, make modifications less invasive 2022-03-14 10:53:51 +01:00
Alex 8c2fb0c066
Add tracing integration with opentelemetry 2022-03-14 10:52:13 +01:00
mricher e349af13a7
Update dependencies and add admin module with metrics
- Global dependencies updated in Cargo.lock
- New module created in src/admin to host:
  - the (future) admin REST API
  - the metric collection
- add configuration block

No metrics implemented yet
2022-03-14 10:51:12 +01:00
Max Audron 9d44127245
add support for kubernetes service discovery
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
This commit adds support to discover garage instances running in
kubernetes.

Once enabled by setting `kubernetes_namespace` and
`kubernetes_service_name` garage will create a Custom Resources
`garagenodes.deuxfleurs.fr` with nodes public key as the resource name.
and IP and Port information as spec in the namespace configured by
`kubernetes_namespace`.

For discovering nodes the resources are filtered with the optionally set
`kubernetes_service_name` which sets a label
`garage.deuxfleurs.fr/service` on the resources.

This allows to separate multiple garage deployments in a single
namespace.

the `kubernetes_skip_crd` variable allows to disable the creation of the
CRD by garage itself. The user must deploy this manually.
2022-03-12 13:05:52 +01:00
Alex d4dd2e2640
Make use of website config, return error document on error 2022-01-13 14:25:19 +01:00
trinity-1686a 1eb972b1ac Add compression using zstd (#173)
All checks were successful
continuous-integration/drone/push Build is passing
fix #27

Co-authored-by: Trinity Pointard <trinity.pointard@gmail.com>
Reviewed-on: #173
Co-authored-by: trinity-1686a <trinity.pointard@gmail.com>
Co-committed-by: trinity-1686a <trinity.pointard@gmail.com>
2021-12-15 11:26:43 +01:00
Trinity Pointard 9c58ec28d3 add support for vhost-style s3 bucket 2021-11-16 15:41:41 +01:00
Trinity Pointard 9d7535c3f5 allow missing bootstrap_peers in garage.toml
All checks were successful
continuous-integration/drone/pr Build is passing
continuous-integration/drone/push Build is passing
2021-11-05 16:36:25 +01:00
Alex de4276202a
Improve CLI, adapt tests, update documentation 2021-10-25 14:21:48 +02:00
Alex 1b450c4b49
Improvements to CLI and various fixes for netapp version
Discovery via consul, persist peer list to file
2021-10-22 16:55:24 +02:00
Alex 4067797d01
First port of Garage to Netapp 2021-10-22 15:55:18 +02:00
Alex b490ebc7f6
Many improvements on ring/replication and its configuration:
- Explicit "replication_mode" configuration parameters that takes
  either "none", "2" or "3" as values, instead of letting user configure
  replication factor themselves. These are presets whose corresponding
  replication/quorum values can be found in replication/mode.rs

- Explicit support for single-node and two-node deployments
  (number of nodes must be at least "replication_mode", with "none"
  we can have only one node)

- Ring is now stored much more compactly with 256*8 + n*32 bytes,
  instead of 256*32 bytes

- Support for gateway-only nodes that do not store data
  (these nodes still need a metadata_directory to store the list
  of bucket and keys since those are stored on all nodes; it also
  technically needs a data_directory to start but it will stay
  empty unless we have bugs)
2021-05-28 14:07:36 +02:00
Alex 575726358c
Tune Sled configuration
- Make sled cache size and flush interval configurable
- Set less agressive default values:
  - cache size 128MB instead of 1GB
  - Flush interval 2 seconds instead of .5 seconds
2021-05-03 17:27:43 +02:00
Trinity Pointard f871689571
run cargo fmt on util and make missing doc warning 2021-04-27 16:37:10 +02:00
Trinity Pointard f9bd2d8fb7
document util crate 2021-04-27 16:37:10 +02:00
Trinity Pointard f17cb6c969 resolve domain to multiple addresses
All checks were successful
continuous-integration/drone/pr Build is passing
And warn instead of failling when a domain can't be resolved
2021-03-18 21:04:30 +01:00
Trinity Pointard c8a7ce5cdf remove domain resolution for *_bind_addr
All checks were successful
continuous-integration/drone/pr Build is passing
2021-03-18 19:47:51 +01:00
Trinity Pointard 81e9db783f simplify addresse deserialialiser and limit allocations 2021-03-18 19:47:51 +01:00
Trinity Pointard ae3b7029a9 add support for using domain name in configuration 2021-03-18 19:47:51 +01:00
Alex 3882d5ba36 Remove epidemic propagation for fully replicated stuff: write directly to all nodes 2021-03-05 15:09:18 +01:00
Quentin 2765291796 Build path correctly 2020-11-11 19:48:01 +01:00
Quentin 4093833ae8 Extract bucket 2020-11-10 09:57:07 +01:00