Commit Graph

66 Commits

Author SHA1 Message Date
Alex dc0b78cdb8
[block-ref-repair] Block refcount recalculation and repair
- We always recalculate the reference count of a block before deleting
  it locally, to make sure that it is indeed zero.

- If we had to fetch a remote block but we were not able to get it,
  check that refcount is indeed > 0.

- Repair procedure that checks everything
2024-03-19 16:20:22 +01:00
Alex 0038ca8a78
Merge branch 'main' into next-0.10
ci/woodpecker/pr/debug Pipeline was successful Details
ci/woodpecker/push/debug Pipeline was successful Details
ci/woodpecker/cron/debug Pipeline was successful Details
ci/woodpecker/cron/release/4 Pipeline was successful Details
ci/woodpecker/cron/release/3 Pipeline was successful Details
ci/woodpecker/cron/release/2 Pipeline was successful Details
ci/woodpecker/cron/release/1 Pipeline was successful Details
ci/woodpecker/cron/publish Pipeline was successful Details
2024-03-18 20:19:30 +01:00
Alex 1e42808a59
[db-snapshot] implement meta_auto_snapshot_interval 2024-03-15 13:51:31 +01:00
Alex 7c86ff6c37
[disable-scrub] implement a `disable_scrub` configuration option 2024-03-14 17:01:16 +01:00
Alex 44454aac01
[rm-sled] Remove the Sled database engine
ci/woodpecker/pr/debug Pipeline was successful Details
ci/woodpecker/push/debug Pipeline was successful Details
2024-03-08 14:11:02 +01:00
Alex 1ace34adbb
Merge branch 'main' into next-0.10
ci/woodpecker/pr/debug Pipeline was successful Details
ci/woodpecker/push/debug Pipeline was successful Details
2024-03-08 13:57:10 +01:00
Alex ec34728b27
[factor-db-open] Combine logic for opening db engines
ci/woodpecker/pr/debug Pipeline was successful Details
ci/woodpecker/push/debug Pipeline was successful Details
ci/woodpecker/deployment/debug Pipeline was successful Details
ci/woodpecker/deployment/release/4 Pipeline was successful Details
ci/woodpecker/deployment/release/3 Pipeline was successful Details
ci/woodpecker/deployment/release/2 Pipeline was successful Details
ci/woodpecker/deployment/release/1 Pipeline was successful Details
ci/woodpecker/deployment/publish Pipeline was successful Details
2024-03-08 12:58:17 +01:00
Yureka c1769bbe69 ReplicationMode -> ConsistencyMode+ReplicationFactor
ci/woodpecker/pr/debug Pipeline was successful Details
ci/woodpecker/deployment/debug Pipeline was successful Details
ci/woodpecker/deployment/release/1 Pipeline was successful Details
ci/woodpecker/deployment/release/3 Pipeline was successful Details
ci/woodpecker/deployment/release/4 Pipeline was successful Details
ci/woodpecker/deployment/release/2 Pipeline was successful Details
ci/woodpecker/deployment/publish Pipeline was successful Details
2024-03-07 12:45:33 +01:00
Yureka 6760895926 refactor: remove max_write_errors and max_faults
ci/woodpecker/pr/debug Pipeline was successful Details
2024-03-04 18:39:56 +01:00
Alex cff702a951
[lock-createbucket] Add node-global lock for bucket/key operations (fix #723)
ci/woodpecker/pr/debug Pipeline was successful Details
ci/woodpecker/push/debug Pipeline was successful Details
2024-02-22 12:28:21 +01:00
Alex 5ea24254a9
[import-netapp] import Netapp code into Garage codebase 2024-02-15 12:15:07 +01:00
Alex 51abbb02d8 Merge branch 'main' into next
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
2023-09-11 20:00:02 +02:00
Alex f8b3883611 config: make block_size and sled_cache_capacity expressable as strings
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
2023-09-11 18:34:59 +02:00
Alex 51b9731a08 make lmdb's map_size configurable (fix #628)
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
2023-09-11 18:03:44 +02:00
Alex 93114a9747 block manager: refactoring 2023-09-06 16:35:28 +02:00
Alex 71c0188055 block manager: skeleton for multi-hdd support 2023-09-06 16:35:28 +02:00
Alex 0f1849e1ac lifecycle worker: launch with the rest of Garage
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
2023-08-30 14:51:08 +02:00
Alex 511e07ecd4 fix mpu counter (add missing workers) and report info at appropriate places 2023-06-09 16:23:37 +02:00
Alex 38d6ac4295 New multipart upload table layout 2023-06-09 16:23:37 +02:00
Alex e7e164a280 Make fsync an option for meta and data
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is passing Details
2023-06-09 16:23:21 +02:00
Alex 19639705e6 Mark sled as deprecated, make lmdb default, and improve sqlite and lmdb defaults
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is failing Details
2023-05-17 14:30:53 +02:00
Alex b8123fb6cd Clearer error message when LMDB has oom error (fix #517)
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
2023-03-06 11:38:49 +01:00
Alex 656b8d42de secrets can be passed directly in config, as file, or as env
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
2023-02-03 15:27:39 +01:00
Alex 8e93d69974 More clippy fixes
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
2023-01-26 17:26:32 +01:00
Alex dac254a6e7
Merge branch 'main' into k2v-watch-range-2
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
2023-01-11 17:09:37 +01:00
Alex 9f5419f465
Make K2V item timestamps globally increasing on each node
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
2023-01-10 11:03:52 +01:00
Alex a48e2e0cb2
K2V: Subscription to ranges of items
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
2023-01-10 10:30:59 +01:00
Felix Scheinost f2106c2733 Implement `rpc_secret_file`
continuous-integration/drone/pr Build is passing Details
2023-01-04 18:35:10 +01:00
Alex f3f27293df
Uniform framework for bg variable management
continuous-integration/drone/push Build is passing Details
2023-01-04 13:07:13 +01:00
Alex d56c472712
Refactor background runner and get rid of job worker
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
2022-12-14 12:51:42 +01:00
Alex 2183518edc
Spawn all background workers in a separate step 2022-12-14 12:28:07 +01:00
Alex 280d1be7b1
Refactor health check and add ability to return it in json
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
2022-12-05 15:28:57 +01:00
Alex 2065f011ca
Implement /health admin API endpoint to check node health
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
2022-12-05 14:59:15 +01:00
Alex ab722cb40f
Add checks on replication_factor of layouts we use (fix #363, fix #364)
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
2022-09-13 16:22:23 +02:00
Alex 07febd3ecd
Ensure data dir is created immediately when Garage starts (fix #349)
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
2022-09-13 15:57:27 +02:00
Alex d9d199a6c9
Merge branch 'main' into lx-perf-improvements
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
2022-09-08 15:49:17 +02:00
Alex 8adc654713
Merge branch 'main' into improve-deps 2022-09-07 18:13:27 +02:00
Alex b886c75450
Make all DB engines optional build features 2022-09-06 17:09:43 +02:00
Alex 07e6bcde85
Merge branch 'main' into lx-perf-improvements
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
2022-09-05 12:40:17 +02:00
Alex 943d76c583
Ability to dynamically set resync tranquility
continuous-integration/drone/push Build is passing Details
2022-09-02 15:34:21 +02:00
Alex 2f111e6b3d
Performance improvements:
- reduce contention on mutation_lock by having 256 of them
- better lmdb defaults
2022-07-29 12:24:48 +02:00
Alex 77e3fd6db2 improve internal item counter mechanisms and implement bucket quotas (#326)
continuous-integration/drone/push Build is passing Details
- [x] Refactoring of internal counting API
- [x] Repair procedure for counters (it's an offline procedure!!!)
- [x] New counter for objects in buckets
- [x] Add quotas to buckets struct
- [x] Add CLI to manage bucket quotas
- [x] Add admin API to manage bucket quotas
- [x] Apply quotas by adding checks on put operations
- [x] Proof-read

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: #326
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-06-15 20:20:28 +02:00
Alex b44d3fc796 Abstract database behind generic interface and implement alternative drivers (#322)
continuous-integration/drone/push Build is passing Details
- [x] Design interface
- [x] Implement Sled backend
  - [x] Re-implement the SledCountedTree hack ~~on Sled backend~~ on all backends (i.e. over the abstraction)
- [x] Convert Garage code to use generic interface
- [x] Proof-read converted Garage code
- [ ] Test everything well
- [x] Implement sqlite backend
- [x] Implement LMDB backend
- [ ] (Implement Persy backend?)
- [ ] (Implement other backends? (like RocksDB, ...))
- [x] Implement backend choice in config file and garage server module
- [x] Add CLI for converting between DB formats
- Exploit the new interface to put more things in transactions
  - [x] `.updated()` trigger on Garage tables

Fix #284

**Bugs**

- [x] When exporting sqlite, trees iterate empty??
- [x] LMDB doesn't work

**Known issues for various back-ends**

- Sled:
  - Eats all my RAM and also all my disk space
  - `.len()` has to traverse the whole table
  - Is actually quite slow on some operations
  - And is actually pretty bad code...
- Sqlite:
  - Requires a lock to be taken on all operations. The lock is also taken when iterating on a table with `.iter()`, and the lock isn't released until the iterator is dropped. This means that we must be VERY carefull to not do anything else inside a `.iter()` loop or else we will have a deadlock! Most such cases have been eliminated from the Garage codebase, but there might still be some that remain. If your Garage-over-Sqlite seems to hang/freeze, this is the reason.
  - (adapter uses a bunch of unsafe code)
- Heed (LMDB):
  - Not suited for 32-bit machines as it has to map the whole DB in memory.
  - (adpater uses a tiny bit of unsafe code)

**My recommendation:** avoid 32-bit machines and use LMDB as much as possible.

**Converting databases** is actually quite easy. For example from Sled to LMDB:

```bash
cd src/db
cargo run --features cli --bin convert -- -i path/to/garage/meta/db -a sled -o path/to/garage/meta/db.lmdb -b lmdb
```

Then, just add this to your `config.toml`:

```toml
db_engine = "lmdb"
```

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: #322
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-06-08 10:01:44 +02:00
Alex 382e74c798 First version of admin API (#298)
continuous-integration/drone/push Build is passing Details
**Spec:**

- [x] Start writing
- [x] Specify all layout endpoints
- [x] Specify all endpoints for operations on keys
- [x] Specify all endpoints for operations on key/bucket permissions
- [x] Specify all endpoints for operations on buckets
- [x] Specify all endpoints for operations on bucket aliases

View rendered spec at <https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/admin-api/doc/drafts/admin-api.md>

**Code:**

- [x] Refactor code for admin api to use common api code that was created for K2V

**General endpoints:**

- [x] Metrics
- [x] GetClusterStatus
- [x] ConnectClusterNodes
- [x] GetClusterLayout
- [x] UpdateClusterLayout
- [x] ApplyClusterLayout
- [x] RevertClusterLayout

**Key-related endpoints:**

- [x] ListKeys
- [x] CreateKey
- [x] ImportKey
- [x] GetKeyInfo
- [x] UpdateKey
- [x] DeleteKey

**Bucket-related endpoints:**

- [x] ListBuckets
- [x] CreateBucket
- [x] GetBucketInfo
- [x] DeleteBucket
- [x] PutBucketWebsite
- [x] DeleteBucketWebsite

**Operations on key/bucket permissions:**

- [x] BucketAllowKey
- [x] BucketDenyKey

**Operations on bucket aliases:**

- [x] GlobalAliasBucket
- [x] GlobalUnaliasBucket
- [x] LocalAliasBucket
- [x] LocalUnaliasBucket

**And also:**

- [x] Separate error type for the admin API (this PR includes a quite big refactoring of error handling)
- [x] Add management of website access
- [ ] Check that nothing is missing wrt what can be done using the CLI
- [ ] Improve formatting of the spec
- [x] Make sure everyone is cool with the API design

Fix #231
Fix #295

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: #298
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-24 12:16:39 +02:00
Alex 5768bf3622 First implementation of K2V (#293)
continuous-integration/drone/push Build is passing Details
**Specification:**

View spec at [this URL](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/k2v/doc/drafts/k2v-spec.md)

- [x] Specify the structure of K2V triples
- [x] Specify the DVVS format used for causality detection
- [x] Specify the K2V index (just a counter of number of values per partition key)
- [x] Specify single-item endpoints: ReadItem, InsertItem, DeleteItem
- [x] Specify index endpoint: ReadIndex
- [x] Specify multi-item endpoints: InsertBatch, ReadBatch, DeleteBatch
- [x] Move to JSON objects instead of tuples
- [x] Specify endpoints for polling for updates on single values (PollItem)

**Implementation:**

- [x] Table for K2V items, causal contexts
- [x] Indexing mechanism and table for K2V index
- [x] Make API handlers a bit more generic
- [x] K2V API endpoint
- [x] K2V API router
- [x] ReadItem
- [x] InsertItem
- [x] DeleteItem
- [x] PollItem
- [x] ReadIndex
- [x] InsertBatch
- [x] ReadBatch
- [x] DeleteBatch

**Testing:**

- [x] Just a simple Python script that does some requests to check visually that things are going right (does not contain parsing of results or assertions on returned values)
- [x] Actual tests:
  - [x] Adapt testing framework
  - [x] Simple test with InsertItem + ReadItem
  - [x] Test with several Insert/Read/DeleteItem + ReadIndex
  - [x] Test all combinations of return formats for ReadItem
  - [x] Test with ReadBatch, InsertBatch, DeleteBatch
  - [x] Test with PollItem
  - [x] Test error codes
- [ ] Fix most broken stuff
  - [x] test PollItem broken randomly
  - [x] when invalid causality tokens are given, errors should be 4xx not 5xx

**Improvements:**

- [x] Descending range queries
  - [x] Specify
  - [x] Implement
  - [x] Add test
- [x] Batch updates to index counter
- [x] Put K2V behind `k2v` feature flag

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: #293
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-10 13:16:57 +02:00
Alex 077dd1cde9
Clippy 2022-03-23 10:25:39 +01:00
Alex e480aaf338
Make background tranquility a configurable parameter 2022-03-23 10:25:19 +01:00
Alex c3982a90b6
Move DataBlock out of manager.rs 2022-03-23 10:25:19 +01:00
Alex c1d9854d2c
Move block manager to separate module 2022-03-23 10:25:15 +01:00
Alex beeef4758e
Some movement of helper code and refactoring of error handling 2022-01-04 12:52:46 +01:00