Commit graph

59 commits

Author SHA1 Message Date
85b5a6bcd1
fix some clippy lints 2023-12-11 15:31:47 +01:00
e4f493b481
table: remove redundant tracing in insert_many 2023-12-11 14:57:42 +01:00
f8df90b79b
table: fix insert_many to not send duplicates 2023-12-08 14:54:11 +01:00
95eb13eb08
rpc: refactor result tracking for quorum sets 2023-12-07 10:57:21 +01:00
3ecd14b9f6
table: implement write sets for insert_many 2023-11-16 16:41:45 +01:00
33c8a489b0
layou: implement ack locking 2023-11-15 15:40:44 +01:00
90e1619b1e
table: take into account multiple write sets in inserts 2023-11-14 15:40:46 +01:00
3b361d2959
layout: prepare for write sets 2023-11-14 14:28:16 +01:00
df36cf3099
layout: add helpers to LayoutHistory and prepare integration with Table 2023-11-09 16:32:31 +01:00
8a2b1dd422
wip: split out layout management from System into separate LayoutManager 2023-11-09 12:55:36 +01:00
426d8784da
cleanup 2023-01-03 15:08:37 +01:00
cdb2a591e9
Refactor how things are migrated 2023-01-03 14:44:47 +01:00
510b620108
Get rid of background::spawn 2022-12-14 16:08:05 +01:00
d56c472712
Refactor background runner and get rid of job worker 2022-12-14 12:51:42 +01:00
2183518edc
Spawn all background workers in a separate step 2022-12-14 12:28:07 +01:00
83c8467e23
Proper queueing for delayed inserts, now backed to disk 2022-12-14 11:58:06 +01:00
56592e1853
RPC performance changes
- configurable ping timeout
- single, much higher, configurable RPC timeout
- no more concurrency semaphore
2022-09-19 20:31:00 +02:00
44733474bb
Remove/change println! in server code (fix #358) 2022-09-13 16:01:55 +02:00
df094bd807
Less strict timeouts 2022-09-01 16:30:44 +02:00
b44d3fc796 Abstract database behind generic interface and implement alternative drivers (#322)
- [x] Design interface
- [x] Implement Sled backend
  - [x] Re-implement the SledCountedTree hack ~~on Sled backend~~ on all backends (i.e. over the abstraction)
- [x] Convert Garage code to use generic interface
- [x] Proof-read converted Garage code
- [ ] Test everything well
- [x] Implement sqlite backend
- [x] Implement LMDB backend
- [ ] (Implement Persy backend?)
- [ ] (Implement other backends? (like RocksDB, ...))
- [x] Implement backend choice in config file and garage server module
- [x] Add CLI for converting between DB formats
- Exploit the new interface to put more things in transactions
  - [x] `.updated()` trigger on Garage tables

Fix #284

**Bugs**

- [x] When exporting sqlite, trees iterate empty??
- [x] LMDB doesn't work

**Known issues for various back-ends**

- Sled:
  - Eats all my RAM and also all my disk space
  - `.len()` has to traverse the whole table
  - Is actually quite slow on some operations
  - And is actually pretty bad code...
- Sqlite:
  - Requires a lock to be taken on all operations. The lock is also taken when iterating on a table with `.iter()`, and the lock isn't released until the iterator is dropped. This means that we must be VERY carefull to not do anything else inside a `.iter()` loop or else we will have a deadlock! Most such cases have been eliminated from the Garage codebase, but there might still be some that remain. If your Garage-over-Sqlite seems to hang/freeze, this is the reason.
  - (adapter uses a bunch of unsafe code)
- Heed (LMDB):
  - Not suited for 32-bit machines as it has to map the whole DB in memory.
  - (adpater uses a tiny bit of unsafe code)

**My recommendation:** avoid 32-bit machines and use LMDB as much as possible.

**Converting databases** is actually quite easy. For example from Sled to LMDB:

```bash
cd src/db
cargo run --features cli --bin convert -- -i path/to/garage/meta/db -a sled -o path/to/garage/meta/db.lmdb -b lmdb
```

Then, just add this to your `config.toml`:

```toml
db_engine = "lmdb"
```

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#322
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-06-08 10:01:44 +02:00
5768bf3622 First implementation of K2V (#293)
**Specification:**

View spec at [this URL](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/k2v/doc/drafts/k2v-spec.md)

- [x] Specify the structure of K2V triples
- [x] Specify the DVVS format used for causality detection
- [x] Specify the K2V index (just a counter of number of values per partition key)
- [x] Specify single-item endpoints: ReadItem, InsertItem, DeleteItem
- [x] Specify index endpoint: ReadIndex
- [x] Specify multi-item endpoints: InsertBatch, ReadBatch, DeleteBatch
- [x] Move to JSON objects instead of tuples
- [x] Specify endpoints for polling for updates on single values (PollItem)

**Implementation:**

- [x] Table for K2V items, causal contexts
- [x] Indexing mechanism and table for K2V index
- [x] Make API handlers a bit more generic
- [x] K2V API endpoint
- [x] K2V API router
- [x] ReadItem
- [x] InsertItem
- [x] DeleteItem
- [x] PollItem
- [x] ReadIndex
- [x] InsertBatch
- [x] ReadBatch
- [x] DeleteBatch

**Testing:**

- [x] Just a simple Python script that does some requests to check visually that things are going right (does not contain parsing of results or assertions on returned values)
- [x] Actual tests:
  - [x] Adapt testing framework
  - [x] Simple test with InsertItem + ReadItem
  - [x] Test with several Insert/Read/DeleteItem + ReadIndex
  - [x] Test all combinations of return formats for ReadItem
  - [x] Test with ReadBatch, InsertBatch, DeleteBatch
  - [x] Test with PollItem
  - [x] Test error codes
- [ ] Fix most broken stuff
  - [x] test PollItem broken randomly
  - [x] when invalid causality tokens are given, errors should be 4xx not 5xx

**Improvements:**

- [x] Descending range queries
  - [x] Specify
  - [x] Implement
  - [x] Add test
- [x] Batch updates to index counter
- [x] Put K2V behind `k2v` feature flag

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: Deuxfleurs/garage#293
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-10 13:16:57 +02:00
f869ca625d
Add spans to table calls, change span names in RPC 2022-03-14 10:54:12 +01:00
2a5609b292
Add metrics to API endpoint 2022-03-14 10:53:36 +01:00
818daa5c78
Refactor how durations are measured 2022-03-14 10:53:35 +01:00
2cab84b1fe
Add many metrics in table/ and rpc/ 2022-03-14 10:51:50 +01:00
beeef4758e
Some movement of helper code and refactoring of error handling 2022-01-04 12:52:46 +01:00
8f6026de5e
Make table name a const in trait 2021-12-15 15:39:10 +01:00
1b450c4b49
Improvements to CLI and various fixes for netapp version
Discovery via consul, persist peer list to file
2021-10-22 16:55:24 +02:00
4067797d01
First port of Garage to Netapp 2021-10-22 15:55:18 +02:00
e4b9e4e24d
rename types to CamelCase 2021-05-03 22:15:09 +02:00
f5a0cf0414
fix clippy warnings on table 2021-05-03 22:11:41 +02:00
7b10245dfb Leader-based GC 2021-03-16 18:42:33 +01:00
0aad2f2e06 some reordering 2021-03-16 11:47:39 +01:00
515029d026 Refactor code 2021-03-16 11:43:58 +01:00
1d9961e411 Simplify replication logic 2021-03-16 11:14:27 +01:00
831eb35763 cargo fmt 2021-03-12 21:52:19 +01:00
c475471e7a Implement table gc, currently for block_ref and version only 2021-03-12 19:57:37 +01:00
cbe7e1a66a Move table rpc client out of tableaux 2021-03-12 15:07:23 +01:00
8860aa19b8 Make syncer have its own rpc client/server 2021-03-12 15:05:26 +01:00
046b649bcc (not well tested) use merkle tree for sync 2021-03-11 18:28:27 +01:00
94f3d28774 WIP big refactoring 2021-03-11 16:54:15 +01:00
8d63738cb0 Checkpoint: add merkle tree in data table 2021-03-11 13:47:21 +01:00
f319a7d374 Refactor model stuff, including cleaner CRDTs 2021-03-10 16:21:56 +01:00
3882d5ba36 Remove epidemic propagation for fully replicated stuff: write directly to all nodes 2021-03-05 15:09:18 +01:00
09fd6ea7f0 I was tired yesterday 2021-02-24 11:05:59 +01:00
a52ab69640 fix misuse of sled transactions 2021-02-23 22:45:36 +01:00
20e6e9fa20 Update sled & try to debug deadlock (but its in sled...) 2021-02-23 21:27:28 +01:00
bf25c95fe2 Make updated() be a sync function that doesn't fail 2021-02-23 20:25:15 +01:00
28bc967c83 Handle correctly deletion dues to offloading 2021-02-23 19:59:43 +01:00
55156cca9d Several changes in table_sync:
- separate path for case of offloading a partition we don't store
- use sync::Mutex instead of tokio::Mutex, make less fn's async
2021-02-23 19:11:02 +01:00