K2V #293
1 changed files with 25 additions and 4 deletions
|
@ -104,12 +104,12 @@ For instance, here is a possible sequence of events:
|
||||||
(node2, tdiscard2', (v4, t4)) ; tdiscard2' = t3
|
(node2, tdiscard2', (v4, t4)) ; tdiscard2' = t3
|
||||||
```
|
```
|
||||||
|
|
||||||
**Generic algorithm for handling insertions:** A certain node i handles the InsertItem and is responsible for the correctness of this procedure.
|
**Generic algorithm for handling insertions:** A certain node n handles the InsertItem and is responsible for the correctness of this procedure.
|
||||||
|
|
||||||
1. Lock the key (or the whole table?) at this node to prevent concurrent updates of the value that would mess things up
|
1. Lock the key (or the whole table?) at this node to prevent concurrent updates of the value that would mess things up
|
||||||
2. Read current set of values
|
2. Read current set of values
|
||||||
3. Generate a new timestamp that is larger than the largest timestamp for node i
|
3. Generate a new timestamp that is larger than the largest timestamp for node n
|
||||||
4. Add the inserted value in the list of values of node i
|
4. Add the inserted value in the list of values of node n
|
||||||
5. Update the discard times to be the times set in the context, and accordingly discard overwritten values
|
5. Update the discard times to be the times set in the context, and accordingly discard overwritten values
|
||||||
6. Release lock
|
6. Release lock
|
||||||
7. Propagate updated value to other nodes
|
7. Propagate updated value to other nodes
|
||||||
|
@ -136,7 +136,28 @@ that keeps tracks of the number of triples stored for each partition key.
|
||||||
This allows easy listing of all of the partition keys for which triples exist
|
This allows easy listing of all of the partition keys for which triples exist
|
||||||
in a bucket, as the partition key becomes the sort key in the index.
|
in a bucket, as the partition key becomes the sort key in the index.
|
||||||
|
|
||||||
TODO: writeup asynchronous counting strategy
|
How indexing works:
|
||||||
|
|
||||||
|
- Each node keeps a local count of how many items it stores for each partition,
|
||||||
|
in a local Sled tree that is updated atomically when an item is modified.
|
||||||
|
- These local counters are asynchronously stored in the index table which is
|
||||||
|
a regular Garage table spread in the network. Counters are stored as LWW values,
|
||||||
|
so basically the final table will have the following structure:
|
||||||
|
|
||||||
|
```
|
||||||
|
- pk: bucket
|
||||||
|
- sk: partition key for which we are counting
|
||||||
|
- v: lwwmap (node id -> number of items)
|
||||||
|
```
|
||||||
|
|
||||||
|
The final number of items present in the partition can be estimated by taking
|
||||||
|
the maximum of the values (i.e. the value for the node that announces having
|
||||||
|
the most items for that partition). In most cases the values for different node
|
||||||
|
IDs should all be the same; more precisely, three node IDs should map to the
|
||||||
|
same non-zero value, and all other node IDs that are present are tombstones
|
||||||
|
that map to zeroes. Note that we need to filter out values from nodes that are
|
||||||
|
no longer part of the cluster layout, as when nodes are removed they won't
|
||||||
|
necessarily have had the time to set their counters to zero.
|
||||||
|
|
||||||
## API Endpoints
|
## API Endpoints
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue