K2V #293

Merged
lx merged 68 commits from k2v into main 2022-05-10 11:16:58 +00:00
Showing only changes of commit 834e564efa - Show all commits

View file

@ -90,21 +90,26 @@ For instance, here is a possible sequence of events:
1. First we have the set of values v1, v2 and v3 described above. 1. First we have the set of values v1, v2 and v3 described above.
A node reads it, it obtains values v1, v2 and v3 with context `[(node1, t2), (node2, t3)]`. A node reads it, it obtains values v1, v2 and v3 with context `[(node1, t2), (node2, t3)]`.
2. A node writes a value `v5` with context `[(node1, t1)]`, i.e. `v5` is only a successor of v1 but not of v2 or v3. Suppose node1 receives the write, it will generate a new timestamp `t5` larger than all of the timestamps it knows of, i.e. `t5 > t2`. We will now have: 2. A node writes a value `v5` with context `[(node1, t1)]`, i.e. `v5` is only a
successor of v1 but not of v2 or v3. Suppose node1 receives the write, it
will generate a new timestamp `t5` larger than all of the timestamps it
knows of, i.e. `t5 > t2`. We will now have:
``` ```
(node1, tdiscard1'', (v2, t2), (v5, t5)) ; tdiscard1'' = t1 < t2 < t5 (node1, tdiscard1'', (v2, t2), (v5, t5)) ; tdiscard1'' = t1 < t2 < t5
(node2, tdiscard2, (v3, t3) ; tdiscard2 < t3 (node2, tdiscard2, (v3, t3) ; tdiscard2 < t3
``` ```
3. Now `v4` is written with context `[(node1, t2), (node2, t3)]`, and node2 processes the query. It will generate `t4 > t3` and the state will become: 3. Now `v4` is written with context `[(node1, t2), (node2, t3)]`, and node2
processes the query. It will generate `t4 > t3` and the state will become:
``` ```
(node1, tdiscard1', (v5, t5)) ; tdiscard1' = t2 < t5 (node1, tdiscard1', (v5, t5)) ; tdiscard1' = t2 < t5
(node2, tdiscard2', (v4, t4)) ; tdiscard2' = t3 (node2, tdiscard2', (v4, t4)) ; tdiscard2' = t3
``` ```
**Generic algorithm for handling insertions:** A certain node n handles the InsertItem and is responsible for the correctness of this procedure. **Generic algorithm for handling insertions:** A certain node n handles the
InsertItem and is responsible for the correctness of this procedure.
1. Lock the key (or the whole table?) at this node to prevent concurrent updates of the value that would mess things up 1. Lock the key (or the whole table?) at this node to prevent concurrent updates of the value that would mess things up
2. Read current set of values 2. Read current set of values
@ -357,11 +362,11 @@ HTTP/1.1 200 OK
end: null, end: null,
limit: null, limit: null,
partition_keys: [ partition_keys: [
[ "keys", 3043 ], { pk: "keys", n: 3043 },
[ "mailbox:INBOX", 42 ], { pk: "mailbox:INBOX", n: 42 },
[ "mailbox:Junk", 2991 ], { pk: "mailbox:Junk", n: 2991 },
[ "mailbox:Trash", 10 ], { pk: "mailbox:Trash", n: 10 },
[ "mailboxes", 3 ], { pk: "mailboxes", n: 3 },
], ],
more: false, more: false,
nextStart: null, nextStart: null,
@ -374,8 +379,8 @@ HTTP/1.1 200 OK
**InsertBatch: `POST /<bucket>`** **InsertBatch: `POST /<bucket>`**
Simple insertion and deletion of triplets. The body is just a list of items to Simple insertion and deletion of triplets. The body is just a list of items to
insert in the following format: `[ "<partition key>", "<sort key>", "<causality insert in the following format:
token>"|null, "<value>"|null ]`. `{ pk: "<partition key>", sk: "<sort key>", ct: "<causality token>"|null, v: "<value>"|null }`.
The causality token should be the one returned in a previous read request (e.g. The causality token should be the one returned in a previous read request (e.g.
by ReadItem or ReadBatch), to indicate that this write takes into account the by ReadItem or ReadBatch), to indicate that this write takes into account the
@ -397,9 +402,9 @@ Example query:
POST /my_bucket HTTP/1.1 POST /my_bucket HTTP/1.1
[ [
[ "mailbox:INBOX", "001892831", "opaquetoken321", "b64cryptoblob321updated" ], { pk: "mailbox:INBOX", sk: "001892831", ct: "opaquetoken321", v: "b64cryptoblob321updated" },
[ "mailbox:INBOX", "001892912", null, "b64cryptoblob444" ], { pk: "mailbox:INBOX", sk: "001892912", ct: null, v: "b64cryptoblob444" },
[ "mailbox:INBOX", "001892932", "opaquetoken654", null ], { pk: "mailbox:INBOX", sk: "001892932", ct: "opaquetoken654", v: null },
] ]
``` ```
@ -440,17 +445,16 @@ The result is a list of length the number of searches, that consists in for
each search a JSON object specified similarly to the result of ReadIndex, but each search a JSON object specified similarly to the result of ReadIndex, but
that lists triples within a partition key. that lists triples within a partition key.
The format of returned tuples is as follows: `[ "<sort key>", "<causality The format of returned tuples is as follows: `{ sk: "<sort key>", ct: "<causality
token>", "<value1>", ...]`, with the following fields: token>", v: ["<value1>", ...] }`, with the following fields:
- sort key: any unicode string used as a sort key - `sk` (sort key): any unicode string used as a sort key
- causality token: an opaque token served by the server (generally - `ct` (causality token): an opaque token served by the server (generally
base64-encoded) to be used in subsequent writes to this key base64-encoded) to be used in subsequent writes to this key
- value: binary blob, always base64-encoded - `v` (list of values): each value is a binary blob, always base64-encoded;
contains multiple items when concurrent values exists
- if several concurrent values exist, they are appended at the end
- in case of concurrent update and deletion, a `null` is added to the list of concurrent values - in case of concurrent update and deletion, a `null` is added to the list of concurrent values
@ -497,9 +501,9 @@ HTTP/1.1 200 OK
tombstones: false, tombstones: false,
single_item: false, single_item: false,
items: [ items: [
[ "INBOX", "opaquetoken123", "b64cryptoblob123", "b64cryptoblob'123" ], { sk: "INBOX", ct: "opaquetoken123", v: ["b64cryptoblob123", "b64cryptoblob'123"] },
[ "Trash", "opaquetoken456", "b64cryptoblob456" ], { sk: "Trash", ct: "opaquetoken456", v: ["b64cryptoblob456"] },
[ "Junk", "opaquetoken789", "b64cryptoblob789" ], { sk: "Junk", ct: "opaquetoken789", v: ["b64cryptoblob789"] },
], ],
more: false, more: false,
nextStart: null, nextStart: null,
@ -513,9 +517,9 @@ HTTP/1.1 200 OK
tombstones: false, tombstones: false,
single_item: false, single_item: false,
items: [ items: [
[ "001892831", "opaquetoken321", "b64cryptoblob321" ], { sk: "001892831", ct: "opaquetoken321", v: ["b64cryptoblob321"] },
[ "001892832", "opaquetoken654", "b64cryptoblob654" ], { sk: "001892832", ct: "opaquetoken654", v: ["b64cryptoblob654"] },
[ "001892874", "opaquetoken987", "b64cryptoblob987" ], { sk: "001892874", ct: "opaquetoken987", v: ["b64cryptoblob987"] },
], ],
more: true, more: true,
nextStart: "001892898", nextStart: "001892898",
@ -529,7 +533,7 @@ HTTP/1.1 200 OK
limit: null, limit: null,
single_item: true, single_item: true,
items: [ items: [
[ "0", "opaquetoken999", "b64binarystuff999" ], { sk: "0", ct: "opaquetoken999", v: ["b64binarystuff999"] },
], ],
more: false, more: false,
nextStart: null, nextStart: null,