WIP: Add a roadmap draft #7

Draft
quentin wants to merge 1 commits from roadmap into master
Owner
No description provided.
quentin added 1 commit 2022-04-07 09:39:36 +00:00
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
a24e8b43d4
Add a roadmap draft
quentin added a new dependency 2022-04-07 09:39:51 +00:00
lx reviewed 2022-04-12 12:58:43 +00:00
lx left a comment
Owner

Added some notes on my perspective for the roadmap

Added some notes on my perspective for the roadmap
@ -0,0 +16,4 @@
## Feature completeness
Most importantly, we still need to fix some corner cases on advanced S3 endpoints (eg. [#263](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues), [#248](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/248), [#204](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/204)). Based on community feedbacks, we might also consider implementing additional endpoints (eg. [#166](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/166)) or quotas (eg. [#71](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/71)) but we can't make any promise (sorry!).
Owner

#263 (anonymous access) -> this shouldn't be the first thing we evoke, for me it's low priority because we have the web endpoint that mostly serves the same purpose, however if there is a clear vision of what we want to achieve with this which currently can't be done, I'm interested

#248 (fix uploadpartcopy) -> we already have a patch for this, maybe not necessary to have it in the roadmap?

#166 (versionning) -> it looks to me that there is quite some demand for this, so it would be nice to do it, however it's a lot of work and we need to plan some time dedicated to this issue if we want to have it. Moreover it would imply a radical overhaul of the internal data structures of Garage.

#204 (correct multipart uploads) -> we will need to have this at some point; however it also requires changing the data model in a non-trivial way. If we decide to spend time on #166 we can do both at the same time, which would be simpler.

#71 (multi-tenancy) -> as Quentin said in the comments, we can start by a simpler accounting scheme already just to know how much storage is used by each bucket, which can be oportunistically done when we develop the index-counting mechanism for K2V. I don't really know the priority level for this but I'd guess quite low, I haven't heared of anybody who needs this now.

`#263` (anonymous access) -> this shouldn't be the first thing we evoke, for me it's low priority because we have the web endpoint that mostly serves the same purpose, however if there is a clear vision of what we want to achieve with this which currently can't be done, I'm interested `#248` (fix uploadpartcopy) -> we already have a patch for this, maybe not necessary to have it in the roadmap? `#166` (versionning) -> it looks to me that there is quite some demand for this, so it would be nice to do it, however it's a lot of work and we need to plan some time dedicated to this issue if we want to have it. Moreover it would imply a radical overhaul of the internal data structures of Garage. `#204` (correct multipart uploads) -> we will need to have this at some point; however it also requires changing the data model in a non-trivial way. If we decide to spend time on `#166` we can do both at the same time, which would be simpler. `#71` (multi-tenancy) -> as Quentin said in the comments, we can start by a simpler accounting scheme already just to know how much storage is used by each bucket, which can be oportunistically done when we develop the index-counting mechanism for K2V. I don't really know the priority level for this but I'd guess quite low, I haven't heared of anybody who needs this now.
@ -0,0 +18,4 @@
Most importantly, we still need to fix some corner cases on advanced S3 endpoints (eg. [#263](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues), [#248](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/248), [#204](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/204)). Based on community feedbacks, we might also consider implementing additional endpoints (eg. [#166](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/166)) or quotas (eg. [#71](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/71)) but we can't make any promise (sorry!).
We also made a series of observation (the S3 API is not adapted to handle multiple small objects, many software require a database, we already have an internal database for metadata) which makes us believe that Garage could leverage its internal database system to provide a simple key-value store. We have already written [an API draft](https://p.adnab.me/code/#/2/code/view/eUNPbfoUrMbCY+CoMXaqed4jmWlmvWALHNDcfuM-O5o/embed/present/) for an API we named K2V. In the following months, we will then try to implement it and merge it if it makes sense. Spiritually, we would like it to be close to the original [Amazon Dynamo paper](https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf), or if you prefer approximative comparisons, K2V could be to Cassandra what sqlite is to PostgreSQL.
Owner

for the API draft please use the following URL: https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/k2v/doc/drafts/k2v-spec.md

Also I don't really like the comparison with sqlite, it feels wrong to me. I'd rather say we are to Cassandra what LMDB (or BerkeleyDB) is to Sqlite, but I guess nobody knows what LMDB or BerkeleyDB is.

for the API draft please use the following URL: <https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/k2v/doc/drafts/k2v-spec.md> Also I don't really like the comparison with sqlite, it feels wrong to me. I'd rather say we are to Cassandra what LMDB (or BerkeleyDB) is to Sqlite, but I guess nobody knows what LMDB or BerkeleyDB is.
@ -0,0 +19,4 @@
Most importantly, we still need to fix some corner cases on advanced S3 endpoints (eg. [#263](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues), [#248](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/248), [#204](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/204)). Based on community feedbacks, we might also consider implementing additional endpoints (eg. [#166](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/166)) or quotas (eg. [#71](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/71)) but we can't make any promise (sorry!).
We also made a series of observation (the S3 API is not adapted to handle multiple small objects, many software require a database, we already have an internal database for metadata) which makes us believe that Garage could leverage its internal database system to provide a simple key-value store. We have already written [an API draft](https://p.adnab.me/code/#/2/code/view/eUNPbfoUrMbCY+CoMXaqed4jmWlmvWALHNDcfuM-O5o/embed/present/) for an API we named K2V. In the following months, we will then try to implement it and merge it if it makes sense. Spiritually, we would like it to be close to the original [Amazon Dynamo paper](https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf), or if you prefer approximative comparisons, K2V could be to Cassandra what sqlite is to PostgreSQL.
If you worry about feature bloat, be ensured that we do not plan to extend Garage beyond these points!
Owner

This looks a bit strange to me, how can we know in advance when we will stop or not? We don't really know who will use Garage in the future and what needs they will have.

This looks a bit strange to me, how can we know in advance when we will stop or not? We don't really know who will use Garage in the future and what needs they will have.
@ -0,0 +27,4 @@
Garage has currently 2 API: S3 and Admin, with different *Consistency Models*. S3 API's consistency is aligned on [Amazon's new S3 Strong Consistency](https://aws.amazon.com/s3/consistency/) while Admin API's eventual consistency is not yet specified. We want to document first the Admin API's consistency, then we could make S3 consistency explanation more approachable.
The Admin API is also currently only exposed through our custom RPC endpoint. We would like to expose it through a REST endpoint ([#231](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/231)) to ease *Administration*. This REST API would make possible to build a web interface to manage Garage ([#232](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/232)).
Owner

#231 (REST admin endpoints) -> we will need this at Deuxfleurs to interconnect with Guichet so it's relatively high in the priority list. Also it's quite easy work which can probably be done without my intervention.

#232 (admin web UI) -> I don't think we should spend time developping a web UI just for Garage "in the general case", what we need for us is something specific that integrates with Guichet and we should focus on that

The following are missing in the list:

#207 monitoring of background tasks -> this would bring great quality-of-life improvements for Garage admins so I'd say it's pretty high in the list

#255 automatically scrub regularly -> for durability we need this; it would probably be better to develop it in conjunction with #207

`#231` (REST admin endpoints) -> we will need this at Deuxfleurs to interconnect with Guichet so it's relatively high in the priority list. Also it's quite easy work which can probably be done without my intervention. `#232` (admin web UI) -> I don't think we should spend time developping a web UI just for Garage "in the general case", what we need for us is something specific that integrates with Guichet and we should focus on that The following are missing in the list: `#207` monitoring of background tasks -> this would bring great quality-of-life improvements for Garage admins so I'd say it's pretty high in the list `#255` automatically scrub regularly -> for durability we need this; it would probably be better to develop it in conjunction with #207
@ -0,0 +35,4 @@
<!--2) Explaining how Garage can take its place in the existing ecosystem, including among the other distributed storage systems, but also in term of uses cases and deployments (how does it perform at scale, with which hardware, for which application, etc.) 3) Make possible to manage Garage from a REST API, possibly write a web GUI to make administration easier, 4) help people understand the reliability and storage density they will have for a specific Garage deployment, if possible through a simulator, 5) we might consider adding a system of quota to protect a cluster from a misbehaving user. 6) Integration in ecosystems -->
## Correctness
Owner

this section could be called "correctness and performance"

this section could be called "correctness and performance"
@ -0,0 +41,4 @@
But we still need to make sure that in practise our implementation is correct, and thus features these defined properties.
[Jepsen](https://jepsen.io/) is a well-known tool to test distributed system properties by simulating some system states and sequencing packets in a specific order.
We would like to learn it and apply it to Garage to better convince ourselves that we
Owner

Jepsen -> we don't have time, let's drop it for now

Jepsen -> we don't have time, let's drop it for now
@ -0,0 +44,4 @@
We would like to learn it and apply it to Garage to better convince ourselves that we
Consider control theory, speak about tranquilizer ([#145](https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/145)).
Owner

Very low priority for me

Very low priority for me
@ -0,0 +49,4 @@
Horizontal scaling made possible no leader also no erasure coding.
Track bottlenecks, instability and overload.
We also plan to deploy Garage on multiple clusters and do a large serie of benchmarks.
Owner

It looks here like we are entering into the "performance" domain. If so, we probably want to evoke a rework of the RPC stack to allow streaming RPCs (maybe based on QUIC but maybe not) which would allow to reduce the TTFB and general latency in many requests while also enabling us to increase the block size, bringing further improvements in performance. To me this is probably the highest priority item in the "performance" milestone.

It looks here like we are entering into the "performance" domain. If so, we probably want to evoke a rework of the RPC stack to allow streaming RPCs (maybe based on QUIC but maybe not) which would allow to reduce the TTFB and general latency in many requests while also enabling us to increase the block size, bringing further improvements in performance. To me this is probably the highest priority item in the "performance" milestone.
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
This pull request is marked as a work in progress.
This branch is out-of-date with the base branch
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Depends on
#6 Blog post inroducing Garage v0.7
Deuxfleurs/garagehq.deuxfleurs.fr
Reference: Deuxfleurs/garagehq.deuxfleurs.fr#7
No description provided.