From 56384677fa70bace19a4f2b555d84de7f77339e0 Mon Sep 17 00:00:00 2001 From: Alex Auvolat Date: Mon, 30 Jan 2023 17:43:25 +0100 Subject: [PATCH 1/5] Add links to presentations --- doc/book/design/_index.md | 6 ++++-- doc/book/design/related-work.md | 3 +-- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/doc/book/design/_index.md b/doc/book/design/_index.md index a3a6ac11..b54b4f8e 100644 --- a/doc/book/design/_index.md +++ b/doc/book/design/_index.md @@ -20,12 +20,14 @@ and could not do, etc. We love to talk and hear about Garage, that's why we keep a log here: + - [(en, 2023-01-18) Presentation of Garage at Inria](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/4cff37397f626ef063dad29e5b5e97ab1206015d/doc/talks/2023-01-18-tocatta/talk.pdf) + + - [(fr, 2022-11-19) De l'auto-hébergement à l'entre-hébergement : Garage, pour conserver ses données ensemble](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/4cff37397f626ef063dad29e5b5e97ab1206015d/doc/talks/2022-11-19-Capitole-du-Libre/pr%C3%A9sentation.pdf) + - [(fr, 2021-11-13, video) Garage : Mille et une façons de stocker vos données](https://video.tedomum.net/w/moYKcv198dyMrT8hCS5jz9) and [slides (html)](https://rfid.deuxfleurs.fr/presentations/2021-11-13/garage/) - during [RFID#1](https://rfid.deuxfleurs.fr/programme/2021-11-13/) event - [(en, 2021-04-28) Distributed object storage is centralised](https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/commit/b1f60579a13d3c5eba7f74b1775c84639ea9b51a/doc/talks/2021-04-28_spirals-team/talk.pdf) - [(fr, 2020-12-02) Garage : jouer dans la cour des grands quand on est un hébergeur associatif](https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/commit/b1f60579a13d3c5eba7f74b1775c84639ea9b51a/doc/talks/2020-12-02_wide-team/talk.pdf) -*Did you write or talk about Garage? [Open a pull request](https://git.deuxfleurs.fr/Deuxfleurs/garage/) to add a link here!* - diff --git a/doc/book/design/related-work.md b/doc/book/design/related-work.md index f96c6618..6c1a6b12 100644 --- a/doc/book/design/related-work.md +++ b/doc/book/design/related-work.md @@ -72,8 +72,7 @@ We considered there v2's design but concluded that it does not fit both our *Sel **[Riak CS](https://docs.riak.com/riak/cs/2.1.1/index.html):** *Not written yet* -**[IPFS](https://ipfs.io/):** -*Not written yet* +**[IPFS](https://ipfs.io/):** IPFS has design goals radically different from Garage, we have [a blog post](@/blog/2022-ipfs/index.md) talking about it. ## Specific research papers From 44f8b1d71abf661fb4e2a34b22c00569efc09481 Mon Sep 17 00:00:00 2001 From: Alex Auvolat Date: Mon, 30 Jan 2023 18:00:01 +0100 Subject: [PATCH 2/5] Reorder reference manual section, move metrics list to there --- doc/book/cookbook/monitoring.md | 276 +---------------- doc/book/reference-manual/admin-api.md | 2 +- doc/book/reference-manual/k2v.md | 2 +- doc/book/reference-manual/monitoring.md | 285 ++++++++++++++++++ doc/book/reference-manual/s3-compatibility.md | 2 +- 5 files changed, 289 insertions(+), 278 deletions(-) create mode 100644 doc/book/reference-manual/monitoring.md diff --git a/doc/book/cookbook/monitoring.md b/doc/book/cookbook/monitoring.md index f2240e8c..8313daa9 100644 --- a/doc/book/cookbook/monitoring.md +++ b/doc/book/cookbook/monitoring.md @@ -52,280 +52,6 @@ or make your own. We detail below the list of exposed metrics and their meaning. - ## List of exported metrics -### Garage system metrics - -#### `garage_build_info` (counter) - -Exposes the Garage version number running on a node. - -``` -garage_build_info{version="1.0"} 1 -``` - -#### `garage_replication_factor` (counter) - -Exposes the Garage replication factor configured on the node - -``` -garage_replication_factor 3 -``` - -### Metrics of the API endpoints - -#### `api_admin_request_counter` (counter) - -Counts the number of requests to a given endpoint of the administration API. Example: - -``` -api_admin_request_counter{api_endpoint="Metrics"} 127041 -``` - -#### `api_admin_request_duration` (histogram) - -Evaluates the duration of API calls to the various administration API endpoint. Example: - -``` -api_admin_request_duration_bucket{api_endpoint="Metrics",le="0.5"} 127041 -api_admin_request_duration_sum{api_endpoint="Metrics"} 605.250344830999 -api_admin_request_duration_count{api_endpoint="Metrics"} 127041 -``` - -#### `api_s3_request_counter` (counter) - -Counts the number of requests to a given endpoint of the S3 API. Example: - -``` -api_s3_request_counter{api_endpoint="CreateMultipartUpload"} 1 -``` - -#### `api_s3_error_counter` (counter) - -Counts the number of requests to a given endpoint of the S3 API that returned an error. Example: - -``` -api_s3_error_counter{api_endpoint="GetObject",status_code="404"} 39 -``` - -#### `api_s3_request_duration` (histogram) - -Evaluates the duration of API calls to the various S3 API endpoints. Example: - -``` -api_s3_request_duration_bucket{api_endpoint="CreateMultipartUpload",le="0.5"} 1 -api_s3_request_duration_sum{api_endpoint="CreateMultipartUpload"} 0.046340762 -api_s3_request_duration_count{api_endpoint="CreateMultipartUpload"} 1 -``` - -#### `api_k2v_request_counter` (counter), `api_k2v_error_counter` (counter), `api_k2v_error_duration` (histogram) - -Same as for S3, for the K2V API. - - -### Metrics of the Web endpoint - - -#### `web_request_counter` (counter) - -Number of requests to the web endpoint - -``` -web_request_counter{method="GET"} 80 -``` - -#### `web_request_duration` (histogram) - -Duration of requests to the web endpoint - -``` -web_request_duration_bucket{method="GET",le="0.5"} 80 -web_request_duration_sum{method="GET"} 1.0528433229999998 -web_request_duration_count{method="GET"} 80 -``` - -#### `web_error_counter` (counter) - -Number of requests to the web endpoint resulting in errors - -``` -web_error_counter{method="GET",status_code="404 Not Found"} 64 -``` - - -### Metrics of the data block manager - -#### `block_bytes_read`, `block_bytes_written` (counter) - -Number of bytes read/written to/from disk in the data storage directory. - -``` -block_bytes_read 120586322022 -block_bytes_written 3386618077 -``` - -#### `block_compression_level` (counter) - -Exposes the block compression level configured for the Garage node. - -``` -block_compression_level 3 -``` - -#### `block_read_duration`, `block_write_duration` (histograms) - -Evaluates the duration of the reading/writing of individual data blocks in the data storage directory. - -``` -block_read_duration_bucket{le="0.5"} 169229 -block_read_duration_sum 2761.6902550310056 -block_read_duration_count 169240 -block_write_duration_bucket{le="0.5"} 3559 -block_write_duration_sum 195.59170078500006 -block_write_duration_count 3571 -``` - -#### `block_delete_counter` (counter) - -Counts the number of data blocks that have been deleted from storage. - -``` -block_delete_counter 122 -``` - -#### `block_resync_counter` (counter), `block_resync_duration` (histogram) - -Counts the number of resync operations the node has executed, and evaluates their duration. - -``` -block_resync_counter 308897 -block_resync_duration_bucket{le="0.5"} 308892 -block_resync_duration_sum 139.64204196100016 -block_resync_duration_count 308897 -``` - -#### `block_resync_queue_length` (gauge) - -The number of block hashes currently queued for a resync. -This is normal to be nonzero for long periods of time. - -``` -block_resync_queue_length 0 -``` - -#### `block_resync_errored_blocks` (gauge) - -The number of block hashes that we were unable to resync last time we tried. -**THIS SHOULD BE ZERO, OR FALL BACK TO ZERO RAPIDLY, IN A HEALTHY CLUSTER.** -Persistent nonzero values indicate that some data is likely to be lost. - -``` -block_resync_errored_blocks 0 -``` - - -### Metrics related to RPCs (remote procedure calls) between nodes - -#### `rpc_netapp_request_counter` (counter) - -Number of RPC requests emitted - -``` -rpc_request_counter{from="",rpc_endpoint="garage_block/manager.rs/Rpc",to=""} 176 -``` - -#### `rpc_netapp_error_counter` (counter) - -Number of communication errors (errors in the Netapp library, generally due to disconnected nodes) - -``` -rpc_netapp_error_counter{from="",rpc_endpoint="garage_block/manager.rs/Rpc",to=""} 354 -``` - -#### `rpc_timeout_counter` (counter) - -Number of RPC timeouts, should be close to zero in a healthy cluster. - -``` -rpc_timeout_counter{from="",rpc_endpoint="garage_rpc/membership.rs/SystemRpc",to=""} 1 -``` - -#### `rpc_duration` (histogram) - -The duration of internal RPC calls between Garage nodes. - -``` -rpc_duration_bucket{from="",rpc_endpoint="garage_block/manager.rs/Rpc",to="",le="0.5"} 166 -rpc_duration_sum{from="",rpc_endpoint="garage_block/manager.rs/Rpc",to=""} 35.172253716 -rpc_duration_count{from="",rpc_endpoint="garage_block/manager.rs/Rpc",to=""} 174 -``` - - -### Metrics of the metadata table manager - -#### `table_gc_todo_queue_length` (gauge) - -Table garbage collector TODO queue length - -``` -table_gc_todo_queue_length{table_name="block_ref"} 0 -``` - -#### `table_get_request_counter` (counter), `table_get_request_duration` (histogram) - -Number of get/get_range requests internally made on each table, and their duration. - -``` -table_get_request_counter{table_name="bucket_alias"} 315 -table_get_request_duration_bucket{table_name="bucket_alias",le="0.5"} 315 -table_get_request_duration_sum{table_name="bucket_alias"} 0.048509778000000024 -table_get_request_duration_count{table_name="bucket_alias"} 315 -``` - - -#### `table_put_request_counter` (counter), `table_put_request_duration` (histogram) - -Number of insert/insert_many requests internally made on this table, and their duration - -``` -table_put_request_counter{table_name="block_ref"} 677 -table_put_request_duration_bucket{table_name="block_ref",le="0.5"} 677 -table_put_request_duration_sum{table_name="block_ref"} 61.617528636 -table_put_request_duration_count{table_name="block_ref"} 677 -``` - -#### `table_internal_delete_counter` (counter) - -Number of value deletions in the tree (due to GC or repartitioning) - -``` -table_internal_delete_counter{table_name="block_ref"} 2296 -``` - -#### `table_internal_update_counter` (counter) - -Number of value updates where the value actually changes (includes creation of new key and update of existing key) - -``` -table_internal_update_counter{table_name="block_ref"} 5996 -``` - -#### `table_merkle_updater_todo_queue_length` (gauge) - -Merkle tree updater TODO queue length (should fall to zero rapidly) - -``` -table_merkle_updater_todo_queue_length{table_name="block_ref"} 0 -``` - -#### `table_sync_items_received`, `table_sync_items_sent` (counters) - -Number of data items sent to/recieved from other nodes during resync procedures - -``` -table_sync_items_received{from="",table_name="bucket_v2"} 3 -table_sync_items_sent{table_name="block_ref",to=""} 2 -``` - - +See our [dedicated page](@/documentation/reference-manual/monitoring.md) in the Reference manual section. diff --git a/doc/book/reference-manual/admin-api.md b/doc/book/reference-manual/admin-api.md index 0b7e2e16..363bc886 100644 --- a/doc/book/reference-manual/admin-api.md +++ b/doc/book/reference-manual/admin-api.md @@ -1,6 +1,6 @@ +++ title = "Administration API" -weight = 60 +weight = 40 +++ The Garage administration API is accessible through a dedicated server whose diff --git a/doc/book/reference-manual/k2v.md b/doc/book/reference-manual/k2v.md index 207d056a..d40ec854 100644 --- a/doc/book/reference-manual/k2v.md +++ b/doc/book/reference-manual/k2v.md @@ -1,6 +1,6 @@ +++ title = "K2V" -weight = 70 +weight = 100 +++ Starting with version 0.7.2, Garage introduces an optionnal feature, K2V, diff --git a/doc/book/reference-manual/monitoring.md b/doc/book/reference-manual/monitoring.md new file mode 100644 index 00000000..97c533d3 --- /dev/null +++ b/doc/book/reference-manual/monitoring.md @@ -0,0 +1,285 @@ + ++++ +title = "Monitoring" +weight = 60 ++++ + + +For information on setting up monitoring, see our [dedicated page](@/documentation/cookbook/monitoring.md) in the Cookbook section. + +## List of exported metrics + +### Garage system metrics + +#### `garage_build_info` (counter) + +Exposes the Garage version number running on a node. + +``` +garage_build_info{version="1.0"} 1 +``` + +#### `garage_replication_factor` (counter) + +Exposes the Garage replication factor configured on the node + +``` +garage_replication_factor 3 +``` + +### Metrics of the API endpoints + +#### `api_admin_request_counter` (counter) + +Counts the number of requests to a given endpoint of the administration API. Example: + +``` +api_admin_request_counter{api_endpoint="Metrics"} 127041 +``` + +#### `api_admin_request_duration` (histogram) + +Evaluates the duration of API calls to the various administration API endpoint. Example: + +``` +api_admin_request_duration_bucket{api_endpoint="Metrics",le="0.5"} 127041 +api_admin_request_duration_sum{api_endpoint="Metrics"} 605.250344830999 +api_admin_request_duration_count{api_endpoint="Metrics"} 127041 +``` + +#### `api_s3_request_counter` (counter) + +Counts the number of requests to a given endpoint of the S3 API. Example: + +``` +api_s3_request_counter{api_endpoint="CreateMultipartUpload"} 1 +``` + +#### `api_s3_error_counter` (counter) + +Counts the number of requests to a given endpoint of the S3 API that returned an error. Example: + +``` +api_s3_error_counter{api_endpoint="GetObject",status_code="404"} 39 +``` + +#### `api_s3_request_duration` (histogram) + +Evaluates the duration of API calls to the various S3 API endpoints. Example: + +``` +api_s3_request_duration_bucket{api_endpoint="CreateMultipartUpload",le="0.5"} 1 +api_s3_request_duration_sum{api_endpoint="CreateMultipartUpload"} 0.046340762 +api_s3_request_duration_count{api_endpoint="CreateMultipartUpload"} 1 +``` + +#### `api_k2v_request_counter` (counter), `api_k2v_error_counter` (counter), `api_k2v_error_duration` (histogram) + +Same as for S3, for the K2V API. + + +### Metrics of the Web endpoint + + +#### `web_request_counter` (counter) + +Number of requests to the web endpoint + +``` +web_request_counter{method="GET"} 80 +``` + +#### `web_request_duration` (histogram) + +Duration of requests to the web endpoint + +``` +web_request_duration_bucket{method="GET",le="0.5"} 80 +web_request_duration_sum{method="GET"} 1.0528433229999998 +web_request_duration_count{method="GET"} 80 +``` + +#### `web_error_counter` (counter) + +Number of requests to the web endpoint resulting in errors + +``` +web_error_counter{method="GET",status_code="404 Not Found"} 64 +``` + + +### Metrics of the data block manager + +#### `block_bytes_read`, `block_bytes_written` (counter) + +Number of bytes read/written to/from disk in the data storage directory. + +``` +block_bytes_read 120586322022 +block_bytes_written 3386618077 +``` + +#### `block_compression_level` (counter) + +Exposes the block compression level configured for the Garage node. + +``` +block_compression_level 3 +``` + +#### `block_read_duration`, `block_write_duration` (histograms) + +Evaluates the duration of the reading/writing of individual data blocks in the data storage directory. + +``` +block_read_duration_bucket{le="0.5"} 169229 +block_read_duration_sum 2761.6902550310056 +block_read_duration_count 169240 +block_write_duration_bucket{le="0.5"} 3559 +block_write_duration_sum 195.59170078500006 +block_write_duration_count 3571 +``` + +#### `block_delete_counter` (counter) + +Counts the number of data blocks that have been deleted from storage. + +``` +block_delete_counter 122 +``` + +#### `block_resync_counter` (counter), `block_resync_duration` (histogram) + +Counts the number of resync operations the node has executed, and evaluates their duration. + +``` +block_resync_counter 308897 +block_resync_duration_bucket{le="0.5"} 308892 +block_resync_duration_sum 139.64204196100016 +block_resync_duration_count 308897 +``` + +#### `block_resync_queue_length` (gauge) + +The number of block hashes currently queued for a resync. +This is normal to be nonzero for long periods of time. + +``` +block_resync_queue_length 0 +``` + +#### `block_resync_errored_blocks` (gauge) + +The number of block hashes that we were unable to resync last time we tried. +**THIS SHOULD BE ZERO, OR FALL BACK TO ZERO RAPIDLY, IN A HEALTHY CLUSTER.** +Persistent nonzero values indicate that some data is likely to be lost. + +``` +block_resync_errored_blocks 0 +``` + + +### Metrics related to RPCs (remote procedure calls) between nodes + +#### `rpc_netapp_request_counter` (counter) + +Number of RPC requests emitted + +``` +rpc_request_counter{from="",rpc_endpoint="garage_block/manager.rs/Rpc",to=""} 176 +``` + +#### `rpc_netapp_error_counter` (counter) + +Number of communication errors (errors in the Netapp library, generally due to disconnected nodes) + +``` +rpc_netapp_error_counter{from="",rpc_endpoint="garage_block/manager.rs/Rpc",to=""} 354 +``` + +#### `rpc_timeout_counter` (counter) + +Number of RPC timeouts, should be close to zero in a healthy cluster. + +``` +rpc_timeout_counter{from="",rpc_endpoint="garage_rpc/membership.rs/SystemRpc",to=""} 1 +``` + +#### `rpc_duration` (histogram) + +The duration of internal RPC calls between Garage nodes. + +``` +rpc_duration_bucket{from="",rpc_endpoint="garage_block/manager.rs/Rpc",to="",le="0.5"} 166 +rpc_duration_sum{from="",rpc_endpoint="garage_block/manager.rs/Rpc",to=""} 35.172253716 +rpc_duration_count{from="",rpc_endpoint="garage_block/manager.rs/Rpc",to=""} 174 +``` + + +### Metrics of the metadata table manager + +#### `table_gc_todo_queue_length` (gauge) + +Table garbage collector TODO queue length + +``` +table_gc_todo_queue_length{table_name="block_ref"} 0 +``` + +#### `table_get_request_counter` (counter), `table_get_request_duration` (histogram) + +Number of get/get_range requests internally made on each table, and their duration. + +``` +table_get_request_counter{table_name="bucket_alias"} 315 +table_get_request_duration_bucket{table_name="bucket_alias",le="0.5"} 315 +table_get_request_duration_sum{table_name="bucket_alias"} 0.048509778000000024 +table_get_request_duration_count{table_name="bucket_alias"} 315 +``` + + +#### `table_put_request_counter` (counter), `table_put_request_duration` (histogram) + +Number of insert/insert_many requests internally made on this table, and their duration + +``` +table_put_request_counter{table_name="block_ref"} 677 +table_put_request_duration_bucket{table_name="block_ref",le="0.5"} 677 +table_put_request_duration_sum{table_name="block_ref"} 61.617528636 +table_put_request_duration_count{table_name="block_ref"} 677 +``` + +#### `table_internal_delete_counter` (counter) + +Number of value deletions in the tree (due to GC or repartitioning) + +``` +table_internal_delete_counter{table_name="block_ref"} 2296 +``` + +#### `table_internal_update_counter` (counter) + +Number of value updates where the value actually changes (includes creation of new key and update of existing key) + +``` +table_internal_update_counter{table_name="block_ref"} 5996 +``` + +#### `table_merkle_updater_todo_queue_length` (gauge) + +Merkle tree updater TODO queue length (should fall to zero rapidly) + +``` +table_merkle_updater_todo_queue_length{table_name="block_ref"} 0 +``` + +#### `table_sync_items_received`, `table_sync_items_sent` (counters) + +Number of data items sent to/recieved from other nodes during resync procedures + +``` +table_sync_items_received{from="",table_name="bucket_v2"} 3 +table_sync_items_sent{table_name="block_ref",to=""} 2 +``` + + diff --git a/doc/book/reference-manual/s3-compatibility.md b/doc/book/reference-manual/s3-compatibility.md index dd3492a0..15b29bd1 100644 --- a/doc/book/reference-manual/s3-compatibility.md +++ b/doc/book/reference-manual/s3-compatibility.md @@ -1,6 +1,6 @@ +++ title = "S3 Compatibility status" -weight = 40 +weight = 70 +++ ## DISCLAIMER From 7f715ba94fd636c5fb9d19686e5bf9f51242df06 Mon Sep 17 00:00:00 2001 From: Alex Auvolat Date: Mon, 30 Jan 2023 18:41:04 +0100 Subject: [PATCH 3/5] zero-downtime migration procedure --- doc/book/cookbook/upgrading.md | 77 +++++++++++++++------- doc/book/development/release-process.md | 2 +- doc/book/working-documents/migration-08.md | 25 ++++++- 3 files changed, 79 insertions(+), 25 deletions(-) diff --git a/doc/book/cookbook/upgrading.md b/doc/book/cookbook/upgrading.md index 9f2ba73b..dd9974d1 100644 --- a/doc/book/cookbook/upgrading.md +++ b/doc/book/cookbook/upgrading.md @@ -6,45 +6,76 @@ weight = 60 Garage is a stateful clustered application, where all nodes are communicating together and share data structures. It makes upgrade more difficult than stateless applications so you must be more careful when upgrading. On a new version release, there is 2 possibilities: - - protocols and data structures remained the same ➡️ this is a **straightforward upgrade** - - protocols or data structures changed ➡️ this is an **advanced upgrade** + - protocols and data structures remained the same ➡️ this is a **minor upgrade** + - protocols or data structures changed ➡️ this is a **major upgrade** -You can quickly now what type of update you will have to operate by looking at the version identifier. -Following the [SemVer ](https://semver.org/) terminology, if only the *patch* number changed, it will only need a straightforward upgrade. -Example: an upgrade from v0.6.0 from v0.6.1 is a straightforward upgrade. -If the *minor* or *major* number changed however, you will have to do an advanced upgrade. Example: from v0.6.1 to v0.7.0. +You can quickly now what type of update you will have to operate by looking at the version identifier: +when we require our users to do a major upgrade, we will always bump the first nonzero component of the version identifier +(e.g. from v0.7.2 to v0.8.0). +Conversely, for versions that only require a minor upgrade, the first nonzero component will always stay the same (e.g. from v0.8.0 to v0.8.1). -Migrations are designed to be run only between contiguous versions (from a *major*.*minor* perspective, *patches* can be skipped). -Example: migrations from v0.6.1 to v0.7.0 and from v0.6.0 to v0.7.0 are supported but migrations from v0.5.0 to v0.7.0 are not supported. +Major upgrades are designed to be run only between contiguous versions. +Example: migrations from v0.7.1 to v0.8.0 and from v0.7.0 to v0.8.2 are supported but migrations from v0.6.0 to v0.8.0 are not supported. -## Straightforward upgrades +## Minor upgrades -Straightforward upgrades do not imply cluster downtime. +Minor upgrades do not imply cluster downtime. Before upgrading, you should still read [the changelog](https://git.deuxfleurs.fr/Deuxfleurs/garage/releases) and ideally test your deployment on a staging cluster before. When you are ready, start by checking the health of your cluster. -You can force some checks with `garage repair`, we recommend at least running `garage repair --all-nodes --yes` that is very quick to run (less than a minute). -You will see that the command correctly terminated in the logs of your daemon. +You can force some checks with `garage repair`, we recommend at least running `garage repair --all-nodes --yes tables` which is very quick to run (less than a minute). +You will see that the command correctly terminated in the logs of your daemon, or using `garage worker list` (the repair workers should be in the `Done` state). -Finally, you can simply upgrades nodes one by one. -For each node: stop it, install the new binary, edit the configuration if needed, restart it. +Finally, you can simply upgrade nodes one by one. +For each node: stop it, install the new binary, edit the configuration if needed, restart it. -## Advanced upgrades +## Major upgrades -Advanced upgrades will imply cluster downtime. +Major upgrades can be done with minimal downtime with a bit of preparation, but the simplest way is usually to put the cluster offline for the duration of the migration. Before upgrading, you must read [the changelog](https://git.deuxfleurs.fr/Deuxfleurs/garage/releases) and you must test your deployment on a staging cluster before. -From a high level perspective, an advanced upgrade looks like this: - 1. Make sure the health of your cluster is good (see `garage repair`) - 2. Disable API access (comment the configuration in your reverse proxy) - 3. Check that your cluster is idle +We write guides for each major upgrade, they are stored under the "Working Documents" section of this documentation. + +### Major upgrades with full downtime + +From a high level perspective, a major upgrade looks like this: + + 1. Disable API access (for instance in your reverse proxy, or by commenting the corresponding section in your Garage configuration file and restarting Garage) + 2. Check that your cluster is idle + 3. Make sure the health of your cluster is good (see `garage repair`) 4. Stop the whole cluster - 5. Backup the metadata folder of all your nodes, so that you will be able to restore it quickly if the upgrade fails (blocks being immutable, they should not be impacted) + 5. Back up the metadata folder of all your nodes, so that you will be able to restore it if the upgrade fails (data blocks being immutable, they should not be impacted) 6. Install the new binary, update the configuration 7. Start the whole cluster 8. If needed, run the corresponding migration from `garage migrate` 9. Make sure the health of your cluster is good - 10. Enable API access (uncomment the configuration in your reverse proxy) + 10. Enable API access (reverse step 1) 11. Monitor your cluster while load comes back, check that all your applications are happy with this new version -We write guides for each advanced upgrade, they are stored under the "Working Documents" section of this documentation. +### Major upgarades with minimal downtime + +There is only one operation that has to be coordinated cluster-wide: the passage of one version of the internal RPC protocol to the next. +This means that an upgrade with very limited downtime can simply be performed from one major version to the next by restarting all nodes +simultaneously in the new version. +The downtime will simply be the time required for all nodes to stop and start again, which should be less than a minute. +If all nodes fail to stop and restart simultaneously, some nodes might be temporarily shut out from the cluster as nodes using different RPC protocol +versions are prevented to talk to one another. + +The entire procedure would look something like this: + +1. Make sure the health of your cluster is good (see `garage repair`) + +2. Take each node offline individually to back up its metadata folder, bring them back online once the backup is done. + You can do all of the nodes in a single zone at once as that won't impact global cluster availability. + Do not try to make a backup of the metadata folder of a running node. + +3. Prepare your binaries and configuration files for the new Garage version + +4. Restart all nodes simultaneously in the new version + +5. If any specific migration procedure is required, it is usually in one of the two cases: + + - It can be run on online nodes after the new version has started, during regular cluster operation. + - it has to be run offline + + For this last step, please refer to the specific documentation pertaining to the version upgrade you are doing. diff --git a/doc/book/development/release-process.md b/doc/book/development/release-process.md index f6db971a..3fed4add 100644 --- a/doc/book/development/release-process.md +++ b/doc/book/development/release-process.md @@ -11,7 +11,7 @@ We define them as our release process. While we run some tests on every commits, we do not make a release for all of them. A release can be triggered manually by "promoting" a successful build. -Otherwise, every weeks, a release build is triggered on the `main` branch. +Otherwise, every night, a release build is triggered on the `main` branch. If the build is from a tag following the regex: `v[0-9]+\.[0-9]+\.[0-9]+`, it will be listed as stable. If it is a tag but with a different format, it will be listed as Extra. diff --git a/doc/book/working-documents/migration-08.md b/doc/book/working-documents/migration-08.md index 5f97c45b..b7c4c783 100644 --- a/doc/book/working-documents/migration-08.md +++ b/doc/book/working-documents/migration-08.md @@ -12,13 +12,15 @@ back up all your data before attempting it!** Garage v0.8 introduces new data tables that allow the counting of objects in buckets in order to implement bucket quotas. A manual migration step is required to first count objects in Garage buckets and populate these tables with accurate data. +## Simple migration procedure (takes cluster offline for a while) + The migration steps are as follows: 1. Disable API and web access. Garage v0.7 does not support disabling these endpoints but you can change the port number or stop your reverse proxy for instance. 2. Do `garage repair --all-nodes --yes tables` and `garage repair --all-nodes --yes blocks`, check the logs and check that all data seems to be synced correctly between - nodes. If you have time, do additional checks (`scrub`, `block_refs`, etc.) + nodes. If you have time, do additional checks (`versions`, `block_refs`, etc.) 3. Check that queues are empty: run `garage stats` to query them or inspect metrics in the Grafana dashboard. 4. Turn off Garage v0.7 5. **Backup the metadata folder of all your nodes!** For instance, use the following command @@ -32,3 +34,24 @@ The migration steps are as follows: 10. Your upgraded cluster should be in a working state. Re-enable API and Web access and check that everything went well. 11. Monitor your cluster in the next hours to see if it works well under your production load, report any issue. + +## Minimal downtime migration procedure + +The migration to Garage v0.8 can be done with almost no downtime, +by restarting all nodes at once in the new version. The only limitation with this +method is that bucket sizes and item counts will not be estimated correctly +until all nodes have had a chance to run their offline migration procedure. + +The migration steps are as follows: + +1. Do `garage repair --all-nodes --yes tables` and `garage repair --all-nodes --yes blocks`, + check the logs and check that all data seems to be synced correctly between + nodes. If you have time, do additional checks (`versions`, `block_refs`, etc.) + +2. Turn off each node individually; back up its metadata folder (see above); turn it back on again. This will allow you to take a backup of all nodes without impacting global cluster availability. You can do all nodes of a single zone at once as this does not impact the availability of Garage. + +3. Prepare your binaries and configuration files for Garage v0.8 + +4. Shut down all v0.7 nodes simultaneously, and restart them all simultaneously in v0.8. Use your favorite deployment tool (Ansible, Kubernetes, Nomad) to achieve this as fast as possible. + +5. At this point, Garage will indicate invalid values for the size and number of objects in each bucket (most likely, it will indicate zero). To fix this, take each node offline individually to do the offline migration step: `garage offline-repair --yes object_counters`. Again you can do all nodes of a single zone at once. From 2ba9463a8acc86b18f5eb483e3184c789bbd78df Mon Sep 17 00:00:00 2001 From: Alex Auvolat Date: Mon, 30 Jan 2023 18:48:00 +0100 Subject: [PATCH 4/5] Raw links to presentations --- doc/book/design/_index.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/book/design/_index.md b/doc/book/design/_index.md index b54b4f8e..efef0b6e 100644 --- a/doc/book/design/_index.md +++ b/doc/book/design/_index.md @@ -20,9 +20,9 @@ and could not do, etc. We love to talk and hear about Garage, that's why we keep a log here: - - [(en, 2023-01-18) Presentation of Garage at Inria](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/4cff37397f626ef063dad29e5b5e97ab1206015d/doc/talks/2023-01-18-tocatta/talk.pdf) + - [(en, 2023-01-18) Presentation of Garage at Inria](https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/commit/4cff37397f626ef063dad29e5b5e97ab1206015d/doc/talks/2023-01-18-tocatta/talk.pdf) - - [(fr, 2022-11-19) De l'auto-hébergement à l'entre-hébergement : Garage, pour conserver ses données ensemble](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/4cff37397f626ef063dad29e5b5e97ab1206015d/doc/talks/2022-11-19-Capitole-du-Libre/pr%C3%A9sentation.pdf) + - [(fr, 2022-11-19) De l'auto-hébergement à l'entre-hébergement : Garage, pour conserver ses données ensemble](https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/commit/4cff37397f626ef063dad29e5b5e97ab1206015d/doc/talks/2022-11-19-Capitole-du-Libre/pr%C3%A9sentation.pdf) - [(fr, 2021-11-13, video) Garage : Mille et une façons de stocker vos données](https://video.tedomum.net/w/moYKcv198dyMrT8hCS5jz9) and [slides (html)](https://rfid.deuxfleurs.fr/presentations/2021-11-13/garage/) - during [RFID#1](https://rfid.deuxfleurs.fr/programme/2021-11-13/) event From 8013a5cd584de891f1fa0099f775954ba5bdd82d Mon Sep 17 00:00:00 2001 From: Alex Auvolat Date: Mon, 30 Jan 2023 18:50:38 +0100 Subject: [PATCH 5/5] Change talk links more --- doc/book/design/_index.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/doc/book/design/_index.md b/doc/book/design/_index.md index efef0b6e..50933139 100644 --- a/doc/book/design/_index.md +++ b/doc/book/design/_index.md @@ -20,14 +20,16 @@ and could not do, etc. We love to talk and hear about Garage, that's why we keep a log here: - - [(en, 2023-01-18) Presentation of Garage at Inria](https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/commit/4cff37397f626ef063dad29e5b5e97ab1206015d/doc/talks/2023-01-18-tocatta/talk.pdf) + - [(en, 2023-01-18) Presentation of Garage with some details on CRDTs and data partitioning among nodes](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/4cff37397f626ef063dad29e5b5e97ab1206015d/doc/talks/2023-01-18-tocatta/talk.pdf) - - [(fr, 2022-11-19) De l'auto-hébergement à l'entre-hébergement : Garage, pour conserver ses données ensemble](https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/commit/4cff37397f626ef063dad29e5b5e97ab1206015d/doc/talks/2022-11-19-Capitole-du-Libre/pr%C3%A9sentation.pdf) + - [(fr, 2022-11-19) De l'auto-hébergement à l'entre-hébergement : Garage, pour conserver ses données ensemble](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/4cff37397f626ef063dad29e5b5e97ab1206015d/doc/talks/2022-11-19-Capitole-du-Libre/pr%C3%A9sentation.pdf) + + - [(en, 2022-06-23) General presentation of Garage](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/4cff37397f626ef063dad29e5b5e97ab1206015d/doc/talks/2022-06-23-stack/talk.pdf) - [(fr, 2021-11-13, video) Garage : Mille et une façons de stocker vos données](https://video.tedomum.net/w/moYKcv198dyMrT8hCS5jz9) and [slides (html)](https://rfid.deuxfleurs.fr/presentations/2021-11-13/garage/) - during [RFID#1](https://rfid.deuxfleurs.fr/programme/2021-11-13/) event - - [(en, 2021-04-28) Distributed object storage is centralised](https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/commit/b1f60579a13d3c5eba7f74b1775c84639ea9b51a/doc/talks/2021-04-28_spirals-team/talk.pdf) + - [(en, 2021-04-28) Distributed object storage is centralised](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/b1f60579a13d3c5eba7f74b1775c84639ea9b51a/doc/talks/2021-04-28_spirals-team/talk.pdf) - - [(fr, 2020-12-02) Garage : jouer dans la cour des grands quand on est un hébergeur associatif](https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/commit/b1f60579a13d3c5eba7f74b1775c84639ea9b51a/doc/talks/2020-12-02_wide-team/talk.pdf) + - [(fr, 2020-12-02) Garage : jouer dans la cour des grands quand on est un hébergeur associatif](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/b1f60579a13d3c5eba7f74b1775c84639ea9b51a/doc/talks/2020-12-02_wide-team/talk.pdf)