diff --git a/doc/book/src/SUMMARY.md b/doc/book/src/SUMMARY.md index 18fad2c..b88ebb4 100644 --- a/doc/book/src/SUMMARY.md +++ b/doc/book/src/SUMMARY.md @@ -3,12 +3,13 @@ [The Garage Data Store](./intro.md) - [Getting Started](./getting_started/index.md) - - [Get a binary](./getting_started/binary.md) - - [Configure the daemon](./getting_started/daemon.md) - - [Control the daemon](./getting_started/control.md) - - [Configure a cluster](./getting_started/cluster.md) - - [Create buckets and keys](./getting_started/bucket.md) - - [Handle files](./getting_started/files.md) + - [Get a binary](./getting_started/01_binary.md) + - [Configuring a test deployment](./getting_started/02_test_deployment.md) + - [Configure a real-world deployment](./getting_started/03_real_world_deployment.md) + - [Control the daemon](./getting_started/04_control.md) + - [Configure a cluster](./getting_started/05_cluster.md) + - [Create buckets and keys](./getting_started/06_bucket.md) + - [Handle files](./getting_started/07_files.md) - [Cookbook](./cookbook/index.md) - [Host a website](./cookbook/website.md) @@ -17,7 +18,8 @@ - [Recovering from failures](./cookbook/recovering.md) - [Reference Manual](./reference_manual/index.md) - - [Garage CLI]() + - [Garage configuration file](./reference_manual/configuration.md) + - [Garage CLI](./reference_manual/cli.md) - [S3 API](./reference_manual/s3_compatibility.md) - [Design](./design/index.md) diff --git a/doc/book/src/cookbook/website.md b/doc/book/src/cookbook/website.md index 2ea82a9..b3dd1b5 100644 --- a/doc/book/src/cookbook/website.md +++ b/doc/book/src/cookbook/website.md @@ -1 +1,3 @@ # Host a website + +TODO diff --git a/doc/book/src/getting_started/binary.md b/doc/book/src/getting_started/01_binary.md similarity index 89% rename from doc/book/src/getting_started/binary.md rename to doc/book/src/getting_started/01_binary.md index e48500a..2719d95 100644 --- a/doc/book/src/getting_started/binary.md +++ b/doc/book/src/getting_started/01_binary.md @@ -7,14 +7,14 @@ We did not test other architecture/operating system but, as long as your archite ## From Docker Our docker image is currently named `lxpz/garage_amd64` and is stored on the [Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated). -We encourage you to use a fixed tag (eg. `v0.2.1`) and not the `latest` tag. -For this example, we will use the latest published version at the time of the writing which is `v0.2.1` but it's up to you +We encourage you to use a fixed tag (eg. `v0.3.0`) and not the `latest` tag. +For this example, we will use the latest published version at the time of the writing which is `v0.3.0` but it's up to you to check [the most recent versions on the Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated). For example: ``` -sudo docker pull lxpz/garage_amd64:v0.2.1 +sudo docker pull lxpz/garage_amd64:v0.3.0 ``` ## From source diff --git a/doc/book/src/getting_started/02_test_deployment.md b/doc/book/src/getting_started/02_test_deployment.md new file mode 100644 index 0000000..16f40dc --- /dev/null +++ b/doc/book/src/getting_started/02_test_deployment.md @@ -0,0 +1,107 @@ +# Configuring a test deployment + +This section describes how to run a simple test Garage deployment with a single node. +Note that this kind of deployment should not be used in production, as it provides +no redundancy for your data! +We will also skip intra-cluster TLS configuration, meaning that if you add nodes +to your cluster, communication between them will not be secure. + +First, make sure that you have Garage installed in your command line environment. +We will explain how to launch Garage in a Docker container, however we still +recommend that you install the `garage` CLI on your host system in order to control +the daemon. + +## Writing a first configuration file + +This first configuration file should allow you to get started easily with the simplest +possible Garage deployment: + +```toml +metadata_dir = "/tmp/meta" +data_dir = "/tmp/data" + +replication_mode = "none" + +rpc_bind_addr = "[::]:3901" + +bootstrap_peers = [] + +[s3_api] +s3_region = "garage" +api_bind_addr = "[::]:3900" + +[s3_web] +bind_addr = "[::]:3902" +root_domain = ".web.garage" +index = "index.html" +``` + +Save your configuration file as `garage.toml`. + +As you can see in the `metadata_dir` and `data_dir` parameters, we are saving Garage's data +in `/tmp` which gets erased when your system reboots. This means that data stored on this +Garage server will not be persistent. Change these to locations on your HDD if you want +your data to be persisted properly. + +## Launching the Garage server + +#### Option 1: directly (without Docker) + +Use the following command to launch the Garage server with our configuration file: + +``` +garage server -c garage.toml +``` + +By default, Garage displays almost no output. You can tune Garage's verbosity as follows +(from less verbose to more verbose): + +``` +RUST_LOG=garage=info garage server -c garage.toml +RUST_LOG=garage=debug garage server -c garage.toml +RUST_LOG=garage=trace garage server -c garage.toml +``` + +Log level `info` is recommended for most use cases. +Log level `debug` can help you check why your S3 API calls are not working. + +#### Option 2: in a Docker container + +Use the following command to start Garage in a docker container: + +``` +docker run -d \ + -p 3901:3901 -p 3902:3902 -p 3900:3900 \ + -v ./config.toml:/garage/config.toml \ + lxpz/garage_amd64:v0.3.0 +``` + +To tune Garage's verbosity level, set the `RUST_LOG` environment variable in the configuration +at launch time. For instance: + +``` +docker run -d \ + -p 3901:3901 -p 3902:3902 -p 3900:3900 \ + -v ./config.toml:/garage/config.toml \ + -e RUST_LOG=garage=info \ + lxpz/garage_amd64:v0.3.0 +``` + +## Checking that Garage runs correctly + +The `garage` utility is also used as a CLI tool to configure your Garage deployment. +It tries to connect to a Garage server through the RPC protocol, by default looking +for a Garage server at `localhost:3901`. + +Since our deployment already binds to port 3901, the following command should be sufficient +to show Garage's status, provided that you installed the `garage` binary on your host system: + +``` +garage status +``` + +Move on to [controlling the Garage daemon](04_control.md) to learn more about how to +use the Garage CLI to control your cluster. + +Move on to [configuring your cluster](05_cluster.md) in order to configure +your single-node deployment for actual use! diff --git a/doc/book/src/getting_started/03_real_world_deployment.md b/doc/book/src/getting_started/03_real_world_deployment.md new file mode 100644 index 0000000..81b929c --- /dev/null +++ b/doc/book/src/getting_started/03_real_world_deployment.md @@ -0,0 +1,154 @@ +# Configuring a real-world Garage deployment + +To run Garage in cluster mode, we recommend having at least 3 nodes. +This will allow you to setup Garage for three-way replication of your data, +the safest and most available mode avaialble. + +## Generating a TLS Certificate + +You first need to generate TLS certificates to encrypt traffic between Garage nodes +(reffered to as RPC traffic). + +To generate your TLS certificates, run on your machine: + +``` +wget https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/branch/master/genkeys.sh +chmod +x genkeys.sh +./genkeys.sh +``` + +It will creates a folder named `pki/` containing the keys that you will used for the cluster. + +## Real-world deployment + +To run a real-world deployment, make sure you the following conditions are met: + +- You have at least three machines with sufficient storage space available + +- Each machine has a public IP address which is reachable by other machines. + Running behind a NAT is possible, but having several Garage nodes behind a single NAT + is slightly more involved as each will have to have a different RPC port number + (the local port number of a node must be the same as the port number exposed publicly + by the NAT). + +- Ideally, each machine should have a SSD available in addition to the HDD you are dedicating + to Garage. This will allow for faster access to metadata and has the potential + to drastically reduce Garage's response times. + +Before deploying garage on your infrastructure, you must inventory your machines. +For our example, we will suppose the following infrastructure with IPv6 connectivity: + +| Location | Name | IP Address | Disk Space | +|----------|---------|------------|------------| +| Paris | Mercury | fc00:1::1 | 1 To | +| Paris | Venus | fc00:1::2 | 2 To | +| London | Earth | fc00:B::1 | 2 To | +| Brussels | Mars | fc00:F::1 | 1.5 To | + + +On each machine, we will have a similar setup, +especially you must consider the following folders/files: + + - `/etc/garage/config.toml`: Garage daemon's configuration (see below) + - `/etc/garage/pki/`: Folder containing Garage certificates, must be generated on your computer and copied on the servers + - `/var/lib/garage/meta/`: Folder containing Garage's metadata, put this folder on a SSD if possible + - `/var/lib/garage/data/`: Folder containing Garage's data, this folder will grows and must be on a large storage, possibly big HDDs. + - `/etc/systemd/system/garage.service`: Service file to start garage at boot automatically (defined below, not required if you use docker) + +A valid `/etc/garage/config.toml` for our cluster would be: + +```toml +metadata_dir = "/var/lib/garage/meta" +data_dir = "/var/lib/garage/data" + +replication_mode = "3" + +rpc_bind_addr = "[::]:3901" + +bootstrap_peers = [ + "[fc00:1::1]:3901", + "[fc00:1::2]:3901", + "[fc00:B::1]:3901", + "[fc00:F::1]:3901", +] + +[rpc_tls] +ca_cert = "/etc/garage/pki/garage-ca.crt" +node_cert = "/etc/garage/pki/garage.crt" +node_key = "/etc/garage/pki/garage.key" + +[s3_api] +s3_region = "garage" +api_bind_addr = "[::]:3900" + +[s3_web] +bind_addr = "[::]:3902" +root_domain = ".web.garage" +index = "index.html" +``` + +Please make sure to change `bootstrap_peers` to **your** IP addresses! + +Check the [configuration file reference documentation](../reference_manual/configuration.md) +to learn more about all available configuration options. + +### For docker users + +On each machine, you can run the daemon with: + +```bash +docker run \ + -d \ + --name garaged \ + --restart always \ + --network host \ + -v /etc/garage/pki:/etc/garage/pki \ + -v /etc/garage/config.toml:/garage/config.toml \ + -v /var/lib/garage/meta:/var/lib/garage/meta \ + -v /var/lib/garage/data:/var/lib/garage/data \ + lxpz/garage_amd64:v0.3.0 +``` + +It should be restart automatically at each reboot. +Please note that we use host networking as otherwise Docker containers +can not communicate with IPv6. + +Upgrading between Garage versions should be supported transparently, +but please check the relase notes before doing so! +To upgrade, simply stop and remove this container and +start again the command with a new version of garage. + +### For systemd/raw binary users + +Create a file named `/etc/systemd/system/garage.service`: + +```toml +[Unit] +Description=Garage Data Store +After=network-online.target +Wants=network-online.target + +[Service] +Environment='RUST_LOG=garage=info' 'RUST_BACKTRACE=1' +ExecStart=/usr/local/bin/garage server -c /etc/garage/config.toml + +[Install] +WantedBy=multi-user.target +``` + +To start the service then automatically enable it at boot: + +```bash +sudo systemctl start garage +sudo systemctl enable garage +``` + +To see if the service is running and to browse its logs: + +```bash +sudo systemctl status garage +sudo journalctl -u garage +``` + +If you want to modify the service file, do not forget to run `systemctl daemon-reload` +to inform `systemd` of your modifications. diff --git a/doc/book/src/getting_started/control.md b/doc/book/src/getting_started/04_control.md similarity index 80% rename from doc/book/src/getting_started/control.md rename to doc/book/src/getting_started/04_control.md index 9a66a0d..018d326 100644 --- a/doc/book/src/getting_started/control.md +++ b/doc/book/src/getting_started/04_control.md @@ -6,8 +6,9 @@ The `garage` binary has two purposes: In this section, we will see how to use the `garage` binary as a control tool for the daemon we just started. You first need to get a shell having access to this binary, which depends of your configuration: - - with `docker-compose`, run `sudo docker-compose exec g1 bash` then `/garage/garage` - - with `docker`, run `sudo docker exec -ti garaged bash` then `/garage/garage` + + - with `docker`, run `sudo docker exec -ti garaged bash`, you will now have a shell + where the Garage binary is available as `/garage/garage` - with `systemd`, simply run `/usr/local/bin/garage` if you followed previous instructions *You can also install the binary on your machine to remotely control the cluster.* @@ -27,14 +28,12 @@ The 3 first ones are certificates and keys needed by TLS, the last one is simply Because we configure garage directly from the server, we do not need to set `--rpc-host`. To avoid typing the 3 first options each time we want to run a command, we will create an alias. -### `docker-compose` alias +### test deployment + +If you have simply deployed Garage on your local machine, without TLS, you can invoke +`garage` directly without any of these parameters and without making a `garagectl` alias +(replace mentions of `garagectl` in the next sections by `garage`). -```bash -alias garagectl='/garage/garage \ - --ca-cert /pki/garage-ca.crt \ - --client-cert /pki/garage.crt \ - --client-key /pki/garage.key' -``` ### `docker` alias @@ -45,7 +44,6 @@ alias garagectl='/garage/garage \ --client-key /etc/garage/pki/garage.key' ``` - ### raw binary alias ```bash @@ -74,4 +72,4 @@ Healthy nodes: 8781c50c410a41b3… 758338dde686 [::ffff:172.20.0.102]:3901 UNCONFIGURED/REMOVED ``` -...which means that you are ready to configure your cluster! +...which means that you are ready to [configure your cluster](05_cluster.md)! diff --git a/doc/book/src/getting_started/cluster.md b/doc/book/src/getting_started/05_cluster.md similarity index 68% rename from doc/book/src/getting_started/cluster.md rename to doc/book/src/getting_started/05_cluster.md index c9c1868..83beb66 100644 --- a/doc/book/src/getting_started/cluster.md +++ b/doc/book/src/getting_started/05_cluster.md @@ -7,7 +7,7 @@ as well as the site (think datacenter) of each machine. ## Test cluster -As this part is not relevant for a test cluster, you can use this one-liner to create a basic topology: +As this part is not relevant for a test cluster, you can use this three-liner to create a basic topology: ```bash garagectl status | grep UNCONFIGURED | grep -Po '^[0-9a-f]+' | while read id; do @@ -19,7 +19,7 @@ done For our example, we will suppose we have the following infrastructure (Capacity, Identifier and Datacenter are specific values to garage described in the following): -| Location | Name | Disk Space | `Capacity` | `Identifier` | `Datacenter` | +| Location | Name | Disk Space | `Capacity` | `Identifier` | `Zone` | |----------|---------|------------|------------|--------------|--------------| | Paris | Mercury | 1 To | `2` | `8781c5` | `par1` | | Paris | Venus | 2 To | `4` | `2a638e` | `par1` | @@ -45,6 +45,15 @@ garagectl status It will display the IP address associated with each node; from the IP address you will be able to recognize the node. +### Zones + +Zones are simply a user-chosen identifier that identify a group of server that are grouped together logically. +It is up to the system administrator deploying garage to identify what does "grouped together" means. + +In most cases, a zone will correspond to a geographical location (i.e. a datacenter). +Behind the scene, Garage will use zone definition to try to store the same data on different zones, +in order to provide high availability despite failure of a zone. + ### Capacity Garage reasons on an arbitrary metric about disk storage that is named the *capacity* of a node. @@ -55,19 +64,19 @@ Additionaly, the capacity values used in Garage should be as small as possible, Here we chose that 1 unit of capacity = 0.5 To, so that we can express servers of size 1 To and 2 To, as wel as the intermediate size 1.5 To. -### Datacenter - -Datacenter are simply a user-chosen identifier that identify a group of server that are located in the same place. -It is up to the system administrator deploying garage to identify what does "the same place" means. -Behind the scene, garage will try to store the same data on different sites to provide high availability despite a data center failure. +Note that the amount of data stored by Garage on each server may not be strictly proportional to +its capacity value, as Garage will priorize having 3 copies of data in different zones, +even if this means that capacities will not be strictly respected. For example in our above examples, +nodes Earth and Mars will always store a copy of everything each, and the third copy will +have 66% chance of being stored by Venus and 33% chance of being stored by Mercury. ### Inject the topology Given the information above, we will configure our cluster as follow: ``` -garagectl node configure --datacenter par1 -c 2 -t mercury 8781c5 -garagectl node configure --datacenter par1 -c 4 -t venus 2a638e -garagectl node configure --datacenter lon1 -c 4 -t earth 68143d -garagectl node configure --datacenter bru1 -c 3 -t mars 212f75 +garagectl node configure -z par1 -c 2 -t mercury 8781c5 +garagectl node configure -z par1 -c 4 -t venus 2a638e +garagectl node configure -z lon1 -c 4 -t earth 68143d +garagectl node configure -z bru1 -c 3 -t mars 212f75 ``` diff --git a/doc/book/src/getting_started/bucket.md b/doc/book/src/getting_started/06_bucket.md similarity index 100% rename from doc/book/src/getting_started/bucket.md rename to doc/book/src/getting_started/06_bucket.md diff --git a/doc/book/src/getting_started/files.md b/doc/book/src/getting_started/07_files.md similarity index 88% rename from doc/book/src/getting_started/files.md rename to doc/book/src/getting_started/07_files.md index 0e3939c..cdd5d94 100644 --- a/doc/book/src/getting_started/files.md +++ b/doc/book/src/getting_started/07_files.md @@ -4,6 +4,9 @@ We recommend the use of MinIO Client to interact with Garage files (`mc`). Instructions to install it and use it are provided on the [MinIO website](https://docs.min.io/docs/minio-client-quickstart-guide.html). Before reading the following, you need a working `mc` command on your path. +Note that on certain Linux distributions such as Arch Linux, the Minio client binary +is called `mcli` instead of `mc` (to avoid name clashes with the Midnight Commander). + ## Configure `mc` You need your access key and secret key created in the [previous section](bucket.md). diff --git a/doc/book/src/getting_started/daemon.md b/doc/book/src/getting_started/daemon.md deleted file mode 100644 index 0f45dae..0000000 --- a/doc/book/src/getting_started/daemon.md +++ /dev/null @@ -1,222 +0,0 @@ -# Configure the daemon - -Garage is a software that can be run only in a cluster and requires at least 3 instances. -In our getting started guide, we document two deployment types: - - [Test deployment](#test-deployment) though `docker-compose` - - [Real-world deployment](#real-world-deployment) through `docker` or `systemd` - -In any case, you first need to generate TLS certificates, as traffic is encrypted between Garage's nodes. - -## Generating a TLS Certificate - -To generate your TLS certificates, run on your machine: - -``` -wget https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/branch/master/genkeys.sh -chmod +x genkeys.sh -./genkeys.sh -``` - -It will creates a folder named `pki` containing the keys that you will used for the cluster. - -## Test deployment - -Single machine deployment is only described through `docker-compose`. - -Before starting, we recommend you create a folder for our deployment: - -```bash -mkdir garage-single -cd garage-single -``` - -We start by creating a file named `docker-compose.yml` describing our network and our containers: - -```yml -version: '3.4' - -networks: { virtnet: { ipam: { config: [ subnet: 172.20.0.0/24 ]}}} - -services: - g1: - image: lxpz/garage_amd64:v0.1.1d - networks: { virtnet: { ipv4_address: 172.20.0.101 }} - volumes: - - "./pki:/pki" - - "./config.toml:/garage/config.toml" - - g2: - image: lxpz/garage_amd64:v0.1.1d - networks: { virtnet: { ipv4_address: 172.20.0.102 }} - volumes: - - "./pki:/pki" - - "./config.toml:/garage/config.toml" - - g3: - image: lxpz/garage_amd64:v0.1.1d - networks: { virtnet: { ipv4_address: 172.20.0.103 }} - volumes: - - "./pki:/pki" - - "./config.toml:/garage/config.toml" -``` - -*We define a static network here which is not considered as a best practise on Docker. -The rational is that Garage only supports IP address and not domain names in its configuration, so we need to know the IP address in advance.* - -and then create the `config.toml` file next to it as follow: - -```toml -metadata_dir = "/garage/meta" -data_dir = "/garage/data" -rpc_bind_addr = "[::]:3901" -bootstrap_peers = [ - "172.20.0.101:3901", - "172.20.0.102:3901", - "172.20.0.103:3901", -] - -[rpc_tls] -ca_cert = "/pki/garage-ca.crt" -node_cert = "/pki/garage.crt" -node_key = "/pki/garage.key" - -[s3_api] -s3_region = "garage" -api_bind_addr = "[::]:3900" - -[s3_web] -bind_addr = "[::]:3902" -root_domain = ".web.garage" -index = "index.html" -``` - -*Please note that we have not mounted `/garage/meta` or `/garage/data` on the host: data will be lost when the container will be destroyed.* - -And that's all, you are ready to launch your cluster! - -``` -sudo docker-compose up -``` - -While your daemons are up, your cluster is still not configured yet. -However, you can check that your services are still listening as expected by querying them from your host: - -```bash -curl http://172.20.0.{101,102,103}:3902 -``` - -which should give you: - -``` -Not found -Not found -Not found -``` - -That's all, you are ready to [configure your cluster!](./cluster.md). - -## Real-world deployment - -Before deploying garage on your infrastructure, you must inventory your machines. -For our example, we will suppose the following infrastructure: - -| Location | Name | IP Address | Disk Space | -|----------|---------|------------|------------| -| Paris | Mercury | fc00:1::1 | 1 To | -| Paris | Venus | fc00:1::2 | 2 To | -| London | Earth | fc00:B::1 | 2 To | -| Brussels | Mars | fc00:F::1 | 1.5 To | - -On each machine, we will have a similar setup, especially you must consider the following folders/files: - - `/etc/garage/pki`: Garage certificates, must be generated on your computer and copied on the servers - - `/etc/garage/config.toml`: Garage daemon's configuration (defined below) - - `/etc/systemd/system/garage.service`: Service file to start garage at boot automatically (defined below, not required if you use docker) - - `/var/lib/garage/meta`: Contains Garage's metadata, put this folder on a SSD if possible - - `/var/lib/garage/data`: Contains Garage's data, this folder will grows and must be on a large storage, possibly big HDDs. - -A valid `/etc/garage/config.toml` for our cluster would be: - -```toml -metadata_dir = "/var/lib/garage/meta" -data_dir = "/var/lib/garage/data" -rpc_bind_addr = "[::]:3901" -bootstrap_peers = [ - "[fc00:1::1]:3901", - "[fc00:1::2]:3901", - "[fc00:B::1]:3901", - "[fc00:F::1]:3901", -] - -[rpc_tls] -ca_cert = "/etc/garage/pki/garage-ca.crt" -node_cert = "/etc/garage/pki/garage.crt" -node_key = "/etc/garage/pki/garage.key" - -[s3_api] -s3_region = "garage" -api_bind_addr = "[::]:3900" - -[s3_web] -bind_addr = "[::]:3902" -root_domain = ".web.garage" -index = "index.html" -``` - -Please make sure to change `bootstrap_peers` to **your** IP addresses! - -### For docker users - -On each machine, you can run the daemon with: - -```bash -docker run \ - -d \ - --name garaged \ - --restart always \ - --network host \ - -v /etc/garage/pki:/etc/garage/pki \ - -v /etc/garage/config.toml:/garage/config.toml \ - -v /var/lib/garage/meta:/var/lib/garage/meta \ - -v /var/lib/garage/data:/var/lib/garage/data \ - lxpz/garage_amd64:v0.1.1d -``` - -It should be restart automatically at each reboot. -Please note that we use host networking as otherwise Docker containers can no communicate with IPv6. - -To upgrade, simply stop and remove this container and start again the command with a new version of garage. - -### For systemd/raw binary users - -Create a file named `/etc/systemd/system/garage.service`: - -```toml -[Unit] -Description=Garage Data Store -After=network-online.target -Wants=network-online.target - -[Service] -Environment='RUST_LOG=garage=info' 'RUST_BACKTRACE=1' -ExecStart=/usr/local/bin/garage server -c /etc/garage/config.toml - -[Install] -WantedBy=multi-user.target -``` - -To start the service then automatically enable it at boot: - -```bash -sudo systemctl start garage -sudo systemctl enable garage -``` - -To see if the service is running and to browse its logs: - -```bash -sudo systemctl status garage -sudo journalctl -u garage -``` - -If you want to modify the service file, do not forget to run `systemctl daemon-reload` -to inform `systemd` of your modifications. diff --git a/doc/book/src/reference_manual/cli.md b/doc/book/src/reference_manual/cli.md new file mode 100644 index 0000000..80789b9 --- /dev/null +++ b/doc/book/src/reference_manual/cli.md @@ -0,0 +1,4 @@ +# Garage CLI + +The Garage CLI is mostly self-documented. Make use of the `help` subcommand +and the `--help` flag to discover all available options. diff --git a/doc/book/src/reference_manual/configuration.md b/doc/book/src/reference_manual/configuration.md new file mode 100644 index 0000000..6c8d5eb --- /dev/null +++ b/doc/book/src/reference_manual/configuration.md @@ -0,0 +1,196 @@ +# Garage configuration file format reference + +Here is an example `garage.toml` configuration file that illustrates all of the possible options: + +```toml +metadata_dir = "/var/lib/garage/meta" +data_dir = "/var/lib/garage/data" + +block_size = 1048576 + +replication_mode = "3" + +rpc_bind_addr = "[::]:3901" + +bootstrap_peers = [ + "[fc00:1::1]:3901", + "[fc00:1::2]:3901", + "[fc00:B::1]:3901", + "[fc00:F::1]:3901", +] + +consul_host = "consul.service" +consul_service_name = "garage-daemon" + +max_concurrent_rpc_requests = 12 + +sled_cache_capacity = 134217728 +sled_flush_every_ms = 2000 + +[rpc_tls] +ca_cert = "/etc/garage/pki/garage-ca.crt" +node_cert = "/etc/garage/pki/garage.crt" +node_key = "/etc/garage/pki/garage.key" + +[s3_api] +s3_region = "garage" +api_bind_addr = "[::]:3900" + +[s3_web] +bind_addr = "[::]:3902" +root_domain = ".web.garage" +index = "index.html" +``` + +The following gives details about each available configuration option. + +## Available configuration options + +#### `metadata_dir` + +The directory in which Garage will store its metadata. This contains the node identifier, +the network configuration and the peer list, the list of buckets and keys as well +as the index of all objects, object version and object blocks. + +Store this folder on a fast SSD drive if possible to maximize Garage's performance. + +#### `data_dir` + +The directory in which Garage will store the data blocks of objects. +This folder can be placed on an HDD. The space available for `data_dir` +should be counted to determine a node's capacity +when [configuring it](../getting_started/05_cluster.md). + +#### `block_size` + +Garage splits stored objects in consecutive chunks of size `block_size` (except the last +one which might be standard). The default size is 1MB and should work in most cases. +If you are interested in tuning this, feel free to do so (and remember to report your +findings to us!) + +#### `replication_mode` + +Garage supports the following replication modes: + +- `none` or `1`: data stored on Garage is stored on a single node. There is no redundancy, + and data will be unavailable as soon as one node fails or its network is disconnected. + Do not use this for anything else than test deployments. + +- `2`: data stored on Garage will be stored on two different nodes, if possible in different + zones. Garage tolerates one node failure before losing data. Data should be available + read-only when one node is down, but write operations will fail. + Use this only if you really have to. + +- `3`: data stored on Garage will be stored on three different nodes, if possible each in + a different zones. + Garage tolerates two node failure before losing data. Data should be available + read-only when two nodes are down, and writes should be possible if only a single node + is down. + +Note that in modes `2` and `3`, +if at least the same number of zones are available, an arbitrary number of failures in +any given zone is tolerated as copies of data will be spread over several zones. + +**Make sure `replication_mode` is the same in the configuration files of all nodes. +Never run a Garage cluster where that is not the case.** + +Changing the `replication_mode` of a cluster might work (make sure to shut down all nodes +and changing it everywhere at the time), but is not officially supported. + +#### `rpc_bind_addr` + +The address and port on which to bind for inter-cluster communcations +(reffered to as RPC for remote procedure calls). +The port specified here should be the same one that other nodes will used to contact +the node, even in the case of a NAT: the NAT should be configured to forward the external +port number to the same internal port nubmer. This means that if you have several nodes running +behind a NAT, they should each use a different RPC port number. + +#### `bootstrap_peers` + +A list of IPs and ports on which to contact other Garage peers of this cluster. +This should correspond to the RPC ports set up with `rpc_bind_addr`. + +#### `consul_host` and `consul_service_name` + +Garage supports discovering other nodes of the cluster using Consul. +This works only when nodes are announced in Consul by an orchestrator such as Nomad, +as Garage is not able to announce itself. + +The `consul_host` parameter should be set to the hostname of the Consul server, +and `consul_service_name` should be set to the service name under which Garage's +RPC ports are announced. + +#### `max_concurrent_rpc_requests` + +Garage implements rate limiting for RPC requests: no more than +`max_concurrent_rpc_requests` concurrent outbound RPC requests will be made +by a Garage node (additionnal requests will be put in a waiting queue). + +#### `sled_cache_capacity` + +This parameter can be used to tune the capacity of the cache used by +[sled](https://sled.rs), the database Garage uses internally to store metadata. +Tune this to fit the RAM you wish to make available to your Garage instance. +More cache means faster Garage, but the default value (128MB) should be plenty +for most use cases. + +#### `sled_flush_every_ms` + +This parameters can be used to tune the flushing interval of sled. +Increase this if sled is thrashing your SSD, at the risk of losing more data in case +of a power outage (though this should not matter much as data is replicated on other +nodes). The default value, 2000ms, should be appropriate for most use cases. + + +## The `[rpc_tls]` section + +This section should be used to configure the TLS certificates used to encrypt +intra-cluster traffic (RPC traffic). The following parameters should be set: + +- `ca_cert`: the certificate of the CA that is allowed to sign individual node certificates +- `node_cert`: the node certificate for the current node +- `node_key`: the key associated with the node certificate + +Note tha several nodes may use the same node certificate, as long as it is signed +by the CA. + +If this section is absent, TLS is not used to encrypt intra-cluster traffic. + + +## The `[s3_api]` section + +#### `api_bind_addr` + +The IP and port on which to bind for accepting S3 API calls. +This endpoint does not suport TLS: a reverse proxy should be used to provide it. + +#### `s3_region` + +Garage will accept S3 API calls that are targetted to the S3 region defined here. +API calls targetted to other regions will fail with a AuthorizationHeaderMalformed error +message that redirects the client to the correct region. + + +## The `[s3_web]` section + +Garage allows to publish content of buckets as websites. This section configures the +behaviour of this module. + +#### `bind_addr` + +The IP and port on which to bind for accepting HTTP requests to buckets configured +for website access. +This endpoint does not suport TLS: a reverse proxy should be used to provide it. + +#### `root_domain` + +The optionnal suffix appended to bucket names for the corresponding HTTP Host. + +For instance, if `root_domain` is `web.garage.eu`, a bucket called `deuxfleurs.fr` +will be accessible either with hostname `deuxfleurs.fr.web.garage.eu` +or with hostname `deuxfleurs.fr`. + +#### `index` + +The name of the index file to return for requests ending with `/` (usually `index.html`). diff --git a/doc/book/src/reference_manual/s3_compatibility.md b/doc/book/src/reference_manual/s3_compatibility.md index c0fc286..5f9f527 100644 --- a/doc/book/src/reference_manual/s3_compatibility.md +++ b/doc/book/src/reference_manual/s3_compatibility.md @@ -1,6 +1,6 @@ -## S3 Compatibility status +# S3 Compatibility status -### Global S3 features +## Global S3 features Implemented: @@ -18,7 +18,7 @@ Not implemented: - most `x-amz-` headers -### Endpoint implementation +## Endpoint implementation All APIs that are not mentionned are not implemented and will return a 400 bad request. diff --git a/doc/book/src/working_documents/load_balancing.md b/doc/book/src/working_documents/load_balancing.md index 583b608..c436fdc 100644 --- a/doc/book/src/working_documents/load_balancing.md +++ b/doc/book/src/working_documents/load_balancing.md @@ -1,8 +1,8 @@ -## Load Balancing Data (planned for version 0.2) +# Load Balancing Data (planned for version 0.2) I have conducted a quick study of different methods to load-balance data over different Garage nodes using consistent hashing. -### Requirements +## Requirements - *good balancing*: two nodes that have the same announced capacity should receive close to the same number of items @@ -15,9 +15,9 @@ I have conducted a quick study of different methods to load-balance data over di replicas, independently of the order in which nodes were added/removed (this is to keep the implementation simple) -### Methods +## Methods -#### Naive multi-DC ring walking strategy +### Naive multi-DC ring walking strategy This strategy can be used with any ring-like algorithm to make it aware of the *multi-datacenter* requirement: @@ -38,7 +38,7 @@ This method was implemented in the first version of Garage, with the basic ring construction from Dynamo DB that consists in associating `n_token` random positions to each node (I know it's not optimal, the Dynamo paper already studies this). -#### Better rings +### Better rings The ring construction that selects `n_token` random positions for each nodes gives a ring of positions that is not well-balanced: the space between the tokens varies a lot, and some partitions are thus bigger than others. @@ -150,7 +150,7 @@ removing grisou gipsie : 49.22% 36.52% 12.79% 1.46% on average: 62.94% 27.89% 8.61% 0.57% <-- WORSE THAN PREVIOUSLY ``` -#### The magical solution: multi-DC aware MagLev +### The magical solution: multi-DC aware MagLev Suppose we want to select three replicas for each partition (this is what we do in our simulation and in most Garage deployments). We apply MagLev three times consecutively, one for each replica selection.