quentin.dufour.io/_posts/2021-09-23-quick-tour-of-garage.md

497 lines
14 KiB
Markdown
Raw Normal View History

2021-09-23 20:29:55 +00:00
---
layout: post
slug: a-quick-tour-of-garage
status: published
sitemap: true
title: A quick tour of Garage
description: Garage is a new lightweight and versatile object storage platform implementing the S3 API, let's quickly deploy it.
category: operation
tags:
---
[Garage](https://garagehq.deuxfleurs.fr) is an object storage platform that can be used as a drop-in replacement for AWS S3.
It is designed to run on bare metal hardware and has no dependency on any cloud provider.
In this article, I quickly deploy a geo-distributed Garage cluster and use it as a backend for Nextcloud.
I chose to use a cloud platform (Scaleway) to easily and quickly spawn geo-distributed machines.
To abstract our machines deployment, I wrote a small tool named [nuage](https://git.deuxfleurs.fr/quentin/nuage).
You will need a working account on [Scaleway](https://console.scaleway.com).
Then, we will need to install some tools on our machine (be sure to have [go](https://golang.org) installed).
Let's install Scaleway's CLI tool:
```bash
sudo curl -o /usr/local/bin/scw -L "https://github.com/scaleway/scaleway-cli/releases/download/v2.3.1/scw-2.3.1-linux-x86_64"
sudo chmod +x /usr/local/bin/scw
scw init # enter your scaleway credentials
```
And my helper to easily deploy instances on Scaleway:
```bash
go install git.deuxfleurs.fr/quentin/nuage@latest
export PATH="$PATH:$HOME/go/bin"
nuage # display how to use the tool
```
Now, we are ready to spawn some machines!
## Spawn machines
We start by creating our `nuage` inventory in a file named `garage_inventory.txt`:
```
fr-par-1 dev1-s debian_bullseye garage-fr-1
fr-par-1 dev1-s debian_bullseye garage-fr-2
pl-waw-1 dev1-s debian_bullseye garage-pl-1
pl-waw-1 dev1-s debian_bullseye garage-pl-2
nl-ams-1 dev1-s debian_bullseye garage-nl-1
nl-ams-1 dev1-s debian_bullseye garage-nl-2
```
Then let's pass it to `nuage`:
```bash
nuage spawn < ./garage_inventory.txt
```
`nuage` will spawn 6 machines:
- 2 machines in Paris, France
- 2 machines in Warsaw, Poland
- 2 machines in Amsterdam, Netherlands
All instances will run debian on Scaleway's cheap [dev1-s](https://www.scaleway.com/en/pricing/#development-instances) instances.
For the naming of our instances, we built it following this pattern: `garage-<zone>-<id>`.
Now, we suppose that the instances are started and your able to login on them, for example:
```
ssh root@51.15.227.63
```
## Some crypto
Our garage instances will communicate together securely through TLS.
We need to generate some certificates locally that we will deploy on the remote instances later.
To ease the operation, we provide a small handler named `genkeys.sh` to generate all the needed keys:
```
wget https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/tag/v0.3.0/genkeys.sh
chmod +x ./genkeys.sh
./genkeys.sh
```
Now you should have a folder named `pki` containing both the key and the certificate for your CA and your end-entity.
Ideally, each node would have its own end-entity certificate but to simplify the configuration, we will use only once in this tour.
Let's create a script to deploy our pki:
```bash
cat > deploy_pki.sh <<EOF
#!/bin/bash
mkdir -p /etc/garage/pki
cat > /etc/garage/pki/garage-ca.crt <<EOG
$(cat pki/garage-ca.crt)
EOG
cat > /etc/garage/pki/garage.crt <<EOG
$(cat pki/garage.crt)
EOG
cat > /etc/garage/pki/garage.key <<EOG
$(cat pki/garage.key)
EOG
EOF
```
Then send and execute our generated script on each of our machine.
```
nuage run ./deploy_pki.sh < ./garage-inventory.txt
```
## Configuration
Garage needs a small configuration file to work.
Again, we will write a deployment script.
You must adapt the `bootstrap_peers` section to your instances, you can run again `nuage spawn < ./garage-inventory.txt` to get their addresses.
Not all IP addresses are needed, after a discovery phase, garage maintains its own list and exchange it regulargy with its peers
Save the following file as `deploy_conf.sh` once you edited it (we arbitrarily chose to put 3 IPs here):
```bash
#!/bin/bash
cat > /etc/garage/config.toml <<EOF
metadata_dir = "/var/lib/garage/meta"
data_dir = "/var/lib/garage/data"
replication_mode = "3"
rpc_bind_addr = "[::]:3901"
bootstrap_peers = [
"51.15.59.148:3901",
"51.15.227.63:3901",
"51.15.206.116:3901",
]
[rpc_tls]
ca_cert = "/etc/garage/pki/garage-ca.crt"
node_cert = "/etc/garage/pki/garage.crt"
node_key = "/etc/garage/pki/garage.key"
[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage"
index = "index.html"
EOF
```
And now, the deployment:
```bash
nuage run ./deploy_conf.sh < ./garage-inventory.txt
```
## Binary and service
And this is already the last step of our deployment, installing the binary and the systemd service.
Again, we write a deployment script, named `deploy_bin.sh` this time:
```bash
#!/bin/bash
# Downloading Garage
wget https://garagehq.deuxfleurs.fr/_releases/v0.3.0/x86_64-unknown-linux-musl/garage -O /usr/local/bin/garage
chmod +x /usr/local/bin/garage
# Creating a control command
cat > /usr/local/bin/garagectl <<EOF
#!/bin/bash
/usr/local/bin/garage \
--ca-cert /etc/garage/pki/garage-ca.crt \
--client-cert /etc/garage/pki/garage.crt \
--client-key /etc/garage/pki/garage.key \
\$@
EOF
chmod +x /usr/local/bin/garagectl
# Creating a systemd service
cat > /etc/systemd/system/garage.service <<EOF
[Unit]
Description=Garage Data Store
After=network-online.target
Wants=network-online.target
[Service]
Environment='RUST_LOG=garage=info' 'RUST_BACKTRACE=1'
ExecStart=/usr/local/bin/garage server -c /etc/garage/config.toml
DynamicUser=true
StateDirectory=garage
[Install]
WantedBy=multi-user.target
EOF
# Activating it
systemctl daemon-reload
systemctl enable garage
systemctl start garage
```
And we execute it:
```
nuage run ./deploy_bin.sh < ./garage-inventory.txt
```
## garagectl
Now that we have built a cluster, we can connect on a machine and use garagectl to configure cluster-wide parameters.
So, first connect on any server (you can run `nuage spawn < garage-inventory.txt` to get your nodes IP addresses).
For example:
```bash
ssh root@51.158.182.206
```
You can see the current cluster status with:
```bash
garagectl status
```
We will then configure each node, assigning them:
- a relative size, because all our nodes have the same storage space, we will put a size of 1 everywhere.
- a zone, that will match the country where our instances are hosted
```bash
garagectl status
garagectl node configure -c 1 -z pl ??
garagectl node configure -c 1 -z pl ??
garagectl node configure -c 1 -z fr ??
# etc.
```
Now, we can create a key, a bucket, and allow the key to access the bucket:
```bash
garagectl key new --name quentin
garagectl bucket create my_files
garagectl bucket allow my_files --read --write --key GKfd49e3906e5d2e3e23ee07f9
```
Back to our local machine, we can already interact with our cluster through `awscli`.
You can install `awscli` as follow:
```
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
```
And quickly setup it by creating a file `~/.awsrc` (edit with your access key, secret key and endpoint):
```bash
export AWS_ACCESS_KEY_ID=GKfd49e3906e5d2e3e23ee07f9
export AWS_SECRET_ACCESS_KEY=xxxxxx
export AWS_DEFAULT_REGION='garage'
function aws { command aws --endpoint-url http://51.158.182.206:3900 $@ ; }
aws --version
```
And then, each time you want to use it, run:
```bash
source ~/.awsrc
```
Now, you should be able to use the awscli command freely:
```bash
aws s3 ls # list buckets
aws s3 cp garage-inventory.txt s3://my_files/inventory.txt # send a file
aws s3 ls my_files # list files in the bucket
```
## Nextcloud
We will provision another machine specifically for Nextcloud.
We start by creating a file named `nextcloud-inventory.txt` containing:
```
fr-par-1 dev1-s debian_bullseye nextcloud-fr-1
```
And spawn it:
```
nuage spawn < ./nextcloud-inventory.txt
```
Then we create an install script for Nextcloud named `deploy_nextcloud.sh`:
```bash
#!/bin/bash
apt-get update
apt-get install -y apache2 mariadb-server libapache2-mod-php7.4 php7.4-gd \
php7.4-mysql php7.4-curl php7.4-mbstring php7.4-intl php7.4-gmp \
php7.4-bcmath php-imagick php7.4-xml php7.4-zip unzip
systemctl start mysql
mysql -u root --password="" <<EOF
CREATE DATABASE IF NOT EXISTS nextcloud CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
CREATE USER 'nextcloud'@'localhost' IDENTIFIED BY 'nextcloud';
GRANT ALL PRIVILEGES ON nextcloud.* TO 'nextcloud'@'localhost';
FLUSH PRIVILEGES;
EOF
rm -fr nextcloud.zip nextcloud/
wget https://download.nextcloud.com/server/releases/nextcloud-22.1.1.zip -O nextcloud.zip
unzip nextcloud.zip
rm -fr /var/www/nextcloud
mv nextcloud /var/www
cat > /etc/apache2/sites-available/nextcloud.conf <<EOF
Alias /nextcloud "/var/www/nextcloud/"
<Directory /var/www/nextcloud/>
Require all granted
AllowOverride All
Options FollowSymLinks MultiViews
<IfModule mod_dav.c>
Dav off
</IfModule>
</Directory>
EOF
a2ensite nextcloud.conf
a2enmod rewrite
a2enmod headers
a2enmod env
a2enmod dir
a2enmod mime
systemctl restart apache2
chown -R www-data:www-data /var/www/nextcloud/
```
Then deploy it:
```
nuage run ./deploy_nextcloud.sh < ./nextcloud-inventory.txt
```
Then open in your browser Nextcloud, for me http://212.47.230.180/nextcloud.
Finish the installation by providing requested information.
Now we will configure Nextcloud to use Garage as its primary object storage.
You can also [read its documentation](https://docs.nextcloud.com/server/latest/admin_manual/configuration_files/primary_storage.html).
First, we need to create the bucket and the key (and will also allow our own key):
```bash
garagectl bucket create nextcloud
garagectl key new --name nextcloud
garagectl bucket allow nextcloud --read --write GK872ebec80feae4ad663e82ec
garagectl bucket allow nextcloud --read GKfd49e3906e5d2e3e23ee07f9
```
We will SSH on the server and edit `config.php`
```bash
ssh root@212.47.230.180
vim /var/www/nextcloud/config/config.php
```
and add:
```php
<?php
$CONFIG = array(
/* some other config */
'objectstore' => [
'class' => '\\OC\\Files\\ObjectStore\\S3',
'arguments' => [
'bucket' => 'nextcloud',
'autocreate' => false,
'key' => 'GK872ebec80feae4ad663e82ec',
'secret' => 'xxxxxxxxxxxxx',
'hostname' => '51.158.182.206',
'port' => 3900,
'use_ssl' => false,
'region' => 'garage',
// required for some non Amazon S3 implementations
'use_path_style' => true
],
],
```
*If you have some errors after reloading the page, run `tail -f /var/www/nextcloud/media/nextcloud.log`*
Primary storage is only one way to integrate Garage in Nextcloud, it is also possible to integrate it through the "External storage" plugin.
This method is not covered here but you can refer to [Nextcloud's documentation](https://docs.nextcloud.com/server/latest/admin_manual/configuration_files/external_storage_configuration_gui.html).
After uploading a file, you can see how nextcloud store them internally through awscli:
```
aws s3 ls nextcloud
```
Our current deployment has some drawbacks: we have a single point of failure with only one server and data are not sent encrypted.
One solution is to deploy garage on our server, locally, as a gateway.
First, we install it normally:
```bash
nuage run ./deploy_pki.sh < ./nextcloud-inventory.txt
nuage run ./deploy_conf.sh < ./nextcloud-inventory.txt
nuage run ./deploy_bin.sh < ./nextcloud-inventory.txt
```
Then, we configure our new node as a gateway because we do not want to store data on it, we just want to use it to route data:
```bash
garagectl status
garagectl node configure -z fr -g 2b145f7b4c15c2a4
```
Then we edit Nextcloud's configuration `/var/www/nextcloud/config/config.php` to just change the hostname:
```php
<?php
$CONFIG = array(
/* some other config */
'objectstore' => [
'class' => '\\OC\\Files\\ObjectStore\\S3',
'arguments' => [
/* other arguments */
'hostname' => '127.0.0.1',
/* other arguments */
],
]
```
Now we have a high availability backend as our local gateway will try to route our request to available servers.
## Handle crashes
Start by choosing a node you want to crash:
```bash
nuage spawn < ./garage-inventory.txt
```
SSH on it, we will simulate its failure by just stopping it (there is no difference between graceful shutdown and crashes):
```bash
ssh root@151.115.34.19
systemctl stop garage
```
Connect on another node and note that the node is unavailable:
```bash
ssh root@51.15.59.148
garagectl status
```
For now, no re-balancing has been triggered.
Garage allows for transient failures.
If you want to re-balance, you have to explicitly remove the node from Garage.
Now, let's assume this is only a transient failure, and let's restart it:
```bash
ssh root@151.115.34.19
systemctl start garage
journalctl -fu garage
```
Note how the repair is automatically triggered.
You can still manually trigger a repair if you want:
```bash
garagectl repair --yes
```
Now let's assume that the machine burnt and all its disks are losts:
```bash
systemctl stop garage
rm -r /var/lib/garage/
systemctl start garage
garagectl status
```
Now our node is seen as a new one and its old ID is seen as failed.
We will replace the old node with this new one with a simple command:
```
garagectl node configure --replace 212027752f40c4d4 -c 1 -z pl 375690c499627ea8
garagectl status
```
We do not cover this part, but you can also add or remove nodes at any time and trigger a re-balance.
## Destroy our VM
When you're done with this tour, just destroy the resources you created:
```bash
nuage destroy < ./garage-inventory.txt
nuage destroy < ./nextcloud-inventory.txt
```
Thanks a lot, this is the end of my tour of Garage, see you next time :)