Add a mdbook documentation to present garage and help user on-boarding #45

Merged
lx merged 9 commits from feature/mdbook into master 2021-03-18 09:39:59 +00:00
4 changed files with 15 additions and 256 deletions
Showing only changes of commit b82a61fba2 - Show all commits

117
README.md
View file

@ -18,119 +18,4 @@ Non-goals include:
Our main use case is to provide a distributed storage layer for small-scale self hosted services such as [Deuxfleurs](https://deuxfleurs.fr). Our main use case is to provide a distributed storage layer for small-scale self hosted services such as [Deuxfleurs](https://deuxfleurs.fr).
Check our [compatibility page](doc/Compatibility.md) to view details of the S3 API compatibility. **[Go to the documentation](https://garagehq.deuxfleurs.fr)**
## Development
We propose the following quickstart to setup a full dev. environment as quickly as possible:
1. Setup a rust/cargo environment. eg. `dnf install rust cargo`
2. Install awscli v2 by following the guide [here](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html).
3. Run `cargo build` to build the project
4. Run `./script/dev-cluster.sh` to launch a test cluster (feel free to read the script)
5. Run `./script/dev-configure.sh` to configure your test cluster with default values (same datacenter, 100 tokens)
6. Run `./script/dev-bucket.sh` to create a bucket named `eprouvette` and an API key that will be stored in `/tmp/garage.s3`
7. Run `source ./script/dev-env-aws.sh` to configure your CLI environment
8. You can use `garage` to manage the cluster. Try `garage --help`.
9. You can use the `awsgrg` alias to add, remove, and delete files. Try `awsgrg help`, `awsgrg cp /proc/cpuinfo s3://eprouvette/cpuinfo.txt`, or `awsgrg ls s3://eprouvette`. `awsgrg` is a wrapper on the `aws s3` command pre-configured with the previously generated API key (the one in `/tmp/garage.s3`) and localhost as the endpoint.
Now you should be ready to start hacking on garage!
## S3 compatibility
Only a subset of S3 is supported: adding, listing, getting and deleting files in a bucket.
Bucket management, ACL and other advanced features are not (yet?) handled through the S3 API but through the `garage` CLI.
We primarily test `garage` against the `awscli` tool and `nextcloud`.
## Setting up Garage
Use the `genkeys.sh` script to generate TLS keys for encrypting communications between Garage nodes.
The script takes no arguments and will generate keys in `pki/`.
This script creates a certificate authority `garage-ca` which signs certificates for individual Garage nodes.
Garage nodes from a same cluster authenticate themselves by verifying that they have certificates signed by the same certificate authority.
Garage requires two locations to store its data: a metadata directory, and a data directory.
The metadata directory is used to store metadata such as object lists, and should ideally be located on an SSD drive.
The data directory is used to store the chunks of data of the objects stored in Garage.
In a typical deployment the data directory is stored on a standard HDD.
Garage does not handle TLS for its S3 API endpoint. This should be handled by adding a reverse proxy.
Create a configuration file with the following structure:
```
block_size = 1048576 # objects are split in blocks of maximum this number of bytes
metadata_dir = "/path/to/ssd/metadata/directory"
data_dir = "/path/to/hdd/data/directory"
rpc_bind_addr = "[::]:3901" # the port other Garage nodes will use to talk to this node
bootstrap_peers = [
# Ideally this list should contain the IP addresses of all other Garage nodes of the cluster.
# Use Ansible or any kind of configuration templating to generate this automatically.
"10.0.0.1:3901",
"10.0.0.2:3901",
"10.0.0.3:3901",
]
# optionnal: garage can find cluster nodes automatically using a Consul server
# garage only does lookup but does not register itself, registration should be handled externally by e.g. Nomad
consul_host = "localhost:8500" # optionnal: host name of a Consul server for automatic peer discovery
consul_service_name = "garage" # optionnal: service name to look up on Consul
max_concurrent_rpc_requests = 12
data_replication_factor = 3
meta_replication_factor = 3
meta_epidemic_fanout = 3
[rpc_tls]
# NOT RECOMMENDED: you can skip this section if you don't want to encrypt intra-cluster traffic
# Thanks to genkeys.sh, generating the keys and certificates is easy, so there is NO REASON NOT TO DO IT.
ca_cert = "/path/to/garage/pki/garage-ca.crt"
node_cert = "/path/to/garage/pki/garage.crt"
node_key = "/path/to/garage/pki/garage.key"
[s3_api]
api_bind_addr = "[::1]:3900" # the S3 API port, HTTP without TLS. Add a reverse proxy for the TLS part.
s3_region = "garage" # set this to anything. S3 API calls will fail if they are not made against the region set here.
[s3_web]
bind_addr = "[::1]:3902"
root_domain = ".garage.tld"
index = "index.html"
```
Build Garage using `cargo build --release`.
Then, run it using either `./target/release/garage server -c path/to/config_file.toml` or `cargo run --release -- server -c path/to/config_file.toml`.
Set the `RUST_LOG` environment to `garage=debug` to dump some debug information.
Set it to `garage=trace` to dump even more debug information.
Set it to `garage=warn` to show nothing except warnings and errors.
## Setting up cluster nodes
Once all your `garage` nodes are running, you will need to:
1. check that they are correctly talking to one another;
2. configure them with their physical location (in the case of a multi-dc deployment) and a number of "ring tokens" proportionnal to the storage space available on each node;
3. create some S3 API keys and buckets;
4. ???;
5. profit!
To run these administrative tasks, you will need to use the `garage` command line tool and it to connect to any of the cluster's nodes on the RPC port.
The `garage` CLI also needs TLS keys and certificates of its own to authenticate and be authenticated in the cluster.
A typicall invocation will be as follows:
```
./target/release/garage --ca-cert=pki/garage-ca.crt --client-cert=pki/garage-client.crt --client-key=pki/garage-client.key <...>
```
## Notes to self
### What to repair
- `tables`: to do a full sync of metadata, should not be necessary because it is done every hour by the system
- `versions` and `block_refs`: very time consuming, usefull if deletions have not been propagated, improves garbage collection
- `blocks`: very usefull to resync/rebalance blocks betweeen nodes

View file

@ -1,140 +0,0 @@
# Quickstart on an existing deployment
First, chances are that your garage deployment is secured by TLS.
All your commands must be prefixed with their certificates.
I will define an alias once and for all to ease future commands.
Please adapt the path of the binary and certificates to your installation!
```
alias grg="/garage/garage --ca-cert /secrets/garage-ca.crt --client-cert /secrets/garage.crt --client-key /secrets/garage.key"
```
Now we can check that everything is going well by checking our cluster status:
```
grg status
```
Don't forget that `help` command and `--help` subcommands can help you anywhere, the CLI tool is self-documented! Two examples:
```
grg help
grg bucket allow --help
```
Fine, now let's create a bucket (we imagine that you want to deploy nextcloud):
```
grg bucket create nextcloud-bucket
```
Check that everything went well:
```
grg bucket list
grg bucket info nextcloud-bucket
```
Now we will generate an API key to access this bucket.
Note that API keys are independent of buckets: one key can access multiple buckets, multiple keys can access one bucket.
Now, let's start by creating a key only for our PHP application:
```
grg key new --name nextcloud-app-key
```
You will have the following output (this one is fake, `key_id` and `secret_key` were generated with the openssl CLI tool):
```
Key { key_id: "GK3515373e4c851ebaad366558", secret_key: "7d37d093435a41f2aab8f13c19ba067d9776c90215f56614adad6ece597dbb34", name: "nextcloud-app-key", name_timestamp: 1603280506694, deleted: false, authorized_buckets: [] }
```
Check that everything works as intended (be careful, info works only with your key identifier and not with its friendly name!):
```
grg key list
grg key info GK3515373e4c851ebaad366558
```
Now that we have a bucket and a key, we need to give permissions to the key on the bucket!
```
grg bucket allow --read --write nextcloud-bucket --key GK3515373e4c851ebaad366558
```
You can check at any times allowed keys on your bucket with:
```
grg bucket info nextcloud-bucket
```
Now, let's move to the S3 API!
We will use the `s3cmd` CLI tool.
You can install it via your favorite package manager.
Otherwise, check [their website](https://s3tools.org/s3cmd)
We will configure `s3cmd` with its interactive configuration tool, be careful not all endpoints are implemented!
Especially, the test run at the end does not work (yet).
```
$ s3cmd --configure
Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.
Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: GK3515373e4c851ebaad366558
Secret Key: 7d37d093435a41f2aab8f13c19ba067d9776c90215f56614adad6ece597dbb34
Default Region [US]: garage
Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]: garage.deuxfleurs.fr
Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: garage.deuxfleurs.fr
Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program [/usr/bin/gpg]:
When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]:
On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:
New settings:
Access Key: GK3515373e4c851ebaad366558
Secret Key: 7d37d093435a41f2aab8f13c19ba067d9776c90215f56614adad6ece597dbb34
Default Region: garage
S3 Endpoint: garage.deuxfleurs.fr
DNS-style bucket+hostname:port template for accessing a bucket: garage.deuxfleurs.fr
Encryption password:
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: True
HTTP Proxy server name:
HTTP Proxy server port: 0
Test access with supplied credentials? [Y/n] n
Save settings? [y/N] y
Configuration saved to '/home/quentin/.s3cfg'
```
Now, if everything works, the following commands should work:
```
echo hello world > hello.txt
s3cmd put hello.txt s3://nextcloud-bucket
s3cmd ls s3://nextcloud-bucket
s3cmd rm s3://nextcloud-bucket/hello.txt
```
That's all for now!

View file

@ -5,6 +5,7 @@
- [Getting Started](./getting_started/index.md) - [Getting Started](./getting_started/index.md)
- [Get a binary](./getting_started/binary.md) - [Get a binary](./getting_started/binary.md)
- [Configure the daemon](./getting_started/daemon.md) - [Configure the daemon](./getting_started/daemon.md)
- [Control the daemon](./getting_started/control.md)
- [Configure a cluster](./getting_started/cluster.md) - [Configure a cluster](./getting_started/cluster.md)
- [Create buckets and keys](./getting_started/bucket.md) - [Create buckets and keys](./getting_started/bucket.md)
- [Handle files](./getting_started/files.md) - [Handle files](./getting_started/files.md)

View file

@ -1 +1,14 @@
# Configure a cluster # Configure a cluster
## Test cluster
## Real-world cluster
For our example, we will suppose we have the following infrastructure:
| Location | Name | IP Address | Disk Space |
|----------|---------|------------|------------|
| Paris | Mercury | fc00:1::1 | 1 To |
| Paris | Venus | fc00:1::2 | 2 To |
| London | Earth | fc00:1::2 | 2 To |
| Brussels | Mars | fc00:B::1 | 1.5 To |