add an article about matrix synapse and s3

2021-09-21 12:29:39 +02:00 · 2021-09-21 12:29:39 +02:00 · 54567241cd
commit 54567241cd
parent e1c9067b8a
2 changed files with 199 additions and 1 deletions
--- a/_posts/2021-07-12-chroniques-administration-synapse.md
+++ b/_posts/2021-07-12-chroniques-administration-synapse.md
@ -2,7 +2,7 @@
 layout: post
 slug: chroniques-administration-synapse
 status: published
-sitemap: false
+sitemap: true
 title: Chroniques d'administration de Synapse
 description: Pour l'instant tout va bien, pour l'instant tout...
 category: operation
--- a/_posts/2021-09-14-synapse-media-storage-provider.md
+++ b/_posts/2021-09-14-synapse-media-storage-provider.md
@ -0,0 +1,198 @@
+---
+layout: post
+slug: matrix-synapse-s3-storage
+status: published
+sitemap: true
+title: Storing Matrix media on a S3 backend
+description: Matrix has multiple solutions to store is media on S3, we review them and point their drawbacks
+category: operation
+tags:
+---
+
+By default, Matrix Synapse stores its media on the local filesystem which rises many issues.
+It exposes your users to loss of data, availability issues but mainly scalability/sizing issues.
+Especially as we live in an era where users expect no resource limitation, where software are not
+designed to garbage collect or even track resource usage, it is really hard to plan ahead resources you will use.
+
+In practise, it leads to 2 observations: resource overprovisioning and distributed filesystems.
+The first one often leads to wasted resources while the second one is often hard to manage and require expensive hardware and network.
+
+Thankfully, as we store blob data, we do not need the full power of a filesystem and a more lightweight API like S3 is enough.
+In Matrix Synapse language, these solutions are referred as storage provider.
+In this article, we will see how we migrated from GlusterFS to Matrix's S3 storage provider + our [Garage](garagehq.deuxfleurs.fr/) backend.
+
+## Internals
+
+First, Matrix's developpers make a difference between a *media provider* and a *storage provider*.
+It appears that files are always stored in the *media provider* even if a *storage provider* is registered, and there is no way
+to change this behavior in the code. And unfortunately the *media provider* can only use the filesystem.
+
+For example when fetching a media, we can see [in the code](
+https://github.com/matrix-org/synapse/blob/b996782df51eaa5dd30635a7c59c93994d3a735e/synapse/rest/media/v1/media_storage.py#L185-L198) that the filesystem is always probed first, and only then our remote backend.
+
+We also see [in the code](
+https://github.com/matrix-org/synapse/blob/b996782df51eaa5dd30635a7c59c93994d3a735e/synapse/rest/media/v1/media_storage.py#L202-L211) that the *media provider* can be referred as the local cache and that some parts of the code may require that a file is in the local cache.
+
+As a conclusion, the best we can do is to keep the *media provider* as a local cache.
+But even if this case, it is our responsability to garbage collect the cache.
+
+## Migration
+
+We can easily configure the S3 synapse provider in our `homeserver.yaml`:
+
+```yaml
+media_storage_providers:
+- module: s3_storage_provider.S3StorageProviderBackend
+  store_local: True
+  store_remote: True
+  store_synchronous: True
+  config:
+    bucket: matrix
+    region_name: garage
+    endpoint_url: XXXXXXXXXXXXXX
+    access_key_id: XXXXXXXXXXXXXX
+    secret_access_key: XXXXXXXXXXX
+```
+
+But registering it like that will only be useful for our new media (because we activated `store_local` and `store_remote` for local and remote content that must automatically pushed to our S3 backend).
+
+Old media must be migrated with a script named `s3_media_upload`. First, we need some setup to use this tool:
+  - postgres credentials + endpoint must be stored in a `database.yml` file
+  - s3 credentials must be configured as per the [boto convention](https://boto3.amazonaws.com/v1/documentation/api/1.9.46/guide/configuration.html) and the endpoint can be specified on the command line
+  - the path to the local cache/media repository is also passed through the command line
+  
+This script needs to store some states between command executions and thus will create a sqlite in your working directory named `cache.db`. Do not delete it!
+  
+In practise, your database configuration may be created as follow:
+  
+```bash
+cat > database.yaml <<EOF
+user: xxxxx
+password: xxxxx
+database: xxxxxx
+host: xxxxxxxx
+port: 5432
+EOF
+```
+
+And S3 can be configured through environment variables:
+
+```bash
+export AWS_ACCESS_KEY_ID=""
+export AWS_SECRET_ACCESS_KEY=""
+export AWS_DEFAULT_REGION="garage"
+```
+
+We are now ready, the other parameters will be passed on the command line.
+
+## Use the tool
+
+First we must build a list of media that we want to send to S3.
+I guess that developpers designed this tool with the idea that S3 is an archive target and that we want to keep recent data locally.
+That's why a duration is required, because they want to send only old data to S3.
+Here, we will fetch media that are at least one day (`1d`) old, but you can set 1 month (`1m`) to keep more media locally or 0 day (`0d`) if you want close to no local cache. For more details, check [the source code](https://github.com/matrix-org/synapse-s3-storage-provider/blob/main/scripts/s3_media_upload#L140-L185).
+
+```bash
+./s3_media_upload update-db 1d
+```
+
+
+Filters media that are not on the local filesystem, either because they were already uploaded to our S3 backend or because they are lost. [See the code](https://github.com/matrix-org/synapse-s3-storage-provider/blob/main/scripts/s3_media_upload#L188-L217).
+
+
+*Please not that I deactivated the progress bar because it is buggy on my docker exec inside a screen inside a ssh session.*
+
+```bash
+./s3_media_upload --no-progress check-deleted /var/lib/matrix-synapse/media 
+```
+
+If we want to combine `update-db` and `check-deleted`, we can run `update`.
+
+Now, before doing any action, we might want to see our candidates.
+These candidates may already be present on our S3 target so you may end up uploading less data.
+
+```bash
+./s3_media_upload write
+```
+
+
+The command upload does many things at once:
+  - check again that the file is still on the local filesystem
+  - check if the file exists on S3
+  - upload it to S3 if needed
+  - optionnaly delete the local file
+
+Ideally, I would only use our S3 target and not anymore the local filesystem.
+Because it is not possible with this module, at least I delete uploaded content from the local filesystem.
+[See the source code for more details](https://github.com/matrix-org/synapse-s3-storage-provider/blob/main/scripts/s3_media_upload#L220-L282).
+
+My final command looks like this:
+
+```
+./s3_media_upload --no-progress upload /var/lib/matrix-synapse/media matrix --delete --endpoint-url https://garage.deuxfleurs.fr 
+```
+
+## GlusterFS again
+
+By running this script one month after activating the main module, I observed that many files were missing on our S3 target, around 60%.
+Our setup was as follow:
+  - The media repository (the local filesystem) was on GlusterFS
+  - The storage provider (our S3 target) was handled by Garage
+  
+We now that our GlusterFS target suffers from severe performance issues.
+I manually migrated the files then deployed a second setup:
+
+  - The media repository was now mounted in RAM (a tmpfs)
+  - The storage provider was still our S3 target
+  
+And now, all of our media are successfully sent on our S3 target.
+My guess is that each media is first written on the local filesystem and then sent on S3.
+Because GlusterFS is slow and error prone, some exceptions or timeouts may be risen before the file is uploaded to S3.
+
+At least, we now considere the problem as solved.
+We only need one more step: regulargy cleaning up the local filesystem to not fill our RAM.
+
+
+## Write a synchro script
+
+Because there is no elegant solution and my time is limited, I chose to write a script that run every 10 minutes.
+It checks that the files are already on the S3 bucket and then delete them from the filesystem.
+
+  
+```bash
+#!/bin/bash
+
+cat > database.yaml <<EOF
+user: $PG_USER
+password: $PG_PASS
+database: $PG_DB
+host: $PG_HOST
+port: $PG_PORT
+EOF
+
+while true; do
+  s3_media_upload update-db 0d
+  s3_media_upload --no-progress check-deleted $MEDIA_PATH
+  s3_media_upload --no-progress upload $MEDIA_PATH $BUCKET --delete --endpoint-url $ENDPOINT 
+  sleep 600
+done
+```
+
+To use it, you must set the following environment variables:
+  - For AWS: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_DEFAULT_REGION`, `ENDPOINT`, `BUCKET`
+  - For Postgres: `PG_USER`, `PG_PASS`, `PG_DB`, `PG_HOST`, `PG_PORT`
+  - For the filesystem: `MEDIA_PATH`, we suppose `s3_media_upload` is in your `PATH`.
+
+## matrix-media-repo
+
+I presented the "native" way to handle media on Matrix Synapse but there is also a community managed project named [`matrix-media-repo`](https://docs.t2bot.io/matrix-media-repo) with a slightly different goal. The author wanted to have a common media repository for multiple servers to reduce storage costs.
+
+matrix-media-repo works is not implementation independent: instead, it shadows the matrix endpoint used for the media `/_matrix/media` and thus is compatible with any matrix server, like dendrite or conduit. Its main advantage over our solution is that it does not have this mandatory cache, it can directly upload and serve from a S3 backend, simplifying the management.
+
+Depending on your reverse proxy, it might be possible that if `matrix-media-repo` is down, users are redirected to the original endpoint that should not be used anymore, leading to loss of data and strange behaviors. It seems that [an option](https://github.com/matrix-org/synapse/blob/v1.42.0/synapse/config/server.py#L265-L269) in Synapse allows to deactivate the media-repo, it might save you some time if it works.
+
+## Conclusion
+
+Using a S3 target with Matrix is not trivial. `matrix-media-repo` seems to be a better solution but in practise it has also its own drawbacks. For now, even if not optimal, our deployed solutions works well and it's what matters.
+
+