Encryption in Garage? #416

Closed
opened 2022-11-15 10:23:41 +00:00 by quentin · 3 comments
Owner

Encryption is a recurring subject when discussing Garage. I tagged this issue as "documentation" as many things can already be done with Garage's current feature set and the existing ecosystem. If we identify some non adressed use cases after this review, we may consider specifying some encryption mechanisms in Garage.

I think we need to have a broader answer, documenting LUKS and client-side encryption.

We should have a very high level approach to answer users' needs, not limiting ourselves to the software scope. This is even more appropriate as we are trying to solve this problem with Aerogramme.

Some logs of discussions we had on Garage's channel:

Make clear that traffic is (always) encrypted between Garage nodes

We should write about it but your RPC are encrypted, more specifically, contrary to many distributed software, it is impossible in Garage to have clear-text RPC.
We use the kuska handshake library which implements a protocol that has been clearly reviewed but can't recall its name, protocol that is used in Secure ScuttleButt.
That's why setting a rpc_secret is mandatory, and that's (also) why your nodes have super long identifiers.

Correct, if the cluster is properly configured with TLS to serve static files. All RPC between nodes are also encrypted.

Explain how to make sure that the traffic between a Garage node and your client is encrypted

HTTP API endpoints are in clear text. You have multiple options to have encryption between your client and a node:

  • Setup a reverse proxy with TLS / ACME / Let's encrypt
  • Setup a Garage gateway locally

Make clear that data is (always) stored in plain text on the filesystem

LUKS is indeed better but complex to setup. It’s not trivial when you install it on a machine with a keyboard and a screen attached to it, and it’s even more complex when you install it remotely and you have to use dropbearssh or similar. I would guess many people don’t bother. Having storage encryption in garage would simplify a lot. That’s about it.

For encryption of data, where could Garage get the encryption keys from ? If we encrypt data but keep the keys in a plaintext file next to them, it's useless. We don't want to have to manage secrets in garage, I don't even know how we would do that in a secure way.

If you are interested by encryption in Garage, I think a safe bet would be to get a look at Amazon's work.
They have both client-side and server-side encryption:

You can already do client-side encryption in Garage, either by using Amazon's Client-Side encryption SDK or by rolling your own client side encryption (as we are doing in aerogramme: https://aerogramme.deuxfleurs.fr/index.html )
For the server-side encryption, they use an external service named KMS, that probably use some hardware checks and so on. Implementing such logic in Garage would probably require to use either a TPM or some external integrations like hashicorp Vault.
But I am pretty sure that is very hard to get it right, to maintain it, to develop it, and in the end you can have the same result with LUKS in the end.

Give some hints how to encrypt data on rest

  • LUKS

Give some hints how to encrypt data on the client side

  • Matrix
  • Aerogramme

Feedbacks that could help us

  • Why do you want encryption in Garage?
  • What is your threat model? What are you fearing?
    • A stolen HDD?
    • A curious administrator?
    • A malicious administrator?
    • A remote attacker?
    • etc.
  • What services do you want to protect with encryption?
    • An existing application? Which one? (eg. Nextcloud)
    • An application that you are writing
  • Any expertise you may have on the subject
  • Any other thoughts that can help us
Encryption is a recurring subject when discussing Garage. I tagged this issue as "documentation" as many things can already be done with Garage's current feature set and the existing ecosystem. If we identify some non adressed use cases after this review, we may consider specifying some encryption mechanisms in Garage. I think we need to have a broader answer, documenting LUKS and client-side encryption. We should have a very high level approach to answer users' needs, not limiting ourselves to the software scope. This is even more appropriate as we are trying to solve this problem with Aerogramme. Some logs of discussions we had on Garage's channel: ## Make clear that traffic is (always) encrypted between Garage nodes > We should write about it but your RPC are encrypted, more specifically, contrary to many distributed software, it is impossible in Garage to have clear-text RPC. We use the kuska handshake library which implements a protocol that has been clearly reviewed but can't recall its name, protocol that is used in Secure ScuttleButt. That's why setting a rpc_secret is mandatory, and that's (also) why your nodes have super long identifiers. > Correct, if the cluster is properly configured with TLS to serve static files. All RPC between nodes are also encrypted. ## Explain how to make sure that the traffic between a Garage node and your client is encrypted HTTP API endpoints are in clear text. You have multiple options to have encryption between your client and a node: - Setup a reverse proxy with TLS / ACME / Let's encrypt - Setup a Garage gateway locally ## Make clear that data is (always) stored in plain text on the filesystem > LUKS is indeed better but complex to setup. It’s not trivial when you install it on a machine with a keyboard and a screen attached to it, and it’s even more complex when you install it remotely and you have to use dropbearssh or similar. I would guess many people don’t bother. Having storage encryption in garage would simplify a lot. That’s about it. > For encryption of data, where could Garage get the encryption keys from ? If we encrypt data but keep the keys in a plaintext file next to them, it's useless. We don't want to have to manage secrets in garage, I don't even know how we would do that in a secure way. > If you are interested by encryption in Garage, I think a safe bet would be to get a look at Amazon's work. They have both client-side and server-side encryption: > - https://docs.aws.amazon.com/general/latest/gr/aws\_sdk\_cryptography.html > - https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucket-encryption.html > You can already do client-side encryption in Garage, either by using Amazon's Client-Side encryption SDK or by rolling your own client side encryption (as we are doing in aerogramme: https://aerogramme.deuxfleurs.fr/index.html ) For the server-side encryption, they use an external service named KMS, that probably use some hardware checks and so on. Implementing such logic in Garage would probably require to use either a TPM or some external integrations like hashicorp Vault. But I am pretty sure that is very hard to get it right, to maintain it, to develop it, and in the end you can have the same result with LUKS in the end. ## Give some hints how to encrypt data on rest - LUKS ## Give some hints how to encrypt data on the client side - Matrix - Aerogramme ## Feedbacks that could help us - Why do you want encryption in Garage? - What is your threat model? What are you fearing? - A stolen HDD? - A curious administrator? - A malicious administrator? - A remote attacker? - etc. - What services do you want to protect with encryption? - An existing application? Which one? (eg. Nextcloud) - An application that you are writing - Any expertise you may have on the subject - Any other thoughts that can help us
quentin added the
Documentation
label 2022-11-15 10:23:41 +00:00
quentin changed title from Best practises for encryption in Garage to Encryption in Garage? 2022-11-15 10:24:09 +00:00

I am using encryption at rest for my Garage data. My filesystem is ZFS, so I'm using the native ZFS-on-Linux encryption. For how I manage keys so the disks are mounted at boot without human intervention, see my blog

In 2022, encrypting the data at rest should be the default, really, as the overhead is negligible and there's no reason not to :) Plus, it makes decomissioning hard drives much easier, as there's no risk someone buying the used drives (or finding them in the trash!) could recover anything.

I am also exposing Garage behind Traefik which uses Let's Encrypt to add TLS.

I am using encryption at rest for my Garage data. My filesystem is ZFS, so I'm using the native ZFS-on-Linux encryption. For how I manage keys so the disks are mounted at boot without human intervention, see [my blog](https://withblue.ink/2020/01/19/auto-mounting-encrypted-drives-with-a-remote-key-on-linux.html) In 2022, encrypting the data at rest should be the default, really, as the overhead is negligible and there's no reason not to :) Plus, it makes decomissioning hard drives much easier, as there's no risk someone buying the used drives (or finding them in the trash!) could recover anything. I am also exposing Garage behind Traefik which uses Let's Encrypt to add TLS.

We have a strange use case where we would like to share one garage network among friends. Everyone who is part of the network shares non-redundant storage space, and gets redundant storage space in return.

It would be nice if a garage node could transparently encrypt data in a bucket, using an encryption key only known to that garage node. That way, we are not able to read each other's data. Of course, we still need to trust each other to not delete data.

We have a strange use case where we would like to share one garage network among friends. Everyone who is part of the network shares non-redundant storage space, and gets redundant storage space in return. It would be nice if a garage node could transparently encrypt data in a bucket, using an encryption key only known to that garage node. That way, we are not able to read each other's data. Of course, we still need to trust each other to not delete data.
Contributor

@twee_bloemen You'd be better served by using something like git-annex - documentation for which I've just commited in 4c143776bf.

git-annex also tracks where your files go, so you can mitigate the risk of an admin deleting the data by also copying it to a local external drive.

@twee_bloemen You'd be better served by using something like `git-annex` - documentation for which I've just commited in 4c143776bfa258f78caf5373572aef80b1cb60e6. `git-annex` also tracks where your files go, so you can mitigate the risk of an admin deleting the data by also copying it to a local external drive.
lx closed this issue 2023-06-14 10:57:33 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#416
No description provided.