Reduce FullTableReplication Write Quorum? #821

Open
opened 2024-05-15 06:52:53 +00:00 by quentin · 2 comments
Owner

Currently, the FullTableReplication is configured as follow:

  • N -> all nodes of the cluster
  • R -> 1
  • W -> N-1

We don't have the nice property of R+W > N that is required to have these "read your write" property where, independently of the node you query, you always get the value you previously wrote. So we are eventually consistent in the end.

Knowing that, my proposition is to further reduce W from N-1 to N/2+1. Reducing the number of write will allow users to create bucket, rename them, delete them, and manage their keys, even in the presence of a failure.

This issue is motivated by recent events on Deuxfleurs production deployment, where one of our geographical zone was not available for an extended period of time, and it prevented our users from publishing their websites.

A commit has already been written here: 6558c15863
And it is currently deployed on Deuxfleurs production servers as a hotfix: could we consider upstreaming the change?

Currently, the FullTableReplication is configured as follow: - N -> all nodes of the cluster - R -> 1 - W -> N-1 We don't have the nice property of R+W > N that is required to have these "read your write" property where, independently of the node you query, you always get the value you previously wrote. So we are eventually consistent in the end. Knowing that, my proposition is to further reduce W from N-1 to N/2+1. Reducing the number of write will allow users to create bucket, rename them, delete them, and manage their keys, even in the presence of a failure. This issue is motivated by recent events on Deuxfleurs production deployment, where one of our geographical zone was not available for an extended period of time, and it prevented our users from publishing their websites. A commit has already been written here: https://git.deuxfleurs.fr/Deuxfleurs/garage/commit/6558c158633a2a6ce8141189cab2a5e992d520cf And it is currently deployed on Deuxfleurs production servers as a hotfix: could we consider upstreaming the change?
quentin added the
Improvement
label 2024-05-15 06:52:53 +00:00
Owner

I'm fine with changing the write quorum to N/2+1, should we merge your commit in the main branch?

I'm fine with changing the write quorum to N/2+1, should we merge your commit in the main branch?
Author
Owner

If you trust it, sure!

If you trust it, sure!
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#821
No description provided.