WIP: blog post on garage community survey results #17

Draft
lx wants to merge 1 commit from blog-survey into master
35 changed files with 264 additions and 0 deletions
Showing only changes of commit 2da43a2472 - Show all commits

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 86 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 86 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

View file

@ -0,0 +1,261 @@
+++
title="Results of the community survey"
date=2024-03-12
+++
*We ran a community survey to gather feedback from Garage users and potential
users during a two-month period. One of the main objectives of
this survey was to determine expectations from the community for Garage's
upcoming v1.0 release and for future work. Read this article for a discussion
of the results.*
<!-- more -->
---
The survey collected 127 response during a time period of almost 2 months,
from the 15th of January to the 12th of March.
The first question we asked users were how they have heard of Garage:
the majority answered that they have head of Garage through a link
aggregator or social network such as Reddit or HN. A portion of
users have heard of it from word of mouth, and a significant portion also
answered "Other". Unfortunately we didn't ask respondents for details
if they selected "Other", so I'm quite curious as to what this could be.
Other choices have almost negligible number of responses.
<center><img src="all-how-known.png" /></center>
Half of the respondents indicated that they are currently running a Garage cluster
for production data, of which a small fraction indicated running it in a commercial
setting. Another third of respondents indicated that they are currently testing Garage
or have tested it previously.
<center><img src="all-currently-admin.png" /></center>
## About currently running Garage installations
We first asked users what kind of data they were storing in Garage.
The first answer, selected by about half of the participants,
is for storing back-ups, followed closely by personal files.
Other answers follow with a rougly linearly decreasing pattern.
<center><img src="all-data-kind.png" /></center>
The majority of users are not running Garage in geodistributed mode,
but many users are also running in 2, 3 or even 4 locations.
<center><img src="all-n-zones.png" /></center>
A large majority of users are only using Garage through the S3 API.
The remaining users are mostly using a mix of S3 API and web API,
with a small number of users (5) using Garage primarily as a web server.
<center><img src="all-access-mode.png" /></center>
Regarding the size of clusters, the majority of installed clusters are less
than 1TB in size. The others are almost all between 1TB to 10TB. 8 users
indicated that they are running clusters of more than 10TB. Two users that
reported running clusters of more than 100TB, but they also indicated that they
are not currently using Garage, so I think that's the size of the data they
would like/need to store on Garage, but not the actual size of an
installed cluster. The number of objects stored in clusters is quite evenly
split between less than 10k, 10k to 100k, and more than 100k.
<center><img src="all-cluster-size.png" /></center>
<center><img src="all-cluster-object-count.png" /></center>
For about half of respondents, this means storing mostly objects of around 100MB in size.
For the others, it's mostly objects of around 10MB. This is very inexact since the
proposed answers for cluster size and object count had such large ranges.
<center><img src="all-object-size.png" /></center>
## Satisfaction regarding Garage
A majority of users reported a high degree of satisfaction with Garage.
About a quarter said that Garage has some significant flaws. A small portion
of respondents indicated that they cannot use Garage due to missing
important features or critical bugs, but still took the time to answer
the survey (thanks to them!).
<center><img src="all-satisfaction.png" /></center>
The top 3 strong points of Garage reported by its users are: good S3 compatibility
(first place, with 2/3 of respondents agreeing), good performance on small / low-power
machines, and easy setup. I'd say we are pretty much on target, as these are some of the
main objectives of Garage.
<center><img src="all-strong-points.png" /></center>
As for most wanted features in Garage, there is a clear winner with a web interface
for cluster administration, with over 40% of users mentioning it. The second most
wanted feature is support for S3 versioning, with almost 30% of answers.
<center><img src="all-wanted-features.png" /></center>
The vast majority of users reported never losing data that they stored in Garage.
Only one indicated that they lost data and it was Garage's fault: this was
because they tried to move an LMDB database between machines with different
architectures, but the LMDB on-disk format is architecture specific. We should
probably be more clear about this in the documentation.
<center><img src="all-lose-data.png" /></center>
# Users in a "homelab/self-hosted setting"
52 respondents indicated that they are using Garage for storing production
data in a homelab or self-hosted setting. I'd say this is the most
representative portion of Garage users, as it is its primary target.
Let's look at the answers from these users only.
## About the clusters
Personal files now takes the first place of the kinds of data stored on these clusters,
still closely followed by back-ups.
<center><img src="homelab-data-kind.png" /></center>
These users are mostly not using Garage in a geodistributed setting.
The distribution of answers is very similar to the overall.
<center><img src="homelab-n-zones.png" /></center>
Most clusters of these users are less than 1TB and size,
and the remaining are mostly in the 1TB - 10TB range.
There are fewer clusters than average storing more than 100k objects in this population,
but the distribution of object sizes (not shown) is very similar to the overall.
<center><img src="homelab-cluster-size.png" /></center>
<center><img src="homelab-cluster-object-count.png" /></center>
## Satisfaction regarding Garage
Homelab/self-hosting users reported a level of satisfaction a bit higher with Garage,
with almost 3/4 very satisfied.
<center><img src="homelab-satisfaction.png" /></center>
The top 3 reasons for using Garage are the same, but good performance on small
/ low-power machines is now taking the first place.
<center><img src="homelab-strong-points.png" /></center>
The top 2 wanted features are still the same, now with an equal number of votes.
<center><img src="homelab-wanted-features.png" /></center>
# Users in a "commercial setting"
Fewer users indicated that they are running Garage in a commercial setting,
as this concerned only 12 of the respondents to the survey.
## About the clusters
Half of users reported using Garage to store back-ups,
and almost half reported storing observability data and web app / service data.
One third selected static websites.
<center><img src="commercial-data-kind.png" /></center>
Users in a commercial setting are more consistent in their use of the
geo-distribution features offered by Garage. Only one third of users are
not running in geo-distributed mode. Another third is running Garage in 2 locations,
and the last third is running in 3 or more locations, thus benefitting from
the best resiliency properties that Garage can offer.
<center><img src="commercial-n-zones.png" /></center>
The majority of commercial deployments are storing between 1TB and 10TB of data.
About a quarter are storing more than 1 million objects.
<center><img src="commercial-cluster-size.png" /></center>
<center><img src="commercial-cluster-object-count.png" /></center>
It seems that the average object size is much smaller in this population:
the majority of answers correspond to average object sizes of less than 10MB,
and one foruth of answers corresponds to objects of around 1MB.
<center><img src="commercial-object-size.png" /></center>
## Satisfaction regarding Garage
Three quarter of these users reported a high degree of satisfaction with Garage,
about the same as for homelab users.
<center><img src="commercial-satisfaction.png" /></center>
The most liked qualities of Garage are a bit different. Fewer users reported
satisfaction due to the easy setup of Garage, but more users indicated
that the possibility of easily adding and removing nodes was important to them.
Good tolerance to offline nodes and crashes, and good performance in the face
of latency, which are the core properties that make Garage work well in
geo-distributed settings, were selected by two thirds of users, most likely
the same that said they are running in geo-distributed mode.
<center><img src="commercial-strong-points.png" /></center>
A web interface for cluster administration is still the most wanted feature, with 40%
of votes. Then, one third voted for better monitoring and observability, and for
per-bucket levels of consistency and numbers of replicas. Only 25% voted for S3
versioning.
<center><img src="commercial-wanted-features.png" /></center>
# Users that have the biggest clusters
7 users reported running clusters storing more than 10TB of data.
About half of these users are using Garage for a homelab or self-hosted setup,
and one is in a commercial setting.
<center><img src="big-currently-admin.png" /></center>
## About the clusters
Almost all of these users are using Garage to store back-ups.
Multimedia files are the second most selected option, which
would explain why these clusters are so big.
<center><img src="big-data-kind.png" /></center>
These deployments are quite evenly split between not
being geo-replicated and being geo-replicated in 2 or 3 locations.
<center><img src="big-n-zones.png" /></center>
## Satisfaction regarding garage
A majority of users report a high degree of satisfaction with Garage,
but many users also reported significant flaws.
<center><img src="big-satisfaction.png" /></center>
Unsurprisingly, when clusters start becoming big enough, the most requested
improvement is better performance around the board.
Per-bucket levels of consistency and number of replicas was also selected
by almost half of users.
<center><img src="big-wanted-features.png" /></center>
# Users that reported that garage had some significant flaws
Focusing on users that reported that Garage is usable for them but has "significant flaws",
the two most requested features were a web administration interface and S3 versioning.
Bucket-level ACLs (that would allow anonymous access directly from the S3 endpoint)
and performance improvements came next.
<center><img src="flaws-wanted-features.png" /></center>
Concerning users that said that Garage has critical issues that is preventing
them from using it, the "Other" option was the most selected answer for the
requested features. Licensing issues allegedly preventing commercial use were
cited by a few users (hint: it's actually a non-issue, and we will write about
this at some point), but I think for most of these users, they have a specific
use case in mind which is not targeted by Garage. For instance, several have
indicated that they would need POSIX filesystem compatibility and/or the
possibility to use Garage as a CSI driver in Kubernetes (unfortunately, this is
mostly impossible to achieve with good performance in a geo-distributed
environment, and the principles on which Garage is based explicitly prevents it
from fulfilling this role).

View file

@ -0,0 +1,3 @@
for f in *.png; do
cp -v /home/lx/Deuxfleurs/documents/survey/$f .
done