Check data_dir valid on startup #601

Closed
opened 2023-07-18 12:32:00 +00:00 by jpds · 4 comments
Contributor

Garage should check that the data_dir provided is valid on start-up, somewhere around:

For my deployment, these are on a separate volume (which I hadn't remounted due to maintenance), however Garage started up just fine without it.

Garage should check that the data_dir provided is valid on start-up, somewhere around: - https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/6ba611361e6d3ae701ea211adddbed61ea338da7/src/model/garage.rs#L170 For my deployment, these are on a separate volume (which I hadn't remounted due to maintenance), however Garage started up just fine without it.
Owner

We already have this line that creates the data directory if it doesn't exist (we have the same for metadata). This is in contradiction with the behaviour you are proposing ("check that data_dir and metadata_dir exist and are directories, and fail to start otherwise"). I don't care for either behavior (actually, I think I prefer your suggestion) but I'd like to have people's opinion on this matter before changing anything.

We already have [this line](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/6ba611361e6d3ae701ea211adddbed61ea338da7/src/model/garage.rs#L85) that creates the data directory if it doesn't exist (we have the same for metadata). This is in contradiction with the behaviour you are proposing ("check that data_dir and metadata_dir exist and are directories, and fail to start otherwise"). I don't care for either behavior (actually, I think I prefer your suggestion) but I'd like to have people's opinion on this matter before changing anything.
Author
Contributor

Yes, I thought that it'd be better to check for a random block that the metadata directory thinks should exist on the node.

Yes, I thought that it'd be better to check for a random block that the metadata directory thinks should exist on the node.
Contributor

I can see use cases for both modes:

The 'create if not initialized' for case when you want to garage cluster for testing. I do that occasionally when testing deployment of services that use S3 like Grafana Mimir and want empty cluster as fast and easy as possible.

The 'fail if not initialized' for production, to get fast feedback on mis-configuration. I would not want start node, have it re-create and start syncing just to fill root partition. Because proper disk was not mounted.

Regarding the check itself - I don't think checking just for dir presence is enough. Especially when metadata and data are on different drive (like split between ssd/hdd), the dirs may be present but not actually mounted. We could start even in strict mode and fill-up the root partition.

Actually checking content of the metadata/data directory would be preferable. Either content we know should be there or having custom file signalizing that the content has been initialized. The second variant might be useful for other purposes too - like storing information about which garage version has created the content and with which options.

I can see use cases for both modes: The 'create if not initialized' for case when you want to garage cluster for testing. I do that occasionally when testing deployment of services that use S3 like Grafana Mimir and want empty cluster as fast and easy as possible. The 'fail if not initialized' for production, to get fast feedback on mis-configuration. I would not want start node, have it re-create and start syncing just to fill root partition. Because proper disk was not mounted. Regarding the check itself - I don't think checking just for dir presence is enough. Especially when metadata and data are on different drive (like split between ssd/hdd), the dirs may be present but not actually mounted. We could start even in strict mode and fill-up the root partition. Actually checking content of the metadata/data directory would be preferable. Either content we know should be there or having custom file signalizing that the content has been initialized. The second variant might be useful for other purposes too - like storing information about which garage version has created the content and with which options.
lx added the
Improvement
label 2024-02-16 10:17:52 +00:00
lx added this to the v1.0 milestone 2024-02-16 10:17:56 +00:00
Author
Contributor

I was able to workaround this issue by adding this to my garage unit's systemd config:

  systemd.services.garage.unitConfig = {
    AssertPathIsMountPoint = [ "/srv/meta" "/srv/data" ];
  };

...probably still useful to have a check done internally.

I was able to workaround this issue by adding this to my garage unit's systemd config: ``` systemd.services.garage.unitConfig = { AssertPathIsMountPoint = [ "/srv/meta" "/srv/data" ]; }; ``` ...probably still useful to have a check done internally.
lx closed this issue 2024-03-20 16:53:48 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#601
No description provided.