Document built-in caching behavior (or absence thereof) #874

Open
opened 2024-09-06 17:16:01 +00:00 by aichrist · 1 comment

Hi, I looked at the docs/website and found some references to caching website requests here:

https://garagehq.deuxfleurs.fr/documentation/cookbook/reverse-proxy/

But I haven't found anything (yet) about how caching works internally within Garage. For example, if we have a 3x replicated bucket and request a file from a garage node that does not have a copy of the object, will that garage node cache the object that it retrieves, or will it pull it over the network every time? The docs are fairly clear in that the object will be retrieved from the node itself, if available, otherwise it will be pulled, but it doesn't say anything about caching.

The features page (https://garagehq.deuxfleurs.fr/documentation/reference-manual/features/) doesn't mention caching either.

When I read the goals and use cases (https://garagehq.deuxfleurs.fr/documentation/design/goals/), one of the main goals is to operate well on geo-distributed clusters with disparate link types. I believe that in order to operate well in that kind of cluster, caching of object reads would be highly desirable as it would eliminate significant transfer over potentially slower/high latency links, particularly in use cases where reads are much more common than writes.

I think that the docs should have more detail on the caching behavior, and possibly recommendations on adding caching layers if that is the intent.

If the project owners also see a valid argument for building caching behavior into Garage (if it doesn't already exist), then it would be good to create a ticket to implement that as well.

Hi, I looked at the docs/website and found some references to caching website requests here: https://garagehq.deuxfleurs.fr/documentation/cookbook/reverse-proxy/ But I haven't found anything (yet) about how caching works internally within Garage. For example, if we have a 3x replicated bucket and request a file from a garage node that does not have a copy of the object, will that garage node cache the object that it retrieves, or will it pull it over the network every time? The docs are fairly clear in that the object will be retrieved from the node itself, if available, otherwise it will be pulled, but it doesn't say anything about caching. The features page (https://garagehq.deuxfleurs.fr/documentation/reference-manual/features/) doesn't mention caching either. When I read the goals and use cases (https://garagehq.deuxfleurs.fr/documentation/design/goals/), one of the main goals is to operate well on geo-distributed clusters with disparate link types. I believe that in order to operate well in that kind of cluster, caching of object reads would be highly desirable as it would eliminate significant transfer over potentially slower/high latency links, particularly in use cases where reads are much more common than writes. I think that the docs should have more detail on the caching behavior, and possibly recommendations on adding caching layers if that is the intent. If the project owners also see a valid argument for building caching behavior into Garage (if it doesn't already exist), then it would be good to create a ticket to implement that as well.
aichrist changed title from Document built-in caching behavior to Document built-in caching behavior (or lack thereof) 2024-09-06 17:16:58 +00:00
aichrist changed title from Document built-in caching behavior (or lack thereof) to Document built-in caching behavior (or absence thereof) 2024-09-06 17:17:08 +00:00
Owner

I agree that caching might be usefull in some scenarios. Our use-case is that we have nodes in 3 zones, and 3 copies of all data, so each zone contains a whole copy of eveything. This means that when a node needs to look for an object, even if it doesn't have a copy itself, there is a copy in the same zone, so we do not need to traverse high-latency links. I agree however that this is not the only scenario in which Garage can be deployed.

Caching data blocks is the first kind of caching that we could add, and it is already tracked in #179. It would be relatively straightforward to implement, as data blocks are content-addressed and therefore there is no issue with cache invalidation to handle, and it would handle the bulk of the issue as it would have the potential of reducing traffic significantly for large frequently-requested objects.

I think we do not want to go further and try to cache object metadata, as this would impact the consistency properties of Garage. This means that when fetching an object from Garage, there will always be at least one inter-zone RPC call before the Garage daemon can give its answer, so there is an incompressible latency there. If we do not do this, we incur the risk of returning old versions of objects, which is not acceptable for the S3 API, and probably not for the Web endpoint either as it is frequently used as a public access point for data programmatically added by external applications (such as media files).

I agree that caching might be usefull in some scenarios. Our use-case is that we have nodes in 3 zones, and 3 copies of all data, so each zone contains a whole copy of eveything. This means that when a node needs to look for an object, even if it doesn't have a copy itself, there is a copy in the same zone, so we do not need to traverse high-latency links. I agree however that this is not the only scenario in which Garage can be deployed. Caching data blocks is the first kind of caching that we could add, and it is already tracked in #179. It would be relatively straightforward to implement, as data blocks are content-addressed and therefore there is no issue with cache invalidation to handle, and it would handle the bulk of the issue as it would have the potential of reducing traffic significantly for large frequently-requested objects. I think we do not want to go further and try to cache object metadata, as this would impact the consistency properties of Garage. This means that when fetching an object from Garage, there will always be at least one inter-zone RPC call before the Garage daemon can give its answer, so there is an incompressible latency there. If we do not do this, we incur the risk of returning old versions of objects, which is not acceptable for the S3 API, and probably not for the Web endpoint either as it is frequently used as a public access point for data programmatically added by external applications (such as media files).
lx added the
scope
documentation
label 2024-09-22 12:01:31 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#874
No description provided.