Alex lx
lx created branch fix-ci in Deuxfleurs/garage 2024-03-28 12:08:51 +00:00
lx pushed to fix-ci at Deuxfleurs/garage 2024-03-28 12:08:51 +00:00
e1dc84e123 [fix-ci] CI: properly cleanup between garage integration tests
lx commented on pull request Deuxfleurs/garage#792 2024-03-28 12:02:36 +00:00
Fix unbounded buffering when one node has slower network

Confirmed that this avoids unbounded memory usage growth inducing OOM kill on a test deployment when sending a single 8GB file to an S3 API server on localhost, which had to send to two other…

lx pushed to main at Deuxfleurs/nixcfg 2024-03-28 10:57:09 +00:00
5b89004c0f staging: deploy garage 0.10 beta + fix monitoring
lx commented on issue Deuxfleurs/garage#788 2024-03-28 10:28:11 +00:00
Unbounded block buffering, was: nomad (lmdb / sqlite): inevitable OOM

Thank you for your patience in trying to work this out.

Can you confirm that this memory is not just the memory map of LMDB's data file, but actual allocations made by Garage like buffer and…

lx commented on issue Deuxfleurs/garage#788 2024-03-27 23:44:16 +00:00
Unbounded block buffering, was: nomad (lmdb / sqlite): inevitable OOM

The Docker image is dxflrs/garage:85f580cbde4913fe8382316ff3c27b8443c61dd7

lx pushed to fix-buffering at Deuxfleurs/garage 2024-03-27 15:23:01 +00:00
85f580cbde [fix-buffering] change request sending strategy and fix priorities
0d3e285d13 [fix-buffering] implement block_ram_buffer_max to avoid excessive RAM usage
Compare 2 commits »
lx commented on issue Deuxfleurs/garage#788 2024-03-27 15:08:24 +00:00
Unbounded block buffering, was: nomad (lmdb / sqlite): inevitable OOM

@Cycneuramus Could you check if PR #792 fixes the issue? I can trigger a release build / docker container build on our CI if that's more practical for you.

lx created pull request Deuxfleurs/garage#792 2024-03-27 15:07:43 +00:00
Fix buffering issues (fix #788)
lx pushed to fix-buffering at Deuxfleurs/garage 2024-03-27 15:06:17 +00:00
9613f9495e [fix-buffering] change request sending strategy and fix priorities
dd4e1894ee [fix-buffering] implement block_ram_buffer_max to avoid excessive RAM usage
Compare 2 commits »
lx pushed to fix-buffering at Deuxfleurs/garage 2024-03-27 15:02:03 +00:00
4d651e505a [fix-buffering] change request sending strategy and fix priorities
lx created branch fix-buffering in Deuxfleurs/garage 2024-03-27 14:26:51 +00:00
lx pushed to fix-buffering at Deuxfleurs/garage 2024-03-27 14:26:51 +00:00
c544a74c51 [fix-buffering] implement block_ram_buffer_max to avoid excessive RAM usage
lx commented on issue Deuxfleurs/garage#788 2024-03-27 13:41:56 +00:00
Unbounded block buffering, was: nomad (lmdb / sqlite): inevitable OOM

Oh, actually I read the units wrong, I thought it was bits and not bytes. If it is bytes then it looks like its maxing out your 200Mbps link which is nice.

lx commented on issue Deuxfleurs/garage#788 2024-03-27 13:37:57 +00:00
Unbounded block buffering, was: nomad (lmdb / sqlite): inevitable OOM

Thanks, these results seem pretty consistent with what I was expecting.

Do you have an idea why Garage is not maxing out your network connection? Are your nodes running at 100% CPU all this…

lx pushed to next-0.10 at Deuxfleurs/garage 2024-03-27 12:56:03 +00:00
25c196f34d [next-0.10] admin api: fix logic in get cluster status
lx pushed to next-0.10 at Deuxfleurs/garage 2024-03-27 12:47:23 +00:00
4eba32f29f [next-0.10] layout helper: rename & clarify updates to update trackers
lx pushed to next-0.10 at Deuxfleurs/garage 2024-03-27 12:37:36 +00:00
32f1786f9f [next-0.10] cache layout check result
lx pushed to next-0.10 at Deuxfleurs/garage 2024-03-27 12:32:29 +00:00
01a0bd5410 [next-0.10] remove impl Deref for LayoutHelper
lx commented on issue Deuxfleurs/garage#788 2024-03-27 11:56:12 +00:00
Unbounded block buffering, was: nomad (lmdb / sqlite): inevitable OOM

I'm not quite sure what you mean by this. My understanding of replication_mode = "3" (which is what my cluster is configured with) was that write operations need to complete on all three nodes…