OutOfMemory error from LMDB when starting a fresh installation #439
Labels
No labels
action
check-aws
action
discussion-needed
action
for-external-contributors
action
for-newcomers
action
more-info-needed
action
need-funding
action
triage-required
kind
correctness
kind/experimental
kind
ideas
kind
improvement
kind
performance
kind
testing
kind
usability
kind
wrong-behavior
prio
critical
prio
low
scope
admin-api
scope
admin-sdk
scope
background-healing
scope
build
scope
documentation
scope
k8s
scope
layout
scope
metadata
scope
ops
scope
rpc
scope
s3-api
scope
security
scope
telemetry
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Deuxfleurs/garage#439
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
First of all, thanks a lot for this amazing project!
I encountered an issue where the application would panic right after starting. It seems to come from LMDB.
The only workaround known to me is to switch the DB engine to another one:
I went with sqlite, but it is definitely, sensibly slower.
I tested it in three different environments, but I can't tell what's causing the panic.
In each environment, I use docker and docker-compose to spin up Garage.
I used the same
docker-compose.yml
andgarage.toml
files (see below), with one exception:On the MacBook, I used an own
dockerfile
to statically copy thegarage.toml
file - somehow it would mount the file as a directory inside the container...First environment: VPS running Arch Linux VPS
✅ Works
Kernel:
Linux neptune.internal 6.0.10-arch2-1 #1 SMP PREEMPT_DYNAMIC Sat, 26 Nov 2022 16:51:18 +0000 x86_64 GNU/Linux
Docker:
Docker version 20.10.21, build baeda1f82a
Second environment: Raspberry Pi 4 running Void Linux musl
❌ Does not work
Kernel:
Linux m-pi 5.15.72_1 #1 SMP PREEMPT Sun Oct 16 14:46:40 UTC 2022 aarch64 GNU/Linux
Docker:
Docker version 20.10.21, build tag v20.10.21
No matter where the mounted directories are located (in the same directory as the compose file, or on another HDD),
and no matter what the permissions and ownership (root or not) are, it just won't work.
At the time of running, this is the resource usage:
Third environment: MacBook Pro running Docker in Minikube
⚠️ Works sometimes
OS: Catalina 10.15.7, Intel Core i5 (x86_64)
Minikube:
minikube version: v1.28.0 commit: 986b1ebd987211ed16f8cc10aed7d2c42fc8392f
Docker:
Docker version 20.10.21, build baeda1f82a
And, truth be told, I can't reproduce the panic very well.
I know it does not work when I start a clean container with empty mounts as a user (in this case,
user: "510:80"
which is my user and the "admin" group on Mac).But I also had the panic starting as root, and I can't reproduce that anymore.
Stack trace
docker-compose.yml
generate-config.sh
Thank you for this detailed report. LMDB works in a bit of a special way, as it requires a large section of virtual memory address space to be reserved so that it can have a view into the entire database file (it doesn't have to read it all from disk into RAM, but it kind of pretends it did). Since we don't know in advance how big the LMDB database will be, we have to use a really big chunk of virtual address space for this, to be sure that we won't run into issues.
In theory there is no problem with mapping a huge portion of address space, as only small pieces of it are actually filled with data from the database file. The rest is not actually assigned to any physical memory on your computer, and it's the job of the OS Kernel to read the data from disk and put it at the correct location in RAM every time LMDB tries to read from it.
I think the out-of-memory error here is most likely due to a parameter in the configuration of your OS kernel that limits the size of such chunks of memory that can be allocated. On your OSX machine, this would seem consistent with the fact that mapping this portion of memory is allowed as root, but forbidden as non-root user. You can check whether a limitation on virtual memory is set using the
ulimit -v
command (make sure to run it in the same conditions as your Garage binary, so that you will see the limitations applied to it). You can also check/proc/<pid>/limits
at the "Max address space" line.Thanks a lot for the info, @lx, and sorry for the delayed reply.
The theory regarding limits makes a lot of sense.
I had a look at the limits (in running containers, just to be sure), and they are indeed different:
Arch Linux VPS
Void Linux Raspberry Pi
But even though I increased the number of open files to 1048576 and set the max locked memory to unlimited, it doesn't seem to make a difference for LMDB.
I'm not quite sure where to go from here. If the problem is limited to RPi installations (whatever may be different there), using SQLite isn't a huge dealbreaker.
Moreover, FWIW it might make more sense for me to open an issue in their issue tracker.
Either way, I guess I'll close the issue here: This problem doesn't seem to be specific to garage.