WIP: Fjall DB engine #906

Draft
withings wants to merge 2 commits from withings/garage:feat/fjall-db-engine into main
Contributor

This is a draft implementation for a new meta backend based on LSM trees using fjall. A couple of things to note so far:

  • ITx::clear could not be implemented (without iterating/deleting all keys) but I believe this method is never used. Since it's not the first time I encounter that issue, perhaps we could just remove that method?
  • New configuration option fjall_block_cache_size to set the block cache size.
  • I could not for the life of me find a way around the 'r lifetimes in ITx::range and ITx::range_rev, ended up cloning the bounds to avoid conflicts with '_.

Performance so far has been pretty low on writes, there's definitely some room for improvement in this PR. Using the dashboard from #851:

image

image

image

In my test setup there were also some nasty crashes, which have yet to be explained... Somehow the backend messes with the integrity of the Merkle trees...

======== PANIC (internal Garage error) ========
panicked at /home/build/garage/src/table/merkle.rs:219:10:
called `Option::unwrap()` on a `None` value
Panics are internal errors that Garage is unable to handle on its own.
They can be caused by bugs in Garage's code, or by corrupted data in
the node's storage. If you feel that this error is likely to be a bug
in Garage, please report it on our issue tracker a the following address:
     https://git.deuxfleurs.fr/Deuxfleurs/garage/issues
Please include the last log messages and the the full backtrace below in
your bug report, as well as any relevant information on the context in
which Garage was running when this error occurred.
GARAGE VERSION: git:v1.0.1-13-gc50cab80-modified [features: k2v, lmdb, sqlite, fjall, metrics, bundled-libs]
BACKTRACE:
    0: garage::main::{{closure}}::{{closure}}
              at home/build/garage/src/garage/main.rs:135:21
    1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/boxed.rs:2245:9
       std::panicking::rust_panic_with_hook
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:805:13
    2: std::panicking::begin_panic_handler::{{closure}}
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:664:13
    3: std::sys::backtrace::__rust_end_short_backtrace
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sys/backtrace.rs:170:18
    4: rust_begin_unwind
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:662:5
    5: core::panicking::panic_fmt
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panicking.rs:74:14
    6: core::panicking::panic
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panicking.rs:148:5
    7: core::option::unwrap_failed
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/option.rs:2015:5
    8: core::option::Option<T>::unwrap
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/option.rs:965:21
       garage_table::merkle::MerkleUpdater<F,R>::update_item_rec
              at home/build/garage/src/table/merkle.rs:211:28
    9: garage_table::merkle::MerkleUpdater<F,R>::update_item_rec
              at home/build/garage/src/table/merkle.rs:149:28
   10: garage_table::merkle::MerkleUpdater<F,R>::update_item::{{closure}}
              at home/build/garage/src/table/merkle.rs:111:22
       <garage_db::TxFn<F,R,E> as garage_db::ITxFn>::try_on
              at home/build/garage/src/db/lib.rs:417:13
   11: <garage_db::metric_proxy::MetricITxFnProxy as garage_db::ITxFn>::try_on
              at home/build/garage/src/db/metric_proxy.rs:149:3
   12: <garage_db::fjall_adapter::FjallDb as garage_db::IDb>::transaction
              at home/build/garage/src/db/fjall_adapter.rs:228:13
   13: <garage_db::metric_proxy::MetricDbProxy as garage_db::IDb>::transaction::{{closure}}
              at home/build/garage/src/db/metric_proxy.rs:135:7
       garage_db::metric_proxy::MetricDbProxy::instrument
              at home/build/garage/src/db/metric_proxy.rs:50:13
       <garage_db::metric_proxy::MetricDbProxy as garage_db::IDb>::transaction
              at home/build/garage/src/db/metric_proxy.rs:134:3
   14: garage_db::Db::transaction
              at home/build/garage/src/db/lib.rs:110:16
       garage_table::merkle::MerkleUpdater<F,R>::update_item
              at home/build/garage/src/table/merkle.rs:108:3
       garage_table::merkle::MerkleUpdater<F,R>::updater_loop_iter
              at home/build/garage/src/table/merkle.rs:85:4
       <garage_table::merkle::MerkleWorker<F,R> as garage_util::background::worker::Worker>::work::{{closure}}::{{closure}}
              at home/build/garage/src/table/merkle.rs:318:13
       <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/blocking/task.rs:42:21
       tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/core.rs:328:17
       tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/loom/std/unsafe_cell.rs:16:9
       tokio::runtime::task::core::Core<T,S>::poll
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/core.rs:317:13
       tokio::runtime::task::harness::poll_future::{{closure}}
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:485:19
       <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panic/unwind_safe.rs:272:9
       std::panicking::try::do_call
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:554:40
       std::panicking::try
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:518:19
       std::panic::catch_unwind
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panic.rs:345:14
       tokio::runtime::task::harness::poll_future
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:473:18
       tokio::runtime::task::harness::Harness<T,S>::poll_inner
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:208:27
       tokio::runtime::task::harness::Harness<T,S>::poll
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:153:15
       tokio::runtime::task::raw::poll
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/raw.rs:271:5
   15: tokio::runtime::task::raw::RawTask::poll
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/raw.rs:201:18
       tokio::runtime::task::UnownedTask<S>::run
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/mod.rs:453:9
   16: tokio::runtime::blocking::pool::Task::run
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/blocking/pool.rs:159:9
       tokio::runtime::blocking::pool::Inner::run
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/blocking/pool.rs:513:17
       tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
              at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/blocking/pool.rs:471:13
       std::sys::backtrace::__rust_begin_short_backtrace
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sys/backtrace.rs:154:18
   17: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/thread/mod.rs:522:17
       <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panic/unwind_safe.rs:272:9
       std::panicking::try::do_call
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:554:40
       std::panicking::try
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:518:19
       std::panic::catch_unwind
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panic.rs:345:14
       std::thread::Builder::spawn_unchecked_::{{closure}}
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/thread/mod.rs:521:30
       core::ops::function::FnOnce::call_once{{vtable.shim}}
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/ops/function.rs:250:5
   18: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/boxed.rs:2231:9
       <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/boxed.rs:2231:9
       std::sys::pal::unix::thread::Thread::new::thread_start
              at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sys/pal/unix/thread.rs:105:17
This is a draft implementation for a new meta backend based on LSM trees using [fjall](https://docs.rs/fjall). A couple of things to note so far: - `ITx::clear` could not be implemented (without iterating/deleting all keys) but I believe this method is never used. Since it's not the first time I encounter that issue, perhaps we could just remove that method? - New configuration option `fjall_block_cache_size` to set the block cache size. - I could not for the life of me find a way around the `'r` lifetimes in `ITx::range` and `ITx::range_rev`, ended up cloning the bounds to avoid conflicts with `'_`. Performance so far has been pretty low on writes, there's definitely some room for improvement in this PR. Using the dashboard from #851: ![image](/attachments/b192d0a3-e9f6-419b-b550-c254de51ea7c) ![image](/attachments/a53d07af-c5c3-492d-b98b-b85b9a3b6cf1) ![image](/attachments/ad5dc456-3eac-47ce-a6ec-bca042343751) In my test setup there were also some nasty crashes, which have yet to be explained... Somehow the backend messes with the integrity of the Merkle trees... ``` ======== PANIC (internal Garage error) ======== panicked at /home/build/garage/src/table/merkle.rs:219:10: called `Option::unwrap()` on a `None` value Panics are internal errors that Garage is unable to handle on its own. They can be caused by bugs in Garage's code, or by corrupted data in the node's storage. If you feel that this error is likely to be a bug in Garage, please report it on our issue tracker a the following address: https://git.deuxfleurs.fr/Deuxfleurs/garage/issues Please include the last log messages and the the full backtrace below in your bug report, as well as any relevant information on the context in which Garage was running when this error occurred. GARAGE VERSION: git:v1.0.1-13-gc50cab80-modified [features: k2v, lmdb, sqlite, fjall, metrics, bundled-libs] BACKTRACE: 0: garage::main::{{closure}}::{{closure}} at home/build/garage/src/garage/main.rs:135:21 1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/boxed.rs:2245:9 std::panicking::rust_panic_with_hook at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:805:13 2: std::panicking::begin_panic_handler::{{closure}} at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:664:13 3: std::sys::backtrace::__rust_end_short_backtrace at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sys/backtrace.rs:170:18 4: rust_begin_unwind at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:662:5 5: core::panicking::panic_fmt at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panicking.rs:74:14 6: core::panicking::panic at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panicking.rs:148:5 7: core::option::unwrap_failed at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/option.rs:2015:5 8: core::option::Option<T>::unwrap at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/option.rs:965:21 garage_table::merkle::MerkleUpdater<F,R>::update_item_rec at home/build/garage/src/table/merkle.rs:211:28 9: garage_table::merkle::MerkleUpdater<F,R>::update_item_rec at home/build/garage/src/table/merkle.rs:149:28 10: garage_table::merkle::MerkleUpdater<F,R>::update_item::{{closure}} at home/build/garage/src/table/merkle.rs:111:22 <garage_db::TxFn<F,R,E> as garage_db::ITxFn>::try_on at home/build/garage/src/db/lib.rs:417:13 11: <garage_db::metric_proxy::MetricITxFnProxy as garage_db::ITxFn>::try_on at home/build/garage/src/db/metric_proxy.rs:149:3 12: <garage_db::fjall_adapter::FjallDb as garage_db::IDb>::transaction at home/build/garage/src/db/fjall_adapter.rs:228:13 13: <garage_db::metric_proxy::MetricDbProxy as garage_db::IDb>::transaction::{{closure}} at home/build/garage/src/db/metric_proxy.rs:135:7 garage_db::metric_proxy::MetricDbProxy::instrument at home/build/garage/src/db/metric_proxy.rs:50:13 <garage_db::metric_proxy::MetricDbProxy as garage_db::IDb>::transaction at home/build/garage/src/db/metric_proxy.rs:134:3 14: garage_db::Db::transaction at home/build/garage/src/db/lib.rs:110:16 garage_table::merkle::MerkleUpdater<F,R>::update_item at home/build/garage/src/table/merkle.rs:108:3 garage_table::merkle::MerkleUpdater<F,R>::updater_loop_iter at home/build/garage/src/table/merkle.rs:85:4 <garage_table::merkle::MerkleWorker<F,R> as garage_util::background::worker::Worker>::work::{{closure}}::{{closure}} at home/build/garage/src/table/merkle.rs:318:13 <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/blocking/task.rs:42:21 tokio::runtime::task::core::Core<T,S>::poll::{{closure}} at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/core.rs:328:17 tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/loom/std/unsafe_cell.rs:16:9 tokio::runtime::task::core::Core<T,S>::poll at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/core.rs:317:13 tokio::runtime::task::harness::poll_future::{{closure}} at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:485:19 <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panic/unwind_safe.rs:272:9 std::panicking::try::do_call at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:554:40 std::panicking::try at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:518:19 std::panic::catch_unwind at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panic.rs:345:14 tokio::runtime::task::harness::poll_future at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:473:18 tokio::runtime::task::harness::Harness<T,S>::poll_inner at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:208:27 tokio::runtime::task::harness::Harness<T,S>::poll at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/harness.rs:153:15 tokio::runtime::task::raw::poll at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/raw.rs:271:5 15: tokio::runtime::task::raw::RawTask::poll at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/raw.rs:201:18 tokio::runtime::task::UnownedTask<S>::run at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/task/mod.rs:453:9 16: tokio::runtime::blocking::pool::Task::run at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/blocking/pool.rs:159:9 tokio::runtime::blocking::pool::Inner::run at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/blocking/pool.rs:513:17 tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}} at home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/blocking/pool.rs:471:13 std::sys::backtrace::__rust_begin_short_backtrace at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sys/backtrace.rs:154:18 17: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}} at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/thread/mod.rs:522:17 <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panic/unwind_safe.rs:272:9 std::panicking::try::do_call at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:554:40 std::panicking::try at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:518:19 std::panic::catch_unwind at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panic.rs:345:14 std::thread::Builder::spawn_unchecked_::{{closure}} at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/thread/mod.rs:521:30 core::ops::function::FnOnce::call_once{{vtable.shim}} at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/ops/function.rs:250:5 18: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/boxed.rs:2231:9 <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/boxed.rs:2231:9 std::sys::pal::unix::thread::Thread::new::thread_start at rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sys/pal/unix/thread.rs:105:17 ```
withings added 2 commits 2024-11-21 17:37:42 +00:00
Contributor

I think it would be interesting to see metrics for individual KV operations. I don't see how raw write speed should be slow or get slower, so I'm wondering if the PutObject call is (heavily) read I/O limited, and if so, why.

Also, how much data was written roughly with how much cache and memory, and how large were the resulting trees?


One obvious thing I'm noticing across the code base is that IDs are fully random; something like UUIDv7's, ULID etc. is much better for every KV store, as this will implicitly order new keys closer together, which helps with locality. If it is possible for Garage to use a time ordered ID format, I would strictly recommend it.

I think it would be interesting to see metrics for individual KV operations. I don't see how raw write speed should be slow or get slower, so I'm wondering if the PutObject call is (heavily) read I/O limited, and if so, why. Also, how much data was written roughly with how much cache and memory, and how large were the resulting trees? --- One obvious thing I'm noticing across the code base is that IDs are fully random; something like UUIDv7's, ULID etc. is much better for every KV store, as this will implicitly order new keys closer together, which helps with locality. If it is possible for Garage to use a time ordered ID format, I would strictly recommend it.
Some checks failed
ci/woodpecker/pr/debug Pipeline failed
Required
Details
This pull request is marked as a work in progress.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u feat/fjall-db-engine:withings-feat/fjall-db-engine
git checkout withings-feat/fjall-db-engine
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#906
No description provided.