use netapp streaming body #343

Merged
lx merged 31 commits from netapp-stream-body into main 2022-09-13 13:26:09 +00:00
Owner

TODO:

  • Test OrderTag works (check trace-level logs in integration test)
  • Publish netapp 0.5.0 and use that as a dependency (revert change to .nix to pull from git repo)
  • In get with range, use streaming also (use StreamExt::scan for slicing)
TODO: - [x] Test OrderTag works (check trace-level logs in integration test) - [x] Publish netapp 0.5.0 and use that as a dependency (revert change to .nix to pull from git repo) - [x] In get with range, use streaming also (use StreamExt::scan for slicing)
lx force-pushed netapp-stream-body from 433cbd65d1 to 6a78c0715c 2022-07-22 14:46:02 +00:00 Compare
lx changed target branch from main to lx-perf-improvements 2022-07-22 14:46:11 +00:00
lx force-pushed netapp-stream-body from d888c9c193 to fe5dadb756 2022-07-22 17:03:16 +00:00 Compare
lx force-pushed netapp-stream-body from 9c1889c630 to f728893dae 2022-07-25 10:05:19 +00:00 Compare
lx force-pushed netapp-stream-body from f728893dae to 326d418367 2022-07-25 10:06:52 +00:00 Compare
Author
Owner

Currently stalled as there is an issue I'm unable to fix. test-smoke.sh pretty consistently reproduces the issue: transfers (GetObject requests) get stuck in the middle. I don't understand exactly what is going on, but it looks like entire netapp connections are blocked as pings start timing out, it's not just an issue with a stream that ends prematurely.

Lines 188 and 229 in block/manager.rs need to be commented for the bug to happen: when these lines are commented, nodes won't priorize reading block from local storage and will instead ask remote nodes most of the time. This is the condition under which the issue happens. (when request priorization is enabled, nodes in test-smoke will always read locally so the bug doesn't happen).

I don't want to spend too much time on this, merging this is not a high priority.

Next steps: ??? Maybe try to reproduce the issue with a simpler netapp program, and not an entire Garage (here there are too many connections open at once and we can't really see what is happening)

Currently stalled as there is an issue I'm unable to fix. `test-smoke.sh` pretty consistently reproduces the issue: transfers (GetObject requests) get stuck in the middle. I don't understand exactly what is going on, but it looks like entire netapp connections are blocked as pings start timing out, it's not just an issue with a stream that ends prematurely. Lines 188 and 229 in `block/manager.rs` need to be commented for the bug to happen: when these lines are commented, nodes won't priorize reading block from local storage and will instead ask remote nodes most of the time. This is the condition under which the issue happens. (when request priorization is enabled, nodes in test-smoke will always read locally so the bug doesn't happen). I don't want to spend too much time on this, merging this is not a high priority. Next steps: ??? Maybe try to reproduce the issue with a simpler netapp program, and not an entire Garage (here there are too many connections open at once and we can't really see what is happening)
lx force-pushed netapp-stream-body from a1c224e2e8 to e935861854 2022-07-29 10:25:11 +00:00 Compare
lx added 5 commits 2022-08-29 14:45:10 +00:00
continuous-integration/drone Build is passing Details
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
8cd02639dc
drone: set TARGET env as needed by "to_s3" func
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/tag Build is passing Details
continuous-integration/drone Build is passing Details
continuous-integration/drone/push Build is passing Details
2c7bae935a
Configure structopt to report the right version
By default, structopt reports the value provided by
the env var CARGO_PKG_VERSION, feeded by Cargo when reading
Cargo.toml. However for Garage we use a versioning based on git,
so we often report a version that is behind the real version.
In this commit, we create garage_util::version::garage() that
reports the right version and configure all structopt subcommands
to call this function instead of using the env var.
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone Build is passing Details
continuous-integration/drone/push Build is pending Details
532eca7ff9
Add some documentation for Caddy
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
ebc20a8798
Merge branch 'main' into lx-perf-improvements
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
1921f4f7e6
Merge branch 'lx-perf-improvements' into netapp-stream-body
lx added 2 commits 2022-08-29 14:48:48 +00:00
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone Build is passing Details
4da67b0035
Update drone signature
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone Build is failing Details
52749e28f7
Merge branch 'lx-perf-improvements' into netapp-stream-body
lx added 1 commit 2022-08-29 15:25:05 +00:00
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
5d065b8a0f
cargo2nix fix to fetchCrateGit
lx added 1 commit 2022-08-29 15:32:54 +00:00
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build was killed Details
continuous-integration/drone Build was killed Details
322dafc761
Try to fix clippy
lx added 5 commits 2022-08-31 15:42:37 +00:00
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
dd5304f6fc
Replace logging crate pretty_env_logger by tracing_subscriber::fmt
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
44cd98d2e4
Tracing-subscriber: write to stderr
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is failing Details
efbca67ce4
Add env filter to tracing subscriber
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
eb97e13a6a
update cargo.nix
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is failing Details
c9bc9d89de
Merge branch 'lx-perf-improvements' into netapp-stream-body
lx added 1 commit 2022-08-31 17:27:36 +00:00
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
e598231ca4
update netapp git commit
lx added 1 commit 2022-08-31 17:44:38 +00:00
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
70231d68b2
Fix bytes_read counter
lx added 1 commit 2022-09-01 07:47:45 +00:00
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
4b726b0941
netapp recv with unbounded channel removes deadlock
lx added 1 commit 2022-09-01 10:58:46 +00:00
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
bc977f9a7a
Update to Netapp with OrderTag support and exploit OrderTags
lx added 1 commit 2022-09-01 12:24:05 +00:00
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
f3bf34b6a1
update netapp: straming + fix-ping
lx added 2 commits 2022-09-01 14:31:13 +00:00
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
e648bf7b69
update cargo.nix
lx added 1 commit 2022-09-01 14:35:53 +00:00
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is failing Details
99b532b85b
Apply PRIO_SECONDARY to block data transfers
lx added 1 commit 2022-09-02 11:38:45 +00:00
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
1ef87ac4cb
cargo fmt
lx added 1 commit 2022-09-02 11:46:55 +00:00
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
13b5f28c7e
Make use of BytesBuf from new Netapp
lx added 1 commit 2022-09-06 17:31:54 +00:00
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is passing Details
c2cc08852b
Reenable node ordering
lx added 1 commit 2022-09-06 17:45:16 +00:00
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is passing Details
4024822585
Update netapp to lastest git version with LAS scheduling
lx added 12 commits 2022-09-06 20:13:21 +00:00
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
6226f5ceca
Update to netapp 0.4.5 - fixed ping
continuous-integration/drone/push Build is passing Details
943d76c583
Ability to dynamically set resync tranquility
continuous-integration/drone/push Build is passing Details
47be652a1f
block manager: refactor: split resync into separate file
continuous-integration/drone/push Build is passing Details
5e8baa433d
Make BlockManagerLocked fully private again
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is failing Details
5d4b937a00
Ability to have up to 4 concurrently working resync workers
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
e1751c8a9c
fix clippy
continuous-integration/drone/push Build is passing Details
0009fd136c
Merge pull request 'Make block resync speed dynamically configurable' (#369) from resync-ajustable-speed into main
Included in this PR:

- [x] Small refactor, resync code is moved to a separate `block/resync.rs` file
- [x] Block resync tranquility is no longer in config file, it is set dynamically using `garage worker set resync-tranquility` (this parameter is persisted over Garage restarts)
- [x] Up to 4 block resync workers can be activated to run simultaneously to speed up big resyncs, this parameter is set dynamically using `garage worker set resync-n-workers`

Reviewed-on: #369
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
07e6bcde85
Merge branch 'main' into lx-perf-improvements
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
fd8074ad9b
Update .drone.yml signature
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
d23b3a14fc
Merge branch 'main' into lx-perf-improvements
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
6b958979bd
Merge branch 'lx-perf-improvements' into netapp-stream-body
lx added 1 commit 2022-09-06 20:25:41 +00:00
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
907054775d
Faster copy, better get error message
lx added 31 commits 2022-09-08 13:51:01 +00:00
a6e40b75ea Add feature "system-libs" to enable linking against system libraries
If this feature is enabled, libsodium-sys and zstd-sys will link
dynamically against system-provided libraries instead of building
and linking statically the bundled (possibly outdated and vulnerable)
copies of them. This feature is intended mainly for linux package
maintainers.
continuous-integration/drone/push Build is pending Details
continuous-integration/drone/pr Build is pending Details
7511ba5530
Allow linking against system-provided libsqlite
Unfortunately, rusqlite uses the opposite logic for enabling/disabling
bundled libraries to others (libsodium-sys, zstd-sys). Cargo features
are very limited and doesn't allow to enable feature A in a dependency
iff feature B is disabled.

Note, lmdb-rkv-sys doesn't need any special treatment because it
automatically links against system liblmdb if found via pkgconf.

Linux distros should build garage with
`--no-default-features --features system-libs` to disable bundled-libs
and enable system-libs.
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
729a910e14
Remove Heed default features
db72812f01 Use the new cargo feature resolver "2"
Garage currently uses the legacy resolver "1". The new one is used
by default if the root package specifies 'edition = 2021', which
Garage does not (yet).

The problem with the legacy resolver is, among others, that features
enabled by dev-dependencies are propagated to normal dependencies.
This affects e.g. hyper - one of the dev-dependencies enables "http2"
feature that adds many extra dependencies. If we build garage without
opentelemetry-otlp (this is enabled in the following commit), there's
no normal dependency enabling "http2" feature.

See https://doc.rust-lang.org/cargo/reference/resolver.html#feature-resolver-version-2
e7af006c1c Make OTLP exporter optional via feature "telemetry-otlp"
opentelemetry-otlp add 48 (!) extra dependencies and increases the
size of the garage binary by ~11 % (with fat LTO).
continuous-integration/drone/pr Build is failing Details
ea36b9ff90
Allow building without Prometheus exporter (/metrics endpoint)
prometheus and opentelemetry-prometheus add 7 extra dependencies in
total and increases the size of the garage binary by ~7 % (with
fat LTO).
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
ed7796924b
Merge pull request 'Make OTLP exporter optional and allow building without Prometheus exporter (/metrics)' (#372) from jirutka/garage:telemetry-and-metrics into improve-deps
Reviewed-on: #372
Reviewed-by: Alex <alex@adnab.me>
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is failing Details
8d77a76df1
Update .nix files
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is failing Details
7de53a4d66
Force disable pkg-config for libsodum-sys and libzstd-sys
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is failing Details
48ffaaadfc
Bump versions to 0.8.0 (compatibility is broken already)
continuous-integration/drone/push Build is pending Details
continuous-integration/drone/pr Build is pending Details
bbb970965c
Document available build features
continuous-integration/drone/push Build is pending Details
continuous-integration/drone/pr Build is pending Details
2c2b93acdf
Update Nix files with optional db engines
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
431dee050f
Remove opentelemetry-otlp dep in api/
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
1e92e9f782
Disable k2v tests when feature is disabled
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
0f5689c169
Include code from v0.5.1 directly to remove dependencies
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
6f02c36a89
cargo fmt
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is passing Details
db61f41030
Move GIT_VERSION injection later in build chain to reduce build times
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is passing Details
28d86e7602
Report build features in garage --help
continuous-integration/drone/push Build is pending Details
continuous-integration/drone/pr Build is pending Details
2559f63e9b
Make all HTTP services optionnal
continuous-integration/drone/push Build is pending Details
continuous-integration/drone/pr Build is pending Details
2e00809af5
Error messages when system-libs XOR bundled-libs != 1
continuous-integration/drone/push Build is failing Details
continuous-integration/drone/pr Build is failing Details
1449204439
Add warnings when features are not included in build
continuous-integration/drone/pr Build is failing Details
continuous-integration/drone/push Build is passing Details
107853334b
Fix build error
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
06df301de5
Fix merge
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
f310fce34b
Inject GIT_VERSION even later
continuous-integration/drone/pr Build is passing Details
continuous-integration/drone/push Build is passing Details
ceb1f0229a
Move version back into util
continuous-integration/drone/push Build was killed Details
03c40a0b24
Merge pull request 'Reorganize dependencies' (#373) from improve-deps into main
This PR includes work from @jirutka :

- [x] Allow linking against system-provided libraries (libsodium, libsqlite, libzstd) #370
- [x] Make OTLP exporter optional and allow building without Prometheus exporter (/metrics) #372

And also:

- [x] Update `.nix` files
- [x] Remove heed default-features
- [x] Bump versions of all Garage crates to 0.8.0
- [x] Make db engines (lmdb, sled, sqlite) optionnal
- [x] Add documentation for available features
- [x] Directly include code of previous versions used for migration in order to reduce dependencies
- [x] Read variable `GIT_VERSION` from garage main instead of in crate garage_util to make builds faster
- [x] Report features used in the build somewhere? (in `garage --version` or something)
- [x] Check we `warn!` correctly if we try to use deactivated feature
- [x] Allow not to launch S3 endpoint if not in config

Reviewed-on: #373
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
d9d199a6c9
Merge branch 'main' into lx-perf-improvements
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
7f54706b95
Merge branch 'lx-perf-improvements' into netapp-stream-body
lx changed target branch from lx-perf-improvements to main 2022-09-12 14:38:57 +00:00
lx added 1 commit 2022-09-12 14:57:55 +00:00
continuous-integration/drone/push Build is passing Details
b823151a0b
improvements in block manager
lx added 1 commit 2022-09-13 11:12:03 +00:00
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
28a4af73ca
Use netapp 0.5 published from crates.io
lx changed title from WIP: use netapp streaming body to use netapp streaming body 2022-09-13 12:40:56 +00:00
lx added 1 commit 2022-09-13 13:13:22 +00:00
continuous-integration/drone/push Build is passing Details
continuous-integration/drone/pr Build is passing Details
ff30891999
Use streaming block API for get with Range requests
lx merged commit 11bdc971e2 into main 2022-09-13 13:26:09 +00:00
Sign in to join this conversation.
No description provided.