Discussing Arc<Pool>

This commit is contained in:
Quentin 2024-02-23 13:17:43 +01:00
parent 723cc14227
commit e1e4497c64
Signed by: quentin
GPG key ID: E9602264D639FF68

View file

@ -37,11 +37,27 @@ by your system (25MB, multiplied by a small constant).
It has also other benefits: it prevents the file descriptor and other network ressource exhaustion,
and it adds fairness between users. Indeed, a user can't monopolize all the ressources of the servers (CPU, I/O, etc.) anymore, and thus multiple users
requests are thus intertwined by the server. And again, it leads to better predictability, as per-user requests completion will be less impacted
by other requests.
requests are thus intertwined by the server (we assume the number of users is greatly superior to the number of cores). And again, it leads to better predictability, as per-user requests completion will be less impacted by other requests.
Another concern is the RAM consumption with the IDLE feature.
The real cause is that we retain in-memory the full user profile (mailbox data, IO connectors, etc.): we should instead
keep only the minimum data to be waken-up. That's the ideal fix, the final solution, that would take lots of time to design and implement.
This fix is not necessary *now*, instead we can simply try to optimize the size of a full user profile in memory.
On this aspect, the [aws-sdk-s3](https://crates.io/crates/aws-sdk-s3) crate has [the following note](https://docs.rs/aws-sdk-s3/1.16.0/aws_sdk_s3/client/index.html):
> Client construction is expensive due to connection thread pool initialization, and should be done once at application start-up.
Digging deeper in the crate dependencies, we learn from the [aws-smithy-runtime](https://crates.io/crates/aws-smithy-runtime) crate, [we can read](https://docs.rs/aws-smithy-runtime/1.1.7/aws_smithy_runtime/client/http/hyper_014/struct.HyperClientBuilder.html):
> [Constructing] a Hyper client with the default TLS implementation (rustls) [...] can be useful when you want to share a Hyper connector between multiple generated Smithy clients.
It seems to be exactly what we want to do: to the best of my knowledge and my high-level understanding of the Rust aws-sdk ecosystem, the thread pool referenced early
is in fact the thread pool created by the Hyper Client. Looking at [Hyper 0.14 client documentation](https://docs.rs/hyper/0.14.28/hyper/client/index.html), we indeed learn that:
> The default Client provides these things on top of the lower-level API: [...] A pool of existing connections, allowing better performance when making multiple requests to the same hostname.
That's exactly what we want: we are doing requests to a single hostname, so we could have a single TCP connection, instead of *n* connections where *n* is the number of connected users! However, it now means sharing a tokio client among multiple threads. Before Hyper 0.11.2, [it was even impossible](https://stackoverflow.com/questions/44866366/how-can-i-use-hyperclient-from-another-thread). Starting from 0.11.3, the Client Pool is behind an Arc reference which allows to share it between threads, which is not necessarily something desirable: we now have synchronizations on this object. Given our workloads (a high number of users, where we expect the load to be evenly spread between them), a share nothing architecture is possible. So ideally we want one thread per core, and as few communication as possible between these threads. Like all other design changes: we are discussing long-term planning/changes, for now having a bit more synchronization could be an acceptable trade-off.
*TODO AWS SDK*
## Users feedbacks