diff --git a/content/blog/2024-predictability-and-correctness/index.md b/content/blog/2024-predictability-and-correctness/index.md index 7a71be6..549c03e 100644 --- a/content/blog/2024-predictability-and-correctness/index.md +++ b/content/blog/2024-predictability-and-correctness/index.md @@ -35,13 +35,29 @@ After, the command took ~30 to get executed, allocated up to 400MB of RAM, used Here is why it's a positive thing: now the memory consumption of Aerogramme is capped approximately by the biggest email accepted by your system (25MB, multiplied by a small constant). -It has also other benefits: it prevents the file descriptor and other network ressource exhaustion, +It has also other benefits: it prevents the file descriptor and other network ressource exhaustion, and it adds fairness between users. Indeed, a user can't monopolize all the ressources of the servers (CPU, I/O, etc.) anymore, and thus multiple users -requests are thus intertwined by the server. And again, it leads to better predictability, as per-user requests completion will be less impacted -by other requests. +requests are thus intertwined by the server (we assume the number of users is greatly superior to the number of cores). And again, it leads to better predictability, as per-user requests completion will be less impacted by other requests. +Another concern is the RAM consumption with the IDLE feature. +The real cause is that we retain in-memory the full user profile (mailbox data, IO connectors, etc.): we should instead +keep only the minimum data to be waken-up. That's the ideal fix, the final solution, that would take lots of time to design and implement. +This fix is not necessary *now*, instead we can simply try to optimize the size of a full user profile in memory. +On this aspect, the [aws-sdk-s3](https://crates.io/crates/aws-sdk-s3) crate has [the following note](https://docs.rs/aws-sdk-s3/1.16.0/aws_sdk_s3/client/index.html): + +> Client construction is expensive due to connection thread pool initialization, and should be done once at application start-up. + +Digging deeper in the crate dependencies, we learn from the [aws-smithy-runtime](https://crates.io/crates/aws-smithy-runtime) crate, [we can read](https://docs.rs/aws-smithy-runtime/1.1.7/aws_smithy_runtime/client/http/hyper_014/struct.HyperClientBuilder.html): + +> [Constructing] a Hyper client with the default TLS implementation (rustls) [...] can be useful when you want to share a Hyper connector between multiple generated Smithy clients. + +It seems to be exactly what we want to do: to the best of my knowledge and my high-level understanding of the Rust aws-sdk ecosystem, the thread pool referenced early +is in fact the thread pool created by the Hyper Client. Looking at [Hyper 0.14 client documentation](https://docs.rs/hyper/0.14.28/hyper/client/index.html), we indeed learn that: + +> The default Client provides these things on top of the lower-level API: [...] A pool of existing connections, allowing better performance when making multiple requests to the same hostname. + +That's exactly what we want: we are doing requests to a single hostname, so we could have a single TCP connection, instead of *n* connections where *n* is the number of connected users! However, it now means sharing a tokio client among multiple threads. Before Hyper 0.11.2, [it was even impossible](https://stackoverflow.com/questions/44866366/how-can-i-use-hyperclient-from-another-thread). Starting from 0.11.3, the Client Pool is behind an Arc reference which allows to share it between threads, which is not necessarily something desirable: we now have synchronizations on this object. Given our workloads (a high number of users, where we expect the load to be evenly spread between them), a share nothing architecture is possible. So ideally we want one thread per core, and as few communication as possible between these threads. Like all other design changes: we are discussing long-term planning/changes, for now having a bit more synchronization could be an acceptable trade-off. -*TODO AWS SDK* ## Users feedbacks