Done with the article content
This commit is contained in:
parent
5adc7baf24
commit
718830a0b0
1 changed files with 72 additions and 7 deletions
|
@ -1,10 +1,11 @@
|
||||||
+++
|
+++
|
||||||
title="Does Aerogramme use lot of RAM?"
|
title="Does Aerogramme use too much RAM?"
|
||||||
date=2024-02-15
|
date=2024-02-15
|
||||||
+++
|
+++
|
||||||
|
|
||||||
*"Will Aerogramme use lot of RAM" was the first question we asked ourselves
|
*"Will Aerogramme use too much RAM?" was the first question we asked ourselves
|
||||||
when designing email mailboxes as an encrypted event log. This blog post
|
when designing email mailboxes as an encrypted event log, which is very different from
|
||||||
|
existing designs that are very optimized. This blog post
|
||||||
tries to evaluate our design assumptions to the real world implementation,
|
tries to evaluate our design assumptions to the real world implementation,
|
||||||
similarly to what we have done [on Garage](https://garagehq.deuxfleurs.fr/blog/2022-perf/).*
|
similarly to what we have done [on Garage](https://garagehq.deuxfleurs.fr/blog/2022-perf/).*
|
||||||
|
|
||||||
|
@ -265,6 +266,12 @@ and 2) mutualizing the notification/wake up mechanism.
|
||||||
|
|
||||||
## Query Commands
|
## Query Commands
|
||||||
|
|
||||||
|
Query commands are the most used commands in IMAP,
|
||||||
|
they are very expressive and allows the client to fetch only what they need:
|
||||||
|
a list of the emails without their content, displaying an email body without
|
||||||
|
having to fetch its attachment, etc.
|
||||||
|
Of course, this expressivity creates complexity!
|
||||||
|
|
||||||
### Fetching emails
|
### Fetching emails
|
||||||
|
|
||||||
Often, IMAP clients in first instance, are only interested by email metadata.
|
Often, IMAP clients in first instance, are only interested by email metadata.
|
||||||
|
@ -317,7 +324,7 @@ UID SEARCH BEFORE 2024-02-09
|
||||||
First, we start with a SEARCH command inspired by what we have seen in the logs on the whole mailbox, and that can be run
|
First, we start with a SEARCH command inspired by what we have seen in the logs on the whole mailbox, and that can be run
|
||||||
without fetching the full email from the blob storage.
|
without fetching the full email from the blob storage.
|
||||||
|
|
||||||
![Search meta](./search-meta.png)
|
![Search meta](search-meta.png)
|
||||||
|
|
||||||
Spike order: 1) artifact, ignored, 2) login+select, 3) search, 4) logout
|
Spike order: 1) artifact, ignored, 2) login+select, 3) search, 4) logout
|
||||||
We load ~10MB in memory to make our request that is quite fast.
|
We load ~10MB in memory to make our request that is quite fast.
|
||||||
|
@ -325,7 +332,7 @@ We load ~10MB in memory to make our request that is quite fast.
|
||||||
But we also know that some SEARCH requests will require to fetch some content
|
But we also know that some SEARCH requests will require to fetch some content
|
||||||
from the S3 object storage, and in this case, the profile is different.
|
from the S3 object storage, and in this case, the profile is different.
|
||||||
|
|
||||||
![Search body](./search-body.png)
|
![Search body](search-body.png)
|
||||||
|
|
||||||
We have the same profile as FETCH FULL: a huge allocation of memory and a very CPU intensive task.
|
We have the same profile as FETCH FULL: a huge allocation of memory and a very CPU intensive task.
|
||||||
The conclusion is similar to FETCH: while these commands are OK to be slow, it's not OK to allocate so much memory.
|
The conclusion is similar to FETCH: while these commands are OK to be slow, it's not OK to allocate so much memory.
|
||||||
|
@ -345,8 +352,66 @@ for the LIST command in itself.
|
||||||
|
|
||||||
## Discussion
|
## Discussion
|
||||||
|
|
||||||
*TODO*
|
At this level of maturity, the main goal for Aerogramme is predictable & stable resource usage server side.
|
||||||
|
Indeed, there is nothing more annoying than a user, honest or malicious, breaking the server while running
|
||||||
|
a resource intensive command.
|
||||||
|
|
||||||
|
**Querying bodies** - They may allocate the full mailbox in RAM. For some parameters (eg. FETCH BODY),
|
||||||
|
it might be possible to precompute data, but for some others (eg. SEARCH TEXT "something") it's not possible, so precomputing
|
||||||
|
is not a definitive solution. Also being slow is acceptable here: we just want to avoid being resource intensive. The solution I envision is to "stream"
|
||||||
|
the fetched emails and process them one by one. For some commands like FETCH 1:* BODY[], that are to the best of my knowledge,
|
||||||
|
never run by real clients, it will not be enough however. Indeed, Aerogramme will drop the fetched emails, but it will have an in-memory copy inside
|
||||||
|
the response object awaiting for delivery. So we might want to implement response streaming too.
|
||||||
|
|
||||||
|
**argon2** - Login resource usage is due to argon2, but compared to many other protocols, authentications in IMAP occure way more often.
|
||||||
|
For example, if a user configure their email client to poll their mailbox every 5 minutes, this client will authenticate 288 times in a single day!
|
||||||
|
argon2 resource usage is a tradeoff with the bruteforcing difficulty, reducing its resource usage is thus a possible performance mitigation
|
||||||
|
solution at the cost of reduced security. Other options might reside in the evolution of our authentication system: even if I don't
|
||||||
|
know what is the current state of implementation of OAUTH in existing IMAP clients, it could be considered as an option.
|
||||||
|
Indeed, the user enters their password once, during the configuration of their account, and then a token is generated and stored in the client.
|
||||||
|
The credentials could be stored in the token (think a JSON Web Token for example), avoiding the expensive KDF on each connection.
|
||||||
|
|
||||||
|
**IDLE** - IDLE RAM usage is the same as other commands, but IDLE keeps the RAM allocated for way longer.
|
||||||
|
To illustrate my point, let's suppose an IMAP session uses 5MB of RAM,
|
||||||
|
and two populations of 1 000 users. We suppose users monitor only one mailbox (which is not true in many cases).
|
||||||
|
The first population, aka *the poll population*, configures their client to poll the server every 5 minutes, polling takes 2 seconds (LOGIN + SELECT + LOGOUT).
|
||||||
|
The second population, aka *the push population*, configures their client to "receive push messages" - ie. using IMAP IDLE.
|
||||||
|
|
||||||
|
For the *push population*, the RAM usage will
|
||||||
|
be a stable 5GB (5MB * 1 000 users): all users will be always connected.
|
||||||
|
With a 1MB session, we could reduce the RAM usage to 1GB: any improvement on the session base RAM
|
||||||
|
will be critical to IDLE with the current design.
|
||||||
|
|
||||||
|
For the *poll population*, we can split the time in 150 ticks (5 minutes / 2 seconds = 150 ticks).
|
||||||
|
It seems the problem can be mapped to a [Balls into bins problem](https://en.wikipedia.org/wiki/Balls_into_bins_problem)
|
||||||
|
with random allocation (yes, I know, assuming random allocation might not hold in many situations).
|
||||||
|
Based on the Wikipedia formula, if I am not wrong, we can suppose that, with high probability, at most 13 clients will be connected
|
||||||
|
at once, which means 65MB of RAM usage (5MB * 13 clients/tick = 65MB/tick).
|
||||||
|
|
||||||
|
With these *back-of-the-envelope calculations*, we understand how crucial the IDLE RAM consumption is compared
|
||||||
|
to other commands, and how the base RAM consumption of a user will impact the service.
|
||||||
|
|
||||||
|
**Large email streaming** - Finally, email streaming could really improve RAM usage for large emails, especially on APPEND and LMTP delivery,
|
||||||
|
or even on FETCH when the body is required. But implementing such a feature would require an email parser that
|
||||||
|
can work on streams, which in turns [is not something trivial](https://github.com/rust-bakery/nom/issues/1160).
|
||||||
|
|
||||||
|
While it seems untimely to act now, these spots are great candidates for a closer monitoring for future performance evaluations.
|
||||||
|
Fixing these points, above the simple mitigations, will involve important design changes in Aerogramme, which means
|
||||||
|
in the end: writing lot of code! That's why I think Aerogramme can work with these "limitations" for now,
|
||||||
|
and we will take decisions about these points when it will be *really* required.
|
||||||
|
|
||||||
## Conclusion
|
## Conclusion
|
||||||
|
|
||||||
*TODO*
|
Back to the question "Does Aerogramme use too much RAM?",
|
||||||
|
it of course depends on the context.
|
||||||
|
For now, we want to start with 1 000 users, a 1k email INBOX, and a server of 8GB of RAM,
|
||||||
|
and Aerogramme seems ready for a first deployment in this context. Of course, in the long term,
|
||||||
|
we expect better ressource usages.
|
||||||
|
|
||||||
|
Based on this benchmark, I identify 3 low-hanging fruits to improve performances that do not require major design changes
|
||||||
|
: 1) in FETCH+SEARCH queries, handling emails one after another instead of loading the full mailbox in memory
|
||||||
|
, 2) streaming FETCH responses instead of aggregating them in memory
|
||||||
|
, and 3) reducing the RAM usage of a base user by tweaking its Garage connectors configuration
|
||||||
|
|
||||||
|
Collecting production data will then help priorize other, more ambitious works, on the authentication side,
|
||||||
|
on the idling side, and on the email streaming side.
|
||||||
|
|
Loading…
Reference in a new issue