This commit is contained in:
Quentin 2024-02-17 17:52:41 +01:00
parent 03d58045ab
commit 2a6437600b
Signed by: quentin
GPG key ID: E9602264D639FF68
3 changed files with 71 additions and 17 deletions

View file

@ -100,12 +100,7 @@ UID FETCH 2 (RFC822.HEADER BODY.PEEK[2]<0.10240>)
Flags, date, headers
```
SEARCH UNDELETED SINCE 2023-11-17
UID SEARCH HEADER "Message-ID" "<x@y.z>" UNDELETED
UID SEARCH 1:* UNSEEN
UID SEARCH BEFORE 2024-02-09
```
-->
@ -175,7 +170,7 @@ copy a same email multiple times in RAM.
It seems that in this first test that Aerogramme is particularly sensitive to 1) login commands due to argon2 and 2) large emails.
### Copy and Move
### Re-organizing your mailbox
You might need to organize your folders, copying or moving your email across your mailboxes.
COPY is a standard IMAP command, MOVE is an extension.
@ -193,7 +188,7 @@ to keep the UI responsive.
While CPU optimizations could probably be imagined, I find this behavior satisfying, especially as memory remains stable and low.
### Setting flags
### Messing with flags
Setting flags (Seen, Deleted, Answered, NonJunk, etc.) is done through the STORE command.
Our run will be made in 3 parts: 1) putting one flag on one email, 2) putting 16 flags on one email, and 3) putting one flag on 1k emails.
@ -255,37 +250,96 @@ that maintain a socket opened. These commands are sensitive, as while many proto
are one shot, and then your users spread their requests over time, with these commands,
all your users are continuously connected.
In the graph below, we plot the resource usage of 16 users that log into the system,
select inbox, and switch to IDLE, then, one by one, they receive an email and are notified.
In the graph below, we plot the resource usage of 16 users with a 100 emails mailbox each that log into the system,
select their inbox, switch to IDLE, and then, one by one, they receive an email and are notified.
![Idle Parallel](05-idle-parallel.png)
Memory usage is linear with the number of users.
If we extrapolate this observation, it would imply that 1k users = 2GB of RAM.
That's not something negligible, and it should be observed closely.
In the future, if it appears that's an issue, we could consider optimizations like 1) unloading the mailbox index
and 2) mutualizing the notification/wake up mechanism.
## Query Commands
`FETCH 1:* ALL`
### Fetching emails
Often, IMAP clients in first instance, are only interested by email metadata.
For example, the ALL keyword fetches some metadata, like flags, size, sender, recipient, etc.
Ressource usage of fetching this information on 1k email is depicted below.
![Fetch All 1k mail](06-fetch-all.png)
`FETCH 1:* FULL`
CPU spike is short, memory usage is low: nothing alarming in term of performances.
![Fetch Full 1k mail](07-fetch-full.png)
IMAP standardizes another keyword, FULL, that also returns the "shape" of a MIME email as an S-Expression.
Indeed, MIME emails can be seen as a tree where each node/leaves are a "part".
Which crashed the Garage server:
In Aerogramme, this shape is - as of 2024-02-17 - not pre-computed and not save in database, and thus, the full email must be fetched and parsed.
So, when I tried to fetch this shape on 1k emails, Garage crashed:
```
ERROR hyper::server::tcp: accept error: No file descriptors available (os error 24)
```
`SEARCH`
Indeed, `ulimit` is set to 1024 on my machine, and apparently, I tried to open more than 1024 descriptors
for a single request... It's definitely an issue that must be fixed, but for this article,
I will increase the limit to make the request succeed.
I get the following graph.
*TODO*
![Fetch Full 1k mail](07-fetch-full.png)
`LIST`
With a spike at 300MB, it's clear we are fetching the full mailbox before starting to process it.
While it's a performance issue, it's also a stability/predictability issue: any user could trigger huge allocations on the server.
### Searching
<!--
```
3 SEARCH HEADER "Message-ID" "<4a83801e-4848-fbe5-0afa-ef8592d99a52@saint-ex.deuxfleurs.org>" UNDELETED SINCE 1-Jan-2020
```
-->
<!--
```
SEARCH UNDELETED SINCE 2023-11-17
UID SEARCH HEADER "Message-ID" "<x@y.z>" UNDELETED
UID SEARCH 1:* UNSEEN
UID SEARCH BEFORE 2024-02-09
```
-->
First, we start with a SEARCH command inspired by what we have seen in the logs on the whole mailbox, and that can be run
without fetching the full email from the blob storage.
![Search meta](./search-meta.png)
Spike order: 1) artifact, ignored, 2) login+select, 3) search, 4) logout
We load ~10MB in memory to make our request that is quite fast.
But we also know that some SEARCH requests will require to fetch some content
from the S3 object storage, and in this case, the profile is different.
![Search body](./search-body.png)
We have the same profile as FETCH FULL: a huge allocation of memory and a very CPU intensive task.
The conclusion is similar to FETCH: while these commands are OK to be slow, it's not OK to allocate so much memory.
### Listing mailboxes
*TODO*
---
## Discussion
## Conclusion
*TBD*

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB