*"Will Aerogramme use lot of RAM" was the first question we asked ourselves
when designing email mailboxes as an encrypted event log. This blog post
tries to evaluate our design assumptions to the real world implementation,
similarly to what we have done [on Garage](https://garagehq.deuxfleurs.fr/blog/2022-perf/).*
<!-- more -->
---
## Methodology
Brendan Gregg, a very respected figure in the world of system performances, says that, for many reasons,
[~100% of benchmarks are wrong](https://www.brendangregg.com/Slides/Velocity2015_LinuxPerfTools.pdf).
This benchmark will be wrong too in multiple ways:
1. It will not say anything about Aerogramme performances in real world deployments
2. It will not say anything about Aerogramme performances compared to other email servers
However, I pursue a very specific goal with this benchmark: validating if the assumptions we have done
during the design phase, in term of compute and memory complexity, holds for real.
I will observe only two metrics: the CPU time used by the program (everything except idle and iowait based on the [psutil](https://pypi.org/project/psutil/) code) - for the computing complexity - and the [Resident Set Size](https://en.wikipedia.org/wiki/Resident_set_size) (data held RAM) - for the memory complexity.
<!--My baseline will be the compute and space complexity of the code that I have in mind. For example,
I know we have a "3 layers" data model: an index stored in RAM, a summary of the emails stored in K2V, a database, and the full email stored in S3, an object store.
Commands that can be solved only with the index should use a very low amount of RAM compared to . In turn, commands that require the full email will require to fetch lots of data from S3.-->
## Testing environment
I ran all the tests on my personal computer, a Dell Inspiron 7775 with an AMD Ryzen 7 1700, 16GB of RAM, an encrypted SSD, on NixOS 23.11.
The setup is made of Aerogramme (compiled in release mode) connected to a local, single node, Garage server.
Observations and graphs are done all in once thanks to the [psrecord](https://github.com/astrofrog/psrecord) tool.
I did not try to make the following values reproducible as it is more an exploration than a definitive review.
## Mailbox dataset
I will use [a dataset of 100 emails](https://git.deuxfleurs.fr/Deuxfleurs/aerogramme/src/commit/0b20d726bbc75e0dfd2ba1900ca5ea697645a8f1/tests/emails/aero100.mbox.zstd) I have made specifically for the occasion.
It contains some emails with various attachments, some emails with lots of text, emails generated by many different clients (Thunderbird, Geary, Sogo, Alps, Outlook iOS, GMail iOS, Windows Mail, Postbox, Mailbird, etc.), etc.
The mbox file weighs 23MB uncompressed.
One question that arise is: how representative of a real mailbox is this dataset? While a definitive response is not possible, I compared the email sizes of this dataset to the 2 367 emails in my personal inbox.
Below I plotted the empirical distribution for both my dataset and my personal inbox (note that the x axis is not linear but logarithimic).
*[Get the 100 emails dataset](https://git.deuxfleurs.fr/Deuxfleurs/aerogramme/src/commit/0b20d726bbc75e0dfd2ba1900ca5ea697645a8f1/tests/emails/aero100.mbox.zstd) - [Get the CSV used to plot this graph](https://git.deuxfleurs.fr/Deuxfleurs/aerogramme/src/branch/perf/cpu-ram-bottleneck/tests/emails/mailbox_email_sizes.csv)*
We see that the curves are close together and follow the same pattern: most emails are between 1kB and 100kB, and then we have a long tail (up to 20MB in my inbox, up to 6MB in the dataset).
It's not that surprising: on many places on the Internet, the limit on emails is set to 25MB. Overall I am quite satisfied by this simple dataset, even if having one or two bigger emails could make it even more representative of my real inbox...
Mailboxes with only 100 emails are not that common (mine has 2k emails...), so to emulate bigger mailboxes, I simply inject the dataset multiple times (eg. 20 times for 2k emails).
## Command dataset
Having a representative mailbox is a thing, but we also need to know what are the typical commands that are sent by IMAP clients.
As I have setup a test instance of Aerogramme (see [my FOSDEM talk](https://fosdem.org/2024/schedule/event/fosdem-2024-2642--servers-aerogramme-a-multi-region-imap-server/)),
I was able to extract 4 619 IMAP commands sent by various clients. Many of them are identical, and in the end, only 248 are truly unique.
The following bar plot depicts the command distribution per command name; top is the raw count, bottom is the unique count.
*[Get the IMAP command log](https://git.deuxfleurs.fr/Deuxfleurs/aerogramme/src/branch/perf/cpu-ram-bottleneck/tests/emails/imap_commands_dataset.log) - [Get the CSV used to plot this graph](https://git.deuxfleurs.fr/Deuxfleurs/aerogramme/src/branch/perf/cpu-ram-bottleneck/tests/emails/imap_commands_summary.csv)*
First, we can handle separately some commands: LOGIN, CAPABILITY, ENABLE, SELECT, EXAMINE, CLOSE, UNSELECT, LOGOUT as they are part of a **connection workflow**.
We do not plan on studying them directly as they will be used in all other tests.
CHECK, NOOP, IDLE, and STATUS are different approaches to detect a change in the current mailbox (or in other mailboxes in the case of STATUS),
I assimilate these commands as a **notification** mechanism.
FETCH, SEARCH and LIST are **query** commands, the first two ones for emails, the last one for mailboxes.
FETCH is from far the most used command (1187 occurencies) with the most variations (128 unique combination of parameters).
SEARCH is also used a lot (658 occurencies, 14 unique).
APPEND, STORE, EXPUNGE, MOVE, COPY, LSUB, SUBSCRIBE, CREATE, DELETE are commands to **write** things: flags, emails or mailboxes.
They are not used a lot but some writes are hidden in other commands (CLOSE, FETCH), and when mails arrive, they are delivered through a different protocol (LMTP) that does not appear here.
In the following, we will assess that APPEND behaves more or less than a LMTP delivery.
<!--
Focus on `FETCH` (128 unique commands), `SEARCH` (14 unique commands)