forgery/README.md

# spam accounts management for forgejo

## Usage

- remove `model.json` if you want to start with no pre-existing model of what is
  spam or not. Or keep it to use the current classifier. The file gets updated
  when using the tool: the classifier learns from spam/legit decisions and
  should get progressively better at identifying spam.
- run: `cargo run`
- classify users as spam/not spam. By default the classification is stored
  locally in `db.json`, no concrete action is taken. (see the
  `ACTUALLY_BAN_USERS` environment variable below.)

## Configuration

Forgery reads the following environment variables:
- `FORGE_URL` (**mandatory**): url of the forgejo instance (e.g.
  https://git.deuxfleurs.fr)
- `FORGE_API_TOKEN` (**mandatory**): Forgejo API token *granting admin access*.
  You can generate an API token using the Forgejo web interface in `Settings ->
  Applications -> Generate New Token`.
- `ACTUALLY_BAN_USERS` (default: `false`): define it to `true` to actually lock
  user accounts, send notification emails and eventually delete user accounts.
  Otherwise, no actual action is taken: spammers are only listed in the
  database. The variable should be set in production, but probably not for
  testing.
- `STORAGE_BACKEND` (default: `local`): either `local` or `s3`. Chose `local` to
  store the application state to local files, or `s3` to store them in
  S3-compatible storage (see below for corresponding configuration variables).
- `BIND_ADDR` (default: `127.0.0.1:8080`): address on which the webserver listens

Environment variables read when `ACTUALLY_BAN_USERS=true`:
- `SMTP_ADDRESS`: address of the SMTP relay used to send email notifications
- `SMTP_USERNAME`: SMTP username
- `SMTP_PASSWORD`: SMTP password
- `ADMIN_CONTACT_EMAIL`: email that can be used to contact admins of your
  instance (included in the notification email sent when locking accounts)
- `ORG_NAME`: organization name (used in the notification email sent when
  locking accounts)

Environment variables read when `STORAGE_BACKEND=local`:
- `STORAGE_LOCAL_DIR` (default: `.`): path to a local directory where to store
  the application data (as two files `db.json` and `model.json`).

Environment variables read when `STORAGE_BACKEND=s3`:
- `STORAGE_S3_BUCKET`: name of the bucket where to store the application data
  (as two entries `db.json` and `model.json`).
- `AWS_DEFAULT_REGION`: S3 endpoint region
- `AWS_ENDPOINT_URL`: S3 endpoint URL
- `AWS_ACCESS_KEY_ID`: S3 key id
- `AWS_SECRET_ACCESS_KEY`: S3 key secret

## Todos

- discuss the current design choices for when locking the account/sending a
  notification email fails.
  (Current behavior is to periodically retry, avoid deleting if the account
   could not be locked, but delete the account after the grace period even if
   the email could not be sent…)
- auth: add support for connecting to the forge using oauth?
- improve error handling? currently the app will panic if writing to the storage
  backend fails. Can we do better?