forgery/README.md

# spam accounts management for forgejo

## Usage

- remove `model.json` if you want to start with no pre-existing model of what is
  spam or not. Or keep it to use the current classifier. The file gets updated
  when using the tool: the classifier learns from spam/legit decisions and
  should get progressively better at identifying spam.
- run: `cargo run`
- classify users as spam/not spam. By default the classification is stored
  locally in `db.json`, no concrete action is taken. (see the
  `ACTUALLY_BAN_USERS` environment variable below.)

## Configuration

Forgery reads the following environment variables:
- `FORGE_URL` (**mandatory**): url of the forgejo instance (e.g.
  https://git.deuxfleurs.fr)
- `FORGE_API_TOKEN` (**mandatory**): Forgejo API token *granting admin access*.
  You can generate an API token using the Forgejo web interface in `Settings ->
  Applications -> Generate New Token`.
- `ACTUALLY_BAN_USERS` (default: `false`): define it to `true` to actually lock
  user accounts, send notification emails and eventually delete user accounts.
  Otherwise, no actual action is taken: spammers are only listed in the
  database. The variable should be set in production, but probably not for
  testing.
- `STORAGE_BACKEND` (default: `local`): either `local` or `s3`. Chose `local` to
  store the application state to local files, or `s3` to store them in
  S3-compatible storage (see below for corresponding configuration variables).
- `BIND_ADDR` (default: `127.0.0.1:8080`): address on which the webserver listens

Environment variables read when `ACTUALLY_BAN_USERS=true`:
- `SMTP_ADDRESS`: address of the SMTP relay used to send email notifications
- `SMTP_USERNAME`: SMTP username
- `SMTP_PASSWORD`: SMTP password
- `ADMIN_CONTACT_EMAIL`: email that can be used to contact admins of your
  instance (included in the notification email sent when locking accounts)
- `ORG_NAME`: organization name (used in the notification email sent when
  locking accounts)

Environment variables read when `STORAGE_BACKEND=local`:
- `STORAGE_LOCAL_DIR` (default: `.`): path to a local directory where to store
  the application data (as two files `db.json` and `model.json`).

Environment variables read when `STORAGE_BACKEND=s3`:
- `STORAGE_S3_BUCKET`: name of the bucket where to store the application data
  (as two entries `db.json` and `model.json`).
- `AWS_DEFAULT_REGION`: S3 endpoint region
- `AWS_ENDPOINT_URL`: S3 endpoint URL
- `AWS_ACCESS_KEY_ID`: S3 key id
- `AWS_SECRET_ACCESS_KEY`: S3 key secret

## Todos

- discuss the current design choices for when locking the account/sending a
  notification email fails.
  (Current behavior is to periodically retry, avoid deleting if the account
   could not be locked, but delete the account after the grace period even if
   the email could not be sent…)
- auth: add support for connecting to the forge using oauth?
- improve error handling? currently the app will panic if writing to the storage
  backend fails. Can we do better?
bundle auxiliary files (templates/css) in the binary 2024-12-22 19:36:24 +00:00			`# spam accounts management for forgejo`
add basic README 2024-11-23 12:20:40 +00:00
			`## Usage`

			- remove `model.json` if you want to start with no pre-existing model of what is
			`spam or not. Or keep it to use the current classifier. The file gets updated`
			`when using the tool: the classifier learns from spam/legit decisions and`
			`should get progressively better at identifying spam.`
			- run: `cargo run`
read the forgejo API token from an environment variable 2024-12-21 20:22:16 +00:00			`- classify users as spam/not spam. By default the classification is stored`
			locally in `db.json`, no concrete action is taken. (see the
			`ACTUALLY_BAN_USERS` environment variable below.)
README: add a list of TODOs 2024-11-23 12:28:24 +00:00
Perform destructive actions only when ACTUALLY_BAN_USERS=true 2024-12-20 20:14:43 +00:00			`## Configuration`

			`Forgery reads the following environment variables:`
env vars handling code: slight cleanup 2025-01-03 11:01:47 +00:00			- `FORGE_URL` (mandatory): url of the forgejo instance (e.g.
			`https://git.deuxfleurs.fr)`
			- `FORGE_API_TOKEN` (mandatory): Forgejo API token granting admin access.
			You can generate an API token using the Forgejo web interface in `Settings ->
read the forgejo API token from an environment variable 2024-12-21 20:22:16 +00:00			Applications -> Generate New Token`.
env vars handling code: slight cleanup 2025-01-03 11:01:47 +00:00			- `ACTUALLY_BAN_USERS` (default: `false`): define it to `true` to actually lock
			`user accounts, send notification emails and eventually delete user accounts.`
			`Otherwise, no actual action is taken: spammers are only listed in the`
			`database. The variable should be set in production, but probably not for`
			`testing.`
			- `STORAGE_BACKEND` (default: `local`): either `local` or `s3`. Chose `local` to
			store the application state to local files, or `s3` to store them in
			`S3-compatible storage (see below for corresponding configuration variables).`
			- `BIND_ADDR` (default: `127.0.0.1:8080`): address on which the webserver listens
Perform destructive actions only when ACTUALLY_BAN_USERS=true 2024-12-20 20:14:43 +00:00
Add S3 as storage backend, refactor db & storage code 2024-12-22 23:50:01 +00:00			Environment variables read when `ACTUALLY_BAN_USERS=true`:
Perform destructive actions only when ACTUALLY_BAN_USERS=true 2024-12-20 20:14:43 +00:00			- `SMTP_ADDRESS`: address of the SMTP relay used to send email notifications
			- `SMTP_USERNAME`: SMTP username
			- `SMTP_PASSWORD`: SMTP password
env vars handling code: slight cleanup 2025-01-03 11:01:47 +00:00			- `ADMIN_CONTACT_EMAIL`: email that can be used to contact admins of your
			`instance (included in the notification email sent when locking accounts)`
			- `ORG_NAME`: organization name (used in the notification email sent when
			`locking accounts)`
Perform destructive actions only when ACTUALLY_BAN_USERS=true 2024-12-20 20:14:43 +00:00
Add S3 as storage backend, refactor db & storage code 2024-12-22 23:50:01 +00:00			Environment variables read when `STORAGE_BACKEND=local`:
env vars handling code: slight cleanup 2025-01-03 11:01:47 +00:00			- `STORAGE_LOCAL_DIR` (default: `.`): path to a local directory where to store
			the application data (as two files `db.json` and `model.json`).
Add S3 as storage backend, refactor db & storage code 2024-12-22 23:50:01 +00:00
			Environment variables read when `STORAGE_BACKEND=s3`:
			- `STORAGE_S3_BUCKET`: name of the bucket where to store the application data
			(as two entries `db.json` and `model.json`).
			- `AWS_DEFAULT_REGION`: S3 endpoint region
			- `AWS_ENDPOINT_URL`: S3 endpoint URL
			- `AWS_ACCESS_KEY_ID`: S3 key id
			- `AWS_SECRET_ACCESS_KEY`: S3 key secret

README: add a list of TODOs 2024-11-23 12:28:24 +00:00			`## Todos`

Move "lock account + send email" to a worker with retries 2024-12-20 19:48:13 +00:00			`- discuss the current design choices for when locking the account/sending a`
			`notification email fails.`
			`(Current behavior is to periodically retry, avoid deleting if the account`
			`could not be locked, but delete the account after the grace period even if`
			`the email could not be sent…)`
Add S3 as storage backend, refactor db & storage code 2024-12-22 23:50:01 +00:00			`- auth: add support for connecting to the forge using oauth?`
add note 2025-01-03 10:52:54 +00:00			`- improve error handling? currently the app will panic if writing to the storage`
			`backend fails. Can we do better?`