forgery/README.md

# spam accounts management for forgejo

## Usage

- remove `model.json` if you want to start with no pre-existing model of what is
  spam or not. Or keep it to use the current classifier. The file gets updated
  when using the tool: the classifier learns from spam/legit decisions and
  should get progressively better at identifying spam.
- run: `cargo run`
- classify users as spam/not spam. By default the classification is stored
  locally in `db.json`, no concrete action is taken. (see the
  `ACTUALLY_BAN_USERS` environment variable below.)

## Configuration

Forgery reads the following environment variables:
- `FORGE_URL`: url of the forgejo instance (e.g. https://git.deuxfleurs.fr)
- `FORGE_API_TOKEN`: Forgejo API token *granting admin access*. Required. You
  can generate an API token using the Forgejo web interface in `Settings ->
  Applications -> Generate New Token`.
- `ORG_NAME`: organization name (used in the notification email sent when
  locking accounts)
- `ADMIN_CONTACT_EMAIL`: email that can be used to contact admins of your
  instance (included in the notification email sent when locking accounts)
- `ACTUALLY_BAN_USERS`: define it to `true` to actually lock user accounts, send
  notification emails and eventually delete user accounts. If not defined (the
  default) or set to `false`, no actual action is taken: spammers are only
  listed in the database. The variable should be set in production, but probably
  not for testing.
- `STORAGE_BACKEND`: either `local` (default) or `s3`. Chose `local` to store
  the application state to local files, or `s3` to store them in S3-compatible
  storage (see below for corresponding configuration variables).
- `LISTEN_ADDR`: address on which the webserver listens (default: `0.0.0.0`)
- `LISTEN_PORT`: port on which the webserver listens (default: `8080`)

Environment variables read when `ACTUALLY_BAN_USERS=true`:
- `SMTP_ADDRESS`: address of the SMTP relay used to send email notifications
- `SMTP_USERNAME`: SMTP username
- `SMTP_PASSWORD`: SMTP password

Environment variables read when `STORAGE_BACKEND=local`:
- `STORAGE_LOCAL_DIR`: path to a local directory where to store the application
  data (as two files `db.json` and `model.json`). Defaults to `.` if not
  defined.

Environment variables read when `STORAGE_BACKEND=s3`:
- `STORAGE_S3_BUCKET`: name of the bucket where to store the application data
  (as two entries `db.json` and `model.json`).
- `AWS_DEFAULT_REGION`: S3 endpoint region
- `AWS_ENDPOINT_URL`: S3 endpoint URL
- `AWS_ACCESS_KEY_ID`: S3 key id
- `AWS_SECRET_ACCESS_KEY`: S3 key secret

## Todos

- discuss the current design choices for when locking the account/sending a
  notification email fails.
  (Current behavior is to periodically retry, avoid deleting if the account
   could not be locked, but delete the account after the grace period even if
   the email could not be sent…)
- auth: add support for connecting to the forge using oauth?
- improve error handling? currently the app will panic if writing to the storage
  backend fails. Can we do better?
bundle auxiliary files (templates/css) in the binary 2024-12-22 19:36:24 +00:00			`# spam accounts management for forgejo`
add basic README 2024-11-23 12:20:40 +00:00
			`## Usage`

			- remove `model.json` if you want to start with no pre-existing model of what is
			`spam or not. Or keep it to use the current classifier. The file gets updated`
			`when using the tool: the classifier learns from spam/legit decisions and`
			`should get progressively better at identifying spam.`
			- run: `cargo run`
read the forgejo API token from an environment variable 2024-12-21 20:22:16 +00:00			`- classify users as spam/not spam. By default the classification is stored`
			locally in `db.json`, no concrete action is taken. (see the
			`ACTUALLY_BAN_USERS` environment variable below.)
README: add a list of TODOs 2024-11-23 12:28:24 +00:00
Perform destructive actions only when ACTUALLY_BAN_USERS=true 2024-12-20 20:14:43 +00:00			`## Configuration`

			`Forgery reads the following environment variables:`
tweaks 2025-01-02 11:45:00 +00:00			- `FORGE_URL`: url of the forgejo instance (e.g. https://git.deuxfleurs.fr)
			- `FORGE_API_TOKEN`: Forgejo API token granting admin access. Required. You
read the forgejo API token from an environment variable 2024-12-21 20:22:16 +00:00			can generate an API token using the Forgejo web interface in `Settings ->
			Applications -> Generate New Token`.
remove deuxfleurs-specific bits, add environment variables for configuration 2024-12-22 14:03:38 +00:00			- `ORG_NAME`: organization name (used in the notification email sent when
			`locking accounts)`
			- `ADMIN_CONTACT_EMAIL`: email that can be used to contact admins of your
			`instance (included in the notification email sent when locking accounts)`
bundle auxiliary files (templates/css) in the binary 2024-12-22 19:36:24 +00:00			- `ACTUALLY_BAN_USERS`: define it to `true` to actually lock user accounts, send
			`notification emails and eventually delete user accounts. If not defined (the`
			default) or set to `false`, no actual action is taken: spammers are only
			`listed in the database. The variable should be set in production, but probably`
			`not for testing.`
Add S3 as storage backend, refactor db & storage code 2024-12-22 23:50:01 +00:00			- `STORAGE_BACKEND`: either `local` (default) or `s3`. Chose `local` to store
			the application state to local files, or `s3` to store them in S3-compatible
			`storage (see below for corresponding configuration variables).`
Allow customizing the listening address and port 2025-01-02 11:13:27 +00:00			- `LISTEN_ADDR`: address on which the webserver listens (default: `0.0.0.0`)
			- `LISTEN_PORT`: port on which the webserver listens (default: `8080`)
Perform destructive actions only when ACTUALLY_BAN_USERS=true 2024-12-20 20:14:43 +00:00
Add S3 as storage backend, refactor db & storage code 2024-12-22 23:50:01 +00:00			Environment variables read when `ACTUALLY_BAN_USERS=true`:
Perform destructive actions only when ACTUALLY_BAN_USERS=true 2024-12-20 20:14:43 +00:00			- `SMTP_ADDRESS`: address of the SMTP relay used to send email notifications
			- `SMTP_USERNAME`: SMTP username
			- `SMTP_PASSWORD`: SMTP password

Add S3 as storage backend, refactor db & storage code 2024-12-22 23:50:01 +00:00			Environment variables read when `STORAGE_BACKEND=local`:
			- `STORAGE_LOCAL_DIR`: path to a local directory where to store the application
			data (as two files `db.json` and `model.json`). Defaults to `.` if not
			`defined.`

			Environment variables read when `STORAGE_BACKEND=s3`:
			- `STORAGE_S3_BUCKET`: name of the bucket where to store the application data
			(as two entries `db.json` and `model.json`).
			- `AWS_DEFAULT_REGION`: S3 endpoint region
			- `AWS_ENDPOINT_URL`: S3 endpoint URL
			- `AWS_ACCESS_KEY_ID`: S3 key id
			- `AWS_SECRET_ACCESS_KEY`: S3 key secret

README: add a list of TODOs 2024-11-23 12:28:24 +00:00			`## Todos`

Move "lock account + send email" to a worker with retries 2024-12-20 19:48:13 +00:00			`- discuss the current design choices for when locking the account/sending a`
			`notification email fails.`
			`(Current behavior is to periodically retry, avoid deleting if the account`
			`could not be locked, but delete the account after the grace period even if`
			`the email could not be sent…)`
Add S3 as storage backend, refactor db & storage code 2024-12-22 23:50:01 +00:00			`- auth: add support for connecting to the forge using oauth?`
add note 2025-01-03 10:52:54 +00:00			`- improve error handling? currently the app will panic if writing to the storage`
			`backend fails. Can we do better?`