It's mildly annoying when trying to execute these scripts while logged
in as a regular user, as the missing execute permissions will hinder
autocompletion even when trying to use with sudo.
These shell scripts don't contain secrets, but may fail when ran by a
regular user. The failure is due to the lack of access to the /matrix
directory, and does not result in any damage.
This commit introduces a new role that downloads and installs the
prometheus community postgres exporter https://github.com/prometheus-community/postgres_exporter.
A new credential is added to matrix_postgres_additional_databases that
allows the exporter access to the database to gather statistics.
A new dashboard was added to the grafana role, with some refactoring
to enable the dashboard only if the new role is enabled.
I've included some basic instructions for how to enable the role in
the Docs section.
In terms of testing, I've tested enabling the role, and disabling
it to make sure it cleans up the container and systemd role.
Added a mid-sized VPS configuration with configuration changes to the PostgreSQL database config.
Deleted single quotes in one of the examples to unify the examples
I changed the conditional statement in prosody systemd template to bind the localhost port by default if people have set ```matrix_nginx_proxy_enabled == false ```.
Hopefully that should make it the default behaviour now.
**HSTS Preloading:**
In its strongest and recommended form, the [HSTS policy](https://www.chromium.org/hsts) includes all subdomains, and indicates a willingness to be “preloaded” into browsers:
`Strict-Transport-Security: max-age=31536000; includeSubDomains; preload`
**X-Xss-Protection:**
`1; mode=block` which tells the browser to block the response if it detects an attack rather than sanitising the script.
@ -8,9 +8,7 @@ Members can be assigned a server from Digitalocean, or they can connect their ow
The AWX system is arranged into 'members' each with their own 'subscriptions'. After creating a subscription the user enters the 'provision stage' where they defined the URLs they will use, the servers location and whether or not there's already a website at the base domain. They then proceed onto the 'deploy stage' where they can configure their Matrix server.
Ideally this system can manage the updates, configuration, backups and monitoring on it's own. It is an extension of the popular deploy script [spantaleev/matrix-docker-ansible-deploy](https://github.com/spantaleev/matrix-docker-ansible-deploy).
Warning: This project is currently alpha quality and should only be run by the brave.
This system can manage the updates, configuration, import and export, backups and monitoring on its own. It is an extension of the popular deploy script [spantaleev/matrix-docker-ansible-deploy](https://github.com/spantaleev/matrix-docker-ansible-deploy).
## Other Required Playbooks
@ -23,6 +21,7 @@ The following repositories allow you to copy and use this setup:
[Ansible Provision Server](https://gitlab.com/GoMatrixHosting/ansible-provision-server) - Used by AWX members to perform initial configuration of their DigitalOcean or On-Premises server.
@ -4,8 +4,6 @@ The playbook can install and configure the [Mjolnir](https://github.com/matrix-o
See the project's [documentation](https://github.com/matrix-org/mjolnir) to learn what it does and why it might be useful to you.
Note: the playbook does not currently support the Mjolnir Synapse module. The playbook does support another antispam module, see [Setting up Synapse Simple Antispam](configuring-playbook-synapse-simple-antispam.md).
# Enabling metrics and graphs for Postgres (optional)
Expanding on the metrics exposed by the [synapse exporter and the node exporter](configuring-playbook-prometheus-grafana.md), the playbook enables the [postgres exporter](https://github.com/prometheus-community/postgres_exporter) that exposes more detailed information about what's happening on your postgres database.
You can enable this with the following settings in your configuration file (`inventory/host_vars/matrix.<your-domain>/vars.yml`):
```yaml
matrix_prometheus_postgres_exporter_enabled: true
# the role creates a postgres user as credential. You can configure these if required:
`matrix_prometheus_postgres_exporter_enabled`|Enable the postgres prometheus exporter. This sets up the docker container, connects it to the database and adds a 'job' to the prometheus config which tells prometheus about this new exporter. The default is 'false'
`matrix_prometheus_postgres_exporter_database_username`| The 'username' for the user that the exporter uses to connect to the database. The default is 'matrix_prometheus_postgres_exporter'
`matrix_prometheus_postgres_exporter_database_password`| The 'password' for the user that the exporter uses to connect to the database.
@ -6,8 +6,6 @@ It's a web UI tool you can use to **administrate users and rooms on your Matrix
See the project's [documentation](https://github.com/Awesome-Technologies/synapse-admin) to learn what it does and why it might be useful to you.
**Warning**: Synapse Admin will likely not work with Synapse v1.32 for now. See [this issue](https://github.com/Awesome-Technologies/synapse-admin/issues/132). If you insist on using Synapse Admin before there's a solution to this issue, you may wish to downgrade Synapse (adding `matrix_synapse_version: v1.31.0` or `matrix_synapse_version_arm64: v1.31.0` to your `inventory/host_vars/matrix.DOMAIN/vars.yml` file).
> **Note**: This migration guide is applicable if you migrate from one server to another server having the same CPU architecture (e.g. both servers being `amd64`).
>
> If you're trying to migrate between different architectures (e.g. `amd64` --> `arm64`), simply copying the complete `/matrix` directory is not possible as it would move the raw PostgreSQL data between different architectures. In this specific case, you can use the guide below as a reference, but you would also need to dump the database on your current server and import it properly on the new server. See our [Backing up PostgreSQL](maintenance-postgres.md#backing-up-postgresql) docs for help with PostgreSQL backup/restore.
# Migrating to new server
1. Prepare by lowering DNS TTL for your domains (`matrix.DOMAIN`, etc.), so that DNS record changes (step 4 below) would happen faster, leading to less downtime
PostgreSQL can be tuned to make it run faster. This is done by passing extra arguments to Postgres with the `matrix_postgres_process_extra_arguments` variable. You should use a website like https://pgtune.leopard.in.ua/ or information from https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server to determine what Postgres settings you should change.
**Note**: the configuration generator at https://pgtune.leopard.in.ua/ adds spaces around the `=` sign, which is invalid. You'll need to remove it manually (`max_connections = 300` -> `max_connections=300`)
### Here are some examples:
These are not recommended values and they may not work well for you. This is just to give you an idea of some of the options that can be set. If you are an experienced PostgreSQL admin feel free to update this documentation with better examples.
@ -106,11 +108,33 @@ These are not recommended values and they may not work well for you. This is jus
Here is an example config for a small 2 core server with 4GB of RAM and SSD storage:
```
matrix_postgres_process_extra_arguments: [
"-c 'shared_buffers=128MB'",
"-c 'effective_cache_size=2304MB'",
"-c 'effective_io_concurrency=100'",
"-c 'random_page_cost=2.0'",
"-c 'min_wal_size=500MB'",
"-c shared_buffers=128MB",
"-c effective_cache_size=2304MB",
"-c effective_io_concurrency=100",
"-c random_page_cost=2.0",
"-c min_wal_size=500MB",
]
```
Here is an example config for a 4 core server with 8GB of RAM on a Virtual Private Server (VPS); the paramters have been configured using https://pgtune.leopard.in.ua with the following setup: PostgreSQL version 12, OS Type: Linux, DB Type: Mixed type of application, Data Storage: SSD storage:
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
register:janitors_token
no_log:True
- name:Collect total number of rooms
- name:Copy build_room_list.py script to target machine
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
- name:Print total number of rooms
debug:
msg:'{{ rooms_total.stdout }}'
when:purge_rooms|bool
- name:Calculate every 100 values for total number of rooms
delegate_to:127.0.0.1
shell:|
seq 0 100 {{ rooms_total.stdout }}
when:purge_rooms|bool
register:every_100_rooms
- name:Fetch complete room list from target machine
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
- name:Ensure room_list_complete.json file exists
delegate_to:127.0.0.1
- name:Remove complete room list from target machine
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
- name:Collect epoche time from date
delegate_to:127.0.0.1
shell:|
date -d '{{ purge_date }}' +"%s"
when:purge_rooms|bool
when:(purge_mode.find("Numberof users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
register:purge_epoche_time
- name:Generate list of rooms with more then N users
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1) or (purge_mode.find("Skip purging rooms [faster]") != -1)
job_template:"{{ matrix_domain }} - 0 - Deploy/Update a Server"
tags:"rust-synapse-compress-state"
wait:yes
tower_host:"https://{{ tower_host }}"
tower_oauthtoken:"{{ tower_token.stdout }}"
validate_certs:yes
register:job
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1) or (purge_mode.find("Skip purging rooms [faster]") != -1)
- name:Stop Synapse service
shell:systemctl stop matrix-synapse.service
- name:Revert 'Deploy/Update a Server' job template
delegate_to:127.0.0.1
awx.awx.tower_job_template:
name:"{{ matrix_domain }} - 0 - Deploy/Update a Server"
description:"Creates a new matrix service with Spantaleev's playbooks"
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1) or (purge_mode.find("Skip purging rooms [faster]") != -1)
- name:Ensure matrix-synapse is stopped
service:
name:matrix-synapse
state:stopped
daemon_reload:yes
when:(purge_mode.find("Perform final shrink") != -1)
when:(purge_mode.find("Perform final shrink") != -1)
- name:Cleanup room_list files
delegate_to:127.0.0.1
shell:|
rm /tmp/{{ subscription_id }}_room_list*
when:purge_rooms|bool
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
ignore_errors:yes
- name:Collect size of Synapse database
- name:Collect after shrink size of Synapse database
shell:du -sh /matrix/postgres/data
register:db_size_after_stat
when:(purge_mode.find("Perform final shrink") != -1)
no_log:True
- name:Print total number of rooms processed
debug:
msg:'{{ rooms_total.stdout }}'
when:purge_rooms|bool
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
- name:Print the number of rooms purged with no local users
debug:
msg:'{{ rooms_no_local_total.stdout }}'
when:purge_rooms|bool
when:(purge_mode.find("No local users [recommended]") != -1) or (purge_mode.find("Number of users [slower]") != -1) or (purge_mode.find("Number of events [slower]") != -1)
- name:Print the number of rooms purged with more then N users
debug:
msg:'{{ rooms_join_members_total.stdout }}'
when:(purge_metric.find("Number of users") != -1) and (purge_rooms|bool)
when:purge_mode.find("Number of users") != -1
- name:Print the number of rooms purged with more then N events
debug:
msg:'{{ rooms_state_events_total.stdout }}'
when:(purge_metric.find("Number of events") != -1) and (purge_rooms|bool)
when:purge_mode.find("Number of events") != -1
- name:Print before purge size of Synapse database
# For OCSP purposes, we need to define a resolver at the `server{}` level or `http{}` level (we do the latter).
#
# Otherwise, we get warnings like this:
# > [warn] 22#22: no resolver defined to resolve r3.o.lencr.org while requesting certificate status, responder: r3.o.lencr.org, certificate: "/matrix/ssl/config/live/.../fullchain.pem"
#
# We point it to the internal Docker resolver, which likely delegates to nameservers defined in `/etc/resolv.conf`.
#
# When nginx proxy is disabled, our configuration is likely used by non-containerized nginx, so can't use the internal Docker resolver.
# Pointing `resolver` to some public DNS server might be an option, but for now we impose DNS servers on people.
# It might also be that no such warnings occur when not running in a container.
matrix_nginx_proxy_http_level_resolver:"{{ '127.0.0.11' if matrix_nginx_proxy_enabled else '' }}"
# By default, this playbook automatically retrieves and auto-renews