Commit Graph

262 Commits

Author SHA1 Message Date
Slavi Pantaleev
0be7b25c64 Make (most) containers run with a read-only filesystem 2019-01-29 18:52:02 +02:00
Slavi Pantaleev
b77b967171 Merge branch 'master' into non-root-containers 2019-01-29 18:00:11 +02:00
Slavi Pantaleev
cbc1cdbbf0 Do not try to load certificates
Seems like we unintentionally removed the mounting of certificates
(the `/matrix-config` mount) as part of splitting the playbook into
roles in 51312b8250.

It appears that those certificates weren't necessary for coturn to
funciton though, so we might just get rid of the configuration as well.
2019-01-29 17:56:40 +02:00
Slavi Pantaleev
bf10331456 Make mautrix-whatsapp run as non-root and w/o capabilities 2019-01-28 15:55:58 +02:00
Slavi Pantaleev
8a3f942d93 Make mautrix-telegram run as non-root and w/o capabilities 2019-01-28 15:40:16 +02:00
Slavi Pantaleev
3e8a4159e6 Uncomment unintentionally-commented logic 2019-01-28 14:25:03 +02:00
Slavi Pantaleev
9830a0871d Fix self-check for mxisd not being enabled 2019-01-28 11:47:31 +02:00
Slavi Pantaleev
9438402f61 Drop capabilities in a few more places
Continuation of 316d653d3e
2019-01-28 11:43:32 +02:00
Slavi Pantaleev
316d653d3e Drop capabilities in containers
We run containers as a non-root user (no effective capabilities).

Still, if a setuid binary is available in a container image, it could
potentially be used to give the user the default capabilities that the
container was started with. For Docker, the default set currently is:
- "CAP_CHOWN"
- "CAP_DAC_OVERRIDE"
- "CAP_FSETID"
- "CAP_FOWNER"
- "CAP_MKNOD"
- "CAP_NET_RAW"
- "CAP_SETGID"
- "CAP_SETUID"
- "CAP_SETFCAP"
- "CAP_SETPCAP"
- "CAP_NET_BIND_SERVICE"
- "CAP_SYS_CHROOT"
- "CAP_KILL"
- "CAP_AUDIT_WRITE"

We'd rather prevent such a potential escalation by dropping ALL
capabilities.

The problem is nicely explained here: https://github.com/projectatomic/atomic-site/issues/203
2019-01-28 11:22:54 +02:00
Slavi Pantaleev
0ff6735546 Fall back to dig for SRV lookup, if no dnspython
This is a known/intentional regression since f92c4d5a27.

The new stance on this is that most people would not have
dnspython, but may have the `dig` tool. There's no good
reason for not increasing our chances of success by trying both
methods (Ansible dig lookup and using the `dig` CLI tool).

Fixes #85 (Github issue).
2019-01-28 09:42:10 +02:00
Slavi Pantaleev
299a8c4c7c Make (most) containers start as non-root
This makes all containers (except mautrix-telegram and
mautrix-whatsapp), start as a non-root user.

We do this, because we don't trust some of the images.
In any case, we'd rather not trust ALL images and avoid giving
`root` access at all. We can't be sure they would drop privileges
or what they might do before they do it.

Because Postfix doesn't support running as non-root,
it had to be replaced by an Exim mail server.

The matrix-nginx-proxy nginx container image is patched up
(by replacing its main configuration) so that it can work as non-root.
It seems like there's no other good image that we can use and that is up-to-date
(https://hub.docker.com/r/nginxinc/nginx-unprivileged is outdated).

Likewise for riot-web (https://hub.docker.com/r/bubuntux/riot-web/),
we patch it up ourselves when starting (replacing the main nginx
configuration).
Ideally, it would be fixed upstream so we can simplify.
2019-01-27 20:25:13 +02:00
Slavi Pantaleev
56d501679d Be explicit about the UID/GID we start Synapse with
We do match the defaults anyway (by default that is),
but people can customize `matrix_user_uid` and `matrix_user_uid`
and it wouldn't be correct then.

In any case, it's better to be explicit about such an important thing.
2019-01-26 20:21:18 +02:00
Slavi Pantaleev
1a80058a2a Indent (non-YAML) using tabs
Fixes #83 (Github issue)
2019-01-26 09:37:29 +02:00
Slavi Pantaleev
a88b24ed2c Update matrix-corporal (1.2.2 -> 1.3.0) 2019-01-25 16:58:20 +02:00
Slavi Pantaleev
fcceb3143d Update riot-web (0.17.8 -> 0.17.9) 2019-01-23 08:13:27 +02:00
Slavi Pantaleev
a4e7ad5566 Use async Ansible task for importing Postgres
A long-running import task may hit the SSH timeout value
and die. Using async is supposed to improve reliability
in such scenarios.
2019-01-21 08:34:49 +02:00
Slavi Pantaleev
0392822aa7 Show Postgres import command and mention manual importing 2019-01-21 08:33:10 +02:00
Slavi Pantaleev
8d186e5194 Fix Postgres import when Postgres had never started
If this is a brand new server and Postgres had never started,
detecting it before we even start it is not possible.

This moves the logic, so that it happens later on, when Postgres
would have had the chance to start and possibly initialize
a new empty database.

Fixes #82 (Github issue)
2019-01-21 07:32:19 +02:00
Slavi Pantaleev
fef6c052c3 Pass Host/X-Forwarded-For everywhere
It hasn't mattered much to have these so far, but
it's probably a good idea to have them.
2019-01-17 16:25:08 +02:00
Slavi Pantaleev
ba75ab496d Send Host/X-Forwarded-For to mxisd
It worked without it too, but doing this is more consistent with the
mxisd recommendations.
2019-01-17 16:22:49 +02:00
Slavi Pantaleev
cb11548eec Use mxisd for user directory searches
Implements #77 (Github issue).
2019-01-17 15:55:23 +02:00
Slavi Pantaleev
df0d465482 Fix typos in some variables (matrix_mxid -> matrix_mxisd) 2019-01-17 14:47:37 +02:00
Slavi Pantaleev
f4f06ae068 Make matrix-nginx-proxy role independent of others
The matrix-nginx-proxy role can now be used independently.
This makes it consistent with all other roles, with
the `matrix-base` role remaining as their only dependency.

Separating matrix-nginx-proxy was relatively straightforward, with
the exception of the Mautrix Telegram reverse-proxying configuration.
Mautrix Telegram, being an extension/bridge, does not feel important enough
to justify its own special handling in matrix-nginx-proxy.

Thus, we've introduced the concept of "additional configuration blocks"
(`matrix_nginx_proxy_proxy_matrix_additional_server_configuration_blocks`),
where any module can register its own custom nginx server blocks.

For such dynamic registration to work, the order of role execution
becomes important. To make it possible for each module participating
in dynamic registration to verify that the order of execution is
correct, we've also introduced a `matrix_nginx_proxy_role_executed`
variable.

It should be noted that this doesn't make the matrix-synapse role
dependent on matrix-nginx-proxy. It's optional runtime detection
and registration, and it only happens in the matrix-synapse role
when `matrix_mautrix_telegram_enabled: true`.
2019-01-17 13:32:46 +02:00
Slavi Pantaleev
c10182e5a6 Make roles more independent of one another
With this change, the following roles are now only dependent
on the minimal `matrix-base` role:
- `matrix-corporal`
- `matrix-coturn`
- `matrix-mailer`
- `matrix-mxisd`
- `matrix-postgres`
- `matrix-riot-web`
- `matrix-synapse`

The `matrix-nginx-proxy` role still does too much and remains
dependent on the others.

Wiring up the various (now-independent) roles happens
via a glue variables file (`group_vars/matrix-servers`).
It's triggered for all hosts in the `matrix-servers` group.

According to Ansible's rules of priority, we have the following
chain of inclusion/overriding now:
- role defaults (mostly empty or good for independent usage)
- playbook glue variables (`group_vars/matrix-servers`)
- inventory host variables (`inventory/host_vars/matrix.<your-domain>`)

All roles default to enabling their main component
(e.g. `matrix_mxisd_enabled: true`, `matrix_riot_web_enabled: true`).
Reasoning: if a role is included in a playbook (especially separately,
in another playbook), it should "work" by default.

Our playbook disables some of those if they are not generally useful
(e.g. `matrix_corporal_enabled: false`).
2019-01-16 18:05:48 +02:00
Slavi Pantaleev
294a5c9083 Fix YAML serialization of empty matrix_synapse_federation_domain_whitelist
We've previously changed a bunch of lists in `homeserver.yaml.j2`
to be serialized using `|to_nice_yaml`, as that generates a more
readable list in YAML.

`matrix_synapse_federation_domain_whitelist`, however, couldn't have
been changed to that, as it can potentially be an empty list.

We may be able to differentiate between empty and non-empty now
and serialize it accordingly (favoring `|to_nice_yaml` if non-empty),
but it's not important enough to be justified. Thus, always
serializing with `|to_json`.

Fixes #78 (Github issue)
2019-01-16 17:06:58 +02:00
Sylvia van Os
cec2aa61c1 Fix scalar widgets
Riot-web parses integrations_widgets_urls as a list, thus causing it to incorrectly think Scalar widgets are non-Scalar and not passing the scalar token
2019-01-16 14:03:39 +01:00
Stuart Mumford
f8ebd94d08
Make the mode of the base path configurable 2019-01-14 14:40:11 +00:00
Slavi Pantaleev
e8c78c1572 Merge branch 'master' into split-into-multiple-roles 2019-01-14 08:27:53 +02:00
Slavi Pantaleev
857603d9d7 Make nginx-proxy files owned by matrix:matrix, not root:root 2019-01-14 08:26:56 +02:00
Slavi Pantaleev
b80d44afaa Stop Postgres before finding files to move over 2019-01-12 18:16:08 +02:00
Slavi Pantaleev
51312b8250 Split playbook into multiple roles
As suggested in #63 (Github issue), splitting the
playbook's logic into multiple roles will be beneficial for
maintainability.

This patch realizes this split. Still, some components
affect others, so the roles are not really independent of one
another. For example:
- disabling mxisd (`matrix_mxisd_enabled: false`), causes Synapse
and riot-web to reconfigure themselves with other (public)
Identity servers.

- enabling matrix-corporal (`matrix_corporal_enabled: true`) affects
how reverse-proxying (by `matrix-nginx-proxy`) is done, in order to
put matrix-corporal's gateway server in front of Synapse

We may be able to move away from such dependencies in the future,
at the expense of a more complicated manual configuration, but
it's probably not worth sacrificing the convenience we have now.

As part of this work, the way we do "start components" has been
redone now to use a loop, as suggested in #65 (Github issue).
This should make restarting faster and more reliable.
2019-01-12 18:01:10 +02:00
Slavi Pantaleev
6d253ff571 Switch to a better riot-web image (avhost/docker-matrix-riot -> bubuntux/riot-web)
The new container image is about 20x smaller in size, faster to start up, etc.

This also fixes #26 (Github issue).
2019-01-11 21:20:21 +02:00
Slavi Pantaleev
14a237885a Fix missing SMTP configuration for mxisd
Regression since 9a9b7383e9.
2019-01-11 20:26:40 +02:00
Slavi Pantaleev
9a9b7383e9 Completely redo how mxisd configuration gets generated
This change is provoked by a few different things:

- #54 (Github Pull Request), which rightfully says that we need a
way to support ALL mxisd configuration options easily

- the upcoming mxisd 1.3.0 release, which drops support for
property-style configuration (dot-notation), forcing us to
redo the way we generate the configuration file

With this, mxisd is much more easily configurable now
and much more easily maintaneable by us in the future
(no need to introduce additional playbook variables and logic).
2019-01-11 19:33:54 +02:00
Slavi Pantaleev
fca2f2e036 Catch misconfigured REST Auth password provider during installation 2019-01-11 01:03:35 +02:00
Slavi Pantaleev
46c5d11d56 Update components 2019-01-10 19:29:56 +02:00
Slavi Pantaleev
2ae7c5e177
Merge pull request #68 from spantaleev/manage-cronjobs-with-cron-module
Switch to managing cronjobs with the Ansible cron module
2019-01-08 16:21:57 +02:00
Slavi Pantaleev
00ae435044 Use |to_json filter for serializing booleans to JSON
This should account for all cases where we were still doing such a thing.

Improvement suggested in #65 (Github issue).
2019-01-08 13:12:56 +02:00
Slavi Pantaleev
b222d26c86 Switch to managing cronjobs with the Ansible cron module
As suggested in #65 (Github issue), this patch switches
cronjob management from using templates to using Ansible's `cron` module.

It also moves the management of the nginx-reload cronjob to `setup_ssl_lets_encrypt.yml`,
which is a more fitting place for it (given that this cronjob is only required when
Let's Encrypt is used).

Pros:
- using a module is more Ansible-ish than templating our own files in
special directories

- more reliable: will fail early (during playbook execution) if `/usr/bin/crontab`
is not available, which is more of a guarantee that cron is working fine
(idea: we should probably install some cron package using the playbook)

Cons:
- invocation schedule is no longer configurable, unless we define individual
variables for everything or do something smart (splitting on ' ', etc.).
Likely not necessary, however.

- requires us to deprecate and clean-up after the old way of managing cronjobs,
because it's not compatible (using the same file as before means appending
additional jobs to it)
2019-01-08 12:52:03 +02:00
Slavi Pantaleev
ef2dc3745a Check DNS SRV record for _matrix-identity._tcp when mxisd enabled 2019-01-08 10:39:22 +02:00
Slavi Pantaleev
f92c4d5a27 Use Ansible dig lookup instead of calling the dig program
This means we no longer have a dependency on the `dig` program,
but we do have a dependency on `dnspython`.

Improves things as suggested in #65 (Github issue).
2019-01-08 10:19:45 +02:00
Jan Christian Grünhage
29d10804f0 Use yaml syntax instead of key=value syntax consistently
fixes #62
2019-01-07 23:38:39 +01:00
Slavi Pantaleev
5135c0cc0a Add Ansible guide and Ansible version checks
After having multiple people report issues with retrieving
SSL certificates, we've finally discovered the culprit to be
Ansible 2.5.1 (default and latest version on Ubuntu 18.04 LTS).

As silly as it is, certain distributions ("LTS" even) are 13 bugfix
versions of Ansible behind.

From now on, we try to auto-detect buggy Ansible versions and tell the
user. We also provide some tips for how to upgrade Ansible or
run it from inside a Docker container.

My testing shows that Ansible 2.4.0 and 2.4.6 are OK.
All other intermediate 2.4.x versions haven't been tested, but we
trust they're OK too.

From the 2.5.x releases, only 2.5.0 and 2.5.1 seem to be affected.
Ansible 2.5.2 corrects the problem with `include_tasks` + `with_items`.
2019-01-03 16:24:14 +02:00
Slavi Pantaleev
99af4543ac Replace include usage with include_tasks and import_tasks
The long-deprecated (since Ansible 2.4) use of include is
no more.
2019-01-03 15:24:08 +02:00
Slavi Pantaleev
76506f34e0 Make media-store restore work with server files, not local
This is a simplification and a way to make it consistent with
how we do Postgres imports (see 6d89319822), using
files coming from the server, not from the local machine.

By encouraging people NOT to use local files,
we potentially avoid problems such as #34 (Github issue),
where people would download `media_store` to their Mac's filesystem
and case-sensitivity issues will actually corrupt it.

By not encouraging local files usage, it's less likely that
people would copy (huge) directories to their local machine like that.
2019-01-01 15:57:50 +02:00
Slavi Pantaleev
e604a7bd43 Fix error message inaccuracy 2019-01-01 15:25:52 +02:00
Slavi Pantaleev
4c2e1a0588 Make SQLite database import work with server files, not local
This is a simplification and a way to make it consistent with
how we do Postgres imports (see 6d89319822), using
files coming from the server, not from the local machine.
2019-01-01 15:21:52 +02:00
Slavi Pantaleev
f153c70a60 Reorganize some files 2019-01-01 14:47:22 +02:00
Slavi Pantaleev
6d89319822 Add support for importing an existing Postgres database 2019-01-01 14:45:37 +02:00
Slavi Pantaleev
f472c1b9e5 Ensure psql returns a failure exit code when it fails
Until now, if the .sql file contained invalid data, psql would
choke on it, but still return an exit code of 0.
This is very misleading.

We need to pass `-v ON_ERROR_STOP=1` to make it exit
with a proper error exit code when failures happen.
2019-01-01 14:05:11 +02:00