Neilj/improved delegation doc 2 (#4832)

Improved federation configuration docs.  Specifically detailing  .well-known and SRV based delegation methods. 

Inspiration Valentin Lab <valentin.lab@kalysto.org> for https://github.com/matrix-org/synapse/pull/4781
This commit is contained in:
Neil Johnson 2019-03-12 14:23:28 +00:00 committed by GitHub
parent b70ea3fa78
commit 8b692bf7c2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 202 additions and 148 deletions

View File

@ -71,7 +71,8 @@ set this to the hostname of your server. For a more production-ready setup, you
will probably want to specify your domain (`example.com`) rather than a will probably want to specify your domain (`example.com`) rather than a
matrix-specific hostname here (in the same way that your email address is matrix-specific hostname here (in the same way that your email address is
probably `user@example.com` rather than `user@email.example.com`) - but probably `user@example.com` rather than `user@email.example.com`) - but
doing so may require more advanced setup. - see [Setting up Federation](README.rst#setting-up-federation). Beware that the server name cannot be changed later. doing so may require more advanced setup: see [Setting up Federation](docs/federate.md).
Beware that the server name cannot be changed later.
This command will generate you a config file that you can then customise, but it will This command will generate you a config file that you can then customise, but it will
also generate a set of keys for you. These keys will allow your Home Server to also generate a set of keys for you. These keys will allow your Home Server to
@ -378,6 +379,9 @@ To configure Synapse to expose an HTTPS port, you will need to edit
for having Synapse automatically provision and renew federation for having Synapse automatically provision and renew federation
certificates through ACME can be found at [ACME.md](docs/ACME.md). certificates through ACME can be found at [ACME.md](docs/ACME.md).
For those of you upgrading your TLS certificate in readiness for Synapse 1.0,
please take a look at `our guide <docs/MSC1711_certificates_FAQ.md#configuring-certificates-for-compatibility-with-synapse-100>`_.
## Registering a user ## Registering a user
You will need at least one user on your server in order to use a Matrix You will need at least one user on your server in order to use a Matrix

View File

@ -80,7 +80,10 @@ Thanks for using Matrix!
Synapse Installation Synapse Installation
==================== ====================
For details on how to install synapse, see `<INSTALL.md>`_. .. _federation:
* For details on how to install synapse, see `<INSTALL.md>`_.
* For specific details on how to configure Synapse for federation see `docs/federate.md <docs/federate.md>`_
Connecting to Synapse from a client Connecting to Synapse from a client
@ -151,56 +154,6 @@ server on the same domain.
See https://github.com/vector-im/riot-web/issues/1977 and See https://github.com/vector-im/riot-web/issues/1977 and
https://developer.github.com/changes/2014-04-25-user-content-security for more details. https://developer.github.com/changes/2014-04-25-user-content-security for more details.
Troubleshooting
===============
Running out of File Handles
---------------------------
If synapse runs out of filehandles, it typically fails badly - live-locking
at 100% CPU, and/or failing to accept new TCP connections (blocking the
connecting client). Matrix currently can legitimately use a lot of file handles,
thanks to busy rooms like #matrix:matrix.org containing hundreds of participating
servers. The first time a server talks in a room it will try to connect
simultaneously to all participating servers, which could exhaust the available
file descriptors between DNS queries & HTTPS sockets, especially if DNS is slow
to respond. (We need to improve the routing algorithm used to be better than
full mesh, but as of June 2017 this hasn't happened yet).
If you hit this failure mode, we recommend increasing the maximum number of
open file handles to be at least 4096 (assuming a default of 1024 or 256).
This is typically done by editing ``/etc/security/limits.conf``
Separately, Synapse may leak file handles if inbound HTTP requests get stuck
during processing - e.g. blocked behind a lock or talking to a remote server etc.
This is best diagnosed by matching up the 'Received request' and 'Processed request'
log lines and looking for any 'Processed request' lines which take more than
a few seconds to execute. Please let us know at #synapse:matrix.org if
you see this failure mode so we can help debug it, however.
Help!! Synapse eats all my RAM!
-------------------------------
Synapse's architecture is quite RAM hungry currently - we deliberately
cache a lot of recent room data and metadata in RAM in order to speed up
common requests. We'll improve this in future, but for now the easiest
way to either reduce the RAM usage (at the risk of slowing things down)
is to set the almost-undocumented ``SYNAPSE_CACHE_FACTOR`` environment
variable. The default is 0.5, which can be decreased to reduce RAM usage
in memory constrained enviroments, or increased if performance starts to
degrade.
Using `libjemalloc <http://jemalloc.net/>`_ can also yield a significant
improvement in overall amount, and especially in terms of giving back RAM
to the OS. To use it, the library must simply be put in the LD_PRELOAD
environment variable when launching Synapse. On Debian, this can be done
by installing the ``libjemalloc1`` package and adding this line to
``/etc/default/matrix-synapse``::
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1
This can make a significant difference on Python 2.7 - it's unclear how
much of an improvement it provides on Python 3.x.
Upgrading an existing Synapse Upgrading an existing Synapse
============================= =============================
@ -211,100 +164,19 @@ versions of synapse.
.. _UPGRADE.rst: UPGRADE.rst .. _UPGRADE.rst: UPGRADE.rst
.. _federation:
Setting up Federation
=====================
Federation is the process by which users on different servers can participate
in the same room. For this to work, those other servers must be able to contact
yours to send messages.
The ``server_name`` in your ``homeserver.yaml`` file determines the way that
other servers will reach yours. By default, they will treat it as a hostname
and try to connect to port 8448. This is easy to set up and will work with the
default configuration, provided you set the ``server_name`` to match your
machine's public DNS hostname, and give Synapse a TLS certificate which is
valid for your ``server_name``.
For a more flexible configuration, you can set up a DNS SRV record. This allows
you to run your server on a machine that might not have the same name as your
domain name. For example, you might want to run your server at
``synapse.example.com``, but have your Matrix user-ids look like
``@user:example.com``. (A SRV record also allows you to change the port from
the default 8448).
To use a SRV record, first create your SRV record and publish it in DNS. This
should have the format ``_matrix._tcp.<yourdomain.com> <ttl> IN SRV 10 0 <port>
<synapse.server.name>``. The DNS record should then look something like::
$ dig -t srv _matrix._tcp.example.com
_matrix._tcp.example.com. 3600 IN SRV 10 0 8448 synapse.example.com.
Note that the server hostname cannot be an alias (CNAME record): it has to point
directly to the server hosting the synapse instance.
You can then configure your homeserver to use ``<yourdomain.com>`` as the domain in
its user-ids, by setting ``server_name``::
python -m synapse.app.homeserver \
--server-name <yourdomain.com> \
--config-path homeserver.yaml \
--generate-config
python -m synapse.app.homeserver --config-path homeserver.yaml
If you've already generated the config file, you need to edit the ``server_name``
in your ``homeserver.yaml`` file. If you've already started Synapse and a
database has been created, you will have to recreate the database.
If all goes well, you should be able to `connect to your server with a client`__,
and then join a room via federation. (Try ``#matrix-dev:matrix.org`` as a first
step. "Matrix HQ"'s sheer size and activity level tends to make even the
largest boxes pause for thought.)
.. __: `Connecting to Synapse from a client`_
Troubleshooting
---------------
You can use the `federation tester <https://matrix.org/federationtester>`_ to
check if your homeserver is all set.
The typical failure mode with federation is that when you try to join a room,
it is rejected with "401: Unauthorized". Generally this means that other
servers in the room couldn't access yours. (Joining a room over federation is a
complicated dance which requires connections in both directions).
So, things to check are:
* If you are not using a SRV record, check that your ``server_name`` (the part
of your user-id after the ``:``) matches your hostname, and that port 8448 on
that hostname is reachable from outside your network.
* If you *are* using a SRV record, check that it matches your ``server_name``
(it should be ``_matrix._tcp.<server_name>``), and that the port and hostname
it specifies are reachable from outside your network.
Another common problem is that people on other servers can't join rooms that
you invite them to. This can be caused by an incorrectly-configured reverse
proxy: see `<docs/reverse_proxy.rst>`_ for instructions on how to correctly
configure a reverse proxy.
Running a Demo Federation of Synapses
-------------------------------------
If you want to get up and running quickly with a trio of homeservers in a
private federation, there is a script in the ``demo`` directory. This is mainly
useful just for development purposes. See `<demo/README>`_.
Using PostgreSQL Using PostgreSQL
================ ================
As of Synapse 0.9, `PostgreSQL <https://www.postgresql.org>`_ is supported as an Synapse offers two database engines:
alternative to the `SQLite <https://sqlite.org/>`_ database that Synapse has * `SQLite <https://sqlite.org/>`_
traditionally used for convenience and simplicity. * `PostgreSQL <https://www.postgresql.org>`_
The advantages of Postgres include: By default Synapse uses SQLite in and doing so trades performance for convenience.
SQLite is only recommended in Synapse for testing purposes or for servers with
light workloads.
Almost all installations should opt to use PostreSQL. Advantages include:
* significant performance improvements due to the superior threading and * significant performance improvements due to the superior threading and
caching model, smarter query optimiser caching model, smarter query optimiser
@ -440,3 +312,54 @@ sphinxcontrib-napoleon::
Building internal API documentation:: Building internal API documentation::
python setup.py build_sphinx python setup.py build_sphinx
Troubleshooting
===============
Running out of File Handles
---------------------------
If synapse runs out of file handles, it typically fails badly - live-locking
at 100% CPU, and/or failing to accept new TCP connections (blocking the
connecting client). Matrix currently can legitimately use a lot of file handles,
thanks to busy rooms like #matrix:matrix.org containing hundreds of participating
servers. The first time a server talks in a room it will try to connect
simultaneously to all participating servers, which could exhaust the available
file descriptors between DNS queries & HTTPS sockets, especially if DNS is slow
to respond. (We need to improve the routing algorithm used to be better than
full mesh, but as of March 2019 this hasn't happened yet).
If you hit this failure mode, we recommend increasing the maximum number of
open file handles to be at least 4096 (assuming a default of 1024 or 256).
This is typically done by editing ``/etc/security/limits.conf``
Separately, Synapse may leak file handles if inbound HTTP requests get stuck
during processing - e.g. blocked behind a lock or talking to a remote server etc.
This is best diagnosed by matching up the 'Received request' and 'Processed request'
log lines and looking for any 'Processed request' lines which take more than
a few seconds to execute. Please let us know at #synapse:matrix.org if
you see this failure mode so we can help debug it, however.
Help!! Synapse eats all my RAM!
-------------------------------
Synapse's architecture is quite RAM hungry currently - we deliberately
cache a lot of recent room data and metadata in RAM in order to speed up
common requests. We'll improve this in the future, but for now the easiest
way to either reduce the RAM usage (at the risk of slowing things down)
is to set the almost-undocumented ``SYNAPSE_CACHE_FACTOR`` environment
variable. The default is 0.5, which can be decreased to reduce RAM usage
in memory constrained enviroments, or increased if performance starts to
degrade.
Using `libjemalloc <http://jemalloc.net/>`_ can also yield a significant
improvement in overall amount, and especially in terms of giving back RAM
to the OS. To use it, the library must simply be put in the LD_PRELOAD
environment variable when launching Synapse. On Debian, this can be done
by installing the ``libjemalloc1`` package and adding this line to
``/etc/default/matrix-synapse``::
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1
This can make a significant difference on Python 2.7 - it's unclear how
much of an improvement it provides on Python 3.x.

1
changelog.d/4832.misc Normal file
View File

@ -0,0 +1 @@
Improve federation documentation, specifically .well-known support. Many thanks to @vaab.

126
docs/federate.md Normal file
View File

@ -0,0 +1,126 @@
Setting up Federation
=====================
Federation is the process by which users on different servers can participate
in the same room. For this to work, those other servers must be able to contact
yours to send messages.
The ``server_name`` configured in the Synapse configuration file (often
``homeserver.yaml``) defines how resources (users, rooms, etc.) will be
identified (eg: ``@user:example.com``, ``#room:example.com``). By
default, it is also the domain that other servers will use to
try to reach your server (via port 8448). This is easy to set
up and will work provided you set the ``server_name`` to match your
machine's public DNS hostname, and provide Synapse with a TLS certificate
which is valid for your ``server_name``.
Once you have completed the steps necessary to federate, you should be able to
join a room via federation. (A good place to start is ``#synapse:matrix.org``
- a room for Synapse admins.)
## Delegation
For a more flexible configuration, you can have ``server_name``
resources (eg: ``@user:example.com``) served by a different host and
port (eg: ``synapse.example.com:443``). There are two ways to do this:
- adding a ``/.well-known/matrix/server`` URL served on ``https://example.com``.
- adding a DNS ``SRV`` record in the DNS zone of domain
``example.com``.
Without configuring delegation, the matrix federation will
expect to find your server via ``example.com:8448``. The following methods
allow you retain a `server_name` of `example.com` so that your user IDs, room
aliases, etc continue to look like `*:example.com`, whilst having your
federation traffic routed to a different server.
### .well-known delegation
To use this method, you need to be able to alter the
``server_name`` 's https server to serve the ``/.well-known/matrix/server``
URL. Having an active server (with a valid TLS certificate) serving your
``server_name`` domain is out of the scope of this documentation.
The URL ``https://<server_name>/.well-known/matrix/server`` should
return a JSON structure containing the key ``m.server`` like so:
{
"m.server": "<synapse.server.name>[:<yourport>]"
}
In our example, this would mean that URL ``https://example.com/.well-known/matrix/server``
should return:
{
"m.server": "synapse.example.com:443"
}
Note, specifying a port is optional. If a port is not specified an SRV lookup
is performed, as described below. If the target of the
delegation does not have an SRV record, then the port defaults to 8448.
Most installations will not need to configure .well-known. However, it can be
useful in cases where the admin is hosting on behalf of someone else and
therefore cannot gain access to the necessary certificate. With .well-known,
federation servers will check for a valid TLS certificate for the delegated
hostname (in our example: ``synapse.example.com``).
.well-known support first appeared in Synapse v0.99.0. To federate with older
servers you may need to additionally configure SRV delegation. Alternatively,
encourage the server admin in question to upgrade :).
### DNS SRV delegation
To use this delegation method, you need to have write access to your
``server_name`` 's domain zone DNS records (in our example it would be
``example.com`` DNS zone).
This method requires the target server to provide a
valid TLS certificate for the original ``server_name``.
domain zone.
You need to add a SRV record in your ``server_name`` 's DNS zone with
this format:
_matrix._tcp.<yourdomain.com> <ttl> IN SRV <priority> <weight> <port> <synapse.server.name>
In our example, we would need to add this SRV record in the
``example.com`` DNS zone:
_matrix._tcp.example.com. 3600 IN SRV 10 5 443 synapse.example.com.
Once done and set up, you can check the DNS record with ``dig -t srv
_matrix._tcp.<server_name>``. In our example, we would expect this:
$ dig -t srv _matrix._tcp.example.com
_matrix._tcp.example.com. 3600 IN SRV 10 0 443 synapse.example.com.
Note that the target of a SRV record cannot be an alias (CNAME record): it has to point
directly to the server hosting the synapse instance.
## Troubleshooting
You can use the [federation tester](
<https://matrix.org/federationtester>) to check if your homeserver is
configured correctly. Alternatively try the [JSON API used by the federation tester](https://matrix.org/federationtester/api/report?server_name=DOMAIN).
Note that you'll have to modify this URL to replace ``DOMAIN`` with your
``server_name``. Hitting the API directly provides extra detail.
The typical failure mode for federation is that when the server tries to join
a room, it is rejected with "401: Unauthorized". Generally this means that other
servers in the room could not access yours. (Joining a room over federation is
a complicated dance which requires connections in both directions).
Another common problem is that people on other servers can't join rooms that
you invite them to. This can be caused by an incorrectly-configured reverse
proxy: see [reverse_proxy.rst](<reverse_proxy.rst>) for instructions on how to correctly
configure a reverse proxy.
## Running a Demo Federation of Synapses
If you want to get up and running quickly with a trio of homeservers in a
private federation, there is a script in the ``demo`` directory. This is mainly
useful just for development purposes. See [demo/README](<../demo/README>).