Commit Graph

914 Commits

Author SHA1 Message Date
reivilibre
eebfd024e9
Factorise get_datastore calls in phone_stats_home. ()
Follow-up to .
2021-07-19 19:31:17 +01:00
reivilibre
4e340412c0
Add a new version of the R30 phone-home metric, which removes a false impression of retention given by the old R30 metric ()
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
2021-07-19 16:11:34 +01:00
Jonathan de Jong
95e47b2e78
[pyupgrade] synapse/ ()
This PR is tantamount to running 
```
pyupgrade --py36-plus --keep-percent-format `find synapse/ -type f -name "*.py"`
```

Part of 
2021-07-19 15:28:05 +01:00
Jonathan de Jong
bf72d10dbf
Use inline type hints in various other places (in synapse/) () 2021-07-15 11:02:43 +01:00
Erik Johnston
7a5873277e
Add support for evicting cache entries based on last access time. () 2021-07-05 16:32:12 +01:00
Erik Johnston
85d237eba7
Add a distributed lock ()
This adds a simple best effort locking mechanism that works cross workers.
2021-06-29 19:15:47 +01:00
Richard van der Hoff
107c06081f
Ensure that errors during startup are written to the logs and the console. ()
* Defer stdio redirection until we are about to start the reactor

* Catch and handle exceptions during startup
2021-06-21 11:41:25 +01:00
Brendan Abolivier
1b3e398bea
Standardise the module interface ()
This PR adds a common configuration section for all modules (see docs). These modules are then loaded at startup by the homeserver. Modules register their hooks and web resources using the new `register_[...]_callbacks` and `register_web_resource` methods of the module API.
2021-06-18 12:15:52 +01:00
Brendan Abolivier
08c8469322
Remove support for ACME v1 ()
Fixes 

ACME v1 has been fully decommissioned for existing installs on June 1st 2021(see https://community.letsencrypt.org/t/end-of-life-plan-for-acmev1/88430/27), so we can now safely remove it from Synapse.
2021-06-17 18:56:48 +01:00
Richard van der Hoff
9cf6e0eae7
Rip out the DNS lookup limiter ()
As I've written in various places in the past (, ) I'm pretty sure this is doing nothing useful at all.
2021-06-17 16:22:41 +01:00
Andrew Morgan
a15a046c93
Clean up a broken import in admin_cmd.py () 2021-06-11 11:34:40 +01:00
Erik Johnston
5eed6348ce
Move some more endpoints off master () 2021-05-27 22:45:43 +01:00
Richard van der Hoff
fe5dad46b0
Remove redundant code to reload tls cert ()
we don't need to reload the tls cert if we don't have any tls listeners.

Follow-up to .
2021-05-27 10:34:24 +01:00
Erik Johnston
3e831f24ff
Don't hammer the database for destination retry timings every ~5mins () 2021-05-21 17:57:08 +01:00
Erik Johnston
8771b1337d
Export jemalloc stats to prometheus when used () 2021-05-06 15:54:07 +01:00
Erik Johnston
ef889c98a6
Optionally track memory usage of each LruCache ()
This will double count slightly in the presence of interned strings. It's off by default as it can consume a lot of resources.
2021-05-05 16:54:36 +01:00
Erik Johnston
1fb9a2d0bf
Limit how often GC happens by time. ()
Synapse can be quite memory intensive, and unless care is taken to tune
the GC thresholds it can end up thrashing, causing noticable performance
problems for large servers. We fix this by limiting how often we GC a
given generation, regardless of current counts/thresholds.

This does not help with the reverse problem where the thresholds are set
too high, but that should only happen in situations where they've been
manually configured.

Adds a `gc_min_seconds_between` config option to override the defaults.

Fixes .
2021-05-05 16:53:45 +01:00
Richard van der Hoff
3ff2251754
Improved validation for received requests ()
* Simplify `start_listening` callpath

* Correctly check the size of uploaded files
2021-04-23 19:20:44 +01:00
Richard van der Hoff
59d24c5bef
pass a reactor into SynapseSite () 2021-04-23 17:06:47 +01:00
Erik Johnston
9d25a0ae65
Split presence out of master () 2021-04-23 12:21:55 +01:00
Richard van der Hoff
5a153772c1
remove HomeServer.get_config ()
Every single time I want to access the config object, I have to remember
whether or not we use `get_config`. Let's just get rid of it.
2021-04-14 19:09:08 +01:00
Erik Johnston
00a6db9676
Move some replication processing out of generic_worker ()
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2021-04-14 17:06:06 +01:00
Jonathan de Jong
4b965c862d
Remove redundant "coding: utf-8" lines ()
Part of 

Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.

`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
2021-04-14 15:34:27 +01:00
Andrew Morgan
04819239ba
Add a Synapse Module for configuring presence update routing ()
At the moment, if you'd like to share presence between local or remote users, those users must be sharing a room together. This isn't always the most convenient or useful situation though.

This PR adds a module to Synapse that will allow deployments to set up extra logic on where presence updates should be routed. The module must implement two methods, `get_users_for_states` and `get_interested_users`. These methods are given presence updates or user IDs and must return information that Synapse will use to grant passing presence updates around.

A method is additionally added to `ModuleApi` which allows triggering a set of users to receive the current, online presence information for all users they are considered interested in. This is the equivalent of that user receiving presence information during an initial sync. 

The goal of this module is to be fairly generic and useful for a variety of applications, with hard requirements being:

* Sending state for a specific set or all known users to a defined set of local and remote users.
* The ability to trigger an initial sync for specific users, so they receive all current state.
2021-04-06 14:38:30 +01:00
Patrick Cloke
da75d2ea1f
Add type hints for the federation sender. ()
Includes an abstract base class which both the FederationSender
and the FederationRemoteSendQueue must implement.
2021-03-29 11:43:20 -04:00
Richard van der Hoff
7c8402ddb8
Suppress CryptographyDeprecationWarning ()
This warning is somewhat confusing to users, so let's suppress it
2021-03-26 17:33:55 +00:00
Brendan Abolivier
0b56481caa
Fix lint 2021-03-19 16:11:08 +01:00
Brendan Abolivier
066c703729
Move support for MSC3026 behind an experimental flag 2021-03-18 18:37:19 +01:00
Brendan Abolivier
405aeb0b2c
Implement MSC3026: busy presence state 2021-03-18 16:34:47 +01:00
Jonathan de Jong
27d2820c33
Enable flake8-bugbear, but disable most checks. ()
* Adds B00 to ignored checks.
* Fixes remaining issues.
2021-03-16 14:19:27 -04:00
Richard van der Hoff
4db07f9aef
Set X-Forwarded-Proto header when frontend-proxy proxies a request ()
Should fix some remaining warnings
2021-03-03 18:49:08 +00:00
Jonathan de Jong
e12077a78a
Allow bytecode again ()
In , bytecode was disabled (from a bit of FUD back in `python<2.4` days, according to dev chat), I think it's safe enough to enable it again.

Added in `__pycache__/` and `.pyc`/`.pyd` to `.gitignore`, to extra-insure compiled files don't get committed.

`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
2021-02-26 18:30:54 +00:00
Erik Johnston
2927921942
Clean up ShardedWorkerHandlingConfig ()
* Split ShardedWorkerHandlingConfig

This is so that we have a type level understanding of when it is safe to
call `get_instance(..)` (as opposed to `should_handle(..)`).

* Remove special cases in ShardedWorkerHandlingConfig.

`ShardedWorkerHandlingConfig` tried to handle the various different ways
it was possible to configure federation senders and pushers. This led to
special cases that weren't hit during testing.

To fix this the handling of the different cases is moved from there and
`generic_worker` into the worker config class. This allows us to have
the logic in one place and allows the rest of the code to ignore the
different cases.
2021-02-24 13:23:18 +00:00
Erik Johnston
66f4949e7f
Fix deleting pushers when using sharded pushers. () 2021-02-22 21:14:42 +00:00
Eric Eastwood
0a00b7ff14
Update black, and run auto formatting over the codebase ()
- Update black version to the latest
 - Run black auto formatting over the codebase
    - Run autoformatting according to [`docs/code_style.md
`](80d6dc9783/docs/code_style.md)
 - Update `code_style.md` docs around installing black to use the correct version
2021-02-16 22:32:34 +00:00
Richard van der Hoff
18ab35284a Merge branch 'social_login' into develop 2021-02-01 17:28:37 +00:00
Jan Christian Grünhage
43dd93bb26
Add phone home stats for encrypted messages. ()
Signed-off-by: Jan Christian Grünhage <jan.christian@gruenhage.xyz>
2021-02-01 17:06:22 +00:00
Richard van der Hoff
9c715a5f19
Fix SSO on workers ()
Fixes .

* Factor out build_synapse_client_resource_tree

Start a function which will mount resources common to all workers.

* Move sso init into build_synapse_client_resource_tree

... so that we don't have to do it for each worker

* Fix SSO-login-via-a-worker

Expose the SSO login endpoints on workers, like the documentation says.

* Update workers config for new endpoints

Add documentation for endpoints recently added (, , )

* remove submit_token from workers endpoints list

this *doesn't* work on workers (yet).

* changelog

* Add a comment about the odd path for SAML2Resource
2021-02-01 15:47:59 +00:00
Richard van der Hoff
f78d07bf00
Split out a separate endpoint to complete SSO registration ()
There are going to be a couple of paths to get to the final step of SSO reg, and I want the URL in the browser to consistent. So, let's move the final step onto a separate path, which we redirect to.
2021-02-01 13:15:51 +00:00
Ivan Shapovalov
13c7ab8181
Fixes for PyPy compatibility ()
* synapse.app.base: only call gc.freeze() on CPython

gc.freeze() is an implementation detail of CPython garbage collector,
and notably does not exist on PyPy.

Rather than playing whack-a-mole and skipping the call when under PyPy,
simply restrict it to CPython because the whole gc module is
implementation-defined.

Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
2021-01-30 17:22:05 +00:00
Erik Johnston
6633a4015a
Allow moving account data and receipts streams off master () 2021-01-18 15:47:59 +00:00
Richard van der Hoff
21a296cd5a
Split OidcProvider out of OidcHandler ()
The idea here is that we will have an instance of OidcProvider for each
configured IdP, with OidcHandler just doing the marshalling of them.

For now it's still hardcoded with a single provider.
2021-01-14 13:29:17 +00:00
Patrick Cloke
d1eb1b96e8
Register the /devices endpoint on workers. () 2021-01-13 12:35:40 -05:00
Erik Johnston
c9195744a4
Move more encryption endpoints off master () 2021-01-11 18:01:27 +00:00
Richard van der Hoff
671138f658
Clean up exception handling in the startup code ()
Factor out the exception handling in the startup code to a utility function,
and fix the some logging and exit code stuff.
2021-01-11 15:55:05 +00:00
Richard van der Hoff
7db2622d30
Remove unused SynapseService () 2021-01-11 10:24:22 +00:00
Erik Johnston
b530eaa262
Allow running sendToDevice on workers () 2021-01-07 20:19:26 +00:00
Richard van der Hoff
111b673fc1
Add initial support for a "pick your IdP" page ()
During login, if there are multiple IdPs enabled, offer the user a choice of
IdPs.
2021-01-05 11:25:28 +00:00
Patrick Cloke
68bb26da69
Allow redacting events on workers ()
Adds the redacts endpoint to workers that have the client listener.
2020-12-29 07:40:12 -05:00
Richard van der Hoff
28877fade9
Implement a username picker for synapse ()
The final part (for now) of my work to implement a username picker in synapse itself. The idea is that we allow
`UsernameMappingProvider`s to return `localpart=None`, in which case, rather than redirecting the browser
back to the client, we redirect to a username-picker resource, which allows the user to enter a username.
We *then* complete the SSO flow (including doing the client permission checks).

The static resources for the username picker itself (in 
https://github.com/matrix-org/synapse/tree/rav/username_picker/synapse/res/username_picker)
are essentially lifted wholesale from
https://github.com/matrix-org/matrix-synapse-saml-mozilla/tree/master/matrix_synapse_saml_mozilla/res. 
As the comment says, we might want to think about making them customisable, but that can be a follow-up. 

Fixes .
2020-12-18 14:19:46 +00:00
Erik Johnston
80a992d7b9
Fix deadlock on SIGHUP ()
Fixes 
2020-12-10 16:56:05 +00:00
Richard van der Hoff
ab7a24cc6b
Better formatting for config errors from modules ()
The idea is that the parse_config method of extension modules can raise either a ConfigError or a JsonValidationError,
and it will be magically turned into a legible error message. There's a few components to it:

* Separating the "path" and the "message" parts of a ConfigError, so that we can fiddle with the path bit to turn it
   into an absolute path.
* Generally improving the way ConfigErrors get printed.
* Passing in the config path to load_module so that it can wrap any exceptions that get caught appropriately.
2020-12-08 14:04:35 +00:00
Patrick Cloke
30fba62108
Apply an IP range blacklist to push and key revocation requests. ()
Replaces the `federation_ip_range_blacklist` configuration setting with an
`ip_range_blacklist` setting with wider scope. It now applies to:

* Federation
* Identity servers
* Push notifications
* Checking key validitity for third-party invite events

The old `federation_ip_range_blacklist` setting is still honored if present, but
with reduced scope (it only applies to federation and identity servers).
2020-12-02 11:09:24 -05:00
Erik Johnston
382b4e83f1
Defer SIGHUP handlers to reactor. ()
We can get a SIGHUP at any point, including times where we are not in a
sane state. By deferring calling the handlers until the next reactor
tick we ensure that we don't get unexpected conflicts, e.g. trying to
flush logs from the signal handler while the code was in the process of
writing a log entry.

Fixes .
2020-11-26 11:18:10 +00:00
Richard van der Hoff
fb56dfdccd
Fix SIGHUP handler ()
Fixes:

```
builtins.TypeError: _reload_logging_config() takes 1 positional argument but 2 were given
```
2020-11-06 11:42:07 +00:00
Erik Johnston
921a3f8a59
Fix not sending events over federation when using sharded event persisters ()
* Fix outbound federaion with multiple event persisters.

We incorrectly notified federation senders that the minimum persisted
stream position had advanced when we got an `RDATA` from an event
persister.

Notifying of federation senders already correctly happens in the
notifier, so we just delete the offending line.

* Change some interfaces to use RoomStreamToken.

By enforcing use of `RoomStreamTokens` we make it less likely that
people pass in random ints that they got from somewhere random.
2020-10-14 13:27:51 +01:00
Patrick Cloke
fe0f4a3591
Move additional tasks to the background worker, part 3 () 2020-10-09 07:37:51 -04:00
Patrick Cloke
c9c0ad5e20
Remove the deprecated Handlers object ()
All handlers now available via get_*_handler() methods on the HomeServer.
2020-10-09 07:24:34 -04:00
Patrick Cloke
e4f72ddc44
Move additional tasks to the background worker () 2020-10-07 11:27:56 -04:00
Patrick Cloke
8dbf62fada
Include the configured log level in phone home stats. ()
By reporting the log level of the synapse logger as a string.
2020-10-07 11:13:38 -04:00
Richard van der Hoff
4f0637346a
Combine SpamCheckerApi with the more generic ModuleApi. ()
Lots of different module apis is not easy to maintain.

Rather than adding yet another ModuleApi(hs, hs.get_auth_handler()) incantation, first add an hs.get_module_api() method and use it where possible.
2020-10-07 12:03:26 +01:00
Erik Johnston
e3debf9682
Add logging on startup/shutdown ()
This is so we can tell what is going on when things are taking a while to start up.

The main change here is to ensure that transactions that are created during startup get correctly logged like normal transactions.
2020-10-02 15:20:45 +01:00
Patrick Cloke
62894673e6
Allow background tasks to be run on a separate worker. () 2020-10-02 08:23:15 -04:00
Patrick Cloke
8a4a4186de
Simplify super() calls to Python 3 syntax. ()
This converts calls like super(Foo, self) -> super().

Generated with:

    sed -i "" -Ee 's/super\([^\(]+\)/super()/g' **/*.py
2020-09-18 09:56:44 -04:00
Jonathan de Jong
837293c314
Remove obsolete __future__ imports () 2020-09-17 08:37:01 -04:00
Andrew Morgan
a3a90ee031
Show a confirmation page during user password reset ()
This PR adds a confirmation step to resetting your user password between clicking the link in your email and your password actually being reset.

This is to better align our password reset flow with the industry standard of requiring a confirmation from the user after email validation.
2020-09-10 11:45:12 +01:00
Patrick Cloke
72bec36d50
Directly import json from the standard library. ()
By importing from canonicaljson the simplejson module was still being used
in some situations. After this change the std lib json is consistenty used
throughout Synapse.
2020-09-08 07:33:48 -04:00
Patrick Cloke
c619253db8
Stop sub-classing object () 2020-09-04 06:54:56 -04:00
Patrick Cloke
d250521cf5
Convert the main methods run by the reactor to async. () 2020-09-02 07:44:50 -04:00
Richard van der Hoff
8027166dd5 Add a comment about _LimitedHostnameResolver 2020-08-29 00:06:00 +01:00
Erik Johnston
0f1afbe8dc Change HomeServer definition to work with typing.
Duplicating function signatures between server.py and server.pyi is
silly. This commit changes that by changing all `build_*` methods to
`get_*` methods and changing the `_make_dependency_method` to work work
as a descriptor that caches the produced value.

There are some changes in other files that were made to fix the typing
in server.py.
2020-08-11 18:00:17 +01:00
Erik Johnston
7620912d84
Add health check endpoint () 2020-08-07 14:21:24 +01:00
Erik Johnston
a7bdf98d01
Rename database classes to make some sense () 2020-08-05 21:38:57 +01:00
Richard van der Hoff
916cf2d439
re-implement daemonize ()
This has long been something I've wanted to do. Basically the `Daemonize` code
is both too flexible and not flexible enough, in that it offers a bunch of
features that we don't use (changing UID, closing FDs in the child, logging to
syslog) and doesn't offer a bunch that we could do with (redirecting stdout/err
to a file instead of /dev/null; having the parent not exit until the child is
running).

As a first step, I've lifted the Daemonize code and removed the bits we don't
use. This should be a non-functional change. Fixing everything else will come
later.
2020-08-04 10:03:41 +01:00
Patrick Cloke
db5970ac6d
Convert ACME code to async/await. () 2020-08-03 07:09:33 -04:00
Olivier Wilkinson (reivilibre)
3aa36b782c Merge branch 'master' into develop 2020-07-30 15:18:36 +01:00
Patrick Cloke
3950ae51ef
Ensure that remove_pusher is always async () 2020-07-30 06:56:55 -04:00
Erik Johnston
2c1b9d6763
Update worker docs with recent enhancements () 2020-07-29 23:22:13 +01:00
Erik Johnston
84d099ae11
Fix typing replication not being handled on master ()
Handling of incoming typing stream updates from replication was not
hooked up on master, effecting set ups where typing was handled on a
different worker.

This is really only a problem if the master process is also handling
sync requests, which is unlikely for those that are at the stage of
moving typing off.

The other observable effect is that if a worker restarts or a
replication connect drops then the typing worker will issue a
`POSITION typing`, triggering master process to try and stream *all*
typing updates from position 0.

Fixes 
2020-07-27 14:10:53 +01:00
Patrick Cloke
00e57b755c
Convert synapse.app to async/await. () 2020-07-17 07:08:56 -04:00
Erik Johnston
f2e38ca867
Allow moving typing off master () 2020-07-16 15:12:54 +01:00
Erik Johnston
f299441cc6
Add ability to shard the federation sender () 2020-07-10 18:26:36 +01:00
Patrick Cloke
8fa7fdd4cb
Pass original request headers from workers to the main process. () 2020-07-09 07:34:46 -04:00
Patrick Cloke
4d978d7db4 Merge branch 'master' into develop 2020-07-02 10:55:41 -04:00
Patrick Cloke
ea26e9a98b Ensure that HTML pages served from Synapse include headers to avoid embedding. 2020-07-02 09:58:31 -04:00
Richard van der Hoff
03619324fc
Create a ListenerConfig object ()
This ended up being a bit more invasive than I'd hoped for (not helped by
generic_worker duplicating some of the code from homeserver), but hopefully
it's an improvement.

The idea is that, rather than storing unstructured `dict`s in the config for
the listener configurations, we instead parse it into a structured
`ListenerConfig` object.
2020-06-16 12:44:07 +01:00
Patrick Cloke
7d2532be36
Discard RDATA from already seen positions. () 2020-06-15 08:44:54 -04:00
Patrick Cloke
bd6dc17221
Replace iteritems/itervalues/iterkeys with native versions. () 2020-06-15 07:03:36 -04:00
Patrick Cloke
02f345d053
Attempt to fix PhoneHomeStatsTestCase.test_performance_100 being flaky. () 2020-06-05 07:36:47 -04:00
Andrew Morgan
e91abfd291
async/await get_user_id_by_threepid ()
Based on  

async's `get_user_id_by_threepid` and its call stack.
2020-06-03 17:15:57 +01:00
Erik Johnston
ef3934ec8f Ensure we persist and ack the same token 2020-05-27 19:45:42 +01:00
Erik Johnston
35c308731d Speed up processing of federation stream RDATA rows.
Instead of storing and sending an ACK for every single row we send
synchronously, we instead do it asynchronously while batching up
updates.
2020-05-27 19:34:07 +01:00
Richard van der Hoff
04729b86f8
Fix incorrect exception handling in KeyUploadServlet.on_POST ()
Introduced in 
2020-05-26 11:42:22 +01:00
Richard van der Hoff
00db90f409
Fix recording of federation stream token ()
A couple of changes of significance:

 * remove the `_last_ack < federation_position` condition, so that
   updates will still be correctly processed after restart

 * Correctly wire up send_federation_ack to the right class.
2020-05-26 11:41:38 +01:00
Erik Johnston
e5c67d04db
Add option to move event persistence off master () 2020-05-22 16:11:35 +01:00
Patrick Cloke
4429764c9f
Return 200 OK for all OPTIONS requests () 2020-05-22 09:30:07 -04:00
Erik Johnston
547e4dd83e
Fix exception reporting due to HTTP request errors. ()
These are business as usual errors, rather than stuff we want to log at
error.
2020-05-22 11:39:20 +01:00
Richard van der Hoff
0bbbd10513
Stub out GET presence requests in the frontend proxy ()
We don't really make any promises about returning accurate presence data when
presence is disabled, so we may as well just return a static response, rather
than making the master handle a request.
2020-05-21 14:36:46 +01:00
Erik Johnston
51055c8c44
Allow ReplicationRestResource to be added to workers ()
This allows workers to talk to each other over HTTP replication.
2020-05-18 12:24:48 +01:00
Erik Johnston
03aff4c75e
Add a worker store for search insertion. ()
This is required as both event persistence and the background update needs access to this function. It should be perfectly safe for two workers to write to that table at the same time.
2020-05-15 17:22:47 +01:00
Erik Johnston
4734a7bbe4
Move EventStream handling into default ReplicationDataHandler ()
This is so that the logic can happen on both master and workers when we move event persistence out.
2020-05-14 14:01:39 +01:00
Erik Johnston
1124111a12
Allow censoring of events to happen on workers. ()
This is safe as we can now write to cache invalidation stream on workers, and is required for when we move event persistence off master.
2020-05-13 17:15:40 +01:00
Erik Johnston
1a1da60ad2
Fix new flake8 errors () 2020-05-12 11:20:48 +01:00
Amber Brown
7cb8b4bc67
Allow configuration of Synapse's cache without using synctl or environment variables () 2020-05-11 18:45:23 +01:00
Quentin Gliech
616af44137
Implement OpenID Connect-based login () 2020-05-08 08:30:40 -04:00
Erik Johnston
0e719f2398
Thread through instance name to replication client. ()
For in memory streams when fetching updates on workers we need to query the source of the stream, which currently is hard coded to be master. This PR threads through the source instance we received via `POSITION` through to the update function in each stream, which can then be passed to the replication client for in memory streams.
2020-05-01 17:19:56 +01:00
Erik Johnston
3085cde577
Use stream.current_token() and remove stream_positions() ()
We move the processing of typing and federation replication traffic into their handlers so that `Stream.current_token()` points to a valid token. This allows us to remove `get_streams_to_replicate()` and `stream_positions()`.
2020-05-01 15:21:35 +01:00
Patrick Cloke
627b0f5f27
Persist user interactive authentication sessions ()
By persisting the user interactive authentication sessions to the database, this fixes
situations where a user hits different works throughout their auth session and also
allows sessions to persist through restarts of Synapse.
2020-04-30 13:47:49 -04:00
Erik Johnston
37f6823f5b
Add instance name to RDATA/POSITION commands ()
This is primarily for allowing us to send those commands from workers, but for now simply allows us to ignore echoed RDATA/POSITION commands that we sent (we get echoes of sent commands when using redis). Currently we log a WARNING on the master process every time we receive an echoed RDATA.
2020-04-29 16:23:08 +01:00
Erik Johnston
38919b521e
Run replication streamers on workers ()
Currently we never write to streams from workers, but that will change soon
2020-04-28 13:34:12 +01:00
Richard van der Hoff
71a1abb8a1
Stop the master relaying USER_SYNC for other workers ()
Long story short: if we're handling presence on the current worker, we shouldn't be sending USER_SYNC commands over replication.

In an attempt to figure out what is going on here, I ended up refactoring some bits of the presencehandler code, so the first 4 commits here are non-functional refactors to move this code slightly closer to sanity. (There's still plenty to do here :/). Suggest reviewing individual commits.

Fixes (I hope) .
2020-04-22 22:39:04 +01:00
Richard van der Hoff
2aa5bf13c8 Merge branch 'release-v1.12.4' into develop 2020-04-22 13:09:23 +01:00
Erik Johnston
51f7eaf908
Add ability to run replication protocol over redis. ()
This is configured via the `redis` config options.
2020-04-22 13:07:41 +01:00
Richard van der Hoff
974c0d726a
Support GET account_data requests on a worker () 2020-04-21 10:46:30 +01:00
Erik Johnston
5016b162fc
Move client command handling out of TCP protocol ()
The aim here is to move the command handling out of the TCP protocol classes and to also merge the client and server command handling (so that we can reuse them for redis protocol). This PR simply moves the client paths to the new `ReplicationCommandHandler`, a future PR will move the server paths too.
2020-04-06 09:58:42 +01:00
Martin Milata
b0db928c63
Extend web_client_location to handle absolute URLs ()
Log warning when filesystem path is used.

Signed-off-by: Martin Milata <martin@martinmilata.cz>
2020-04-03 11:57:34 -04:00
Richard van der Hoff
bae32740da
Remove some run_in_background calls in replication code ()
By running this stuff with `run_in_background`, it won't be correctly reported
against the relevant CPU usage stats.

Fixes 
2020-04-03 12:29:30 +01:00
Erik Johnston
db098ec994 Fix starting workers when federation sending not split out. 2020-03-31 11:25:21 +01:00
Erik Johnston
4f21c33be3
Remove usage of "conn_id" for presence. ()
* Remove `conn_id` usage for UserSyncCommand.

Each tcp replication connection is assigned a "conn_id", which is used
to give an ID to a remotely connected worker. In a redis world, there
will no longer be a one to one mapping between connection and instance,
so instead we need to replace such usages with an ID generated by the
remote instances and included in the replicaiton commands.

This really only effects UserSyncCommand.

* Add CLEAR_USER_SYNCS command that is sent on shutdown.

This should help with the case where a synchrotron gets restarted
gracefully, rather than rely on 5 minute timeout.
2020-03-30 16:37:24 +01:00
Erik Johnston
4cff617df1
Move catchup of replication streams to worker. ()
This changes the replication protocol so that the server does not send down `RDATA` for rows that happened before the client connected. Instead, the server will send a `POSITION` and clients then query the database (or master out of band) to get up to date.
2020-03-25 14:54:01 +00:00
Erik Johnston
b1cfaf08af
Merge pull request from matrix-org/erikj/fix_worker_startup
Fix starting workers when federation sending not split out.
2020-03-25 09:42:39 +00:00
Erik Johnston
c816072d47 Fix starting workers when federation sending not split out. 2020-03-24 10:35:00 +00:00
Richard van der Hoff
a564b92d37
Convert *StreamRow classes to inner classes ()
This just helps keep the rows closer to their streams, so that it's easier to
see what the format of each stream is.
2020-03-23 13:59:11 +00:00
Richard van der Hoff
b3cee0ce67
Fix processing of groups stream, and use symbolic names for streams ()
`groups` != `receipts`

Introduced in 
2020-03-23 11:39:36 +00:00
Erik Johnston
a319cb1dd1
Change device list streams to have one row per ID ()
* Add 'device_lists_outbound_pokes' as extra table.

This makes sure we check all the relevant tables to get the current max
stream ID.

Currently not doing so isn't problematic as the max stream ID in
`device_lists_outbound_pokes` is the same as in `device_lists_stream`,
however that will change.

* Change device lists stream to have one row per id.

This will make it possible to process the streams more incrementally,
avoiding having to process large chunks at once.

* Change device list replication to match new semantics.

Instead of sending down batches of user ID/host tuples, send down a row
per entity (user ID or host).

* Newsfile

* Remove handling of multiple rows per ID

* Fix worker handling

* Comments from review
2020-03-19 11:36:53 +00:00
Richard van der Hoff
443162e577
Move pusherpool startup into _base.setup ()
This should be safe to do on all workers/masters because it is guarded by
a config option which will ensure it is only actually done on the worker
assigned as a pusher.
2020-03-19 09:48:45 +00:00
Erik Johnston
6e6476ef07 Comments from review 2020-03-18 10:13:55 +00:00
Neil Johnson
1d66dce83e
Break down monthly active users by appservice_id ()
* Break down monthly active users by appservice_id and emit via prometheus.

Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
2020-03-06 18:14:19 +00:00
Erik Johnston
e53744c737 Fix worker handling 2020-03-02 12:52:28 +00:00
Erik Johnston
9ce4e344a8 Change device list replication to match new semantics.
Instead of sending down batches of user ID/host tuples, send down a row
per entity (user ID or host).
2020-02-28 11:25:34 +00:00
Erik Johnston
2201bc9795
Don't refuse to start worker if media listener configured. ()
Instead lets just warn if the worker has a media listener configured but
has the media repository disabled.

Previously non media repository workers would just ignore the media
listener.
2020-02-27 16:33:21 +00:00
Erik Johnston
bbf8886a05
Merge worker apps into one. () 2020-02-25 16:56:55 +00:00
Patrick Cloke
509e381afa
Clarify list/set/dict/tuple comprehensions and enforce via flake8 ()
Ensure good comprehension hygiene using flake8-comprehensions.
2020-02-21 07:15:07 -05:00
Erik Johnston
fc87d2ffb3
Freeze allocated objects on startup. ()
This may make gc go a bit faster as the gc will know things like
caches/data stores etc. are frozen without having to check.
2020-02-19 15:09:00 +00:00
Erik Johnston
21db35f77e
Add support for putting fed user query API on workers () 2020-02-07 15:45:39 +00:00
Erik Johnston
de2d267375
Allow moving group read APIs to workers () 2020-02-07 11:14:19 +00:00
Erik Johnston
6b9e1014cf
Fix race in federation sender that delayed device updates. ()
We were sending device updates down both the federation stream and
device streams. This mean there was a race if the federation sender
worker processed the federation stream first, as when the sender checked
if there were new device updates the slaved ID generator hadn't been
updated with the new stream IDs and so returned nothing.

This situation is correctly handled by events/receipts/etc by not
sending updates down the federation stream and instead having the
federation sender worker listen on the other streams and poke the
transaction queues as appropriate.
2020-01-29 11:23:01 +00:00
Neil Johnson
5e52d8563b Allow monthly active user limiting support for worker mode, fixes . () 2020-01-22 11:05:14 +00:00
Erik Johnston
a8a50f5b57
Wake up transaction queue when remote server comes back online ()
This will be used to retry outbound transactions to a remote server if
we think it might have come back up.
2020-01-17 10:27:19 +00:00
Erik Johnston
48c3a96886
Port synapse.replication.tcp to async/await ()
* Port synapse.replication.tcp to async/await

* Newsfile

* Correctly document type of on_<FOO> functions as async

* Don't be overenthusiastic with the asyncing....
2020-01-16 09:16:12 +00:00
Richard van der Hoff
8039685051
Allow additional_resources to implement Resource directly ()
AdditionalResource really doesn't add any value, and it gets in the way for
resources which want to support child resources or the like. So, if the
resource object already implements the IResource interface, don't bother
wrapping it.
2020-01-13 12:42:44 +00:00
Erik Johnston
1adf27c82a Import RoomStore in media worker to fix admin APIs 2020-01-08 13:26:20 +00:00
Richard van der Hoff
26c5d3d398 Fix exceptions in log when rejected event is replicated 2020-01-06 17:16:28 +00:00
Richard van der Hoff
c74de81bfc async/await for SyncReplicationHandler.process_and_notify 2020-01-06 17:14:28 +00:00
Richard van der Hoff
e484101306
Raise an error if someone tries to use the log_file config option ()
This has caused some confusion for people who didn't notice it going away.
2020-01-03 17:11:29 +00:00
Richard van der Hoff
98247c4a0e
Remove unused, undocumented "content repo" resource ()
This looks like it got half-killed back in .

Fixes .
2020-01-03 17:10:52 +00:00
Erik Johnston
3d46124ad0
Port some admin handlers to async/await () 2019-12-19 15:07:28 +00:00
Richard van der Hoff
bca30cefee
Improve diagnostics on database upgrade failure ()
`Failed to upgrade database` is not helpful, and it's unlikely that UPGRADE.rst
has anything useful.
2019-12-19 14:53:15 +00:00
Richard van der Hoff
0b794cbd7b
Fix sdnotify with acme enabled ()
If acme was enabled, the sdnotify startup hook would never be run because we
would try to add it to a hook which had already fired.

There's no need to delay it: we can sdnotify as soon as we've started the
listeners.
2019-12-19 14:52:52 +00:00
Erik Johnston
b8e4b39b69
Merge pull request from matrix-org/erikj/remove_db_config_from_apps
Move database config from apps into HomeServer object
2019-12-12 10:37:56 +00:00