Commit Graph

707 Commits

Author SHA1 Message Date
Patrick Cloke
627b0f5f27
Persist user interactive authentication sessions (#7302)
By persisting the user interactive authentication sessions to the database, this fixes
situations where a user hits different works throughout their auth session and also
allows sessions to persist through restarts of Synapse.
2020-04-30 13:47:49 -04:00
Erik Johnston
37f6823f5b
Add instance name to RDATA/POSITION commands (#7364)
This is primarily for allowing us to send those commands from workers, but for now simply allows us to ignore echoed RDATA/POSITION commands that we sent (we get echoes of sent commands when using redis). Currently we log a WARNING on the master process every time we receive an echoed RDATA.
2020-04-29 16:23:08 +01:00
Erik Johnston
38919b521e
Run replication streamers on workers (#7146)
Currently we never write to streams from workers, but that will change soon
2020-04-28 13:34:12 +01:00
Richard van der Hoff
71a1abb8a1
Stop the master relaying USER_SYNC for other workers (#7318)
Long story short: if we're handling presence on the current worker, we shouldn't be sending USER_SYNC commands over replication.

In an attempt to figure out what is going on here, I ended up refactoring some bits of the presencehandler code, so the first 4 commits here are non-functional refactors to move this code slightly closer to sanity. (There's still plenty to do here :/). Suggest reviewing individual commits.

Fixes (I hope) #7257.
2020-04-22 22:39:04 +01:00
Richard van der Hoff
2aa5bf13c8 Merge branch 'release-v1.12.4' into develop 2020-04-22 13:09:23 +01:00
Erik Johnston
51f7eaf908
Add ability to run replication protocol over redis. (#7040)
This is configured via the `redis` config options.
2020-04-22 13:07:41 +01:00
Richard van der Hoff
974c0d726a
Support GET account_data requests on a worker (#7311) 2020-04-21 10:46:30 +01:00
Erik Johnston
5016b162fc
Move client command handling out of TCP protocol (#7185)
The aim here is to move the command handling out of the TCP protocol classes and to also merge the client and server command handling (so that we can reuse them for redis protocol). This PR simply moves the client paths to the new `ReplicationCommandHandler`, a future PR will move the server paths too.
2020-04-06 09:58:42 +01:00
Martin Milata
b0db928c63
Extend web_client_location to handle absolute URLs (#7006)
Log warning when filesystem path is used.

Signed-off-by: Martin Milata <martin@martinmilata.cz>
2020-04-03 11:57:34 -04:00
Richard van der Hoff
bae32740da
Remove some run_in_background calls in replication code (#7203)
By running this stuff with `run_in_background`, it won't be correctly reported
against the relevant CPU usage stats.

Fixes #7202
2020-04-03 12:29:30 +01:00
Erik Johnston
db098ec994 Fix starting workers when federation sending not split out. 2020-03-31 11:25:21 +01:00
Erik Johnston
4f21c33be3
Remove usage of "conn_id" for presence. (#7128)
* Remove `conn_id` usage for UserSyncCommand.

Each tcp replication connection is assigned a "conn_id", which is used
to give an ID to a remotely connected worker. In a redis world, there
will no longer be a one to one mapping between connection and instance,
so instead we need to replace such usages with an ID generated by the
remote instances and included in the replicaiton commands.

This really only effects UserSyncCommand.

* Add CLEAR_USER_SYNCS command that is sent on shutdown.

This should help with the case where a synchrotron gets restarted
gracefully, rather than rely on 5 minute timeout.
2020-03-30 16:37:24 +01:00
Erik Johnston
4cff617df1
Move catchup of replication streams to worker. (#7024)
This changes the replication protocol so that the server does not send down `RDATA` for rows that happened before the client connected. Instead, the server will send a `POSITION` and clients then query the database (or master out of band) to get up to date.
2020-03-25 14:54:01 +00:00
Erik Johnston
b1cfaf08af
Merge pull request #7133 from matrix-org/erikj/fix_worker_startup
Fix starting workers when federation sending not split out.
2020-03-25 09:42:39 +00:00
Erik Johnston
c816072d47 Fix starting workers when federation sending not split out. 2020-03-24 10:35:00 +00:00
Richard van der Hoff
a564b92d37
Convert *StreamRow classes to inner classes (#7116)
This just helps keep the rows closer to their streams, so that it's easier to
see what the format of each stream is.
2020-03-23 13:59:11 +00:00
Richard van der Hoff
b3cee0ce67
Fix processing of groups stream, and use symbolic names for streams (#7117)
`groups` != `receipts`

Introduced in #6964
2020-03-23 11:39:36 +00:00
Erik Johnston
a319cb1dd1
Change device list streams to have one row per ID (#7010)
* Add 'device_lists_outbound_pokes' as extra table.

This makes sure we check all the relevant tables to get the current max
stream ID.

Currently not doing so isn't problematic as the max stream ID in
`device_lists_outbound_pokes` is the same as in `device_lists_stream`,
however that will change.

* Change device lists stream to have one row per id.

This will make it possible to process the streams more incrementally,
avoiding having to process large chunks at once.

* Change device list replication to match new semantics.

Instead of sending down batches of user ID/host tuples, send down a row
per entity (user ID or host).

* Newsfile

* Remove handling of multiple rows per ID

* Fix worker handling

* Comments from review
2020-03-19 11:36:53 +00:00
Richard van der Hoff
443162e577
Move pusherpool startup into _base.setup (#7104)
This should be safe to do on all workers/masters because it is guarded by
a config option which will ensure it is only actually done on the worker
assigned as a pusher.
2020-03-19 09:48:45 +00:00
Erik Johnston
6e6476ef07 Comments from review 2020-03-18 10:13:55 +00:00
Neil Johnson
1d66dce83e
Break down monthly active users by appservice_id (#7030)
* Break down monthly active users by appservice_id and emit via prometheus.

Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
2020-03-06 18:14:19 +00:00
Erik Johnston
e53744c737 Fix worker handling 2020-03-02 12:52:28 +00:00
Erik Johnston
9ce4e344a8 Change device list replication to match new semantics.
Instead of sending down batches of user ID/host tuples, send down a row
per entity (user ID or host).
2020-02-28 11:25:34 +00:00
Erik Johnston
2201bc9795
Don't refuse to start worker if media listener configured. (#7002)
Instead lets just warn if the worker has a media listener configured but
has the media repository disabled.

Previously non media repository workers would just ignore the media
listener.
2020-02-27 16:33:21 +00:00
Erik Johnston
bbf8886a05
Merge worker apps into one. (#6964) 2020-02-25 16:56:55 +00:00
Patrick Cloke
509e381afa
Clarify list/set/dict/tuple comprehensions and enforce via flake8 (#6957)
Ensure good comprehension hygiene using flake8-comprehensions.
2020-02-21 07:15:07 -05:00
Erik Johnston
fc87d2ffb3
Freeze allocated objects on startup. (#6953)
This may make gc go a bit faster as the gc will know things like
caches/data stores etc. are frozen without having to check.
2020-02-19 15:09:00 +00:00
Erik Johnston
21db35f77e
Add support for putting fed user query API on workers (#6873) 2020-02-07 15:45:39 +00:00
Erik Johnston
de2d267375
Allow moving group read APIs to workers (#6866) 2020-02-07 11:14:19 +00:00
Erik Johnston
6b9e1014cf
Fix race in federation sender that delayed device updates. (#6799)
We were sending device updates down both the federation stream and
device streams. This mean there was a race if the federation sender
worker processed the federation stream first, as when the sender checked
if there were new device updates the slaved ID generator hadn't been
updated with the new stream IDs and so returned nothing.

This situation is correctly handled by events/receipts/etc by not
sending updates down the federation stream and instead having the
federation sender worker listen on the other streams and poke the
transaction queues as appropriate.
2020-01-29 11:23:01 +00:00
Neil Johnson
5e52d8563b Allow monthly active user limiting support for worker mode, fixes #4639. (#6742) 2020-01-22 11:05:14 +00:00
Erik Johnston
a8a50f5b57
Wake up transaction queue when remote server comes back online (#6706)
This will be used to retry outbound transactions to a remote server if
we think it might have come back up.
2020-01-17 10:27:19 +00:00
Erik Johnston
48c3a96886
Port synapse.replication.tcp to async/await (#6666)
* Port synapse.replication.tcp to async/await

* Newsfile

* Correctly document type of on_<FOO> functions as async

* Don't be overenthusiastic with the asyncing....
2020-01-16 09:16:12 +00:00
Richard van der Hoff
8039685051
Allow additional_resources to implement Resource directly (#6686)
AdditionalResource really doesn't add any value, and it gets in the way for
resources which want to support child resources or the like. So, if the
resource object already implements the IResource interface, don't bother
wrapping it.
2020-01-13 12:42:44 +00:00
Erik Johnston
1adf27c82a Import RoomStore in media worker to fix admin APIs 2020-01-08 13:26:20 +00:00
Richard van der Hoff
26c5d3d398 Fix exceptions in log when rejected event is replicated 2020-01-06 17:16:28 +00:00
Richard van der Hoff
c74de81bfc async/await for SyncReplicationHandler.process_and_notify 2020-01-06 17:14:28 +00:00
Richard van der Hoff
e484101306
Raise an error if someone tries to use the log_file config option (#6626)
This has caused some confusion for people who didn't notice it going away.
2020-01-03 17:11:29 +00:00
Richard van der Hoff
98247c4a0e
Remove unused, undocumented "content repo" resource (#6628)
This looks like it got half-killed back in #888.

Fixes #6567.
2020-01-03 17:10:52 +00:00
Erik Johnston
3d46124ad0
Port some admin handlers to async/await (#6559) 2019-12-19 15:07:28 +00:00
Richard van der Hoff
bca30cefee
Improve diagnostics on database upgrade failure (#6570)
`Failed to upgrade database` is not helpful, and it's unlikely that UPGRADE.rst
has anything useful.
2019-12-19 14:53:15 +00:00
Richard van der Hoff
0b794cbd7b
Fix sdnotify with acme enabled (#6571)
If acme was enabled, the sdnotify startup hook would never be run because we
would try to add it to a hook which had already fired.

There's no need to delay it: we can sdnotify as soon as we've started the
listeners.
2019-12-19 14:52:52 +00:00
Erik Johnston
b8e4b39b69
Merge pull request #6511 from matrix-org/erikj/remove_db_config_from_apps
Move database config from apps into HomeServer object
2019-12-12 10:37:56 +00:00
Erik Johnston
adb3a873fd Synapse 1.7.0rc2 (2019-12-11)
=============================
 
 Bugfixes
 --------
 
 - Fix incorrect error message for invalid requests when setting user's avatar URL. ([\#6497](https://github.com/matrix-org/synapse/issues/6497))
 - Fix support for SQLite 3.7. ([\#6499](https://github.com/matrix-org/synapse/issues/6499))
 - Fix regression where sending email push would not work when using a pusher worker. ([\#6507](https://github.com/matrix-org/synapse/issues/6507), [\#6509](https://github.com/matrix-org/synapse/issues/6509))
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEumuwyPtYLL2OMhYdOtoG7cdT0R4FAl3w+RwQHGVyaWtAbWF0
 cml4Lm9yZwAKCRA62gbtx1PRHijfD/43EQ58jqKvD+qKZwVFOE6JJ3SCS7UJbi3f
 zq9KWDuCB6EAFjAgmJikbBBqHPO5qq2WtNXaCpexx44s8Mk8SZKHg56dP6Fk651C
 assAb/Nsh6CAlPUcRkx8I0L/kYXMPDyATlLVBHVOi3pFDJ093mdOQ4q8yP9iUTM+
 OPsbT8k/pMhrhCH951bGmB6/SEcju+ubObW+bRFe8o3v1KE9jVYQjUGMhuoXp3pM
 z/OB8idZcqOvCc6HMo83tg9FuI613Jy80PMIc1ofyJgvnu+aDBepWuldvFEMnmSR
 D862jMor7+WdnDOTeWZrC+DXjl0qCP8F6ahs5rEllRglt/Ep2wEDPA/8YrRoEI3V
 RWe9W7XDdFCXdzlvXheOfETqTu9kdsurTwBEeJrWQ0vOLY86hxt9KKKcHVhy5eKq
 kNfRtvSVLmRhIssp7hVBcywRwnaxN7R2OoRq/TWnTZz+xEOPzYFU6r0l9kk5dbg6
 fqYYxIXbgZlhjihLWNIMNwwZH9ll/eoPiinkRTZNz40THP/VDTR9DM4tQ6jrhrr6
 sP0qM7LljvrdiimXtV00tUDyUspNgJl6xDJyGDHWM9uoCB7uorEpMQZSzlZhZe8s
 6q+fQPHlfW4JYOejfihPkjrV1ViawvEucqWPaIsD+v26C4RP7qCTtYj3NPiA65of
 zhOjomWWcg==
 =ABoZ
 -----END PGP SIGNATURE-----

Merge tag 'v1.7.0rc2' into develop

Synapse 1.7.0rc2 (2019-12-11)
=============================

Bugfixes
--------

- Fix incorrect error message for invalid requests when setting user's avatar URL. ([\#6497](https://github.com/matrix-org/synapse/issues/6497))
- Fix support for SQLite 3.7. ([\#6499](https://github.com/matrix-org/synapse/issues/6499))
- Fix regression where sending email push would not work when using a pusher worker. ([\#6507](https://github.com/matrix-org/synapse/issues/6507), [\#6509](https://github.com/matrix-org/synapse/issues/6509))
2019-12-11 14:14:30 +00:00
Erik Johnston
bc5cb8bfe8 Remove database config parsing from apps. 2019-12-10 14:34:17 +00:00
Erik Johnston
663238aeb4 Phone home stats DB reporting should not assume a single DB. 2019-12-10 13:21:04 +00:00
Brendan Abolivier
2ac78438d8
Make the PusherSlaveStore inherit from the slave RoomStore
So that it has access to the get_retention_policy_for_room function which is required by filter_events_for_client.
2019-12-10 12:31:03 +00:00
Erik Johnston
75f87450d8 Move start up DB checks to main data store. 2019-12-06 16:02:21 +00:00
Erik Johnston
d64bb32a73 Move are_all_users_on_domain checks to main data store. 2019-12-06 13:43:40 +00:00
Erik Johnston
9a4fb457cf Change DataStores to accept 'database' param. 2019-12-06 13:30:06 +00:00