Commit Graph

67 Commits

Author SHA1 Message Date
Richard van der Hoff
5d281c10dd
Stop BackgroundProcessLoggingContext making new prometheus timeseries (#9854)
This undoes part of b076bc276e.
2021-04-21 10:03:31 +01:00
Patrick Cloke
b076bc276e
Always use the name as the log ID. (#9829)
As far as I can tell our logging contexts are meant to log the request ID, or sometimes the request ID followed by a suffix (this is generally stored in the name field of LoggingContext). There's also code to log the name@memory location, but I'm not sure this is ever used.

This simplifies the code paths to require every logging context to have a name and use that in logging. For sub-contexts (created via nested_logging_contexts, defer_to_threadpool, Measure) we use the current context's str (which becomes their name or the string "sentinel") and then potentially modify that (e.g. add a suffix).
2021-04-20 14:19:00 +01:00
Patrick Cloke
48d44ab142
Record more information into structured logs. (#9654)
Records additional request information into the structured logs,
e.g. the requester, IP address, etc.
2021-04-08 08:01:14 -04:00
Erik Johnston
b5efcb577e
Make it possible to use dmypy (#9692)
Running `dmypy run` will do a `mypy` check while spinning up a daemon
that makes rerunning `dmypy run` a lot faster.

`dmypy` doesn't support `follow_imports = silent` and has
`local_partial_types` enabled, so this PR enables those options and
fixes the issues that were newly raised. Note that `local_partial_types`
will be enabled by default in upcoming mypy releases.
2021-03-26 16:49:46 +00:00
Patrick Cloke
d29b71aa50
Fix remaining mypy issues due to Twisted upgrade. (#9608) 2021-03-15 11:14:39 -04:00
Patrick Cloke
55da8df078
Fix additional type hints from Twisted 21.2.0. (#9591) 2021-03-12 11:37:57 -05:00
Eric Eastwood
0a00b7ff14
Update black, and run auto formatting over the codebase (#9381)
- Update black version to the latest
 - Run black auto formatting over the codebase
    - Run autoformatting according to [`docs/code_style.md
`](80d6dc9783/docs/code_style.md)
 - Update `code_style.md` docs around installing black to use the correct version
2021-02-16 22:32:34 +00:00
Patrick Cloke
1619802228
Various clean-ups to the logging context code (#8935) 2020-12-14 14:19:47 -05:00
Patrick Cloke
4ff0201e62
Enable mypy checking for unreachable code and fix instances. (#8432) 2020-10-01 08:09:18 -04:00
Patrick Cloke
c619253db8
Stop sub-classing object (#8249) 2020-09-04 06:54:56 -04:00
Richard van der Hoff
f57b99af22
Handle replication commands synchronously where possible (#7876)
Most of the stuff we do for replication commands can be done synchronously. There's no point spinning up background processes if we're not going to need them.
2020-07-27 18:54:43 +01:00
Richard van der Hoff
05060e0223
Track command processing as a background process (#7879)
I'm going to be doing more stuff synchronously, and I don't want to lose the
CPU metrics down the sofa.
2020-07-22 00:40:42 +01:00
Patrick Cloke
38e1fac886
Fix some spelling mistakes / typos. (#7811) 2020-07-09 09:52:58 -04:00
Erik Johnston
3eab76ad43
Don't relay REMOTE_SERVER_UP cmds to same conn. (#7352)
For direct TCP connections we need the master to relay REMOTE_SERVER_UP
commands to the other connections so that all instances get notified
about it. The old implementation just relayed to all connections,
assuming that sending back to the original sender of the command was
safe. This is not true for redis, where commands sent get echoed back to
the sender, which was causing master to effectively infinite loop
sending and then re-receiving REMOTE_SERVER_UP commands that it sent.

The fix is to ensure that we only relay to *other* connections and not
to the connection we received the notification from.

Fixes #7334.
2020-04-29 14:10:59 +01:00
Erik Johnston
841c581c40
Fix replication metrics when using redis (#7325) 2020-04-22 16:26:19 +01:00
Erik Johnston
51f7eaf908
Add ability to run replication protocol over redis. (#7040)
This is configured via the `redis` config options.
2020-04-22 13:07:41 +01:00
Richard van der Hoff
e13c6c7a96 Handle one-word replication commands correctly
`REPLICATE` is now a valid command, and it's nice if you can issue it from the
console without remembering to call it `REPLICATE ` with a trailing space.
2020-04-07 17:43:46 +01:00
Erik Johnston
82498ee901
Move server command handling out of TCP protocol (#7187)
This completes the merging of server and client command processing.
2020-04-07 10:51:07 +01:00
Erik Johnston
5016b162fc
Move client command handling out of TCP protocol (#7185)
The aim here is to move the command handling out of the TCP protocol classes and to also merge the client and server command handling (so that we can reuse them for redis protocol). This PR simply moves the client paths to the new `ReplicationCommandHandler`, a future PR will move the server paths too.
2020-04-06 09:58:42 +01:00
Erik Johnston
4f21c33be3
Remove usage of "conn_id" for presence. (#7128)
* Remove `conn_id` usage for UserSyncCommand.

Each tcp replication connection is assigned a "conn_id", which is used
to give an ID to a remotely connected worker. In a redis world, there
will no longer be a one to one mapping between connection and instance,
so instead we need to replace such usages with an ID generated by the
remote instances and included in the replicaiton commands.

This really only effects UserSyncCommand.

* Add CLEAR_USER_SYNCS command that is sent on shutdown.

This should help with the case where a synchrotron gets restarted
gracefully, rather than rely on 5 minute timeout.
2020-03-30 16:37:24 +01:00
Erik Johnston
4cff617df1
Move catchup of replication streams to worker. (#7024)
This changes the replication protocol so that the server does not send down `RDATA` for rows that happened before the client connected. Instead, the server will send a `POSITION` and clients then query the database (or master out of band) to get up to date.
2020-03-25 14:54:01 +00:00
Erik Johnston
d5275fc55f
Propagate cache invalidates from workers to other workers. (#6748)
Currently if a worker invalidates a cache it will be streamed to master, which then didn't forward those to other workers.
2020-01-27 13:47:50 +00:00
Erik Johnston
a8a50f5b57
Wake up transaction queue when remote server comes back online (#6706)
This will be used to retry outbound transactions to a remote server if
we think it might have come back up.
2020-01-17 10:27:19 +00:00
Erik Johnston
48c3a96886
Port synapse.replication.tcp to async/await (#6666)
* Port synapse.replication.tcp to async/await

* Newsfile

* Correctly document type of on_<FOO> functions as async

* Don't be overenthusiastic with the asyncing....
2020-01-16 09:16:12 +00:00
Erik Johnston
e8b68a4e4b
Fixup synapse.replication to pass mypy checks (#6667) 2020-01-14 14:08:06 +00:00
Richard van der Hoff
cc6243b4c0
document the REPLICATE command a bit better (#6305)
since I found myself wonder how it works
2019-11-04 12:40:18 +00:00
Andrew Morgan
54fef094b3
Remove usage of deprecated logger.warn method from codebase (#6271)
Replace every instance of `logger.warn` with `logger.warning` as the former is deprecated.
2019-10-31 10:23:24 +00:00
Amber Brown
463b072b12
Move logging utilities out of the side drawer of util/ and into logging/ (#5606) 2019-07-04 00:07:04 +10:00
Amber Brown
32e7c9e7f2
Run Black. (#5482) 2019-06-20 19:32:02 +10:00
Richard van der Hoff
1f6d6f918a Make EventStream rows have a type
... as a precursor to combining it with the CurrentStateDelta stream.
2019-03-27 22:07:05 +00:00
Richard van der Hoff
f570916a3e Add parse_row method to replication stream class
This will allow individual stream classes to override how a row is parsed.
2019-03-27 21:32:33 +00:00
Richard van der Hoff
8cbbedaa2b
Fix ClientReplicationStreamProtocol.__str__ (#4929)
`__str__` depended on `self.addr`, which was absent from
ClientReplicationStreamProtocol, so attempting to call str on such an object
would raise an exception.

We can calculate the peer addr from the transport, so there is no need for addr
anyway.
2019-03-25 16:41:51 +00:00
Richard van der Hoff
9bde730ef8
Fix bug where read-receipts lost their timestamps (#4927)
Make sure that they are sent correctly over the replication stream.

Fixes: #4898
2019-03-25 16:38:05 +00:00
Andrew Morgan
b9f6163092 Simplify token replication logic 2019-03-05 13:58:30 +00:00
Andrew Morgan
fe7bd23a85 Clean up logic and add comments 2019-03-04 15:08:15 +00:00
Andrew Morgan
9f7cdf3da1 Clearer branching, fix missing list clear 2019-03-04 14:36:52 +00:00
Andrew Morgan
5f0c449dd5 Prevent replication wedging 2019-03-04 14:03:18 +00:00
Erik Johnston
7590e9fa28
Merge pull request #4749 from matrix-org/erikj/replication_connection_backoff
Fix tightloop over connecting to replication server
2019-02-27 11:00:59 +00:00
Erik Johnston
6bb1c028f1 Limit cache invalidation replication line length (#4748) 2019-02-27 10:28:37 +00:00
Erik Johnston
6870fc496f Move connecting logic into ClientReplicationStreamProtocol 2019-02-27 10:23:51 +00:00
Erik Johnston
a163b748a5 Don't truncate command name in metrics 2018-10-29 17:34:21 +00:00
Erik Johnston
3e242dc149 Remove conn_id 2018-09-04 11:45:52 +01:00
Erik Johnston
b13836da7f Remove conn_id from repl prometheus metrics
`conn_id` gets set to a random string, and so we end up filling up
prometheus with tonnes of data series, which is bad.
2018-09-03 17:22:49 +01:00
Richard van der Hoff
0e8d78f6aa Logcontexts for replication command handlers
Run the handlers for replication commands as background processes. This should
improve the visibility in our metrics, and reduce the number of "running db
transaction from sentinel context" warnings.

Ideally it means converting the things that fire off deferreds into the night
into things that actually return a Deferred when they are done. I've made a bit
of a stab at this, but it will probably be leaky.
2018-08-17 00:43:43 +01:00
Amber Brown
49af402019 run isort 2018-07-09 16:09:20 +10:00
Amber Brown
99b77aa829
Fix tcp protocol metrics naming (#3410) 2018-06-21 09:39:27 +01:00
Richard van der Hoff
b7e7fd2d0e Fix replication metrics
fix bug introduced in #3256
2018-06-04 16:23:05 +01:00
Amber Brown
754826a830 Merge remote-tracking branch 'origin/develop' into 3218-official-prom 2018-05-28 18:57:23 +10:00
Amber Brown
b6063631c3 more cleanup 2018-05-22 17:36:20 -05:00
Amber Brown
228f1f584e fix the test failures 2018-05-22 15:02:38 -05:00