Commit Graph

170 Commits

Author SHA1 Message Date
Erik Johnston
a03d71dc9d
Fix "Starting metrics collection from sentinel context" errors () 2021-01-08 14:33:53 +00:00
Patrick Cloke
be2db93b3c
Do not assume that the contents dictionary includes history_visibility. () 2020-12-16 08:46:37 -05:00
Erik Johnston
a6ea1a957e
Don't pull event from DB when handling replication traffic. ()
I was trying to make it so that we didn't have to start a background task when handling RDATA, but that is a bigger job (due to all the code in `generic_worker`). However I still think not pulling the event from the DB may help reduce some DB usage due to replication, even if most workers will simply go and pull that event from the DB later anyway.

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2020-10-28 12:11:45 +00:00
Erik Johnston
2b7c180879
Start fewer opentracing spans ()
 started a span for every background process. This is good as it means all Synapse code that gets run should be in a span (unless in the sentinel logging context), but it means we generate about 15x the number of spans as we did previously.

This PR attempts to reduce that number by a) not starting one for send commands to Redis, and b) deferring starting background processes until after we're sure they're necessary.

I don't really know how much this will help.
2020-10-26 09:30:19 +00:00
Patrick Cloke
34a5696f93
Fix typos and spelling errors. () 2020-10-23 12:38:40 -04:00
Will Hunt
c276bd9969
Send some ephemeral events to appservices ()
Optionally sends typing, presence, and read receipt information to appservices.
2020-10-15 12:33:28 -04:00
Erik Johnston
921a3f8a59
Fix not sending events over federation when using sharded event persisters ()
* Fix outbound federaion with multiple event persisters.

We incorrectly notified federation senders that the minimum persisted
stream position had advanced when we got an `RDATA` from an event
persister.

Notifying of federation senders already correctly happens in the
notifier, so we just delete the offending line.

* Change some interfaces to use RoomStreamToken.

By enforcing use of `RoomStreamTokens` we make it less likely that
people pass in random ints that they got from somewhere random.
2020-10-14 13:27:51 +01:00
Patrick Cloke
a93f3121f8
Add type hints to some handlers () 2020-10-09 07:20:51 -04:00
Erik Johnston
ea70f1c362
Various clean ups to room stream tokens. () 2020-09-29 21:48:33 +01:00
Erik Johnston
ac11fcbbb8
Add EventStreamPosition type ()
The idea is to remove some of the places we pass around `int`, where it can represent one of two things:

1. the position of an event in the stream; or
2. a token that partitions the stream, used as part of the stream tokens.

The valid operations are then:

1. did a position happen before or after a token;
2. get all events that happened before or after a token; and
3. get all events between two tokens.

(Note that we don't want to allow other operations as we want to change the tokens to be vector clocks rather than simple ints)
2020-09-24 13:24:17 +01:00
Patrick Cloke
aec294ee0d
Use slots in attrs classes where possible ()
slots use less memory (and attribute access is faster) while slightly
limiting the flexibility of the class attributes. This focuses on objects
which are instantiated "often" and for short periods of time.
2020-09-14 12:50:06 -04:00
Erik Johnston
fe8ed1b46f
Make StreamToken.room_key be a RoomStreamToken instance. () 2020-09-11 12:22:55 +01:00
Erik Johnston
5d3e306d9f
Clean up Notifier.on_new_room_event code path ()
The idea here is that we pass the `max_stream_id` to everything, and only use the stream ID of the particular event to figure out *when* the max stream position has caught up to the event and we can notify people about it.

This is to maintain the distinction between the position of an item in the stream (i.e. event A has stream ID 513) and a token that can be used to partition the stream (i.e. give me all events after stream ID 352). This distinction becomes important when the tokens are more complicated than a single number, which they will be once we start tracking the position of multiple writers in the tokens.

The valid operations here are:

1. Is a position before or after a token
2. Fetching all events between two tokens
3. Merging multiple tokens to get the "max", i.e. `C = max(A, B)` means that for all positions P where P is before A *or* before B, then P is before C.

Future PR will change the token type to a dedicated type.
2020-09-10 13:24:43 +01:00
Erik Johnston
0f545e6b96
Clean up types for PaginationConfig ()
This removes `SourcePaginationConfig` and `get_pagination_rows`. The reasoning behind this is that these generic classes/functions erased the types of the IDs it used (i.e. instead of passing around `StreamToken` it'd pass in e.g. `token.room_key`, which don't have uniform types).
2020-09-08 15:00:17 +01:00
Patrick Cloke
c619253db8
Stop sub-classing object () 2020-09-04 06:54:56 -04:00
Erik Johnston
9d1e4942ab
Fix typing for notifier () 2020-08-12 14:03:08 +01:00
Erik Johnston
a1e9bb9eae
Add typing info to Notifier () 2020-08-11 19:40:02 +01:00
Patrick Cloke
e19de43eb5
Convert streams to async. () 2020-08-04 07:21:47 -04:00
Patrick Cloke
38e1fac886
Fix some spelling mistakes / typos. () 2020-07-09 09:52:58 -04:00
Erik Johnston
1a1da60ad2
Fix new flake8 errors () 2020-05-12 11:20:48 +01:00
Patrick Cloke
b0cbc57375
Convert the synapse.notifier module to async/await. () 2020-05-01 15:14:49 -04:00
Erik Johnston
3eab76ad43
Don't relay REMOTE_SERVER_UP cmds to same conn. ()
For direct TCP connections we need the master to relay REMOTE_SERVER_UP
commands to the other connections so that all instances get notified
about it. The old implementation just relayed to all connections,
assuming that sending back to the original sender of the command was
safe. This is not true for redis, where commands sent get echoed back to
the sender, which was causing master to effectively infinite loop
sending and then re-receiving REMOTE_SERVER_UP commands that it sent.

The fix is to ensure that we only relay to *other* connections and not
to the connection we received the notification from.

Fixes .
2020-04-29 14:10:59 +01:00
Erik Johnston
a8a50f5b57
Wake up transaction queue when remote server comes back online ()
This will be used to retry outbound transactions to a remote server if
we think it might have come back up.
2020-01-17 10:27:19 +00:00
Erik Johnston
8437e2383e Port SyncHandler to async/await 2019-12-05 17:58:25 +00:00
Erik Johnston
69f0054ce6 Port to use state storage 2019-10-30 14:46:54 +00:00
Andrew Morgan
4548d1f87e
Remove unnecessary parentheses around return statements ()
Python will return a tuple whether there are parentheses around the returned values or not.

I'm just sick of my editor complaining about this all over the place :)
2019-08-30 16:28:26 +01:00
Amber Brown
4806651744
Replace returnValue with return () 2019-07-23 23:00:55 +10:00
Amber Brown
463b072b12
Move logging utilities out of the side drawer of util/ and into logging/ () 2019-07-04 00:07:04 +10:00
Amber Brown
32e7c9e7f2
Run Black. () 2019-06-20 19:32:02 +10:00
Richard van der Hoff
c7325776a7 Remove redundant PreserveLoggingContext
Both (!) things that register as replication listeners do the right thing wrt
logcontexts, so this is redundant.
2019-03-04 18:31:18 +00:00
Richard van der Hoff
daa10e3e66 Remove unused wait_for_replication method
I guess this was used once? It's not now, anyway.
2019-03-04 18:27:32 +00:00
Amber Brown
b69216f768
Make the metrics less racy () 2018-10-19 21:45:45 +11:00
Richard van der Hoff
a87d419a85 Run notify_app_services as a bg process
This ensures that its resource usage metrics get recorded somewhere rather than
getting lost.

(It also fixes an error when called from a nested logging context which
completes before the bg process)
2018-09-26 17:00:40 +01:00
Erik Johnston
80d2d50f47 Fixup 2018-09-19 11:19:47 +01:00
Erik Johnston
9407bcf37a Replace custom DeferredTimeoutError with defer.TimeoutError 2018-09-19 11:07:29 +01:00
Erik Johnston
a334e1cace Update to use new timeout function everywhere.
The existing deferred timeout helper function (and the one into twisted)
suffer from a bug when a deferred's canceller throws an exception, .

The new helper function doesn't suffer from this problem.
2018-09-19 10:39:40 +01:00
Amber Brown
b37c472419
Rename async to async_helpers because async is a keyword on Python 3.7 () 2018-08-10 23:50:21 +10:00
Matthew Hodgson
5797f5542b WIP to announce deleted devices over federation
Previously we queued up the poke correctly when the device was deleted,
but then the actual EDU wouldn't get sent, as the device was no longer known.
Instead, we now send EDUs for deleted devices too if there's a poke for them.
2018-07-12 01:32:39 +01:00
Amber Brown
49af402019 run isort 2018-07-09 16:09:20 +10:00
Amber Brown
77ac14b960
Pass around the reactor explicitly () 2018-06-22 09:37:10 +01:00
Amber Brown
071206304d cleanup pep8 errors 2018-05-22 16:54:22 -05:00
Amber Brown
85ba83eb51 fixes 2018-05-22 16:28:23 -05:00
Amber Brown
df9f72d9e5 replacing portions 2018-05-21 19:47:37 -05:00
Richard van der Hoff
6146332387 Merge remote-tracking branch 'origin/develop' into rav/deferred_timeout 2018-04-27 14:18:00 +01:00
Richard van der Hoff
9d2c1b8429 Backport deferred.addTimeout
Twisted 16.0 doesn't have addTimeout, so let's backport it.
2018-04-27 12:52:30 +01:00
Richard van der Hoff
9255a6cb17 Improve exception handling for background processes
There were a bunch of places where we fire off a process to happen in the
background, but don't have any exception handling on it - instead relying on
the unhandled error being logged when the relevent deferred gets
garbage-collected.

This is unsatisfactory for a number of reasons:
 - logging on garbage collection is best-effort and may happen some time after
   the error, if at all
 - it can be hard to figure out where the error actually happened.
 - it is logged as a scary CRITICAL error which (a) I always forget to grep for
   and (b) it's not really CRITICAL if a background process we don't care about
   fails.

So this is an attempt to add exception handling to everything we fire off into
the background.
2018-04-27 11:07:40 +01:00
Richard van der Hoff
1ea904b9f0 Use deferred.addTimeout instead of time_bound_deferred
This doesn't feel like a wheel we need to reinvent.
2018-04-23 00:53:18 +01:00
Adrian Tschira
f63ff73c7f add __bool__ alias to __nonzero__ methods
Signed-off-by: Adrian Tschira <nota@notafile.com>
2018-04-15 20:40:47 +02:00
Richard van der Hoff
d4fb4f7c52 Clear logcontext before starting fed txn queue runner
These processes take a long time compared to the request, so there is lots of
"Entering|Restoring dead context" in the logs. Let's try to shut it up a bit.
2017-11-28 15:26:14 +00:00
Richard van der Hoff
eaaabc6c4f replace 'except:' with 'except Exception:'
what could possibly go wrong
2017-10-23 15:52:32 +01:00