Commit Graph

600 Commits

Author SHA1 Message Date
Richard van der Hoff
6e3fc657b4 Resource tracking for background processes
This introduces a mechanism for tracking resource usage by background
processes, along with an example of how it will be used.

This will help address #3518, but more importantly will give us better insights
into things which are happening but not being shown up by the request metrics.

We *could* do this with Measure blocks, but:
 - I think having them pulled out as a completely separate metric class will
   make it easier to distinguish top-level processes from those which are
   nested.

 - I want to be able to report on in-flight background processes, and I don't
   think we want to do this for *all* Measure blocks.
2018-07-18 10:50:33 +01:00
Krombel
3366b9c534 rename assert_params_in_request to assert_params_in_dict
the method "assert_params_in_request" does handle dicts and not
requests. A request body has to be parsed to json before this method
can be used
2018-07-13 21:53:01 +02:00
Amber Brown
49af402019 run isort 2018-07-09 16:09:20 +10:00
Richard van der Hoff
3cf3e08a97 Implementation of server_acls
... as described at
https://docs.google.com/document/d/1EttUVzjc2DWe2ciw4XPtNpUpIl9lWXGEsy2ewDS7rtw.
2018-07-04 19:06:20 +01:00
Richard van der Hoff
546bc9e28b More server_name validation
We need to do a bit more validation when we get a server name, but don't want
to be re-doing it all over the shop, so factor out a separate
parse_and_validate_server_name, and do the extra validation.

Also, use it to verify the server name in the config file.
2018-07-04 18:59:51 +01:00
Richard van der Hoff
508196e08a
Reject invalid server names (#3480)
Make sure that server_names used in auth headers are sane, and reject them with
a sensible error code, before they disappear off into the depths of the system.
2018-07-03 14:36:14 +01:00
Erik Johnston
e3b4043800
Merge pull request #3456 from matrix-org/hawkowl/federation-prevevent-checking
Check the state of prev_events a bit more thoroughly when coming over federation
2018-06-29 13:55:02 +01:00
Amber Brown
6350bf925e
Attempt to be more performant on PyPy (#3462) 2018-06-28 14:49:57 +01:00
Amber Brown
77078d6c8e handle federation not telling us about prev_events 2018-06-27 11:27:32 +01:00
Erik Johnston
b4a5d767a9
Merge pull request #3428 from matrix-org/erikj/persisted_pdu
Simplify get_persisted_pdu
2018-06-22 14:47:43 +01:00
Amber Brown
c2eff937ac
Populate synapse_federation_client_sent_pdu_destinations:count again (#3386) 2018-06-21 09:39:58 +01:00
Amber Brown
a61738b316
Remove run_on_reactor (#3395) 2018-06-14 18:27:37 +10:00
Richard van der Hoff
9fc5b74b24 simplify get_persisted_pdu
it doesn't make much sense to use get_persisted_pdu on the receive path: just
get the event straight from the store.
2018-06-12 09:51:31 +01:00
Ivan Shapovalov
c88d50aa8f federation/send_queue.py: fix usage of sortedcontainers.SortedDict 2018-06-06 02:45:18 +03:00
Amber Brown
f7869f8f8b
Port to sortedcontainers (with tests!) (#3332) 2018-06-06 00:13:57 +10:00
Ivan Shapovalov
7d9d75e4e8 federation/send_queue.py: fix usage of LaterGauge
Fixes a startup crash due to commit df9f72d9e5
"replacing portions".
2018-06-03 14:16:17 +03:00
Amber Brown
c936a52a9e
Consistently use six's iteritems and wrap lazy keys/values in list() if they're not meant to be lazy (#3307) 2018-05-31 19:03:47 +10:00
Amber Brown
e987079037 fixes 2018-05-23 13:03:51 -05:00
Amber Brown
53cc2cde1f cleanup 2018-05-22 17:32:57 -05:00
Amber Brown
071206304d cleanup pep8 errors 2018-05-22 16:54:22 -05:00
Amber Brown
85ba83eb51 fixes 2018-05-22 16:28:23 -05:00
Amber Brown
df9f72d9e5 replacing portions 2018-05-21 19:47:37 -05:00
Richard van der Hoff
23e2dfe940
Merge pull request #3209 from damir-manapov/master
transaction_id, destination defined twice
2018-05-11 00:35:13 +01:00
Damir Manapov
db18d854cd transaction_id, destination twice 2018-05-10 22:13:31 +03:00
Richard van der Hoff
ca7211104e Merge branch 'release-v0.28.1' into develop 2018-05-01 18:16:57 +01:00
Richard van der Hoff
33f469ba19 Apply some limits to depth to counter abuse
* When creating a new event, cap its depth to 2^63 - 1
* When receiving events, reject any without a sensible depth

As per https://docs.google.com/document/d/1I3fi2S-XnpO45qrpCsowZv8P8dHcNZ4fsBsbOW7KABI
2018-05-01 17:54:19 +01:00
Richard van der Hoff
db75c86e84
Merge branch 'develop' into py3-xrange-1 2018-04-30 01:02:25 +01:00
Adrian Tschira
d82b6ea9e6 Move more xrange to six
plus a bonus next()

Signed-off-by: Adrian Tschira <nota@notafile.com>
2018-04-28 13:57:00 +02:00
Richard van der Hoff
fc149b4eeb Merge remote-tracking branch 'origin/develop' into rav/use_run_in_background 2018-04-27 14:31:23 +01:00
Richard van der Hoff
2a13af23bc Use run_in_background in preference to preserve_fn
While I was going through uses of preserve_fn for other PRs, I converted places
which only use the wrapped function once to use run_in_background, to avoid
creating the function object.
2018-04-27 12:55:51 +01:00
Richard van der Hoff
9255a6cb17 Improve exception handling for background processes
There were a bunch of places where we fire off a process to happen in the
background, but don't have any exception handling on it - instead relying on
the unhandled error being logged when the relevent deferred gets
garbage-collected.

This is unsatisfactory for a number of reasons:
 - logging on garbage collection is best-effort and may happen some time after
   the error, if at all
 - it can be hard to figure out where the error actually happened.
 - it is logged as a scary CRITICAL error which (a) I always forget to grep for
   and (b) it's not really CRITICAL if a background process we don't care about
   fails.

So this is an attempt to add exception handling to everything we fire off into
the background.
2018-04-27 11:07:40 +01:00
Richard van der Hoff
77ebef9d43
Merge pull request #3118 from matrix-org/rav/reject_prev_events
Reject events which have lots of prev_events
2018-04-23 17:51:38 +01:00
Richard van der Hoff
dc875d2712
Merge pull request #3106 from NotAFile/py3-six-itervalues-1
Use six.itervalues in some places
2018-04-20 15:43:52 +01:00
Richard van der Hoff
11a67b7c9d
Merge pull request #3093 from matrix-org/rav/response_cache_wrap
Refactor ResponseCache usage
2018-04-20 11:31:17 +01:00
Richard van der Hoff
0c280d4d99 Reinstate linearizer for federation_server.on_context_state_request 2018-04-20 11:10:04 +01:00
Richard van der Hoff
b1dfbc3c40 Refactor store.have_events
It turns out that most of the time we were calling have_events, we were only
using half of the result. Replace have_events with have_seen_events and
get_rejection_reasons, so that we can see what's going on a bit more clearly.
2018-04-20 10:25:56 +01:00
Richard van der Hoff
1f4b498b73 Add some comments 2018-04-18 00:15:36 +01:00
Adrian Tschira
36c59ce669 Use six.itervalues in some places
There's more where that came from

Signed-off-by: Adrian Tschira <nota@notafile.com>
2018-04-15 20:39:43 +02:00
Matthew Hodgson
78a9698650 fix federation_domain_whitelist
we were checking the wrong server_name on inbound requests
2018-04-13 15:47:43 +01:00
Matthew Hodgson
25b0ba30b1 revert last to PR properly 2018-04-13 15:46:37 +01:00
Matthew Hodgson
f8d46cad3c correctly auth inbound federation_domain_whitelist reqs 2018-04-13 15:41:52 +01:00
Richard van der Hoff
d3347ad485 Revert "Use sortedcontainers instead of blist"
This reverts commit 9fbe70a7dc.

It turns out that sortedcontainers.SortedDict is not an exact match for
blist.sorteddict; in particular, `popitem()` removes things from the opposite
end of the dict.

This is trivial to fix, but I want to add some unit tests, and potentially some
more thought about it, before we do so.
2018-04-13 11:16:43 +01:00
Richard van der Hoff
b78395b7fe Refactor ResponseCache usage
Adds a `.wrap` method to ResponseCache which wraps up the boilerplate of a
(get, set) pair, and then use it throughout the codebase.

This will be largely non-functional, but does include the following functional
changes:

* federation_server.on_context_state_request: drops use of _server_linearizer
  which looked redundant and could cause incorrect cache misses by yielding
  between the get and the set.
* RoomListHandler.get_remote_public_room_list(): fixes logcontext leaks
* the wrap function includes some logging. I'm hoping this won't be too noisy
  on production.
2018-04-12 13:02:15 +01:00
Richard van der Hoff
d5c74b9f6c
Merge pull request #3092 from matrix-org/rav/response_cache_metrics
Add metrics for ResponseCache
2018-04-12 12:59:36 +01:00
Erik Johnston
f67e906e18 Set all metrics at the same time 2018-04-12 11:18:19 +01:00
Erik Johnston
4dae4a97ed Track last processed event received_ts 2018-04-11 14:27:09 +01:00
Erik Johnston
92e34615c5 Track where event stream processing have gotten up to 2018-04-11 12:13:40 +01:00
Richard van der Hoff
233699c42e
Merge pull request #2760 from Valodim/pypy
Synapse on PyPy
2018-04-11 11:20:01 +01:00
Richard van der Hoff
b3384232a0 Add metrics for ResponseCache 2018-04-10 23:14:47 +01:00
Erik Johnston
9daf82278f
Merge pull request #3078 from matrix-org/erikj/federation_sender
Send federation events concurrently
2018-04-10 16:43:48 +01:00
Erik Johnston
a060dfa132 Use run_in_background instead 2018-04-10 14:25:11 +01:00
Erik Johnston
1246d23710 Preserve log contexts correctly 2018-04-10 12:04:32 +01:00
Erik Johnston
d49cbf712f Log event ID on exception 2018-04-10 12:03:41 +01:00
Erik Johnston
11d2609da7 Ensure slashes are escaped 2018-04-10 11:24:40 +01:00
Erik Johnston
dab87b84a3 URL quote path segments over federation 2018-04-10 11:16:08 +01:00
Vincent Breitmoser
9fbe70a7dc Use sortedcontainers instead of blist
This commit drop-in replaces blist with SortedContainers. They are
written in pure python so work with pypy, but perform as good as
native implementations, at least in a couple benchmarks:

http://www.grantjenks.com/docs/sortedcontainers/performance.html
2018-04-10 11:29:51 +02:00
Erik Johnston
6e025a97b4 Handle all events in a room correctly 2018-04-09 16:02:48 +01:00
Erik Johnston
11974f3787 Send federation events concurrently 2018-04-09 11:47:10 +01:00
Erik Johnston
145d14656b Handle exceptions in get_hosts_for_room when sending events over federation 2018-04-09 11:47:01 +01:00
Luke Barnard
112c2253e2 pep8 2018-04-06 15:43:27 +01:00
Luke Barnard
f8d1917fce Fix federation client set_group_joinable typo 2018-04-06 15:43:27 +01:00
David Baker
b370fe61c0 Implement group join API 2018-04-06 15:43:27 +01:00
Krombel
1d71f484d4 use PUT instead of POST for federating groups/m.join_policy 2018-04-06 12:54:09 +02:00
Luke Barnard
104c0bc1d5 Use "/settings/" (plural) 2018-04-05 14:07:16 +01:00
Luke Barnard
eb8d8d6f57 Use join_policy API instead of joinable
The API is now under
 /groups/$group_id/setting/m.join_policy

and expects a JSON blob of the shape

```json
{
  "m.join_policy": {
    "type": "invite"
  }
}
```

where "invite" could alternatively be "open".
2018-04-03 16:16:40 +01:00
David Baker
32260baa41 pep8 2018-03-28 14:29:42 +01:00
David Baker
79452edeee Add joinability for groups
Adds API to set the 'joinable' flag, and corresponding flag in the
table.
2018-03-28 14:03:37 +01:00
Erik Johnston
95cb401ae0
Merge pull request #2978 from matrix-org/erikj/refactor_replication_layer
Remove ReplicationLayer and user Client/Server directly
2018-03-13 15:45:08 +00:00
Erik Johnston
56e709857c
Merge pull request #2979 from matrix-org/erikj/no_handlers
Don't build handlers on workers unnecessarily
2018-03-13 13:46:38 +00:00
Erik Johnston
cea462e285 s/replication_server/federation_server 2018-03-13 13:22:21 +00:00
Erik Johnston
9a2d9b4789
Merge pull request #2977 from matrix-org/erikj/replication_move_props
Move property setting from ReplicationLayer to base classes
2018-03-13 11:45:25 +00:00
Erik Johnston
f43b6d6d9b Fix docstring types 2018-03-13 11:29:35 +00:00
Erik Johnston
ea7b3c4b1b Remove unused ReplicationLayer 2018-03-13 11:00:04 +00:00
Erik Johnston
265b993b8a Split replication layer into two 2018-03-13 10:55:47 +00:00
Erik Johnston
e05bf34117 Move property setting from ReplicationLayer to FederationBase 2018-03-13 10:51:30 +00:00
Erik Johnston
c3f79c9da5 Split out edu/query registration to a separate class 2018-03-13 10:24:27 +00:00
Matthew Hodgson
ab9f844aaf
Add federation_domain_whitelist option (#2820)
Add federation_domain_whitelist

gives a way to restrict which domains your HS is allowed to federate with.
useful mainly for gracefully preventing a private but internet-connected HS from trying to federate to the wider public Matrix network
2018-01-22 19:11:18 +01:00
Richard van der Hoff
a027c2af8d Metrics for events processed in appservice and fed sender
More metrics I wished I'd had
2018-01-15 18:23:24 +00:00
Richard van der Hoff
bd91857028 Check missing fields in event_from_pdu_json
Return a 400 rather than a 500 when somebody messes up their send_join
2017-12-30 18:40:19 +00:00
Richard van der Hoff
3079f80d4a Factor out event_from_pdu_json
turns out we have two copies of this, and neither needs to be an instance
method
2017-12-30 18:40:19 +00:00
Richard van der Hoff
65abc90fb6 federation_server: clean up imports 2017-12-30 18:40:19 +00:00
Richard van der Hoff
a7b726ad18 federation_client: clean up imports 2017-12-30 18:40:19 +00:00
Richard van der Hoff
d4fb4f7c52 Clear logcontext before starting fed txn queue runner
These processes take a long time compared to the request, so there is lots of
"Entering|Restoring dead context" in the logs. Let's try to shut it up a bit.
2017-11-28 15:26:14 +00:00
Richard van der Hoff
7e6fa29cb5 Remove preserve_context_over_{fn, deferred}
Both of these functions ae known to leak logcontexts. Replace the remaining
calls to them and kill them off.
2017-11-14 11:22:42 +00:00
Erik Johnston
82e4bfb53d Add brackets 2017-11-09 10:06:42 +00:00
Erik Johnston
e8814410ef Have an explicit API to update room config 2017-11-08 16:13:27 +00:00
Erik Johnston
94ff2cda73
Revert "Modify group room association API to allow modification of is_public" 2017-11-08 15:43:34 +00:00
Luke Barnard
207fabbc6a Update docs for updating room group association 2017-11-01 09:35:15 +00:00
Luke Barnard
13b3d7b4a0 Flake8 2017-10-31 17:20:11 +00:00
Luke Barnard
20fe347906 Modify group room association API to allow modification of is_public
also includes renamings to make things more consistent.
2017-10-31 17:04:28 +00:00
Erik Johnston
2a7e9faeec Do logcontexts outside ResponseCache 2017-10-25 15:21:08 +01:00
Erik Johnston
39dc52157d Merge branch 'develop' of github.com:matrix-org/synapse into erikj/group_fed_update_profile 2017-10-24 09:16:20 +01:00
Richard van der Hoff
eaaabc6c4f replace 'except:' with 'except Exception:'
what could possibly go wrong
2017-10-23 15:52:32 +01:00
Erik Johnston
ce6d4914f4 Correctly wire in update group profile over federation 2017-10-23 15:21:24 +01:00
Erik Johnston
011d03a0f6 Fix typo 2017-10-19 11:22:48 +01:00
Erik Johnston
9ab859f27b Fix typo in group attestation handling 2017-10-19 10:55:52 +01:00
Richard van der Hoff
582bd19ee9 Fix 500 error when we get an error handling a PDU
FederationServer doesn't have a send_failure (and nor does its subclass,
ReplicationLayer), so this was failing.

I'm not really sure what the idea behind send_failure is, given (a) we don't do
anything at the other end with it except log it, and (b) we also send back the
failure via the transaction response. I suspect there's a whole lot of dead
code around it, but for now I'm just removing the broken bit.
2017-10-17 20:52:40 +01:00
Luke Barnard
85f5674e44 Delint 2017-10-16 15:52:17 +01:00
Luke Barnard
2c5972f87f Implement GET /groups/$groupId/invited_users 2017-10-16 15:31:11 +01:00
Richard van der Hoff
8dd0c85ac5 Merge pull request #2529 from matrix-org/rav/fix_transaction_failure_handling
log pdu_failures from incoming transactions
2017-10-11 17:29:14 +01:00