Commit Graph

4102 Commits

Author SHA1 Message Date
Erik Johnston
1579fdd54a
Ensure we always drop the federation inbound lock (#10336) 2021-07-09 10:16:54 +01:00
Brendan Abolivier
9ad8455895
ANALYZE new stream ordering column (#10326)
Fixes #10325
2021-07-07 11:56:17 +02:00
Dirk Klimpel
bcb0962a72
Fix deactivate a user if he does not have a profile (#10252) 2021-07-06 13:08:53 +01:00
Erik Johnston
6655ea5587
Add script for getting info about recently registered users (#10290) 2021-07-06 13:03:16 +01:00
Erik Johnston
c65067d673
Handle old staged inbound events (#10303)
We might have events in the staging area if the service was restarted while there were unhandled events in the staging area.

Fixes #10295
2021-07-06 13:02:37 +01:00
Richard van der Hoff
0aab50c772
fix ordering of bg update (#10291)
this was a typo introduced in #10282. We don't want to end up doing the
`replace_stream_ordering_column` update after anything that comes up in
migration 60/03.
2021-07-01 18:45:55 +01:00
Erik Johnston
76addadd7c
Add some metrics to staging area (#10284) 2021-07-01 10:18:25 +01:00
Richard van der Hoff
b6dbf89fae
Change more stream_ordering columns to BIGINT (#10286) 2021-06-30 17:27:20 +01:00
Richard van der Hoff
859dc05b36
Rebuild other indexes using stream_ordering (#10282)
We need to rebuild *all* of the indexes that use the current `stream_ordering`
column.
2021-06-30 15:01:24 +01:00
Erik Johnston
329ef5c715
Fix the inbound PDU metric (#10279)
This broke in #10272
2021-06-30 12:07:16 +01:00
Richard van der Hoff
785bceef72 Merge branch 'release-v1.37' into develop 2021-06-29 20:25:47 +01:00
Erik Johnston
c54db67d0e
Handle inbound events from federation asynchronously (#10272)
Fixes #9490

This will break a couple of SyTest that are expecting failures to be added to the response of a federation /send, which obviously doesn't happen now that things are asynchronous.

Two drawbacks:

    Currently there is no logic to handle any events left in the staging area after restart, and so they'll only be handled on the next incoming event in that room. That can be fixed separately.
    We now only process one event per room at a time. This can be fixed up further down the line.
2021-06-29 19:55:22 +01:00
Erik Johnston
85d237eba7
Add a distributed lock (#10269)
This adds a simple best effort locking mechanism that works cross workers.
2021-06-29 19:15:47 +01:00
Richard van der Hoff
7647b0337f
Fix populate_stream_ordering2 background job (#10267)
It was possible for us not to find any rows in a batch, and hence conclude that
we had finished. Let's not do that.
2021-06-29 12:43:36 +01:00
Richard van der Hoff
60efc51a2b
Migrate stream_ordering to a bigint (#10264)
* Move background update names out to a separate class

`EventsBackgroundUpdatesStore` gets inherited and we don't really want to
further pollute the namespace.

* Migrate stream_ordering to a bigint

* changelog
2021-06-29 11:25:34 +01:00
Quentin Gliech
bd4919fb72
MSC2918 Refresh tokens implementation (#9450)
This implements refresh tokens, as defined by MSC2918

This MSC has been implemented client side in Hydrogen Web: vector-im/hydrogen-web#235

The basics of the MSC works: requesting refresh tokens on login, having the access tokens expire, and using the refresh token to get a new one.

Signed-off-by: Quentin Gliech <quentingliech@gmail.com>
2021-06-24 14:33:20 +01:00
Erik Johnston
33701dc116
Fix schema delta to not take as long on large servers (#10227)
Introduced in #6739
2021-06-22 12:00:45 +01:00
Eric Eastwood
96f6293de5
Add endpoints for backfilling history (MSC2716) (#9247)
Work on https://github.com/matrix-org/matrix-doc/pull/2716
2021-06-22 10:02:53 +01:00
Erik Johnston
a5cd05beee
Fix performance of responding to user key requests over federation (#10221)
We were repeatedly looking up a config option in a loop (using the
unclassed config style), which is expensive enough that it can cause
large CPU usage.
2021-06-21 14:38:59 +01:00
Andrew Morgan
6f1a28de19
Fix incorrect time magnitude on delayed call (#10195)
Fixes https://github.com/matrix-org/synapse/issues/10030.

We were expecting milliseconds where we should have provided a value in seconds.

The impact of this bug isn't too bad. The code is intended to count the number of remote servers that the homeserver can see and report that as a metric. This metric is supposed to run initially 1 second after server startup, and every 60s as well. Instead, it ran 1,000 seconds after server startup, and every 60s after startup.

This fix allows for the correct metrics to be collected immediately, as well as preventing a random collection 1,000s in the future after startup.
2021-06-17 15:04:26 +01:00
Richard van der Hoff
52c60bd0a9
Fix persist_events to stop leaking opentracing contexts (#10193) 2021-06-17 11:21:53 +01:00
Richard van der Hoff
9e405034e5
Make opentracing trace into event persistence (#10134)
* Trace event persistence

When we persist a batch of events, set the parent opentracing span to the that
from the request, so that we can trace all the way in.

* changelog

* When we force tracing, set a baggage item

... so that we can check again later.

* Link in both directions between persist_events spans
2021-06-16 11:41:15 +01:00
Richard van der Hoff
1dfdc87b9b
Refactor EventPersistenceQueue (#10145)
some cleanup, pulled out of #10134.
2021-06-14 11:59:27 +01:00
Richard van der Hoff
c1b9922498
Support for database schema version ranges (#9933)
This is essentially an implementation of the proposal made at https://hackmd.io/@richvdh/BJYXQMQHO, though the details have ended up looking slightly different.
2021-06-11 14:45:53 +01:00
Erik Johnston
d26d15ba3d
Fix bug when running presence off master (#10149)
Hopefully fixes #10027.
2021-06-11 10:27:12 +01:00
Andrew Morgan
a7a37437bc
Integrate knock rooms with the public rooms directory (#9359)
This PR implements the ["Changes regarding the Public Rooms Directory"](https://github.com/Sorunome/matrix-doc/blob/soru/knock/proposals/2403-knock.md#changes-regarding-the-public-rooms-directory) section of knocking MSC2403.

Specifically, it:

* Allows rooms with `join_rule` "knock" to be returned by the query behind the public rooms directory
* Adds the field `join_rule` to each room entry returned by a public rooms directory query, so clients can know whether to attempt a join or knock on a room

Based on https://github.com/matrix-org/synapse/issues/6739. Complement tests for this change: https://github.com/matrix-org/complement/pull/72
2021-06-09 20:31:31 +01:00
Sorunome
d936371b69
Implement knock feature (#6739)
This PR aims to implement the knock feature as proposed in https://github.com/matrix-org/matrix-doc/pull/2403

Signed-off-by: Sorunome mail@sorunome.de
Signed-off-by: Andrew Morgan andrewm@element.io
2021-06-09 19:39:51 +01:00
Erik Johnston
1092718cac
Fix logging context when opening new DB connection (#10141)
Fixes #10140
2021-06-08 13:49:29 +01:00
Richard van der Hoff
0acb5010ec
More database opentracing (#10136)
Add a couple of extra logs/spans, to give a bit of a better idea.
2021-06-07 18:01:32 +01:00
Richard van der Hoff
9eea4646be
Add OpenTracing for database activity. (#10113)
This adds quite a lot of OpenTracing decoration for database activity. Specifically it adds tracing at four different levels:

 * emit a span for each "interaction" - ie, the top level database function that we tend to call "transaction", but isn't really, because it can end up as multiple transactions.
 * emit a span while we hold a database connection open
 * emit a span for each database transaction - actual actual transaction.
 * emit a span for each database query.

I'm aware this might be quite a lot of overhead, but even just running it on a local Synapse it looks really interesting, and I hope the overhead can be offset just by turning down the sampling frequency and finding other ways of tracing requests of interest (eg, the `force_tracing_for_users` setting).
2021-06-03 16:31:56 +01:00
Dirk Klimpel
0284d2a297
Add new admin APIs to remove media by media ID from quarantine. (#10044)
Related to: #6681, #5956, #10040

Signed-off-by: Dirk Klimpel dirk@klimpel.org
2021-06-02 18:50:35 +01:00
Richard van der Hoff
b4b2fd2ece
add a cache to have_seen_event (#9953)
Empirically, this helped my server considerably when handling gaps in Matrix HQ. The problem was that we would repeatedly call have_seen_events for the same set of (50K or so) auth_events, each of which would take many minutes to complete, even though it's only an index scan.
2021-06-01 12:04:47 +01:00
Callum Brown
8fb9af570f
Make reason and score optional for report_event (#10077)
Implements MSC2414: https://github.com/matrix-org/matrix-doc/pull/2414
See #8551 

Signed-off-by: Callum Brown <callum@calcuode.com>
2021-05-27 18:42:23 +01:00
Richard van der Hoff
224f2f949b
Combine LruCache.invalidate and invalidate_many (#9973)
* Make `invalidate` and `invalidate_many` do the same thing

... so that we can do either over the invalidation replication stream, and also
because they always confused me a bit.

* Kill off `invalidate_many`

* changelog
2021-05-27 10:33:56 +01:00
Dirk Klimpel
65e6c64d83
Add an admin API for unprotecting local media from quarantine (#10040)
Signed-off-by: Dirk Klimpel dirk@klimpel.org
2021-05-26 11:19:47 +01:00
Patrick Cloke
7adcb20fc0
Add missing type hints to synapse.util (#9982) 2021-05-24 15:32:01 -04:00
Richard van der Hoff
c0df6bae06
Remove keylen from LruCache. (#9993)
`keylen` seems to be a thing that is frequently incorrectly set, and we don't really need it.

The only time it was used was to figure out if we had removed a subtree in `del_multi`, which we can do better by changing `TreeCache.pop` to return a different type (`TreeCacheNode`).

Commits should be independently reviewable.
2021-05-24 14:02:01 +01:00
Eric Eastwood
5f1198a67e
Fix get_state_ids_for_event return type typo to match what the function actually does (#10050)
It looks like a typo copy/paste from `get_state_for_event` above.
2021-05-24 10:43:33 +01:00
Erik Johnston
3e831f24ff
Don't hammer the database for destination retry timings every ~5mins (#10036) 2021-05-21 17:57:08 +01:00
Marek Matys
6a8643ff3d
Fixed removal of new presence stream states (#10014)
Fixes: https://github.com/matrix-org/synapse/issues/9962

This is a fix for above problem.

I fixed it by swaping the order of insertion of new records and deletion of old ones. This ensures that we don't delete fresh database records as we do deletes before inserts.

Signed-off-by: Marek Matys <themarcq@gmail.com>
2021-05-21 12:02:06 +01:00
Andrew Morgan
4d6e5a5e99
Use a database table to hold the users that should have full presence sent to them, instead of something in-memory (#9823) 2021-05-18 14:13:45 +01:00
Richard van der Hoff
5090f26b63
Minor @cachedList enhancements (#9975)
- use a tuple rather than a list for the iterable that is passed into the
  wrapped function, for performance

- test that we can pass an iterable and that keys are correctly deduped.
2021-05-14 11:12:36 +01:00
Dan Callahan
52ed9655ed
Remove unnecessary SystemRandom from SQLBaseStore (#9987)
It's not obvious that instances of SQLBaseStore each need their own
instances of random.SystemRandom(); let's just use random directly.

Introduced by 52839886d6

Signed-off-by: Dan Callahan <danc@element.io>
2021-05-14 10:59:10 +01:00
Aaron Raimist
dc6366a9bd
Add config option to hide device names over federation (#9945)
Now that cross signing exists there is much less of a need for other people to look at devices and verify them individually. This PR adds a config option to allow you to prevent device display names from being shared with other servers.

Signed-off-by: Aaron Raimist <aaron@raim.ist>
2021-05-11 14:03:23 +01:00
Richard van der Hoff
b378d98c8f
Add debug logging for issue #9533 (#9959)
Hopefully this will help us track down where to-device messages are getting
lost/delayed.
2021-05-11 11:04:03 +01:00
Richard van der Hoff
25f43faa70
Reorganise the database schema directories (#9932)
The hope here is that by moving all the schema files into synapse/storage/schema, it gets a bit easier for newcomers to navigate.

It certainly got easier for me to write a helpful README. There's more to do on that front, but I'll follow up with other PRs for that.
2021-05-07 10:22:05 +01:00
Erik Johnston
d0aee697ac
Use get_current_users_in_room from store and not StateHandler (#9910) 2021-05-05 16:49:34 +01:00
Patrick Cloke
10a08ab88a
Use the parent's logging context name for runWithConnection. (#9895)
This fixes a regression where the logging context for runWithConnection
was reported as runWithConnection instead of the connection name,
e.g. "POST-XYZ".
2021-04-28 07:44:52 -04:00
Andrew Morgan
4e0fd35bc9 Revert "Experimental Federation Speedup (#9702)"
This reverts commit 05e8c70c05.
2021-04-28 11:38:33 +01:00
Andrew Morgan
fe604a022a
Remove various bits of compatibility code for Python <3.6 (#9879)
I went through and removed a bunch of cruft that was lying around for compatibility with old Python versions. This PR also will now prevent Synapse from starting unless you're running Python 3.6+.
2021-04-27 13:13:07 +01:00