* fix incorrect unwrapFirstError import
this was being imported from the wrong place
* Refactor `concurrently_execute` to use `yieldable_gather_results`
* Improve exception handling in `yieldable_gather_results`
Try to avoid swallowing so many stack traces.
* mark unwrapFirstError deprecated
* changelog
...and various code supporting it.
The /spaces endpoint was from an old version of MSC2946 and included
both a Client-Server and Server-Server API. Note that the unstable
/hierarchy endpoint (from the final version of MSC2946) is not yet
removed.
* Fix `PushRuleEvaluator` to work on frozendicts
frozendicts do not (necessarily) inherit from dict, so this needs to handle
them correctly.
* Fix event filtering for frozen events
Looks like this one was introduced by #11194.
Before this fix, a legitimate 404 from a federation endpoint (e.g. due
to an unknown room) would be treated as an unknown endpoint. This
could cause unnecessary federation traffic.
Don't attempt to add non-string `value`s to `event_search` and add a
background update to clear out bad rows from `event_search` when
using sqlite.
Signed-off-by: Sean Quah <seanq@element.io>
The complement.sh script relies on the name of the ref matching the name
of the unpacked folder. The branch redirect from renaming the default
branch breaks that assumption.
Signed-off-by: Nicolas Werner <n.werner@famedly.com>
* Remove `trial` section from setup.cfg
This was added in the initial commit from 2014. I can't see that it does
anything. Maybe it's there so that you can run `trial` without any extra
args, but if I do that then I just get the `--help` message.
* Move flake8's config to its own file
This is an endpoint that we have server-side support for, but no client-side support. It's going to be useful for resyncing partial-stated rooms, so let's introduce it.
msc3706 proposes changing the `/send_join` response:
> Any events returned within `state` can be omitted from `auth_chain`.
Currently, we rely on `m.room.create` being returned in `auth_chain`, but since
the `m.room.create` event must necessarily be part of the state, the above
change will break this.
In short, let's look for `m.room.create` in `state` rather than `auth_chain`.
For users with large accounts it is inefficient to calculate the set of
users they share a room with (and takes a lot of space in the cache).
Instead we can look at users whose devices have changed since the last
sync and check if they share a room with the syncing user.
When the server leaves a room the `get_rooms_for_user` cache is not
correctly invalidated for the remote users in the room. This means that
subsequent calls to `get_rooms_for_user` for the remote users would
incorrectly include the room (it shouldn't be included because the
server no longer knows anything about the room).
The driver for this is to stop Complement complaining about it, but as far as I can tell it was pointless and needed to go away anyway.
I'm a bit unclear about what exactly VOLUME does, but I think what it means is that, if you don't override it with an explicit -v argument, then docker run will create a temporary volume, and copy things into it. The temporary volume is then deleted when the container finishes.
That only sounds useful if your image has something to copy into it (otherwise you may as well just use the default root filesystem), and our image notably doesn't copy anything into /data.
So... this wasn't doing anything, except annoying Complement?
Splits the search code into a few logical functions instead of a single
unreadable function.
There are also a few additional changes for readability.
After refactoring it was clear to see there were some unused and
unnecessary variables, which were simplified.
If the latest event in a thread was edited than the original
event content was included in bundled aggregation for
threads instead of the edited event content.
* Make `get_auth_chain_ids` return a Set
It has a set internally, and a set is often useful where it gets used, so let's
avoid converting to an intermediate list.
* Minor refactors in `on_send_join_request`
A little bit of non-functional groundwork
* Implement MSC3706: partial state in /send_join response
This should reduce database usage when fetching bundled aggregations
as the number of individual queries (and round trips to the database) are
reduced.
If ther are more than 100 to-device messages pending for a device
`/sync` will only return the first 100, however the next batch token was
incorrectly calculated and so all other pending messages would be
dropped.
This is due to `txn.rowcount` only returning the number of rows that
*changed*, rather than the number *selected* in SQLite.
If we prepopulate the test homeserver with a key for a remote homeserver, we
can make federation requests to it without having to stub out the
authenticator. This has two advantages:
* means that what we are testing is closer to reality (ie, we now have
complete tests for the incoming-request-authorisation flow)
* some tests require that other objects be signed by the remote server (eg,
the event in `/send_join`), and doing that would require a whole separate
set of mocking out. It's much simpler just to use real keys.
This implements an allow list for content types for which Synapse will attempt URL preview. If a URL resolves to a resource with a content type which isn't in the list, the download will terminate immediately.
This makes sense given that Synapse would never successfully generate a URL preview for such files in the first place, and helps prevent issues with streaming media servers, such as #8302.
Signed-off-by: Denis Kasak dkasak@termina.org.uk
This should reduce database usage when fetching bundled aggregations
as the number of individual queries (and round trips to the database) are
reduced.
Part of the Tchap Synapse mainlining.
This allows modules to implement extra logic to figure out whether a given 3PID can be added to the local homeserver. In the Tchap use case, this will allow a Synapse module to interface with the custom endpoint /internal_info.
Since #11811 there has been general Complement flakiness around networking.
It seems like tests are hitting the wrong containers. In an effort to diagnose
the cause of this, as well as reduce its impact on this project, set the
parallelsim to 1 (no parallelism) when running tests.
If this fixes the flakiness then this indicates the cause and I can diagnose
this further. If this doesn't fix the flakiness then that implies some kind
of test pollution which also helps to diagnose this further.
The idea here is to set the parent span for incoming federation requests to the
*outgoing* span on the other end. That means that you can see (most of) the
full end-to-end flow when you have a process that includes federation requests.
However, in order not to lose information, we still want a link to the
`incoming-federation-request` span from the servlet, so we have to create
another span to do exactly that.
`start_active_span` was inconsistent as to whether it would activate the span
immediately, or wait for `scope.__enter__` to happen (it depended on whether
the current logcontext already had an associated scope). The inconsistency was
rather confusing if you were hoping to set up a couple of separate spans before
activating either.
Looking at the other implementations of opentracing `ScopeManager`s, the
intention is that it *should* be activated immediately, as the name
implies. Indeed, the idea is that you don't have to use the scope as a
contextmanager at all - you can just call `.close` on the result. Hence, our
cleanup has to happen in `.close` rather than `.__exit__`.
So, the main change here is to ensure that `start_active_span` does activate
the span, and that `scope.close()` does close the scope.
We also add some tests, which requires a `tracer` param so that we don't have
to rely on the global variable in unit tests.
The get_users_in_room and get_users_in_room_with_profiles
are now only invalidated when the membership of a room changes,
instead of during any state change in the room.
* Fix losing incoming EDUs if debug logging enabled
Fixes#11889. Homeservers should only be affected if the
`synapse.8631_debug` logger was enabled for DEBUG mode.
I am not sure if this merits a bugfix release: I think the logging can
be disabled in config if anyone is affected? But it is still pretty bad.
Only allow files which file size and content types match configured
limits to be set as avatar.
Most of the inspiration from the non-test code comes from matrix-org/synapse-dinsic#19
* Include 'prev_content' field in AS events
Signed-off-by: Vaishnav Nair <nairvaishnav007@icloud.com>
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
This is in the context of mainlining the Tchap fork of Synapse. Currently in Tchap usernames are derived from the user's email address (extracted from the UIA results, more specifically the m.login.email.identity step).
This change also exports the check_username method from the registration handler as part of the module API, so that a module can check if the username it's trying to generate is correct and doesn't conflict with an existing one, and fallback gracefully if not.
Co-authored-by: David Robertson <davidr@element.io>
This is some odds and ends found during the review of #11791
and while continuing to work in this code:
* Return attrs classes instead of dictionaries from some methods
to improve type safety.
* Call `get_bundled_aggregations` fewer times.
* Adds a missing assertion in the tests.
* Do not return empty bundled aggregations for an event (preferring
to not include the bundle at all, as the docstring states).
This is mostly motivated by the tchap use case, where usernames are automatically generated from the user's email address (in a way that allows figuring out the email address from the username). Therefore, it's an issue if we respond to requests on /register and /register/available with M_USER_IN_USE, because it can potentially leak email addresses (which include the user's real name and place of work).
This commit adds a flag to inhibit the M_USER_IN_USE errors that are raised both by /register/available, and when providing a username early into the registration process. This error will still be raised if the user completes the registration process but the username conflicts. This is particularly useful when using modules (https://github.com/matrix-org/synapse/pull/11790 adds a module callback to set the username of users at registration) or SSO, since they can ensure the username is unique.
More context is available in the PR that introduced this behaviour to synapse-dinsic: matrix-org/synapse-dinsic#48 - as well as the issue in the matrix-dinsic repo: matrix-org/matrix-dinsic#476
Similar to #11817.
In `_create_power_level_validator` we
- retrieve `validator`. This is a class implementing the
`jsonschema.protocols.Validator` interface. In other words,
`validator: Type[jsonschema.protocols.Validator]`.
- we then create an second validator class by modifying the original
`validator`. We return that class, which is also of type
`Type[jsonschema.protocols.Validator]`.
So the original annotation was incorrect: it claimed we were returning
an instance of jsonSchema.Draft7Validator, not the class (or a subclass)
itself. (Strictly speaking this is incorrect, because `POWER_LEVELS_SCHEMA`
isn't pinned to a particular version of JSON Schema. But there are other
complications with the type stubs if you try to fix this; I felt like
the change herein was a decent compromise that better expresses intent).
(I suspect/hope the typeshed project would welcome an effort to improve
the jsonschema stubs. Let's see if I get some spare time.)
* add check that gc.freeze is available before calling
* newsfragment
* lint
* Update comment
Co-authored-by: Dan Callahan <danc@element.io>
Co-authored-by: Dan Callahan <danc@element.io>
* CI: run Complement on the VM, not inside Docker
This requires https://github.com/matrix-org/complement/pull/289
We now run Complement on the VM instead of inside a Docker container.
This is to allow Complement to bind to any high-numbered port when it
starts up its own federation servers. We want to do this to allow for
more concurrency when running complement tests. Previously, Complement
only ever bound to `:8448` when running its own federation server. This
prevented multiple federation tests running at the same time as they would
fight each other on the port. This did however allow Complement to run
in Docker, as the host could just port forward `:8448` to allow homeserver
containers to communicate to Complement. Now that we are using random
ports however, we cannot use Docker to run Complement. This ends up
being a good thing because:
- Running Complement tests locally is closer to how they run in CI.
- Allows the `CI` env var to be removed in Complement.
- Slightly speeds up runs as we don't need to pull down the Complement
image prior to running tests. This assumes GHA caches actions sensibly.
* Changelog
* Full stop
* Update .github/workflows/tests.yml
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
* Review comments
* Update .github/workflows/tests.yml
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
* Docs: add missing PR submission process how-tos
The documentation says that in order to submit a pull request you have to run the linter and links to [Run the linters](https://matrix-org.github.io/synapse/latest/development/contributing_guide.html#run-the-linters). IMO "Run the linters" should explain that development dependencies are a pre-requisite.
I also included `pip install wheel` which I had to run inside my virtual environment on ubuntu before I `pip install -e ".[all,dev]"` would succeed.
PyNaCl's recent 1.5.0 release on PyPi includes arm64 wheels, which means our
arm64 docker images now build in a sensible amount of time, so we can skip the
amd64-only build.
* remove reference in comments to python3.6
* upgrade tox python env in script
* bump python version in example for completeness
* upgrade python version requirement in setup doc
* upgrade necessary python version in __init__.py
* upgrade python version in setup.py
* newsfragment
* drops refs to bionic and replace with focal
* bump refs to postgres 9.6 to 10
* fix hanging ci
* try installing tzdata first
* revert change made in b979f336
* ignore new random mypy error while debugging other error
* fix lint error for temporary workaround
* revert change to install list
* try passing env var
* export debian frontend var?
* move line and add comment
* bump pillow dependency
* bump lxml depenency
* install libjpeg-dev for pillow
* bump automat version to one compatible with py3.8
* add libwebp for pillow
* bump twisted trunk python version
* change suffix of newsfragment
* remove redundant python 3.7 checks
* lint
Debug for #8631.
I'm having a hard time tracking down what's going wrong in that issue.
In the reported example, I could see server A sending federation traffic
to server B and all was well. Yet B reports out-of-sync device updates
from A.
I couldn't see what was _in_ the events being sent from A to B. So I
have added some crude logging to track
- when we have updates to send to a remote HS
- the edus we actually accumulate to send
- when a federation transaction includes a device list update edu
- when such an EDU is received
This is a bit of a sledgehammer.
By scraping Open Graph information from the HTML even
when an autodiscovery endpoint is found. The results are
then combined to capture as much information as possible
from the page.
I've never found this terribly useful. I think it was added in the early days
of Synapse, without much thought as to what would actually be useful to log,
and has just been cargo-culted ever since.
Rather, it tends to clutter up debug logs with useless information.
The existing implementation of the `python_twisted_reactor_tick_time` metric is pretty useless, because it *only*
measures the time taken to execute timed calls and callbacks from threads. That neglects everything that
happens off the back of I/O, which is obviously quite a lot for us.
To improve this, I've hooked into a different place in the reactor - in particular, where it calls `epoll`. That call is
the only place it should wait for something to happen - the rest of the loop *should* be quick.
I've also removed `python_twisted_reactor_pending_calls`, because I don't believe anyone ever looks at it, and
it's a nuisance to populate.
Always add state.room_id after the configurable ORDER BY. Otherwise,
for any sort, certain pages can contain results from
other pages. (Especially when sorting by creator, since there may
be many rooms by the same creator)
* Document different order direction of numerical fields
"joined_members", "joined_local_members", "version" and "state_events"
are ordered in descending direction by default (dir=f). Added a note
in tests to explain the differences in ordering.
Signed-off-by: Daniël Sonck <daniel@sonck.nl>
documentation claims that you can use the %(app)s variable in password_reset and email_validation subjects, but if you do you end up with an error 500
Co-authored-by: br4nnigan <10244835+br4nnigan@users.noreply.github.com>
Rather than hooking into the reactor loop, just add a timed task that runs every 100 ms to do the garbage collection.
Part 1 of a quest to simplify the reactor monkey-patching.
Currently when puppeting another user, the user doing the puppeting is
tracked for client IPs and MAU (if configured).
When tracking MAU is important, it becomes necessary to be possible to
also track the client IPs and MAU of puppeted users. As an example a
client that manages user creation and creation of tokens via the Synapse
admin API, passing those tokens for the client to use.
This PR adds optional configuration to enable tracking of puppeted users
into monthly active users. The default behaviour stays the same.
Signed-off-by: Jason Robinson <jasonr@matrix.org>
* Deal with mypy errors w/ type-hinted pynacl 1.5.0
Fixes#11644.
I really don't like that we're monkey patching pynacl SignedKey
instances with alg and version objects. But I'm too scared to make the
changes necessary right now.
(Ideally I would replace `signedjson.types.SingingKey` with a runtime class which
wraps or inherits from `nacl.signing.SigningKey`.) C.f. https://github.com/matrix-org/python-signedjson/issues/16
By returning all of the m.space.child state of the space, not just
the first 50. The number of rooms returned is still capped at 50.
For the federation API this implies that the requesting server will
need to individually query for any other rooms it is not joined to.
* Optionally use an on-disk sqlite db in tests
When debugging a test it is sometimes useful to inspect the state of the
DB. This is not easy when the db is in-memory: one cannot attach the
sqlite CLI to another process's DB.
With this change, if SYNAPSE_TEST_PERSIST_SQLITE_DB is set, we use
`_trial_temp/test.db` as our sqlite database. One can then use
`sqlite3 _trial_temp/test.db` and query to your heart's content.
The DB is destroyed and recreated between different test cases.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
This makes the serialization of events synchronous (and it no
longer access the database), but we must manually calculate and
provide the bundled aggregations.
Overall this should cause no change in behavior, but is prep work
for other improvements.
Fixes minor discrepancies between the /hierarchy endpoint described
in MSC2946 and the implementation.
Note that the changes impact the stable and unstable /hierarchy and
unstable /spaces endpoints for both client and federation APIs.
* `_auth_and_persist_outliers`: mark persisted events as outliers
Mark any events that get persisted via `_auth_and_persist_outliers` as, well,
outliers.
Currently this will be a no-op as everything will already be flagged as an
outlier, but I'm going to change that.
* `process_remote_join`: stop flagging as outlier
The events are now flagged as outliers later on, by `_auth_and_persist_outliers`.
* `send_join`: remove `outlier=True`
The events created here are returned in the result of `send_join` to
`FederationHandler.do_invite_join`. From there they are passed into
`FederationEventHandler.process_remote_join`, which passes them to
`_auth_and_persist_outliers`... which sets the `outlier` flag.
* `get_event_auth`: remove `outlier=True`
stop flagging the events returned by `get_event_auth` as outliers. This method
is only called by `_get_remote_auth_chain_for_event`, which passes the results
into `_auth_and_persist_outliers`, which will flag them as outliers.
* `_get_remote_auth_chain_for_event`: remove `outlier=True`
we pass all the events into `_auth_and_persist_outliers`, which will now flag
the events as outliers.
* `_check_sigs_and_hash_and_fetch`: remove unused `outlier` parameter
This param is now never set to True, so we can remove it.
* `_check_sigs_and_hash_and_fetch_one`: remove unused `outlier` param
This is no longer set anywhere, so we can remove it.
* `get_pdu`: remove unused `outlier` parameter
... and chase it down into `get_pdu_from_destination_raw`.
* `event_from_pdu_json`: remove redundant `outlier` param
This is never set to `True`, so can be removed.
* changelog
* update docstring
* Fix AssertionErrors after purging events
If you purged a bunch of events from your database, and then restarted synapse
without receiving more events, then you would get a bunch of AssertionErrors on
restart.
This fixes the situation by rewinding the stream processors.
* `check-newsfragment`: ignore deleted newsfiles
Events returned by `backfill` should not be flagged as outliers.
Fixes:
```
AssertionError: null
File "synapse/handlers/federation.py", line 313, in try_backfill
dom, room_id, limit=100, extremities=extremities
File "synapse/handlers/federation_event.py", line 517, in backfill
await self._process_pulled_events(dest, events, backfilled=True)
File "synapse/handlers/federation_event.py", line 642, in _process_pulled_events
await self._process_pulled_event(origin, ev, backfilled=backfilled)
File "synapse/handlers/federation_event.py", line 669, in _process_pulled_event
assert not event.internal_metadata.is_outlier()
```
See https://sentry.matrix.org/sentry/synapse-matrixorg/issues/231992Fixes#8894.
* Push `get_room_{min,max_stream_ordering}` into StreamStore
Both implementations of this are identical, so we may as well push it down and
get rid of the abstract base class nonsense.
* Remove redundant `StreamStore` class
This is empty now
* Remove redundant `get_current_events_token`
This was an exact duplicate of `get_room_max_stream_ordering`, so let's get rid
of it.
* newsfile
* remove python 3.6 and postgres 9.6 from github workflow
* remove python 3.6 env from tox
* newsfragment
* correct postgres version
* add py310 to tox env list
* Wrap `auth.get_user_by_req` in an opentracing span
give `get_user_by_req` its own opentracing span, since it can result in a
non-trivial number of sub-spans which it is useful to group together.
This requires a bit of reorganisation because it also sets some tags (and may
force tracing) on the servlet span.
* Emit opentracing span for encoding json responses
This can be a significant time sink.
* Rename all sync spans with a prefix
* Write an opentracing span for encoding sync response
* opentracing span to group generate_room_entries
* opentracing spans within sync.encode_response
* changelog
* Use the `trace` decorator instead of context managers
This adds some opentracing annotations to ResponseCache, to make it easier to see what's going on; in particular, it adds a link back to the initial trace which is actually doing the work of generating the response.
* remove `start_active_span_from_request`
Instead, pull out a separate function, `span_context_from_request`, to extract
the parent span, which we can then pass into `start_active_span` as
normal. This seems to be clearer all round.
* Remove redundant tags from `incoming-federation-request`
These are all wrapped up inside a parent span generated in AsyncResource, so
there's no point duplicating all the tags that are set there.
* Leave request spans open until the request completes
It may take some time for the response to be encoded into JSON, and that JSON
to be streamed back to the client, and really we want that inside the top-level
span, so let's hand responsibility for closure to the SynapseRequest.
* opentracing logs for HTTP request events
* changelog
* Disable aggregation bundling on `/sync` responses
A partial revert of #11478. This turns out to have had a significant CPU impact
on initial-sync handling. For now, let's disable it, until we find a more
efficient way of achieving this.
* Fix tests.
Co-authored-by: Patrick Cloke <patrickc@matrix.org>
A couple of safety-checks to hopefully stop people doing what I just did, and create a storage
function which only works the first time it is called (and not when it is re-run due to a database
concurrency error or similar).
* Splits the logic for parsing HTML from the resource handling code.
* Fix a circular import in the oEmbed code (which uses the HTML parsing code).
* Renames some of the HTML parsing methods to:
* Make it clear which methods are "internal" to the module.
* Clarify what the methods do.
Create a new dict helper method `simple_insert_many_values_txn`, which takes
raw row values, rather than {key=>value} dicts. This saves us a bunch of dict
munging, and makes it easier to use generators rather than creating
intermediate lists and dicts.
* Move sync_token up to the top
* Pull out _get_ignored_users
* Try to signpost the body of `_generate_sync_entry_for_rooms`
* Pull out _calculate_user_changes
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
After #10847, `looping_background_call` would print an error in the logs
every time a non-async function was called. Since the error would be
caught and ignored immediately, there were no other side effects.
Due to updates to MSC2675 this includes a few fixes:
* Include bundled aggregations for /sync.
* Do not include bundled aggregations for /initialSync and /events.
* Do not bundle aggregations for state events.
* Clarifies comments and variable names.
This mainly consists of docstrings and inline comments. There are one or two type annotations and variable renames thrown in while I was here.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
* move wiki pages to synapse/docs and add a few titles where necessary
* update SUMMARY.md with added pages
* add changelog
* move incorrectly located newsfragment
* update changelog number
* snake case added files and update summary.md accordingly
* update issue/pr links
* update relative links to docs
* update changelog to indicate that we moved wiki pages to the docs and state reasoning
* requested changes to admin_faq.md
* requested changes to database_maintenance_tools.md
* requested changes to understanding_synapse_through_graphana_graphs.md
* add changelog
* fix leftover merge errata
* fix unwanted changes from merge
* use two spaces between entries
* outdent code blocks
MSC3030: https://github.com/matrix-org/matrix-doc/pull/3030
Client API endpoint. This will also go and fetch from the federation API endpoint if unable to find an event locally or we found an extremity with possibly a closer event we don't know about.
```
GET /_matrix/client/unstable/org.matrix.msc3030/rooms/<roomID>/timestamp_to_event?ts=<timestamp>&dir=<direction>
{
"event_id": ...
"origin_server_ts": ...
}
```
Federation API endpoint:
```
GET /_matrix/federation/unstable/org.matrix.msc3030/timestamp_to_event/<roomID>?ts=<timestamp>&dir=<direction>
{
"event_id": ...
"origin_server_ts": ...
}
```
Co-authored-by: Erik Johnston <erik@matrix.org>
* move wiki pages to synapse/docs and add a few titles where necessary
* update SUMMARY.md with added pages
* add changelog
* move incorrectly located newsfragment
* update changelog number
* snake case added files and update summary.md accordingly
* update issue/pr links
* update relative links to docs
* update changelog to indicate that we moved wiki pages to the docs and state reasoning
* revert unintentional change to CHANGES.md
* add link
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
* Update CHANGES.md
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
* Add check to catch syanpse master process starting when workers are configured
* add test to verify that starting master process with worker config raises error
* newsfragment
* specify config.worker.worker_app in check
* update test
* report specific config option that triggered the error
Co-authored-by: reivilibre <oliverw@matrix.org>
* clarify error message
Co-authored-by: reivilibre <oliverw@matrix.org>
Co-authored-by: reivilibre <oliverw@matrix.org>
When all entries in an `LruCache` have a size of 0 according to the
provided `size_callback`, and `drop_from_cache` is called on a cache
node, the node would be unlinked from the LRU linked list but remain in
the cache dictionary. An assertion would be later be tripped due to the
inconsistency.
Avoid unintentionally calling `__len__` and use a strict `is None`
check instead when unwrapping the weak reference.
Part of https://github.com/matrix-org/synapse/issues/11300
Call stack:
- `_persist_events_and_state_updates` (added `use_negative_stream_ordering`)
- `_persist_events_txn`
- `_update_room_depths_txn` (added `update_room_forward_stream_ordering`)
- `_update_metadata_tables_txn`
- `_store_room_members_txn` (added `inhibit_local_membership_updates`)
Using keyword-only arguments (`*`) to reduce the mistakes from `backfilled` being left as a positional argument somewhere and being interpreted wrong by our new arguments.
Since e81fa92648, Synapse depends on
the use_float flag which has been introduced in ijson 3.1 and
is not available in 3.0. This is known to cause runtime errors
with send_join.
Signed-off-by: Daniel Molkentin <danimo@infra.run>
Co-authored-by: Daniel Molkentin <danimo@infra.run>
The previous fix for the ongoing event fetches counter
(8eec25a1d9) was both insufficient and
incorrect.
When the database is unreachable, `_do_fetch` never gets run and so
`_event_fetch_ongoing` is never decremented.
The previous fix also moved the `_event_fetch_ongoing` decrement outside
of the `_event_fetch_lock` which allowed race conditions to corrupt the
counter.
This change makes mypy complain if the constants are ever reassigned,
and, more usefully, makes mypy type them as `Literal`s instead of `str`s,
allowing code of the following form to pass mypy:
```py
def do_something(membership: Literal["join", "leave"], ...): ...
do_something(Membership.JOIN, ...)
```
* remove background update code related to deprecated config flag
* changelog entry
* update changelog
* Delete 11394.removal
Duplicate, wrong number
* add no-op background update and change newfragment so it will be consolidated with associated work
* remove unused code
* Remove code associated with deprecated flag from legacy docker dynamic config file
Co-authored-by: reivilibre <oliverw@matrix.org>
Instead of only known relation types. This also reworks the background
update for thread relations to crawl events and search for any relation
type, not just threaded relations.
If `room_list_publication_rules` was configured with a rule with a
non-wildcard alias and a room was created with an alias then an
internal server error would have been thrown.
This fixes the error and properly applies the publication rules
during room creation.
Fixes a bug introduced in #11129: objects signed by the local server, but with
keys other than the current one, could not be successfully verified.
We need to check the key id in the signature, and track down the right key.
* remove code legacy code related to deprecated config flag "trust_identity_server_for_password_resets" from synapse/config/emailconfig.py
* remove legacy code supporting depreciated config flag "trust_identity_server_for_password_resets" from synapse/config/registration.py
* remove legacy code supporting depreciated config flag "trust_identity_server_for_password_resets" from synapse/handlers/identity.py
* add tests to ensure config error is thrown and synapse refuses to start when depreciated config flag is found
* add changelog
* slightly change behavior to only check for deprecated flag if set to 'true'
* Update changelog.d/11333.misc
Co-authored-by: reivilibre <oliverw@matrix.org>
Co-authored-by: reivilibre <oliverw@matrix.org>
Adds validation to the Client-Server API to ensure that
the potential thread head does not relate to another event
already. This results in not allowing a thread to "fork" into
other threads.
If the target event is unknown for some reason (maybe it isn't
visible to your homeserver), but is the target of other events
it is assumed that the thread can be created from it. Otherwise,
it is rejected as an unknown event.
Otherwise I get this beautiful stacktrace:
```
python3 -m synapse.app.homeserver --config-path /etc/matrix/homeserver.yaml
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/synapse/synapse/app/homeserver.py", line 455, in <module>
main()
File "/root/synapse/synapse/app/homeserver.py", line 445, in main
hs = setup(sys.argv[1:])
File "/root/synapse/synapse/app/homeserver.py", line 345, in setup
config = HomeServerConfig.load_or_generate_config(
File "/root/synapse/synapse/config/_base.py", line 671, in load_or_generate_config
config_dict = read_config_files(config_files)
File "/root/synapse/synapse/config/_base.py", line 717, in read_config_files
yaml_config = yaml.safe_load(file_stream)
File "/root/synapse/env/lib/python3.8/site-packages/yaml/__init__.py", line 125, in safe_load
return load(stream, SafeLoader)
File "/root/synapse/env/lib/python3.8/site-packages/yaml/__init__.py", line 81, in load
return loader.get_single_data()
File "/root/synapse/env/lib/python3.8/site-packages/yaml/constructor.py", line 49, in get_single_data
node = self.get_single_node()
File "/root/synapse/env/lib/python3.8/site-packages/yaml/composer.py", line 36, in get_single_node
document = self.compose_document()
File "/root/synapse/env/lib/python3.8/site-packages/yaml/composer.py", line 55, in compose_document
node = self.compose_node(None, None)
File "/root/synapse/env/lib/python3.8/site-packages/yaml/composer.py", line 84, in compose_node
node = self.compose_mapping_node(anchor)
File "/root/synapse/env/lib/python3.8/site-packages/yaml/composer.py", line 133, in compose_mapping_node
item_value = self.compose_node(node, item_key)
File "/root/synapse/env/lib/python3.8/site-packages/yaml/composer.py", line 82, in compose_node
node = self.compose_sequence_node(anchor)
File "/root/synapse/env/lib/python3.8/site-packages/yaml/composer.py", line 110, in compose_sequence_node
while not self.check_event(SequenceEndEvent):
File "/root/synapse/env/lib/python3.8/site-packages/yaml/parser.py", line 98, in check_event
self.current_event = self.state()
File "/root/synapse/env/lib/python3.8/site-packages/yaml/parser.py", line 379, in parse_block_sequence_first_entry
return self.parse_block_sequence_entry()
File "/root/synapse/env/lib/python3.8/site-packages/yaml/parser.py", line 384, in parse_block_sequence_entry
if not self.check_token(BlockEntryToken, BlockEndToken):
File "/root/synapse/env/lib/python3.8/site-packages/yaml/scanner.py", line 116, in check_token
self.fetch_more_tokens()
File "/root/synapse/env/lib/python3.8/site-packages/yaml/scanner.py", line 227, in fetch_more_tokens
return self.fetch_alias()
File "/root/synapse/env/lib/python3.8/site-packages/yaml/scanner.py", line 610, in fetch_alias
self.tokens.append(self.scan_anchor(AliasToken))
File "/root/synapse/env/lib/python3.8/site-packages/yaml/scanner.py", line 922, in scan_anchor
raise ScannerError("while scanning an %s" % name, start_mark,
yaml.scanner.ScannerError: while scanning an alias
in "/etc/matrix/homeserver.yaml", line 614, column 5
expected alphabetic or numeric character, but found '.'
in "/etc/matrix/homeserver.yaml", line 614, column 6
```
Signed-off-by: Nicolai Søborg <git@xn--sb-lka.org>
It already seems to pass mypy. I wonder what changed, given that it was
on the exclusion list. So this commit consists of me ensuring
`--disallow-untyped-defs` passes and a minor fixup to a function that
returned either `True` or `None`.
* Add support for the stable version of MSC2778
Signed-off-by: Tulir Asokan <tulir@maunium.net>
* Expect m.login.application_service in login and password provider tests
Signed-off-by: Tulir Asokan <tulir@maunium.net>
* remove unused tables room_stats_historical and user_stats_historical
* update changelog number
* Bump schema compat version comment
* make linter happy
* Update comment to give more info
Co-authored-by: reivilibre <oliverw@matrix.org>
Co-authored-by: reivilibre <oliverw@matrix.org>
* Prefer `HTTPStatus` over plain `int`
This is an Opinion that no-one has seemed to object to yet.
* `--disallow-untyped-defs` for `tests.rest.client.test_directory`
* Improve synapse's annotations for deleting aliases
* Test case for deleting a room alias
* Changelog
* change display names/avatar URLS to None if they contain null bytes
* add changelog
* add POC test, requested changes
* add a saner test and remove old one
* update test to verify that display name has been changed to None
* make test less fragile
* Make DataStore inherit from EventForwardExtremitiesStore before CacheInvalidationWorkerStore
the former implicitly inherits from the latter, so they should be
ordered like this when used.
* Annotate HomeserverTestCase.servlets
* Correct annotation of federation_auth_origin
* Use AnyStr custom_headers instead of a Union
This allows (str, str) and (bytes, bytes).
This disallows (str, bytes) and (bytes, str)
* DomainSpecificString.SIGIL is a ClassVar
==============================
This fixes an issue with publishing the Debian packages for 1.47.0rc1.
It is otherwise identical to 1.47.0rc1.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8SRSDO7gYkSP4chELS76LzL74EcFAmGLl4sACgkQLS76LzL7
4EdpXxAArEBqEWCUCy6wSNRfexzI+qAITzhhqR5BiDDtkt33GzXLWDN83lT3kCS/
xNZyhiwwUbtt/TqkZ0/Tqu2afI5JZCizpP/kXLVS2WA03jY8+l+eQbKEQR5vEsEV
752J8OVJ9GUunewOI4Uo4xRqndzvOMQBKaoPzyq44PcFd6lS2rkAnJYstVnow+rB
JvRIXFjOocaOzpemal5Mh8ToH5Y6yfe7MEE8B0s3PjX1FMAv87475X1oLaK7GeKn
3yf8XJ1mJcvJfkHuopqX8PxW+pFGorb4N1AFk2BikxB4XV0nCo5VHUejSWaEP4oH
uHtTYDV5RHwHhZyHYMbBOyPD2bSIKZ6wXqgn/o5io7mfsUS77SdwuGhRDqnupCi/
Vg+4F5e1nuSPaLEJd5Qfb1NM2xErGm8gfYN3DlRBzpuQ+Jx/034YWds42ZuNv2IO
qdAOeztiOMdrPkdPTL+XlNNXV80waMsQFt2EaycYkYmPVtgvlKvr/c/Wg06jyD7u
dll33KlE8d+jM3JbOf6D/Ze5ECQlf0gGtCukXTMIEOorrCdQuLH4R8hb49YqmiLq
nL8aPXCv1pZOrMbHGbcYfHoZIV+qMhUK04PN7jMLF/+nyBkTI1wRODN0lvBh8gy6
dNWTzM/dMk5IvFY0FDipkR7Cv5c2U2Wu1xCy8026VHCu/DIzVRw=
=Baxl
-----END PGP SIGNATURE-----
Merge tag 'v1.47.0rc2' into develop
Synapse 1.47.0rc2 (2021-11-10)
==============================
This fixes an issue with publishing the Debian packages for 1.47.0rc1.
It is otherwise identical to 1.47.0rc1.
Co-authored-by: Dirk Klimpel <5740567+dklimpel@users.noreply.github.com>
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
* Make lock better handle process being killed
If the process gets killed and restarted (so that it didn't have a
chance to drop its locks gracefully) then there may still be locks in
the DB that are for the same instance that haven't yet timed out but are
safe to delete.
We handle this case by a) checking if the current instance already has
taken out the lock, and b) if not then ignoring locks that are for the
same instance.
* Periodically check for old staged events
This is to protect against other instances dying and their locks timing
out.
* Remove unused Vagrant scripts
* Change package Architecture to any
* Preinstall the wheel package when building venvs.
Addresses the following warnings during Debian builds:
Using legacy 'setup.py install' for jaeger-client, since package 'wheel' is not installed.
Using legacy 'setup.py install' for matrix-synapse-ldap3, since package 'wheel' is not installed.
Using legacy 'setup.py install' for opentracing, since package 'wheel' is not installed.
Using legacy 'setup.py install' for psycopg2, since package 'wheel' is not installed.
Using legacy 'setup.py install' for systemd-python, since package 'wheel' is not installed.
Using legacy 'setup.py install' for pympler, since package 'wheel' is not installed.
Using legacy 'setup.py install' for threadloop, since package 'wheel' is not installed.
Using legacy 'setup.py install' for thrift, since package 'wheel' is not installed.
* Allow /etc/default/matrix-synapse to be missing
Per the systemd.exec manpage, prefixing an EnvironmentFile with "-":
> indicates that if the file does not exist, it will not be read and no
> error or warning message is logged.
Signed-off-by: Dan Callahan <danc@element.io>
When an event fetcher aborts due to an exception, `_event_fetch_ongoing`
must be decremented, otherwise the event fetcher would never be
replaced. If enough event fetchers were to fail, no more events would be
fetched and requests would get stuck waiting for events.
* add code to handle missing content-type header and a test to verify that it works
* add handling for missing content-type in the /upload endpoint as well
* slightly refactor test code to put private method in approriate place
* handle possible null value for content-type when pulling from the local db
* add changelog
* refactor test and add code to handle missing content-type in cached remote media
* requested changes
* Update changelog.d/11200.bugfix
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* Docker image: avoid changing user during `generate`
The intention was always that the config files get written as the initial user
(normally root) - only the data directory needs to be writable by Synapse. This
got changed in https://github.com/matrix-org/synapse/pull/5970, but that seems
to have been a mistake.
* Avoid changing user if no explicit UID is given
* changelog
* Labeled a lot more code blocks with the appropriate type
* Fixed a couple of minor typos (missing/extraneous commas)
Signed-off-by: Sumner Evans <me@sumnerevans.com>
* add tests for fetching key locally
* add logic to check if origin server is same as host and fetch verify key locally rather than over federation
* add changelog
* slight refactor, add docstring, change changelog entry
* Make changelog entry one line
* remove verify_json_locally and push locality check to process_request, add function process_request_locally
* remove leftover code reference
* refactor to add common call to 'verify_json and associated handling code
* add type hint to process_json
* add some docstrings + very slight refactor
* Teach MyPy that the sentinel context is False
This means that if `ctx: LoggingContextOrSentinel`
then `bool(ctx)` narrows us to `ctx:LoggingContext`, which is a really
neat find!
* Annotate RequestMetrics
- Raise errors for sentry if we use the sentinel context
- Ensure we don't raise an error and carry on, but not recording stats
- Include stack trace in the error case to lower Sean's blood pressure
* Make mypy pass for synapse.http.request_metrics
* Make synapse.http.connectproxyclient pass mypy
Co-authored-by: reivilibre <oliverw@matrix.org>
Users admin API can now also modify user
type in addition to allowing it to be
set on user creation.
Signed-off-by: Jason Robinson <jasonr@matrix.org>
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
This is the final piece of the jigsaw for #9595. As with other changes before this one (eg #10771), we need to make sure that we auth the auth events in the right order, and actually check that their predecessors haven't been rejected.
To do this I've reused the existing code we use when persisting outliers elsewhere.
I've removed the code for attempting to fetch missing auth_events - the events should have been present in the send_join response, so the likely reason they are missing is that we couldn't verify them, so requesting them again is unlikely to help. Instead, we simply drop any state which relies on those auth events, as we do at a backwards-extremity. See also matrix-org/complement#216 for a test for this.
`synapse.config.__main__` has the possibility to read a config item. This can be used to conveniently also validate the config is valid before trying to start Synapse.
The "read" command broke in https://github.com/matrix-org/synapse/pull/10916 as it now requires passing in "server.server_name" for example.
Also made the read command optional so one can just call this with just the confirm file reference and get a "Config parses OK" if things are ok.
Signed-off-by: Jason Robinson <jasonr@matrix.org>
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
* We only need to fetch users in private rooms
* Filter out `user_id` at the top
* Discard excluded users in the top loop
We weren't doing this in the "First, if they're our user" branch so this
is a bugfix.
* The caller must check that `user_id` is included
This is in the docstring. There are two call sites:
- one in `_handle_room_publicity_change`, which explicitly checks before calling;
- and another in `_handle_room_membership_event`, which returns early if
the user is excluded.
So this change is safe.
* Test joining a private room with an excluded user
* Tweak an existing test
* Changelog
* test docstring
* lint
If we find ourselves dealing with rejected events, we proably want to know
about it. Let's include it in the stringification of the event so that it gets
logged.
Currently, when we receive an event whose auth_events differ from those we expect, we state-resolve between the two state sets, and check that the event passes auth based on the resolved state.
This means that it's possible for us to accept events which don't pass auth at their declared auth_events (or where the auth events themselves were rejected), leading to problems down the line like #10083.
This change means we will:
* ignore any events where we cannot find the auth events
* reject any events whose auth events were rejected
* reject any events which do not pass auth at their declared auth_events.
Together with a whole raft of previous work, this is a partial fix to #9595.
Fixes#6643.
Based on #11009.
This fixes a bug where we would accept an event whose `auth_events` include
rejected events, if the rejected event was shadowed by another `auth_event`
with same `(type, state_key)`.
The approach is to pass a list of auth events into
`check_auth_rules_for_event` instead of a dict, which of course means updating
the call sites.
This is an extension of #10956.
Instead of triggering `__exit__` manually on the replication handler's
logging context, use it as a context manager so that there is an
`__enter__` call to balance the `__exit__`.
Found while working on the Gitter backfill script and noticed
it only happened after we sent 7 batches, https://gitlab.com/gitterHQ/webapp/-/merge_requests/2229#note_665906390
When there are more than 5 backward extremities for a given depth,
backfill will throw an error because we sliced the extremity list
to 5 but then try to iterate over the full list. This causes
us to look for state that we never fetched and we get a `KeyError`.
Before when calling `/messages` when there are more than 5 backward extremities:
```
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 258, in _async_render_wrapper
callback_return = await self._async_render(request)
File "/usr/local/lib/python3.8/site-packages/synapse/http/server.py", line 446, in _async_render
callback_return = await raw_callback_return
File "/usr/local/lib/python3.8/site-packages/synapse/rest/client/room.py", line 580, in on_GET
msgs = await self.pagination_handler.get_messages(
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/pagination.py", line 396, in get_messages
await self.hs.get_federation_handler().maybe_backfill(
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 133, in maybe_backfill
return await self._maybe_backfill_inner(room_id, current_depth, limit)
File "/usr/local/lib/python3.8/site-packages/synapse/handlers/federation.py", line 386, in _maybe_backfill_inner
likely_extremeties_domains = get_domains_from_state(states[e_id])
KeyError: '$zpFflMEBtZdgcMQWTakaVItTLMjLFdKcRWUPHbbSZJl'
```
==============================
**Note:** This release candidate [fixes](https://github.com/matrix-org/synapse/issues/11053) the user directory [bug](https://github.com/matrix-org/synapse/issues/11025) present in 1.45.0rc1. However, the [performance issue](https://github.com/matrix-org/synapse/issues/11049) which appeared in v1.44.0 is yet to be resolved.
Bugfixes
--------
- Fix a long-standing bug when using multiple event persister workers where events were not correctly sent down `/sync` due to a race. ([\#11045](https://github.com/matrix-org/synapse/issues/11045))
- Fix a bug introduced in Synapse 1.45.0rc1 where the user directory would stop updating if it processed an event from a
user not in the `users` table. ([\#11053](https://github.com/matrix-org/synapse/issues/11053))
- Fix a bug introduced in Synapse v1.44.0 when logging errors during oEmbed processing. ([\#11061](https://github.com/matrix-org/synapse/issues/11061))
Internal Changes
----------------
- Add an 'approximate difference' method to `StateFilter`. ([\#10825](https://github.com/matrix-org/synapse/issues/10825))
- Fix inconsistent behavior of `get_last_client_by_ip` when reporting data that has not been stored in the database yet. ([\#10970](https://github.com/matrix-org/synapse/issues/10970))
- Fix a bug introduced in Synapse 1.21.0 that causes opentracing and Prometheus metrics for replication requests to be measured incorrectly. ([\#10996](https://github.com/matrix-org/synapse/issues/10996))
- Ensure that cache config tests do not share state. ([\#11036](https://github.com/matrix-org/synapse/issues/11036))
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE1508oLYUKainYFJakD7OEIo53t0FAmFoBHYACgkQkD7OEIo5
3t2+wRAAlk2GR9Ezi8E/AG6f5Dsq0+guswHK9b34PtI0wcLQFgy76y200g8D/V0F
XFdby4IZsSLVmP2EVF+vIaU0/zviClHR81b8pvNcIRxORBAhV6fH5JkCRPNAlurS
aGnoKA1TSSg4HwNHn+dWyg7VCHl76oLEBhixRdz7aWPhLnCNkCM62E+jlwVq19BH
lPIDcwEvtq2ovHIGyE/sTg6Shrf9ELy0mYMQpPJNfDmxqjWqRhKaSpx5JSEJnzih
amONFY7Nc8kIUdP6JSR6105IhUDAnItScmyyUowyCvwtAQsXsgMd8Cs0FdiJ0JLy
BtE9qsBmBpmVWjfk7ESXE1eTHqH9Eexsy5dcB43hiVZ3ihhCCLmtVANej3QaRnRL
rYAiH+W9WK7n3189hzkmEG/BKk+zu65dLaHSMjBvEDMyiYLoPRlYnmunLeo6mESq
HHa25o5WiY8+UQ4RgcYqScIqvlgFqODmZT21OlknZfupUd2kDCN7Ga7TY599sBp2
XSGNB6l/eGWRWtOPcdrandsDnppcoYsfJkXIfAUIxwjnSN/RU4kDPDiYDXhU8d+D
hW/f59XwfMy3D3zZI4a+pEY8ZhcSHTnLJni95UFyb/7zOwnT+pZXUfVX46727nxg
imUfXsXIxUYQnDkAGBXdt7QMZPy4gvTGenllqaA987xjfd5/fKs=
=TTLK
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE1508oLYUKainYFJakD7OEIo53t0FAmFoCZQACgkQkD7OEIo5
3t2Ygw/9E48TTeW7F5j16aCVVmRYnlAoh+rBlrKiaIE4MSJtwI+TEdIIrwGm+JdB
lh59N3YmkXrn2No4ew0UY9fjD8LgofX3GAmlP2vN/fHst9kfe/UtoIDHWdQG1Xhh
xeHnt9vXirQMOLxapkDM4ymBW1WAQdvgdn3Ozy3hQ+ZUQiMqJg4CQVMb7mUpjdJz
D6Jb0Ub5FGnCGcSrofBk9KvHSZQb9Qrp8Rz3kokib98/+eJGdJDXhf0Yg4junJDy
mOvTAhbJYG9TS3taFajuw8IfcngJYyBMUOf7HlcqMtyEPdHFW2wClUHzeQIPOurU
YQmpqpDO7LSrHhFOt3cdmIlCGpGxKgR9AIUn8Volb2pxwH2i/BA7ryP9ZKz9dht3
VnPFLEHYumBYOAB05TsUkDlpr/0x3YS1JCgzHGABM8spsi8XO/ZPRkwpjdc/jSRB
3A4I7WGpKddk7pY3mCOnDx8hm2asgTPgTHkvJr2v2wYndIHxW91BDKhhSZ4tzjIY
aOr2CtJmoFb3Buek/s1S2AGVLRXMYa/IIRHl8FjmL7WhsqebJkcRfcaMfHHtCq5Y
tLhK2CYavEOu8TrkFWyIPPCCn8Y6Y8qnfg8c8f1ncfHsVHE4sOs9hjsVVz+n4QBd
LD+ICWBeLvpGQK+0cZu3QFtIyo6XKBJlYuAyLD5saNkuLnupA3Y=
=A9jl
-----END PGP SIGNATURE-----
Merge tag 'v1.45.0rc2' into develop
Synapse 1.45.0rc2 (2021-10-14)
==============================
**Note:** This release candidate [fixes](https://github.com/matrix-org/synapse/issues/11053) the user directory [bug](https://github.com/matrix-org/synapse/issues/11025) present in 1.45.0rc1. However, the [performance issue](https://github.com/matrix-org/synapse/issues/11049) which appeared in v1.44.0 is yet to be resolved.
Bugfixes
--------
- Fix a long-standing bug when using multiple event persister workers where events were not correctly sent down `/sync` due to a race. ([\#11045](https://github.com/matrix-org/synapse/issues/11045))
- Fix a bug introduced in Synapse 1.45.0rc1 where the user directory would stop updating if it processed an event from a
user not in the `users` table. ([\#11053](https://github.com/matrix-org/synapse/issues/11053))
- Fix a bug introduced in Synapse v1.44.0 when logging errors during oEmbed processing. ([\#11061](https://github.com/matrix-org/synapse/issues/11061))
Internal Changes
----------------
- Add an 'approximate difference' method to `StateFilter`. ([\#10825](https://github.com/matrix-org/synapse/issues/10825))
- Fix inconsistent behavior of `get_last_client_by_ip` when reporting data that has not been stored in the database yet. ([\#10970](https://github.com/matrix-org/synapse/issues/10970))
- Fix a bug introduced in Synapse 1.21.0 that causes opentracing and Prometheus metrics for replication requests to be measured incorrectly. ([\#10996](https://github.com/matrix-org/synapse/issues/10996))
- Ensure that cache config tests do not share state. ([\#11036](https://github.com/matrix-org/synapse/issues/11036))
Resolve and share `state_groups` for all historical events in batch. This also helps for showing the appropriate avatar/displayname in Element and will work whenever `/messages` has one of the historical messages as the first message in the batch.
This does have the flaw where if you just insert a single historical event somewhere, it probably won't resolve the state correctly from `/messages` or `/context` since it will grab a non historical event above or below with resolved state which never included the historical state back then. For the same reasions, this also does not work in Element between the transition from actual messages to historical messages. In the Gitter case, this isn't really a problem since all of the historical messages are in one big lump at the beginning of the room.
For a future iteration, might be good to look at `/messages` and `/context` to additionally add the `state` for any historical messages in that batch.
---
How are the `state_groups` shared? To illustrate the `state_group` sharing, see this example:
**Before** (new `state_group` for every event 😬, very inefficient):
```
# Tests from https://github.com/matrix-org/complement/pull/206
$ COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh TestBackfillingHistory/parallel/should_resolve_member_state_events_for_historical_events
create_new_client_event m.room.member event=$_JXfwUDIWS6xKGG4SmZXjSFrizhARM7QblhATVWWUcA state_group=None
create_new_client_event org.matrix.msc2716.insertion event=$1ZBfmBKEjg94d-vGYymKrVYeghwBOuGJ3wubU1-I9y0 state_group=9
create_new_client_event org.matrix.msc2716.insertion event=$Mq2JvRetTyclPuozRI682SAjYp3GqRuPc8_cH5-ezPY state_group=10
create_new_client_event m.room.message event=$MfmY4rBQkxrIp8jVwVMTJ4PKnxSigpG9E2cn7S0AtTo state_group=11
create_new_client_event m.room.message event=$uYOv6V8wiF7xHwOMt-60d1AoOIbqLgrDLz6ZIQDdWUI state_group=12
create_new_client_event m.room.message event=$PAbkJRMxb0bX4A6av463faiAhxkE3FEObM1xB4D0UG4 state_group=13
create_new_client_event org.matrix.msc2716.batch event=$Oy_S7AWN7rJQe_MYwGPEy6RtbYklrI-tAhmfiLrCaKI state_group=14
```
**After** (all events in batch sharing `state_group=10`) (the base insertion event has `state_group=8` which matches the `prev_event` we're inserting next to):
```
# Tests from https://github.com/matrix-org/complement/pull/206
$ COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh TestBackfillingHistory/parallel/should_resolve_member_state_events_for_historical_events
create_new_client_event m.room.member event=$PWomJ8PwENYEYuVNoG30gqtybuQQSZ55eldBUSs0i0U state_group=None
create_new_client_event org.matrix.msc2716.insertion event=$e_mCU7Eah9ABF6nQU7lu4E1RxIWccNF05AKaTT5m3lw state_group=9
create_new_client_event org.matrix.msc2716.insertion event=$ui7A3_GdXIcJq0C8GpyrF8X7B3DTjMd_WGCjogax7xU state_group=10
create_new_client_event m.room.message event=$EnTIM5rEGVezQJiYl62uFBl6kJ7B-sMxWqe2D_4FX1I state_group=10
create_new_client_event m.room.message event=$LGx5jGONnBPuNhAuZqHeEoXChd9ryVkuTZatGisOPjk state_group=10
create_new_client_event m.room.message event=$wW0zwoN50lbLu1KoKbybVMxLbKUj7GV_olozIc5i3M0 state_group=10
create_new_client_event org.matrix.msc2716.batch event=$5ZB6dtzqFBCEuMRgpkU201Qhx3WtXZGTz_YgldL6JrQ state_group=10
```
* Pull out `_handle_room_membership_event`
* Discard excluded users early
* Rearrange logic so the change is membership is effectively switched over. See PR for rationale.
The following scenarios would halt the user directory updater:
- user joins room
- user leaves room
- user present in room which switches from private to public, or vice versa.
for two classes of users:
- appservice senders
- users missing from the user table.
If this happened, the user directory would be stuck, unable to make forward progress.
Exclude both cases from the user directory, so that we ignore them.
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: reivilibre <oliverw@matrix.org>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
The race allowed the current position to advance too far when stream IDs
are still being persisted.
This happened when it received a new stream ID from a remote write
between a new stream ID being allocated and it being added to the set of
unpersisted stream IDs.
Fixes#9424.
This reverts #11019 and structures the code a bit more like it was before #10985.
The global cache state must be reset before running the tests since other test
cases might have configured caching (and thus touched the global state).
Make `get_last_client_by_ip` return the same dictionary structure
regardless of whether the data has been persisted to the database.
This change will allow slightly cleaner type hints to be applied later
on.
This commit fixes two bugs to do with decorators not instrumenting
`ReplicationEndpoint`'s `send_request` correctly. There are two
decorators on `send_request`: Prometheus' `Gauge.track_inprogress()`
and Synapse's `opentracing.trace`.
`Gauge.track_inprogress()` does not have any support for async
functions when used as a decorator. Since async functions behave like
regular functions that return coroutines, only the creation of the
coroutine was covered by the metric and none of the actual body of
`send_request`.
`Gauge.track_inprogress()` returns a regular, non-async function
wrapping `send_request`, which is the source of the next bug.
The `opentracing.trace` decorator would normally handle async functions
correctly, but since the wrapped `send_request` is a non-async function,
the decorator ends up suffering from the same issue as
`Gauge.track_inprogress()`: the opentracing span only measures the
creation of the coroutine and none of the actual function body.
Using `Gauge.track_inprogress()` as a context manager instead of a
decorator resolves both bugs.
Updating mypy past version 0.9 means that third-party stubs are no-longer distributed with typeshed. See http://mypy-lang.blogspot.com/2021/06/mypy-0900-released.html for details.
We therefore pull in stub packages in setup.py
Additionally, some modules that we were previously ignoring import failures for now have stubs. So let's use them.
The rest of this change consists of fixups to make the newer mypy + stubs pass CI.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
This splits apart `handle_new_user` into a function which adds an entry to the `user_directory` and a function which updates the room sharing tables. I plan to continue doing more of this kind of refactoring to clarify the implementation.
The shared ratelimit function was replaced with a dedicated
RequestRatelimiter class (accessible from the HomeServer
object).
Other properties were copied to each sub-class that inherited
from BaseHandler.
Use `PreserveLoggingContext()` to ensure that logging contexts are not
lost when exiting a read/write lock.
When exiting a read/write lock, callbacks on a `Deferred` are triggered
as a signal to any waiting coroutines. Any waiting coroutine that
becomes runnable is likely to follow the Synapse logging context rules
and will restore its own logging context, then either run to completion
or await another `Deferred`, resetting the logging context in the
process.
This removes the magic allowing accessing configurable
variables directly from the config object. It is now required
that a specific configuration class is used (e.g. `config.foo`
must be replaced with `config.server.foo`).
Fix a long-standing bug where a batch of user directory changes would be
silently dropped if the server left a room early in the batch.
* Pull out `wait_for_background_update` in tests
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
The following modules now pass `disallow_untyped_defs`:
* synapse.util.caches.cached_call
* synapse.util.caches.lrucache
* synapse.util.caches.response_cache
* synapse.util.caches.stream_change_cache
* synapse.util.caches.ttlcache pass
* synapse.util.daemonize
* synapse.util.patch_inline_callbacks pass `no-untyped-defs`
* synapse.util.versionstring
Additional typing in synapse.util.metrics. Didn't get this to pass `no-untyped-defs`, think I'll need to watch #10847
There are two steps to rebuilding the user directory:
1. a scan over rooms, followed by
2. a scan over local users.
The former reads avatars and display names from the `room_memberships`
table and therefore contains potentially private avatars and
display names. The latter reads from the the `profiles` table which only
contains public data; moreover it will overwrite any private profiles
that the rooms scan may have written to the user directory. This means
that the rebuild could leak private user while the rebuild was in
progress, only to later cover up the leaks once the rebuild had completed.
This change skips over local users when writing user_directory rows
when scanning rooms. Doing so means that it'll take longer for a rebuild
to make local users searchable, which is unfortunate. I think a future
PR can improve this by swapping the order of the two steps above. (And
indeed there's more to do here, e.g. copying from `profiles` without
going via Python.)
Small tidy-ups while I'm here:
* Remove duplicated code from test_initial. This was meant to be pulled into `purge_and_rebuild_user_dir`.
* Move `is_public` before updating sharing tables. No functional change; it's still before the first read of `is_public`.
* Don't bother creating a set from dict keys. Slightly nicer and makes the code simpler.
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
We correctly allowed using the MSC2716 batch endpoint for
the room creator in existing room versions but accidentally didn't track
the events because of a logic flaw.
This prevented you from connecting subsequent chunks together because it would
throw the unknown batch ID error.
We only want to process MSC2716 events when:
- The room version supports MSC2716
- Any room where the homeserver has the `msc2716_enabled` experimental feature enabled and the event is from the room creator
`_check_event_auth` is only called in two places, and only one of those sets
`send_on_behalf_of`. Warming the cache isn't really part of auth anyway, so
moving it out makes a lot more sense.
There's little point in doing a fancy state reconciliation dance if the event
itself is invalid.
Likewise, there's no point checking it again in `_check_for_soft_fail`.
* add test
* add function to remove user from monthly active table in deactivate code
* add function to remove user from monthly active table
* add changelog entry
* update changelog number
* requested changes
* update docstring on new function
* fix lint error
* Update synapse/storage/databases/main/monthly_active_users.py
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
==============================
Bugfixes
--------
- Fix a bug introduced in Synapse v1.40.0 where changing a user's display name or avatar in a restricted room would cause an authentication error. ([\#10933](https://github.com/matrix-org/synapse/issues/10933))
- Fix `/admin/whois/{user_id}` endpoint, which was broken in v1.44.0rc1. ([\#10968](https://github.com/matrix-org/synapse/issues/10968))
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdVkXOgzrGzds0jtrHgFcFF8ZFs0FAmFbCRUACgkQHgFcFF8Z
Fs1n2Q/+P6WZmqNPEyPb4zgEhHgBZtAiYsS8i+6TBdu/viricfe3xmiifmKYyblV
C07a8Qu14OMgmXKYR8AY0HvLRx+zdhJFGuYa3I0+yY1GTlQWlf+OKnQtsbgdmJ6O
b8HbmGB1qJCJ0rbbWhCQvx/XXlqqMIATc6qJj3HTwOpWxNL8TymOFg6TGtb+rLkN
/s2NfxWPJqKMKTLVgZUVjkNGBBtJmu/+Ow3PYz7J9hEW69REONed/014wVgiWx+W
BJOoUj8dHAS5ZhdKWiSKAzl5A9FQyxUPDwkO44wsK5OIu+dVy4eF9HCMi0EP/9nT
G3VWwr3z/TA55foL8XdrIPdp1SFqJmIEwDLgaibISD1/8MoC15YzAkt4CoKYOsL6
EQtDDQal1BNSFZPITz7PWSFGOMgN3tKfMQUqCC+eagHvNfSxVH+J1zNg7ve2/24h
PbRU/tt4mJiPu7M2Ejj0EWNHyI2fp92ARzzAzQ0JlrPJe34Z/hAiqf7w4kjtJ7Ew
Lm890EAw2azo7RYU9xevkOsU2CEtLEnKsGMW8pJc9eRxUjRKfk/EW39dCQq9Myhu
uQFeYHALcn4vu9jGhGyoO9fJDKIxpM76h37Cwu7shg84Gp2ZwucSjwW2l2rZKiIS
YUeruLOavUueUZYTcyXxAqAAsH8z1hbKmnXIbNxFrcu+NOl+o84=
=HR3N
-----END PGP SIGNATURE-----
Merge tag 'v1.44.0rc3' into develop
Synapse 1.44.0rc3 (2021-10-04)
==============================
Bugfixes
--------
- Fix a bug introduced in Synapse v1.40.0 where changing a user's display name or avatar in a restricted room would cause an authentication error. ([\#10933](https://github.com/matrix-org/synapse/issues/10933))
- Fix `/admin/whois/{user_id}` endpoint, which was broken in v1.44.0rc1. ([\#10968](https://github.com/matrix-org/synapse/issues/10968))
* Introduce `should_include_local_users_in_dir`
We exclude three kinds of local users from the user_directory tables. At
present we don't consistently exclude all three in the same places. This
commit introduces a new function to gather those exclusion conditions
together. Because we have to handle local and remote users in different
ways, I've made that function only consider the case of remote users.
It's the caller's responsibility to make the local versus remote
distinction clear and correct.
A test fixup is required. The test now hits a path which makes db
queries against the users table. The expected rows were missing, because
we were using a dummy user that hadn't actually been registered.
We also add new test cases to covert the exclusion logic.
----
By my reading this makes these changes:
* When an app service user registers or changes their profile, they will
_not_ be added to the user directory. (Previously only support and
deactivated users were excluded). This is consistent with the logic that
rebuilds the user directory. See also [the discussion
here](https://github.com/matrix-org/synapse/pull/10914#discussion_r716859548).
* When rebuilding the directory, exclude support and disabled users from
room sharing tables. Previously only appservice users were excluded.
* Exclude all three categories of local users when rebuilding the
directory. Previously `_populate_user_directory_process_users` didn't do
any exclusion.
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
This fixes a "Event not signed by authorising server" error when
transition room member from join -> join, e.g. when updating a
display name or avatar URL for restricted rooms.
This fixes a "Event not signed by authorising server" error when
transition room member from join -> join, e.g. when updating a
display name or avatar URL for restricted rooms.
This follows a correction made in twisted/twisted#1664 and should fix our Twisted Trial CI job.
Until that change is in a twisted release, we'll have to ignore the type
of the `host` argument. I've raised #10899 to remind us to review the
issue in a few months' time.
Fix event context for outlier causing failures in all of the MSC2716
Complement tests.
The `EventContext.for_outlier` refactor happened in
https://github.com/matrix-org/synapse/pull/10883
and this spot was left out.
* Pull out GetUserDirectoryTables helper
* Don't rebuild the dir in tests that don't need it
In #10796 I changed registering a user to add directory entries under.
This means we don't have to force a directory regbuild in to tests of
the user directory search.
* Move test_initial to tests/storage
* Add type hints to both test_user_directory files
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Broadly, the existing `event_auth.check` function has two parts:
* a validation section: checks that the event isn't too big, that it has the rught signatures, etc.
This bit is independent of the rest of the state in the room, and so need only be done once
for each event.
* an auth section: ensures that the event is allowed, given the rest of the state in the room.
This gets done multiple times, against various sets of room state, because it forms part of
the state res algorithm.
Currently, this is implemented with `do_sig_check` and `do_size_check` parameters, but I think
that makes everything hard to follow. Instead, we split the function in two and call each part
separately where it is needed.
Before Synapse 1.31 (#9411), we relied on `outlier` being stored in the
`internal_metadata` column. We can now assume nobody will roll back their
deployment that far and drop the legacy support.
* Inline `_check_event_auth` for outliers
When we are persisting an outlier, most of `_check_event_auth` is redundant:
* `_update_auth_events_and_context_for_auth` does nothing, because the
`input_auth_events` are (now) exactly the event's auth_events,
which means that `missing_auth` is empty.
* we don't care about soft-fail, kicking guest users or `send_on_behalf_of`
for outliers
... so the only thing that matters is the auth itself, so let's just do that.
* `_auth_and_persist_fetched_events_inner`: de-async `prep`
`prep` no longer calls any `async` methods, so let's make it synchronous.
* Simplify `_check_event_auth`
We no longer need to support outliers here, which makes things rather simpler.
* changelog
* lint