This implements an allow list for content types for which Synapse will attempt URL preview. If a URL resolves to a resource with a content type which isn't in the list, the download will terminate immediately.
This makes sense given that Synapse would never successfully generate a URL preview for such files in the first place, and helps prevent issues with streaming media servers, such as #8302.
Signed-off-by: Denis Kasak dkasak@termina.org.uk
Part of the Tchap Synapse mainlining.
This allows modules to implement extra logic to figure out whether a given 3PID can be added to the local homeserver. In the Tchap use case, this will allow a Synapse module to interface with the custom endpoint /internal_info.
`start_active_span` was inconsistent as to whether it would activate the span
immediately, or wait for `scope.__enter__` to happen (it depended on whether
the current logcontext already had an associated scope). The inconsistency was
rather confusing if you were hoping to set up a couple of separate spans before
activating either.
Looking at the other implementations of opentracing `ScopeManager`s, the
intention is that it *should* be activated immediately, as the name
implies. Indeed, the idea is that you don't have to use the scope as a
contextmanager at all - you can just call `.close` on the result. Hence, our
cleanup has to happen in `.close` rather than `.__exit__`.
So, the main change here is to ensure that `start_active_span` does activate
the span, and that `scope.close()` does close the scope.
We also add some tests, which requires a `tracer` param so that we don't have
to rely on the global variable in unit tests.
Only allow files which file size and content types match configured
limits to be set as avatar.
Most of the inspiration from the non-test code comes from matrix-org/synapse-dinsic#19
This is in the context of mainlining the Tchap fork of Synapse. Currently in Tchap usernames are derived from the user's email address (extracted from the UIA results, more specifically the m.login.email.identity step).
This change also exports the check_username method from the registration handler as part of the module API, so that a module can check if the username it's trying to generate is correct and doesn't conflict with an existing one, and fallback gracefully if not.
Co-authored-by: David Robertson <davidr@element.io>
This is some odds and ends found during the review of #11791
and while continuing to work in this code:
* Return attrs classes instead of dictionaries from some methods
to improve type safety.
* Call `get_bundled_aggregations` fewer times.
* Adds a missing assertion in the tests.
* Do not return empty bundled aggregations for an event (preferring
to not include the bundle at all, as the docstring states).
This is mostly motivated by the tchap use case, where usernames are automatically generated from the user's email address (in a way that allows figuring out the email address from the username). Therefore, it's an issue if we respond to requests on /register and /register/available with M_USER_IN_USE, because it can potentially leak email addresses (which include the user's real name and place of work).
This commit adds a flag to inhibit the M_USER_IN_USE errors that are raised both by /register/available, and when providing a username early into the registration process. This error will still be raised if the user completes the registration process but the username conflicts. This is particularly useful when using modules (https://github.com/matrix-org/synapse/pull/11790 adds a module callback to set the username of users at registration) or SSO, since they can ensure the username is unique.
More context is available in the PR that introduced this behaviour to synapse-dinsic: matrix-org/synapse-dinsic#48 - as well as the issue in the matrix-dinsic repo: matrix-org/matrix-dinsic#476
==============================
Bugfixes
--------
- Fix a bug introduced in Synapse 1.40.0 that caused Synapse to fail to process incoming federation traffic after handling a large amount of events in a v1 room. ([\#11806](https://github.com/matrix-org/synapse/issues/11806))
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEgQG31Z317NrSMt0QiISIDS7+X/QFAmHum5QTHGFuZHJld0Bh
bW9yZ2FuLnh5egAKCRCIhIgNLv5f9G6vD/9Dw6V0687vzahnU6aYWaecRSO1sbww
EtcCiXOh0r8HXPAwEIXJYomSTTbPl0eAwP8T1the1WZVQArvRW2VvMzDoBheO8bt
dCm4CTKFCGUsI/GrlFDPtjAEd6kruAETmpHZn7bAFSqtIpissD6FjUwg/ND/NJfM
zVM/bMgM0+js6eD9J5/k16E1ZWj7Lbp+/fKN+qTeQrXzIeT13WQZ4Nz1o/cqe/21
o/coI3FmYq8CzKpfA6qZMDd/OtYpYqwwr7otSmW+6qHeZI/yqeoxlpzfgSZGpRUe
mtXqbQllZQeqrbm8oK96GNmKcThy6awTwJPoD46b1AOkmFdf/lGSqO0lnTQRRPqR
hyPPdrx6Lt2t7DDbVVyUElzkqLPhVJtrItPDC8669sWvmSgsPQdUqIsRmhF+8aJe
ffjRvKbGRICnNZkT+qGf1HwuMHajZyAIHAS/kyFDKUZCvau3VQ1wlyOqVQ1hWr7+
3k3CobjSx4y1bYKXncAK6hWH6lE8M319jaTnVfYXocDLWRonyFf7o286Q+c8WBGF
tY0FzvUPb5S2kktC4WLwcqtcTWK1cu5MI6GfD69EqL7iifJRVyKDeUoD7tiUvgzN
O2wBl2soJIsAU+8y16WQu7p+k2nZIomCCAySnW3C8mlxvv7CKQu0Url8J7IH8uZE
ZVN2MMe7H+ZHJQ==
=Fpsa
-----END PGP SIGNATURE-----
Merge tag 'v1.51.0rc2' into develop
Synapse 1.51.0rc2 (2022-01-24)
==============================
Bugfixes
--------
- Fix a bug introduced in Synapse 1.40.0 that caused Synapse to fail to process incoming federation traffic after handling a large amount of events in a v1 room. ([\#11806](https://github.com/matrix-org/synapse/issues/11806))
Always add state.room_id after the configurable ORDER BY. Otherwise,
for any sort, certain pages can contain results from
other pages. (Especially when sorting by creator, since there may
be many rooms by the same creator)
* Document different order direction of numerical fields
"joined_members", "joined_local_members", "version" and "state_events"
are ordered in descending direction by default (dir=f). Added a note
in tests to explain the differences in ordering.
Signed-off-by: Daniël Sonck <daniel@sonck.nl>
Currently when puppeting another user, the user doing the puppeting is
tracked for client IPs and MAU (if configured).
When tracking MAU is important, it becomes necessary to be possible to
also track the client IPs and MAU of puppeted users. As an example a
client that manages user creation and creation of tokens via the Synapse
admin API, passing those tokens for the client to use.
This PR adds optional configuration to enable tracking of puppeted users
into monthly active users. The default behaviour stays the same.
Signed-off-by: Jason Robinson <jasonr@matrix.org>
* Deal with mypy errors w/ type-hinted pynacl 1.5.0
Fixes#11644.
I really don't like that we're monkey patching pynacl SignedKey
instances with alg and version objects. But I'm too scared to make the
changes necessary right now.
(Ideally I would replace `signedjson.types.SingingKey` with a runtime class which
wraps or inherits from `nacl.signing.SigningKey`.) C.f. https://github.com/matrix-org/python-signedjson/issues/16
* Deal with mypy errors w/ type-hinted pynacl 1.5.0
Fixes#11644.
I really don't like that we're monkey patching pynacl SignedKey
instances with alg and version objects. But I'm too scared to make the
changes necessary right now.
(Ideally I would replace `signedjson.types.SingingKey` with a runtime class which
wraps or inherits from `nacl.signing.SigningKey`.) C.f. https://github.com/matrix-org/python-signedjson/issues/16
By returning all of the m.space.child state of the space, not just
the first 50. The number of rooms returned is still capped at 50.
For the federation API this implies that the requesting server will
need to individually query for any other rooms it is not joined to.
* Optionally use an on-disk sqlite db in tests
When debugging a test it is sometimes useful to inspect the state of the
DB. This is not easy when the db is in-memory: one cannot attach the
sqlite CLI to another process's DB.
With this change, if SYNAPSE_TEST_PERSIST_SQLITE_DB is set, we use
`_trial_temp/test.db` as our sqlite database. One can then use
`sqlite3 _trial_temp/test.db` and query to your heart's content.
The DB is destroyed and recreated between different test cases.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
This makes the serialization of events synchronous (and it no
longer access the database), but we must manually calculate and
provide the bundled aggregations.
Overall this should cause no change in behavior, but is prep work
for other improvements.
* `_auth_and_persist_outliers`: mark persisted events as outliers
Mark any events that get persisted via `_auth_and_persist_outliers` as, well,
outliers.
Currently this will be a no-op as everything will already be flagged as an
outlier, but I'm going to change that.
* `process_remote_join`: stop flagging as outlier
The events are now flagged as outliers later on, by `_auth_and_persist_outliers`.
* `send_join`: remove `outlier=True`
The events created here are returned in the result of `send_join` to
`FederationHandler.do_invite_join`. From there they are passed into
`FederationEventHandler.process_remote_join`, which passes them to
`_auth_and_persist_outliers`... which sets the `outlier` flag.
* `get_event_auth`: remove `outlier=True`
stop flagging the events returned by `get_event_auth` as outliers. This method
is only called by `_get_remote_auth_chain_for_event`, which passes the results
into `_auth_and_persist_outliers`, which will flag them as outliers.
* `_get_remote_auth_chain_for_event`: remove `outlier=True`
we pass all the events into `_auth_and_persist_outliers`, which will now flag
the events as outliers.
* `_check_sigs_and_hash_and_fetch`: remove unused `outlier` parameter
This param is now never set to True, so we can remove it.
* `_check_sigs_and_hash_and_fetch_one`: remove unused `outlier` param
This is no longer set anywhere, so we can remove it.
* `get_pdu`: remove unused `outlier` parameter
... and chase it down into `get_pdu_from_destination_raw`.
* `event_from_pdu_json`: remove redundant `outlier` param
This is never set to `True`, so can be removed.
* changelog
* update docstring
This adds some opentracing annotations to ResponseCache, to make it easier to see what's going on; in particular, it adds a link back to the initial trace which is actually doing the work of generating the response.
* Disable aggregation bundling on `/sync` responses
A partial revert of #11478. This turns out to have had a significant CPU impact
on initial-sync handling. For now, let's disable it, until we find a more
efficient way of achieving this.
* Fix tests.
Co-authored-by: Patrick Cloke <patrickc@matrix.org>
* Splits the logic for parsing HTML from the resource handling code.
* Fix a circular import in the oEmbed code (which uses the HTML parsing code).
* Renames some of the HTML parsing methods to:
* Make it clear which methods are "internal" to the module.
* Clarify what the methods do.
Due to updates to MSC2675 this includes a few fixes:
* Include bundled aggregations for /sync.
* Do not include bundled aggregations for /initialSync and /events.
* Do not bundle aggregations for state events.
* Clarifies comments and variable names.
by calling into `make_test_homeserver_synchronous`.
The function *could* have been inlined at this point but the function is big enough
and it felt fine to leave it as is.
At least there isn't a confusing name clash anymore!
It had no users.
We have just taken the identity of a previous function but don't provide the same
behaviour, so we need to fix this in the next commit...