* Limit AS transactions to 100 events
* Update changelog.d/8606.feature
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
* Add tests
* Update synapse/appservice/scheduler.py
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
I noticed in https://github.com/matrix-org/synapse/issues/8575 that the `end_error` variable in `synapse_port_db` is set to an `Exception`, even though later we expect it to be a `str`.
This PR simply casts an exception raised to a string. I'm doing this instead of having `end_error` be of type exception as we explicitly set `end_error` to a str here:
d25eb8f370/scripts/synapse_port_db (L542-L547)
This whole file could probably use some heavy refactoring, but until then at least this fix will prevent exception contents from being hidden from us and users.
* Add `DeferredCache.get_immediate` method
A bunch of things that are currently calling `DeferredCache.get` are only
really interested in the result if it's completed. We can optimise and simplify
this case.
* Remove unused 'default' parameter to DeferredCache.get()
* another get_immediate instance
* type annotations for LruCache
* changelog
* Apply suggestions from code review
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
* review comments
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
This implements a more standard API for instantiating a homeserver and
moves some of the dependency injection into the test suite.
More concretely this stops using `setattr` on all `kwargs` passed to `HomeServer`.
This PR makes several changes to the `./scripts-dev/lint.sh` script, which lints the codebase with a number of tools:
* Adds usage information, with `-h` flag to show it. Otherwise it will show when providing an unknown flag.
* Adds option `-d` which will check both staged and unstaged files that have changed since the last commit and add them to the list of files to lint.
- Note that only files without an extension, or with a `.py` extension will be allowed. This prevents editing bash scripts causing the linters to break on non-python files.
* Improves the print-out of which files/directories are being linted.
We asserted that the IDs returned by postgres sequence was greater than
any we had seen, however this is technically racey as we may update the
current positions out of order.
We now assert that the sequences are correct on startup, so the
assertion is no longer really required, so we remove them.
Autocommit means that we don't wrap the functions in transactions, and instead get executed directly. Introduced in #8456. This will help:
1. reduce the number of `could not serialize access due to concurrent delete` errors that we see (though there are a few functions that often cause serialization errors that we don't fix here);
2. improve the DB performance, as it no longer needs to deal with the overhead of `REPEATABLE READ` isolation levels; and
3. improve wall clock speed of these functions, as we no longer need to send `BEGIN` and `COMMIT` to the DB.
Some notes about the differences between autocommit mode and our default `REPEATABLE READ` transactions:
1. Currently `autocommit` only applies when using PostgreSQL, and is ignored when using SQLite (due to silliness with [Twisted DB classes](https://twistedmatrix.com/trac/ticket/9998)).
2. Autocommit functions may get retried on error, which means they can get applied *twice* (or more) to the DB (since they are not in a transaction the previous call would not get rolled back). This means that the functions need to be idempotent (or otherwise not care about being called multiple times). Read queries, simple deletes, and updates/upserts that replace rows (rather than generating new values from existing rows) are all idempotent.
3. Autocommit functions no longer get executed in [`REPEATABLE READ`](https://www.postgresql.org/docs/current/transaction-iso.html) isolation level, and so data can change queries, which is fine for single statement queries.
We asserted that the IDs returned by postgres sequence was greater than
any we had seen, however this is technically racey as we may update the
current positions out of order.
We now assert that the sequences are correct on startup, so the
assertion is no longer really required, so we remove them.
* Fix outbound federaion with multiple event persisters.
We incorrectly notified federation senders that the minimum persisted
stream position had advanced when we got an `RDATA` from an event
persister.
Notifying of federation senders already correctly happens in the
notifier, so we just delete the offending line.
* Change some interfaces to use RoomStreamToken.
By enforcing use of `RoomStreamTokens` we make it less likely that
people pass in random ints that they got from somewhere random.
Currently background proccesses stream the events stream use the "minimum persisted position" (i.e. `get_current_token()`) rather than the vector clock style tokens. This is broadly fine as it doesn't matter if the background processes lag a small amount. However, in extreme cases (i.e. SyTests) where we only write to one event persister the background processes will never make progress.
This PR changes it so that the `MultiWriterIDGenerator` keeps the current position of a given instance as up to date as possible (i.e using the latest token it sees if its not in the process of persisting anything), and then periodically announces that over replication. This then allows the "minimum persisted position" to advance, albeit with a small lag.
This could, very occasionally, cause:
```
tests.test_visibility.FilterEventsForServerTestCase.test_large_room
===============================================================================
[ERROR]
Traceback (most recent call last):
File "/src/tests/rest/media/v1/test_media_storage.py", line 86, in test_ensure_media_is_in_local_cache
self.wait_on_thread(x)
File "/src/tests/unittest.py", line 296, in wait_on_thread
self.reactor.advance(0.01)
File "/src/.tox/py35/lib/python3.5/site-packages/twisted/internet/task.py", line 826, in advance
self._sortCalls()
File "/src/.tox/py35/lib/python3.5/site-packages/twisted/internet/task.py", line 787, in _sortCalls
self.calls.sort(key=lambda a: a.getTime())
builtins.ValueError: list modified during sort
tests.rest.media.v1.test_media_storage.MediaStorageTests.test_ensure_media_is_in_local_cache
```
This PR allows Synapse modules making use of the `ModuleApi` to create and send non-membership events into a room. This can useful to have modules send messages, or change power levels in a room etc. Note that they must send event through a user that's already in the room.
The non-membership event limitation is currently arbitrary, as it's another chunk of work and not necessary at the moment.
==============================
Bugfixes
--------
- Fix duplication of events on high traffic servers, caused by PostgreSQL `could not serialize access due to concurrent update` errors. ([\#8456](https://github.com/matrix-org/synapse/issues/8456))
Internal Changes
----------------
- Add Groovy Gorilla to the list of distributions we build `.deb`s for. ([\#8475](https://github.com/matrix-org/synapse/issues/8475))
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAl9+7JIQHGVyaWtAbWF0
cml4Lm9yZwAKCRClQuTtGw+sCUc1B/9Bz8uAEJWX9qahuWZGrsvj0ukF/9O6YWrU
ZZGHnyaAXg4MF0lR6DVCV2NhW/l3RWO01G/LgGT//0vac+EFxWDbjuaHP9pRVBln
02TQpn12ceZtW/n+X+8wmQzfSn1zNWRUycV43MyGu49z3IAG7KVjjVDQPkY0mEHC
h6THy30SYIOKDCdXUUL9i+sV1JLzKPX0NlzO3ONKe2LbfLiBTvdfDRlZGq6rJu5H
GAsIPwgadhanhXBPhG4WrealG31TkcPbZCwFMv9aFzltU841jps+QN6OKvbGFVer
FboD2df0ZJqKNJOJVVlOMs7I5uSYbrg+n1zolZfrbo0lk/xEyIAT
=F5cw
-----END PGP SIGNATURE-----
Merge tag 'v1.21.0rc3' into develop
Synapse 1.21.0rc3 (2020-10-08)
==============================
Bugfixes
--------
- Fix duplication of events on high traffic servers, caused by PostgreSQL `could not serialize access due to concurrent update` errors. ([\#8456](https://github.com/matrix-org/synapse/issues/8456))
Internal Changes
----------------
- Add Groovy Gorilla to the list of distributions we build `.deb`s for. ([\#8475](https://github.com/matrix-org/synapse/issues/8475))
Added shields directing to synapse-dev room, showing license, latest version on PyPi and supported Python versions.
I've moved substitution definitions to the bottom to improve readability.
Signed-off-by: Mateusz Przybyłowicz <uamfhq@gmail.com>
We call `_update_stream_positions_table_txn` a lot, which is an UPSERT
that can conflict in `REPEATABLE READ` isolation level. Instead of doing
a transaction consisting of a single query we may as well run it outside
of a transaction.
We call `_update_stream_positions_table_txn` a lot, which is an UPSERT
that can conflict in `REPEATABLE READ` isolation level. Instead of doing
a transaction consisting of a single query we may as well run it outside
of a transaction.
Currently when using multiple event persisters we (in the worst case) don't tell clients about events until all event persisters have persisted new events after the original event. This is a suboptimal, especially if one of the event persisters goes down.
To handle this, we encode the position of each event persister in the room tokens so that we can send events to clients immediately. To reduce the size of the token we do two things:
1. We create a unique immutable persistent mapping between instance names and a generated small integer ID, which we can encode in the tokens instead of the instance name; and
2. We encode the "persisted upto position" of the room token and then only explicitly include instances that have positions strictly greater than that.
The new tokens look something like: `m3478~1.3488~2.3489`, where the first number is the min position, and the subsequent `-` separated pairs are the instance ID to positions map. (We use `.` and `~` as separators as they're URL safe and not already used by `StreamToken`).
It seems most of these blacklisted tests do actually pass most of the time.
I'm of the opinion that having them blacklisted here means there is very little incentive for us to deflake any flaky tests, and meanwhile any value in those tests is completely lost.
Lots of different module apis is not easy to maintain.
Rather than adding yet another ModuleApi(hs, hs.get_auth_handler()) incantation, first add an hs.get_module_api() method and use it where possible.
https://github.com/matrix-org/synapse/tree/develop/docs/sphinx doesn't seem to really be utilised or changed recently since the initial commit. I like the idea of exportable documentation of the codebase, but at the moment after running through the build instructions the generated website wasn't very useful...
* Optimise and test state fetching for 3p event rules
Getting all the events at once is much more efficient than getting them
individually
* Test that 3p event rules can modify events
PR #8292 tried to maintain backwards compat with modules which don't provide a
`check_visibility_can_be_modified` method, but the tests weren't being run,
and the check didn't work.
This PR allows `ThirdPartyEventRules` modules to view, manipulate and block changes to the state of whether a room is published in the public rooms directory.
While the idea of whether a room is in the public rooms list is not kept within an event in the room, `ThirdPartyEventRules` generally deal with controlling which modifications can happen to a room. Public rooms fits within that idea, even if its toggle state isn't controlled through a state event.
There's no need for it to be in the dict as well as the events table. Instead,
we store it in a separate attribute in the EventInternalMetadata object, and
populate that on load.
This means that we can rely on it being correctly populated for any event which
has been persited to the database.
This is so we can tell what is going on when things are taking a while to start up.
The main change here is to ensure that transactions that are created during startup get correctly logged like normal transactions.
#7124 changed the behaviour of remote thumbnails so that the thumbnailing method was included in the filename of the thumbnail. To support existing files, it included a fallback so that we would check the old filename if the new filename didn't exist.
Unfortunately, it didn't apply this logic to storage providers, so any thumbnails stored on such a storage provider was broken.