During the migration the automated script to update the copyright
headers accidentally got rid of some of the existing copyright lines.
Reinstate them.
The current query supports passing in a list of users, which generates a
query using `user_id = ANY(..)`. This is generates a less efficient
query plan that is notably slower than a simple `user_id = ?` condition.
Note: The new function is mostly a copy and paste and then a
simplification of the existing function.
The crux of the change is to try and make the queries simpler and pull
out fewer rows. Before, there were quite a few joins against subqueries,
which caused postgres to pull out more rows than necessary.
Instead, let's simplify the query and do some of the filtering out in
Python instead, letting Postgres do better optimizations now that it
doesn't have to deal with joins against subqueries.
Review note: this is a complete rewrite of the function, so not sure how
useful the diff is.
---------
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
There are two changes here:
1. Only pull out the required state when handling the request.
2. Change the get filtered state return type to check that we're only
querying state that was requested
---------
Co-authored-by: reivilibre <oliverw@matrix.org>
There are a couple of things we need to be careful of here:
1. The current python code does no validation when loading from the DB,
so we need to be careful to ignore such errors (at least on jki.re there
are some old events with internal metadata fields of the wrong type).
2. We want to be memory efficient, as we often have many hundreds of
thousands of events in the cache at a time.
---------
Co-authored-by: Quentin Gliech <quenting@element.io>
We remove these fields as they're just duplicating data the event
already stores, and (for reasons 🤫) I'd like to simplify
the class to only store simple types.
I'm not entirely convinced that we shouldn't instead add helper methods
to the event class to generate stream tokens, but I don't really think
that's where they belong either
* Describe `insert_client_ip`
* Pull out client_ips and MAU tracking to BaseAuth
* Define HAS_AUTHLIB once in tests
sick of copypasting
* Track ips and token usage when delegating auth
* Test that we track MAU and user_ips
* Don't track `__oidc_admin`
Keeping track of a lower bound of stream ID where we've deleted everything below makes the queries much faster. Otherwise, every time we scan for rows to delete we'd re-scan across all the rows that have previously deleted (until the next table VACUUM).
If simple_{insert,upsert,update}_many_txn is called without any data
to modify then return instead of executing the query.
This matches the behavior of simple_{select,delete}_many_txn.
Fetch information needed for push rule evaluation in parallel.
Ideally this would use query pipelining, but this is not
available in psycopg2.
Due to the database thread pool this may result in little
to no parallelization.
The event persistence code used to handle multiple rooms
at a time, but was simplified to only ever be called with a
single room at a time (different rooms are now handled in
parallel). The code is still generic to multiple rooms causing
a lot of work that is unnecessary (e.g. unnecessary loops, and
partitioning data by room).
This strips out the ability to handle multiple rooms at once, greatly
simplifying the code.
Just to standardize on the normal helpers, it might also have
a slight perf improvement on PostgreSQL which will now use
`ANY (?)` instead of `IN (?, ?, ...)`.
This is mostly useful for federated rooms where some users
would get stuck in the invite or knock state when the room
was purged from their homeserver.
This could happen if the last rows in the account data stream were inserted into `account_data`. After a restart the max account ID would be calculated without looking at the `account_data` table, and so have an old ID.
This splits thinsg into two queries, but most of the time we won't have
new event backwards extremities so this shouldn't actually add an extra
RTT for the majority of cases.
Note this removes the check for events with no prev events, but that was
part of MSC2716 work that has since been removed.
This avoids calling cursor_to_dict and then immediately
unpacking the values in the dict for other users. By not
creating the intermediate dictionary we can avoid allocating
the dictionary and strings for the keys, which should generally
be more performant.
Additionally this improves type hints by avoid Dict[str, Any]
dictionaries coming out of the database layer.
Co-authored-by: David Robertson <davidr@element.io>
Co-authored-by: Patrick Cloke <patrickc@matrix.org>
Co-authored-by: Erik Johnston <erik@matrix.org>
Assert that the return type of callables wrapped in @cached
and @cachedList are cachable (aka immutable).
Refresh tokens were not correctly moved to the rehydrated
device (similar to how the access token is currently handled).
This resulted in invalid refresh tokens after rehydration.
* Fix rare bug that broke looping calls
We can't interact with the reactor from the main thread via looping
call.
Introduced in v1.90.0 / #15791.
* Newsfile
* Properly update retry_last_ts when hitting the maximum retry interval
This was broken in 1.87 when the maximum retry interval got changed from
almost infinite to a week (and made configurable).
fixes#16101
Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
* Add changelog
* Change fix + add test
* Add comment
---------
Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
Co-authored-by: Mathieu Velten <mathieuv@matrix.org>
If we don't have all the auth events in a room then not all state events will have a chain cover index. Even so, we can still use the chain cover index on the events that do have it, rather than bailing and using the slower functions.
This situation should not arise for newly persisted rooms, as we check we have the full auth chain for each event, but can happen for existing rooms.
c.f. #15245
We were seeing serialization errors when taking out multiple read locks.
The transactions were retried, so isn't causing any failures.
Introduced in #15782.