Fix a potentially-huge sql query (#7274)

We could end up looking up tens of thousands of events, which could cause large
amounts of data to be logged to the postgres log.
This commit is contained in:
Richard van der Hoff 2020-04-15 10:16:35 +01:00 committed by GitHub
parent f1097e7720
commit f2049a8d21
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 17 additions and 7 deletions

1
changelog.d/7274.bugfix Normal file
View File

@ -0,0 +1 @@
Fix a sql query introduced in Synapse 1.12.0 which could cause large amounts of logging to the postgres slow-query log.

View File

@ -173,19 +173,28 @@ class EventFederationWorkerStore(EventsWorkerStore, SignatureWorkerStore, SQLBas
for event_id in initial_events for event_id in initial_events
} }
# The sorted list of events whose auth chains we should walk.
search = [] # type: List[Tuple[int, str]]
# We need to get the depth of the initial events for sorting purposes. # We need to get the depth of the initial events for sorting purposes.
sql = """ sql = """
SELECT depth, event_id FROM events SELECT depth, event_id FROM events
WHERE %s WHERE %s
ORDER BY depth ASC
""" """
clause, args = make_in_list_sql_clause( # the list can be huge, so let's avoid looking them all up in one massive
txn.database_engine, "event_id", initial_events # query.
) for batch in batch_iter(initial_events, 1000):
txn.execute(sql % (clause,), args) clause, args = make_in_list_sql_clause(
txn.database_engine, "event_id", batch
)
txn.execute(sql % (clause,), args)
# The sorted list of events whose auth chains we should walk. # I think building a temporary list with fetchall is more efficient than
search = txn.fetchall() # type: List[Tuple[int, str]] # just `search.extend(txn)`, but this is unconfirmed
search.extend(txn.fetchall())
# sort by depth
search.sort()
# Map from event to its auth events # Map from event to its auth events
event_to_auth_events = {} # type: Dict[str, Set[str]] event_to_auth_events = {} # type: Dict[str, Set[str]]